Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"PA1_template.md" completed and "PA1_template.html" generated., #511

Open
wants to merge 2 commits into
base: master
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
96 changes: 91 additions & 5 deletions PA1_template.Rmd
Original file line number Diff line number Diff line change
Expand Up @@ -5,21 +5,107 @@ output:
keep_md: true
---

```{r setup, include=FALSE}
knitr::opts_chunk$set(echo = TRUE)
```
# Activity Monitoring Data Analysis

## Loading and preprocessing the data

## 0. Load the necessary library
```{r}
library(utils)
library(ggplot2)
```
## 1. Loading and preprocessing the data
```{r}
# Read the CSV file directly from the zipped archive
zipFilePath <- "activity.zip"
csvFileName <- "activity.csv"
data <- read.csv(unzip(zipFilePath, files = csvFileName))

# Aggregate total steps by date
totalStepsByDay <- aggregate(steps ~ date, data, sum, na.rm = TRUE)
```

## What is mean total number of steps taken per day?
## 2. What is mean total number of steps taken per day?
```{r}
# Make a histogram of the total number of steps taken each day
hist(totalStepsByDay$steps, main = "Total Steps Taken Each Day", xlab = "Total Steps", ylab = "Frequency", col = "blue")

# Calculate and report the mean and median total number of steps taken per day
meanSteps <- mean(totalStepsByDay$steps)
medianSteps <- median(totalStepsByDay$steps)

# Print the results
cat("Mean total number of steps taken per day: ", meanSteps, "\n")
cat("Median total number of steps taken per day: ", medianSteps, "\n")
```

## What is the average daily activity pattern?
## 3. What is the average daily activity pattern?
```{r}
# Calculate the average number of steps taken in each 5-minute interval
averageStepsByInterval <- aggregate(steps ~ interval, data, mean, na.rm = TRUE)

# Make a time series plot
plot(averageStepsByInterval$interval, averageStepsByInterval$steps, type = "l", xlab = "Interval", ylab = "Average Number of Steps", main = "Average Daily Activity Pattern")

# Identify the interval with the maximum average number of steps
maxInterval <- averageStepsByInterval[which.max(averageStepsByInterval$steps), ]$interval

## Imputing missing values
# Print the result
cat("The 5-minute interval with the maximum number of steps on average is:", maxInterval)
```

## 4. Imputing missing values
```{r}
# Calculate and report the total number of missing values
totalNAs <- sum(is.na(data$steps))
cat("Total number of missing values: ", totalNAs, "\n")

# Calculate the mean for each interval
averageStepsByInterval <- aggregate(steps ~ interval, data, mean, na.rm = TRUE)

## Are there differences in activity patterns between weekdays and weekends?
# Fill in missing values and create a new dataset
filledData <- data
for (i in 1:nrow(filledData)) {
if (is.na(filledData$steps[i])) {
filledData$steps[i] <- averageStepsByInterval$steps[which(averageStepsByInterval$interval == filledData$interval[i])]
}
}

# Aggregate total steps by day for the filled dataset
totalStepsByDayFilled <- aggregate(steps ~ date, filledData, sum)

# Make a histogram
hist(totalStepsByDayFilled$steps, main = "Total Steps Taken Each Day (Filled Data)", xlab = "Total Steps", ylab = "Frequency", col = "green")

# Calculate mean and median
meanStepsFilled <- mean(totalStepsByDayFilled$steps)
medianStepsFilled <- median(totalStepsByDayFilled$steps)

# Print the results
cat("Mean total number of steps taken per day (Filled Data): ", meanStepsFilled, "\n")
cat("Median total number of steps taken per day (Filled Data): ", medianStepsFilled, "\n")
```

## 5. Are there differences in activity patterns between weekdays and weekends?
```{r}
# Convert 'date' to Date type if it's not already
filledData$date <- as.Date(filledData$date)

# Create a new factor variable for weekday/weekend
filledData$dayType <- ifelse(weekdays(filledData$date, abbreviate = TRUE) %in% c("土", "日"), "weekend", "weekday")
filledData$dayType <- factor(filledData$dayType, levels = c("weekday", "weekend"))

# Aggregate average steps by interval and dayType again to reflect the updated dayType
averageStepsByDayType <- aggregate(steps ~ interval + dayType, filledData, mean)

# Create the plot
ggplot(averageStepsByDayType, aes(x = interval, y = steps, color = dayType)) +
geom_line() +
xlab("Interval") +
ylab("Average Number of Steps") +
ggtitle("Average Number of Steps by Interval: Weekday vs Weekend") +
scale_color_manual(values = c("weekday" = "blue", "weekend" = "red")) +
theme(legend.position = "bottom")
```
509 changes: 509 additions & 0 deletions PA1_template.html

Large diffs are not rendered by default.