Skip to content

Latest commit

 

History

History
executable file
·
130 lines (109 loc) · 10.3 KB

File metadata and controls

executable file
·
130 lines (109 loc) · 10.3 KB
output
html_document

CodeBook

This codebook describes the variables found in the tidy dataset generated by running run_analysis.R does the following.

    1. Merges the training and the test sets to create one data set.
    2. Extracts only the measurements on the mean and standard deviation for each measurement.
    3. Uses descriptive activity names to name the activities in the data set.
    4. Appropriately labels the data set with descriptive variable names.
    5. Creates a second, independent tidy data set with the average of each variable for each activity and each subject.

Original Data

Data Collection Description

The original data set was obtained from [http://archive.ics.uci.edu/ml/datasets/Human+Activity+Recognition+Using+Smartphones]

It is described as a "Human Activity Recognition database built from the recordings of 30 subjects performing activities of daily living while carrying a waist-mounted smartphone with embedded inertial sensors."

The experiments involved 30 subjects. Each subject performed six activities (WALKING, WALKING_UPSTAIRS, WALKING_DOWNSTAIRS, SITTING, STANDING, LAYING) with a sensor-equipped smartphone attached to their waist. The smartphone's accelerometer and gyroscope they used to capture 3-axial linear acceleration and 3-axial angular velocity at a constant rate of 50Hz.

The experiments have been video-recorded to label the data manually. The obtained dataset has been randomly partitioned into two sets, where 70% of the volunteers was selected for generating the training data and 30% the test data.

See the above website for further details on sensor signal pre-processing if you are interested.

Required files

  • train/subject_train.txt, test/subject_test.txt: Each row identifies the subject who performed the activity for each window sample. Its range is from 1 to 30.

      train [1 3 5 6 7 8 11 14 15 16 17 19 21 22 23 25 26 27 28 29 30]
      test  [2 4 9 10 12 13 18 20 24]
    
  • activity_labels.txt: Links the class labels with their activity name.

WALKING WALKING_UPSTAIRS WALKING_DOWNSTAIRS SITTING STANDING LAYING
1 2 3 4 5 6
  • train/y_train.txt, test/y_test.txt: Each row identifies the class label of an activity. Its range is from 1 to 6.

  • features.txt: List of all features. And the tidy data includes only the measurements on the mean and standard deviation for each measurement

  • train/X_train.txt, test/X_test.txt: Training and test set. Each row is a 561-feature vector with time and frequency domain variables.

Tidy Data

This picture comes from Course Discussion Forums. It illustrates the relationship between the original data and the tidy data. And in this project, subject.id and activity are moved to the first and second colums.

Complete list of variables by column number

Note that I have cleaned the original variable names for better readability/understanding. See the 'Style' section in README for more info if necessary.

Column No. Variable Name Values
1 subject.id 1-30
2 activity WALKING
WALKING_UPSTAIRS
WALKING_DOWNSTAIRS
SITTING
STANDING
LAYING
3 time.body.acceleration.mean.x.averaged
4 time.body.acceleration.mean.y.averaged
5 time.body.acceleration.mean.z.averaged
6 time.body.acceleration.std.x.averaged
7 time.body.acceleration.std.y.averaged
8 time.body.acceleration.std.z.averaged
9 time.gravity.acceleration.mean.x.averaged
10 time.gravity.acceleration.mean.y.averaged
11 time.gravity.acceleration.mean.z.averaged
12 time.gravity.acceleration.std.x.averaged
13 time.gravity.acceleration.std.y.averaged
14 time.gravity.acceleration.std.z.averaged
15 time.body.acceleration.jerk.mean.x.averaged
16 time.body.acceleration.jerk.mean.y.averaged
17 time.body.acceleration.jerk.mean.z.averaged
18 time.body.acceleration.jerk.std.x.averaged
19 time.body.acceleration.jerk.std.y.averaged
20 time.body.acceleration.jerk.std.z.averaged
21 time.body.gyro.mean.x.averaged
22 time.body.gyro.mean.y.averaged
23 time.body.gyro.mean.z.averaged
24 time.body.gyro.std.x.averaged
25 time.body.gyro.std.y.averaged
26 time.body.gyro.std.z.averaged
27 time.body.gyro.jerk.mean.x.averaged
28 time.body.gyro.jerk.mean.y.averaged
29 time.body.gyro.jerk.mean.z.averaged
30 time.body.gyro.jerk.std.x.averaged
31 time.body.gyro.jerk.std.y.averaged
32 time.body.gyro.jerk.std.z.averaged
33 time.body.acceleration.magnitude.mean.averaged
34 time.body.acceleration.magnitude.std.averaged
35 time.gravity.acceleration.magnitude.mean.averaged
36 time.gravity.acceleration.magnitude.std.averaged
37 time.body.acceleration.jerk.magnitude.mean.averaged
38 time.body.acceleration.jerk.magnitude.std.averaged
39 time.body.gyro.magnitude.mean.averaged
40 time.body.gyro.magnitude.std.averaged
41 time.body.gyro.jerk.magnitude.mean.averaged
42 time.body.gyro.jerk.magnitude.std.averaged
43 freq.body.acceleration.mean.x.averaged
44 freq.body.acceleration.mean.y.averaged
45 freq.body.acceleration.mean.z.averaged
46 freq.body.acceleration.std.x.averaged
47 freq.body.acceleration.std.y.averaged
48 freq.body.acceleration.std.z.averaged
49 freq.body.acceleration.jerk.mean.x.averaged
50 freq.body.acceleration.jerk.mean.y.averaged
51 freq.body.acceleration.jerk.mean.z.averaged
52 freq.body.acceleration.jerk.std.x.averaged
53 freq.body.acceleration.jerk.std.y.averaged
54 freq.body.acceleration.jerk.std.z.averaged
55 freq.body.gyro.mean.x.averaged
56 freq.body.gyro.mean.y.averaged
57 freq.body.gyro.mean.z.averaged
58 freq.body.gyro.std.x.averaged
59 freq.body.gyro.std.y.averaged
60 freq.body.gyro.std.z.averaged
61 freq.body.acceleration.magnitude.mean.averaged
62 freq.body.acceleration.magnitude.std.averaged
63 freq.body.body.acceleration.jerk.magnitude.mean.averaged
64 freq.body.body.acceleration.jerk.magnitude.std.averaged
65 freq.body.body.gyro.magnitude.mean.averaged
66 freq.body.body.gyro.magnitude.std.averaged
67 freq.body.body.gyro.jerk.magnitude.mean.averaged
68 freq.body.body.gyro.jerk.magnitude.std.averaged