Skip to content

Latest commit

 

History

History
16 lines (15 loc) · 836 Bytes

README.md

File metadata and controls

16 lines (15 loc) · 836 Bytes

Data Analyst Capstone Project

End of Coursera's Google Advanced Data Analitycs course.

In the project, the main goal is to analyze the data and to build a model that predicts whether or not an employee will leave the company. The notebook is divided in 4 steps:

  1. Package import and dataset load
  2. Data exploration and visualization
    • Understand the variables
    • Clean the dataset (missing data, redundant data, outliers)
    • Boxplots, scatterplots, histograms and heatmaps
  3. Model building in 2 methods
    • Model approach A: Logistic Regression
    • Model approach B: Tree-based Machine Learning
  4. Results and evaluation
    • Summary of model results
    • Conclusion and next steps