Skip to content

Latest commit

 

History

History
33 lines (26 loc) · 3.41 KB

README.md

File metadata and controls

33 lines (26 loc) · 3.41 KB

I. Project 1: COVID-19 Data analysis

Introduction

Discovered in China, the SARS-COV 2 coronavirus is challenging the entire world, including the world's major economies. This pandemic continues to spread throughout the world and in the various countries already affected. Almost 3.6 million confirmed cases with 257,000 deaths have been officially reported on the site of the World Health Organization. As for Africa, it recorded 27,973 confirmed cases, i.e. 0.86% of the number of confirmed cases in the world. Through this study, we make a brief summary of the situation in two African countries (Ghana and Rwanda) in order to highlight the feelings of the populations on the difficult current situation.

Datasets used

For this report, we use two databases: a quantitative database from https://covid.ourworldindata.org and a qualitative database from a survey on https://data.humdata.org. This analysis will be done on two countries: Ghana and Rwanda. We have extracted from the two global databases the parameters measured for these two countries.

II. Project 2: Predicting Neurodegenerative Diseases: e.g. Parkinson's disease

Introduction

Neurodegenerative diseases are a heterogeneous group of disorders that are characterized by the progressive degeneration of the structure and function of the nervous system. They are incurable and debilitating conditions that cause problems with mental functioning also called dementias. An estimated 930,000 people in the United States could be living with Parkinson’s disease by 2020. The goal of this project is to build a model to accurately predict the presence of a neurodegenerative disease in an individual as early detection of a neurodegenerative disease could be useful for the identification of people who can participate in trials of neuroprotective agents, or ultimately to try and halt disease progression once effective disease-modifying interventions have been identified.

Datasets used

This dataset (https://archive.ics.uci.edu/ml/datasets/Parkinsons) is composed of a range of biomedical voice measurements from 31 people, 23 with Parkinson's disease (PD). Each column in the table is a particular voice measure, and each row corresponds one of 195 voice recording from these individuals ("name" column). The main aim of the data is to discriminate healthy people from those with PD, according to "status" column which is set to 0 for healthy and 1 for PD.

II. Project 3: TIME SERIES ANALYSIS OF NAICS

Introduction

  • NAICS Dataset Purpose

The North American Industry Classification System (NAICS) is an industry classification system developed by the statistical agencies of Canada, Mexico and the United States. Created against the background of the North American Free Trade Agreement, it is designed to provide common definitions of the industrial structure of the three countries and a common statistical framework to facilitate the analysis of the three economies. NAICS is based on supply-side or production-oriented principles, to ensure that industrial data, classified to NAICS, are suitable for the analysis of production-related issues such as industrial performance. NAICS is a comprehensive system encompassing all economic activities. It has a hierarchical structure. At the highest level, it divides the economy into 20 sectors. At lower levels, it further distinguishes the different economic activities in which businesses are engaged.

  • Task Prepare the data set and analyze as DS