This project predicts local epidemics of dengue fever to help fight life-threatening pandemics. This project was created as an entry to the DengAI: Predicting Disease Spread competition hosted by Driven Data.
The project predicts the number of Dengue Fever cases reported each week in the following locations:
- San Juan (Puerto Rico)
- Iquitos (Peru)
The predictor variables include environmental variables describing changes in temperature, precipitation, vegitation, and more.
To allow interactive visualisation of the total cases over time by city, we built a simple Streamlit app. Click here to discover patterns in the data!
This is a time series project using Random Forest and Negative Binomial regression models to predict the total cases of Dengue fever over time in the different cities.
We used the Mean Squared Error (MSE) metric for evaluating the model.
data/
- Contains the training and test datasets.notebooks/
- Jupyter notebooks with exploratory data analysis and model development.submissions/
- Prediction files ready for submission to the competition.images/
- Visualizations generated during analysis, including the Scatterplot of total cases by city.
Clone the repository, install the required packages listed in requirements.txt
, and run the Jupyter notebooks to replicate the analysis and predictions.
Contributions are welcome. Please open an issue or pull request if you would like to contribute to the project.
This project is open-source and available under the MIT license.