All of our fellows come in with at least an intermediate programming ability (usually in Python), some statistics knowledge, and some social science knowledge. We bring together people of all backgrounds, so we know that some will be stronger in some areas, and some might never have seen some of this material. To get everyone on the same page at the beginning, we send out a welcome email about a month before the fellowship begins with recommended background reading and what they'll need to install. We also spend a significant part of the first two weeks of the summer getting everyone up to speed with the tools they'll be using in their projects.
In order to be ready for the summer, you need to install some packages on your computer:
- SSH (PuTTY for Windows)
- Git (for version control)
- psql (PostgreSQL command line interface)
- Python tools
- Python 3.6
- Anaconda/Miniconda or pip + virtualenv
- Python Packages
- pandas
- matplotlib
- scikit-learn
- psycopg2
- ipython
- jupyter
- DBeaver (GUI to access various databases)
- Tableau (students can request a free education license)
- Sublime Text (text editor for coding)
- R
- RStudio
- OS X users - Follow these instructions
- Linux users - You probably know how to do it, but still check this for information on Python tools
- Windows users - We don't have a guide yet (any volunteers?)
- We will also hold a software setup session during the first week with technical mentors there to help anyone still having difficulty.
You should give all installed software a quick spin to check that it did install. For your python packages, try to import them. Type python
in your shell, and then once you are in your python session, try for example import pandas
, import matplotlib
, and so on. (You can quit with exit()
.) Also try ipython
and jupyter notebook
in your terminal, and see if you get any errors.
Similarly, try psql
in your terminal; it should reply
psql: could not connect to server: No such file or directory
ssh
should print a 'helpful' message, and R
should drop you into an R session that you can quit with q()
.
You need to generate a SSH key pair. To do this, follow the instructions on GitHub, namely 'Generating a new SSH key' and 'Adding your SSH key to ssh-agent'. Windows users probably want to use git bash or PuTTYgen (if you're on Linux or OS X, your standard terminal should be the bash shell you need).
The steps in 'Generating a new SSH key' create two new files (by default in ~/.ssh/
: One without a file extension (by default, it's called id_rsa
), and one with the extension .pub
. The latter one is your _pub_lic key, which you will share with your project server, so that it can recognize you; the former is your private key, which you must not share with anybody, as it will let you access your project server.
After having generated the key pair, you should set the correct file permissions for your private key: SSH requires that only you, the owner, are able to read/write it, and will give you an error otherwise. You can set the right permissions with this command: chmod 600 ~/.ssh/nameofyourprivatekey
(where you'll have to substitute in the path and name of your private key that you chose during key generation).
We just started this repo but we want the issues section to be a knowledge base for common problems.
If you have any trouble installing anything check closed issues. If you don't find the answer, feel free to open an issue and someone will help you.
To open issues, you need to create a Github account (you'll need it for the summer anyway).