Lecturer: Hossein Hajiabolhassan Data Science Center Shahid Beheshti University |
||||
---|---|---|---|---|
Teaching Assistants: | ||||
Behnaz H.M. Hoseyni | Yavar T. Yeganeh | Mostafa Khodayari | Esmail Mafakheri | |
- Course Overview
- Main TextBooks
- Slides and Papers
- Lecture 1: Introduction
- Lecture 2: Toolkit Lab 1: Google Colab and Anaconda
- Lecture 3: Toolkit Lab 2: Getting Started with PyTorch
- Lecture 4: Deep Feedforward Networks
- Lecture 5: Toolkit Lab 3: Preprocessing Datasets by PyTorch
- Lecture 6: Regularization for Deep Learning
- Lecture 7: Toolkit Lab 4: Using a Neural Network to Fit the Data with PyTorch
- Lecture 8: Optimization for Training Deep Models
- Lecture 9: Convolutional Networks
- Lecture 10: Toolkit Lab 5: Using Convolutions to Generalize
- Lecture 11: Sequence Modeling: Recurrent and Recursive Networks
- Lecture 12: Toolkit Lab 6: Transfer Learning and Other Tricks
- Lecture 13: Practical Methodology
- Lecture 14: Toolkit Lab 7: Optimization Software (Ray Tune or Optuna)
- Lecture 15: Applications
- Lecture 16: Autoencoders
- Lecture 17: Generative Adversarial Networks
- Lecture 18: Graph Neural Networks
- Lecture 19: Atention Mechanisms and Transformers
- Additional Resources
- Class Time and Location
- Projects
- Grading
- Prerequisites
- Topics
- Account
- Academic Honor Code
- Questions
- Miscellaneous:
In this course, you will learn the foundations of Deep Learning, understand how to build
neural networks, and learn how to lead successful machine learning projects. You will learn
about Convolutional networks, RNNs, LSTM, Adam, Dropout, BatchNorm, and more.
Main TextBooks:
- Deep Learning (available in online) by Bengio, Yoshua, Ian J. Goodfellow, and Aaron Courville
- Deep Learning with PyTorch by Eli Stevens and Luca Antiga
- GitHub: Codes
- Programming PyTorch for Deep Learning by Ian Pointer
- GitHub: Codes
Additional TextBooks:
- Generative Deep Learning by David Foster
- GitHub: Codes
- Dive into Deep Learning by Mag Gardner, Max Drummy, Joanne Quinn, Joanne McEachen, and Michael Fullan
- GitHub: Codes
Recommended Slides & Papers:
Required Reading:
- Chapter 1 of the Deep Learning textbook.
- Slide: Introduction by Ian Goodfellow
Suggested Reading:
- Demo: 3D Fully-Connected Network Visualization by Adam W. Harley
Additional Resources:
- Video of lecture by Ian Goodfellow and discussion of Chapter 1 at a reading group in San Francisco organized by Alena Kruchkova
- Paper: On the Origin of Deep Learning by Haohan Wang and Bhiksha Raj
Applied Mathematics and Machine Learning Basics:
- Slide: Mathematics for Machine Learning by Avishkar Bhoopchand, Cynthia Mulenga, Daniela Massiceti, Kathleen Siminyu, and Kendi Muchungi
- Blog: A Gentle Introduction to Maximum Likelihood Estimation and Maximum A Posteriori Estimation (Getting Intuition of MLE and MAP with a Football Example) by Shota Horii
Required Reading:
- Blog: Google Colab Free GPU Tutorial by Fuat
- Blog: Managing Environments
- Blog: Kernels for Different Environments
Suggested Reading:
- Blog: Using Pip in a Conda Environment by Jonathan Helmus
- Blog: How to Import Dataset to Google Colab Notebook?
- Blog: How to Upload Large Files to Google Colab and Remote Jupyter Notebooks (For Linux Operating System) by Bharath Raj
Additional Resources:
- PDF: Conda Cheat Sheet
- Blog: Conda Commands (Create Virtual Environments for Python with Conda) by LipingY
- Blog: Colab Tricks by Rohit Midha
Required Reading:
- NoteBook: Chapter 3: It Starts with a Tensor from Deep Learning with PyTorch by Eli Stevens and Luca Antiga
Suggested Reading:
- Blog: Why PyTorch is the Deep Learning Framework of the Future by Dhiraj Kumar
- Blog: Torch Tensors & Types: A torch.Tensor is a multi-dimensional matrix containing elements of a single data type. Torch defines nine CPU tensor types and nine GPU tensor types.
Additional Resources:
- Blog: Learning PyTorch with Exampls by Justin Johnson. This tutorial introduces the fundamental concepts of PyTorch through self-contained examples.
Building Dynamic Models Using the Subclassing API:
-
Object-Oriented Programming:
- Blog: Object-Oriented Programming (OOP) in Python 3 by the Real Python Team
- Blog: How to Explain Object-Oriented Programming Concepts to a 6-Year-Old
- Blog: Understanding Object-Oriented Programming Through Machine Learning by David Ziganto
- Blog: Object-Oriented Programming for Data Scientists: Build your ML Estimator by Tirthajyoti Sarkar
- Blog: Python Callable Class Method by Lalu Erfandi Maula Yusnu
Required Reading:
- Chapter 6 of the Deep Learning textbook.
- Slides: Deep Feed-forward Networks by Sargur Srihari
- Chapter 20 of Understanding Machine Learning: From Theory to Algorithms
- Slide: Neural Networks by Shai Shalev-Shwartz
- Slide: Backpropagation and Neural Networks by Fei-Fei Li, Justin Johnson, and Serena Yeung
- Blog: 7 Types of Neural Network Activation Functions: How to Choose?
- Blog: Back-Propagation, an Introduction by Sanjeev Arora and Tengyu Ma
Interesting Questions:
Suggested Reading:
- Blog: Epoch vs Batch Size vs Iterations by Sagar Sharma
- Blog: The Gradient by Khanacademy
- Blog: Calculus on Computational Graphs: Backpropagation by Christopher Olah
- PDF: SVM (Section 5: Lagrange Duality) by Andrew Ng
- Blog: Killer Combo: Softmax and Cross Entropy by Paolo Perrotta
Additional Resources:
- Blog: Activation Functions by Sefik Ilkin Serengil
- Paper: Mish: A Self Regularized Non-Monotonic Neural Activation Function by Diganta Misra
- Blog: Activation Functions
- Blog: Analytical vs Numerical Solutions in Machine Learning by Jason Brownlee
- Blog: Validating Analytic Gradient for a Neural Network by Shiva Verma
- Blog: Stochastic vs Batch Gradient Descent by Divakar Kapil
- Video: (.flv) of a presentation by Ian Goodfellow and a group discussion at a reading group at Google organized by Chintan Kaur.
- Extra Slide:
- Slide: Deep Feedforward Networks by Ian Goodfellow
- Slide: Feedforward Neural Networks (Lecture 2) by Ali Harakeh
- Slides: Deep Feedforward Networks 1 and 2 by U Kang
Required Reading:
- NoteBook: Chapter 4: Real-World Data Representation Using Tensors from Deep Learning with PyTorch by Eli Stevens and Luca Antiga
- Blog: How to Build a Streaming DataLoader with PyTorch by David MacLeod
- Blog: Building Efficient Custom Datasets in PyTorch by Syafiq Kamarul Azman
- Blog: A Beginner’s Tutorial on Building an AI Image Classifier using PyTorch by Alexander Wu
Suggested Reading:
- Blog: A Quick Guide To Python Generators and Yield Statements by Jason Rigden
- NoteBook: Iterable, Generator, and Iterator
- Blog: Vectorization in Python
- Blog: numpy.vectorize
Additional Resources:
- Blog: Iterables vs. Iterators vs. Generators by Vincent Driessen
- Blog: Writing Custum Datasets, Dataloaders and Transforms by Sasank Chilamkurthy
- Blog: TORCHVISION.DATASETS
Required Reading:
- Chapter 7 of the Deep Learning textbook.
Regularization - Slides: Regularization by Sargur Srihari
- Slide: Bagging and Random Forests by David Rosenberg
- Slide: Deep Learning Tutorial (Read the Part of Dropout) by Hung-yi Lee
Suggested Reading:
- Blog: Train Neural Networks With Noise to Reduce Overfitting by Jason Brownlee
- Blog: Generative VS Discriminative Models by Prathap Manohar Joshi
- Paper: Ensemble Methods in Machine Learnin by Thomas G. Dietterich
- Blog: Conventional guide to Supervised learning with scikit-learn — Orthogonal Matching Pursuit (OMP)- Generalized Linear Models (10) by Venali Sonone
- Paper: Dropout: A Simple Way to Prevent Neural Networks from Overfitting by Nitish Srivastava, Geoffrey Hinton, Alex Krizhevsky, Ilya Sutskever, Ruslan Salakhutdinov
- Blog: A Gentle Introduction to Weight Constraints in Deep Learning by Jason Brownlee
Additional Reading:
- Blog: Analysis of Dropout by Paolo Galeone
- Extra Slides:
- Slide: Regularization For Deep Models (Lecture 3) by Ali Harakeh
- Slide: Regularization for Deep Learning by Ian Goodfellow
- Slides: Regularization for Deep Learning 1 and 2 by U Kang
- Slide: Training Deep Neural Networks by Aykut Erdem
Required Reading:
- NoteBook: Chapter 5: The Mechanics of Learning from Deep Learning with PyTorch by Eli Stevens and Luca Antiga
- NoteBook: Chapter 6: Using a Neural Network to Fit the Data from Deep Learning with PyTorch by Eli Stevens and Luca Antiga
- NoteBook: Chapter 2: Image Classification with PyTorch from Programming PyTorch for Deep Learning by Ian Pointer
Suggested Reading:
- Blog: Properly Setting the Random Seed in ML Experiments. Not as Simple as You Might Imagine by Open Data Science
- Blog & NoteBook: How to Choose Loss Functions When Training Deep Learning Neural Networks by Jason Brownlee * Blog: Why is my Validation Loss Lower than my Training Loss? by Adrian Rosebrock
- Blog: Saving/Loading Your Model in PyTorch by David Ashraf
- Blog: Saving and Loading Your Model to Resume Training in PyTorch by Rachit Jain
- Blog: Deep Learning with PyTorch: A 60 Minute Blitz — PyTorch by Soumith Chintala
Additional Resources:
- PDF: Self-Normalizing Neural Networks by Günter Klambauer, Thomas Unterthiner, Andreas Mayr, and Sepp Hochreiter
- Deep Learning via Pytorch by Ayoosh Kathuria
Required Reading:
- Chapter 8 of the Deep Learning textbook.
- Slides: Optimization for Training Deep Models by Sargur Srihari
- Blog: Batch Normalization in Deep Networks by Sunita Nayak
Suggested Reading:
- Lecture Note: Matrix Norms and Condition Numbers by Ralucca Gera
- Blog: Initializing Neural Networks by Katanforoosh & Kunin, deeplearning.ai, 2019
- Blog: How to Initialize Deep Neural Networks? Xavier and Kaiming Initialization by Pierre Ouannes
- Blog: What Is Covariate Shift? by Saeed Izadi
- Book: Dataset Shift in Machine Learning by by Joaquin Quinonero-Candela, Masashi Sugiyama, Anton Schwaighofer, and Neil D. Lawrence
- Blog: Stay Hungry, Stay Foolish: This interesting blog contains the computation of back propagation of different layers of deep learning prepared by Aditya Agrawal
- Paper: FitNets: Hints for Thin Deep Nets by Adriana Romero, Nicolas Ballas, Samira Ebrahimi Kahou, Antoine Chassang, Carlo Gatta, and Yoshua Bengio
- Paper: Hyperband: A Novel Bandit-Based Approach to Hyperparameter Optimization by Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, and Ameet Talwalkar
Additional Reading:
-
Blog: Why Momentum Really Works by Gabriel Goh
-
Blog: Understanding the Backward Pass Through Batch Normalization Layer by Frederik Kratzert
-
Video of lecture / discussion: This video covers a presentation by Ian Goodfellow and group discussion on the end of Chapter 8 and entirety of Chapter 9 at a reading group in San Francisco organized by Taro-Shigenori Chiba.
-
Blog: Preconditioning the Network by Nic Schraudolph and Fred Cummins
-
Paper: Delving Deep into Rectifiers: Surpassing Human-Level Performance on ImageNet Classification by Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun
-
Blog: Neural Network Optimization by Matthew Stewart
-
Paper: Understanding the Disharmony between Dropout and Batch Normalization by Variance Shift by Xiang Li, Shuo Chen, Xiaolin Hu, and Jian Yang
-
Blog: Restricted Boltzmann Machine, a Complete Analysis. Part 3: Contrastive Divergence Algorithm by Nguyễn Văn Lĩnh
-
Blog: A (Slightly) Better Budget Allocation for Hyperband by Alexandre Abraham
-
Blog: Massively Parallel Hyperparameter Optimization by Liam Li
-
Extra Slides:
- Slide: Optimization for Training Deep Models (Lecture 4) by Ali Harakeh
- Slide: Optimization for Training Deep Models - Algorithms (Lecture 4) by Ali Harakeh
- Slide: Conjugate Gradient Descent by Aarti Singh
- Slide: Training Deep Neural Networks by Aykut Erdem
- Slides: Optimization for Training Deep Models 1 and 2 by U Kang
Required Reading:
- Chapter 9 of the Deep Learning textbook.
- Slides: Convolutional Networks by Sargur Srihari
Suggested Reading:
- Blog: Color Vision
- Blog: Convolutional Neural Networks CheatSheet by Afshine Amidi and Shervine Amidi
- Blog: Understanding Convolutions by Christopher Olah
- Blog: A Comprehensive Guide to Convolutional Neural Networks — the ELI5 Way by Sumit Saha
- Blog: A Gentle Introduction to 1×1 Convolutions to Manage Model Complexity by Jason Brownlee
- Blog: Global Average Pooling Layers for Object Localization by Alexis Cook
- Blog: Review: Spatial Pyramid Pooling by Sanchit Tanwar
- Blog: Maxpooling vs Minpooling vs Average Pooling by Madhushree Basavarajaiah
- Blog: A Basic Introduction to Separable Convolutions by Chi-Feng Wang
- Blog: Depth wise Separable Convolutional Neural Networks by Mayank Chaurasia
- Blog: Type of convolutions: Deformable and Transformable Convolution by Ali Raza
- Blog: Review: DilatedNet — Dilated Convolution (Semantic Segmentation) by Sik-Ho Tsang
- Blog: Region of Interest Pooling Explained by Tomasz Grel
- Blog: Understanding the Receptive Field of Deep Convolutional Networks by Nikolas Adaloglou
Additional Reading:
- Blog: Image Convolution Examples by Utkarsh Sinha
- Blog: Convolutions and Backpropagations by Pavithra Solai
- Extra Slides:
- Slide: Convolutional Neural Networks (Lecture 6) by Ali Harakeh
- Slide: Convolutional Networks by Ian Goodfellow
Fourier Transformation:
- Blog: Fourier Transformation and Its Mathematics by Akash Dubey
- Blog: Fourier Transformation for a Data Scientist by Nagesh Singh Chauhan
- Blog: Purrier Series (Meow) and Making Images Speak by Bilim Ne Güzel Lan
- Blog: Follow up to Fourier Series by Bilim Ne Güzel Lan
Required Reading:
- NoteBook: Chapter 8: Using Convolutions to Generalize from Deep Learning with PyTorch by Eli Stevens and Luca Antiga
- NoteBook: Chapter 3: Convolutional Neural Networks by Ian Pointer
Suggested Reading:
- Blog: Pytorch (Basics) — Intro to CNN by Akshaj Verma
Additional Resources:
Required Reading:
- Chapter 10 of the Deep Learning textbook.
- Slides: Sequence Modeling: Recurrent and Recursive Nets by Sargur Srihari
Suggested Reading:
- Blog: Understanding LSTM Networks by Christopher Olah
- Blog: Illustrated Guide to LSTM’s and GRU’s: A Step by Step Explanation by Michael Nguyen
Additional Reading:
- Video of lecture / discussion. This video covers a presentation by Ian Goodfellow and a group discussion of Chapter 10 at a reading group in San Francisco organized by Alena Kruchkova.
- Blog: Gentle introduction to Echo State Networks by Madalina Ciortan
- Blog: Understanding GRU Networks by Simeon Kostadinov
- Blog: Animated RNN, LSTM and GRU by Raimi Karim
- Extra Slides:
- Slide: Sequence Modeling: Recurrent and Recursive Networks by U Kang
- Slide: Training Recurrent Nets by Arvind Ramanathan
- Slide: An Introduction to: Reservoir Computing and Echo State Networks by Claudio Gallicchio
- Slide: Graphical Models vs. Deep Networks by Christian Wolf
Required Reading:
- NoteBook: Chapter 4: Transfer Learning and Other Tricks from Programming PyTorch for Deep Learning by Ian Pointer
Suggested Reading:
- Blog: Ideas on How to Fine-Tune a Pre-Trained Model in PyTorch by Florin-Daniel Cioloboc
- Blog: Visualizing Models, Data, and Training with TensorBoard
Additional Resources:
- Deep Learning via Pytorch by Ayoosh Kathuria
Required Reading:
- Chapter 11 of the Deep Learning textbook.
- Slides: Practical Methodology by Sargur Srihari
Suggested Reading:
- Metrics:
- Blog: Demystifying KL Divergence by Naoki Shibuya
- Blog: Demystifying Cross-Entropy by Naoki Shibuya
- Blog: Deep Quantile Regression by Sachin Abeywardana
- Blog: An Illustrated Guide to the Poisson Regression Model by Sachin Date
- Blog: Generalized Linear Models by Semih Akbayrak
- Blog: ROC curves and Area Under the Curve Explained (Video) by Data School
- Blog: Introduction to the ROC (Receiver Operating Characteristics) Plot
- Slide: ROC Curves by Maryam Shoaran
- Blog: Precision-Recall Curves by Andreas Beger
Additional Reading:
- Paper: The Relationship Between Precision-Recall and ROC Curves by Jesse Davis and Mark Goadrich
- Blog: Making sense of the Kullback–Leibler (KL) Divergence by Marko Cotra
- Extra Slides:
- Slide: Practical Methodology by Ian Goodfellow
- Slide: Practical Methodology by U Kang
- Slide: Practical Methodology by Ian Goodfellow
Required Reading:
Ray provides a simple, universal API for building distributed applications:
- Blog: Key Concepts
- Code: A Basic Tune Tutorial
Suggested Reading:
-
Blog: A Novice's Guide to Hyperparameter Optimization at Scale by Brandon M. Wood
-
Algorithms:
-
Blog: Tutorial #8: Bayesian Optimization by M.O. Ahmed and S. Prince
-
Blog: Hyper-parameter Optimization Algorithms: a Short Review by Aloïs Bissuel
-
Blog: Band and BOHB: Understanding State of the Art Hyperparameter Optimization Algorithms by MJ Bahmani
-
Blog: A (Slightly) Better Budget Allocation for Hyperband by Alexandre Abraham
-
Paper: Hyperband: Bandit-Based Configuration Evaluation for Hyperparameter Optimization by Lisha Li, Kevin Jamieson, Giulia DeSalvo, Afshin Rostamizadeh, Ameet Talwalkar
-
Blog: Population Based Bandits: Provably Efficient Online Hyperparameter Optimization by Jack Parker-Holder and Amog Kamsetty
-
Blog: Population Based Training of Neural Networks by by Max Jaderberg, Valentin Dalibard, Simon Osindero, Wojciech M. Czarnecki, Jeff Donahue, Ali Razavi, Oriol Vinyals, Tim Green, Iain Dunning, Karen Simonyan, Chrisantha Fernando and Koray Kavukcuoglu.
-
Extra Slides:
- Multifidelity Modelling by Neil Lawrence
Additional Resources:
- Blog: Spark, Dask, and Ray: Choosing the Right Framework by Nikolay Manchev
Optuna is an automatic hyperparameter optimization software framework, particularly designed for machine learning:
- Blog: Using Optuna to Optimize PyTorch Hyperparameters by Crissman Loomis
- Colab: Optuna
- Blog: Tutorial
- Blog: Efficient Hyperparameter Optimization with Optuna Framework by Brotoun Lab
- Blog: Announcing Optuna 2.0 by Hiroyuki Vincent Yamazaki
Required Reading:
- Chapter 12 of the Deep Learning textbook.
- Slide: Applications by U Kang
Suggested Reading:
- Blog: How Neural Networks Learn Distributed Representations By Garrett Hoffman
Additional Reading:
- Blog: 30 Amazing Applications of Deep Learning by Yaron Hadad
- Extra Slides:
- Slides: Applications by Sargur Srihari
Required Reading:
- Chapter 14 of the Deep Learning textbook.
- Slide: Autoencoders by Sargur Srihari
- Blog: Variational Autoencoders by Jermey Jordan
Suggested Reading:
- Blog: Introduction to Autoencoders by Jermey Jordan
- Blog: Understanding Variational Autoencoders (VAEs) by Joseph Rocca
- Blog: Tutorial - What is a Variational Autoencoder? by Jaan Altosaar
- Blog: Autoencoders vs PCA: When to Use? by Urwa Muaz
- Blog: Generative Modeling: What is a Variational Autoencoder (VAE)? by Peter Foy
- Blog: Variational Autoencoder: Intuition and Implementation by Agustinus Kristiadi
- Blog: Conditional Variational Autoencoder: Intuition and Implementation by Agustinus Kristiadi
Additional Reading:
-
Blog: Tutorial - What is a Variational Autoencoder? by Jaan Altosaar
-
Blog: Intuitively Understanding Variational Autoencoder: And Why They’re so Useful in Creating Your Own Generative Text, Art and Even Music by Irhum Shafkat
-
Blog: A High-Level Guide to Autoencoders by Shreya Chaudhary
-
Extra Slides:
- Slide: Variational Autoencoders by Raymond Yeh, Junting Lou, and Teck-Yian Lim
- Slide: Autoencoders by U Kang
- Slide: Generative Models by Mina Rezaei
Required Reading:
Slide: Generative Adversarial Networks (GANs) by Binglin, Shashank, and Bhargav
Paper: NIPS 2016 Tutorial: Generative Adversarial Networks by Ian Goodfellow
Suggested Reading:
- Blog: Generative Adversarial Networks (GANs), Some Open Questions by Sanjeev Arora
- Paper: Generative Adversarial Networks: An Overview by Antonia Creswell, Tom White, Vincent Dumoulin, Kai Arulkumaran, Biswa Sengupta, and Anil A Bharath
Additional Reading:
- Blog: GANs Comparison Without Cherry-Picking by Junbum Cha
- Blog: New Progress on GAN Theory and Practice by Liping Liu
- Blog: Play with Generative Adversarial Networks (GANs) in your browser!
- Blog: The GAN Zoo by Avinash Hindupur
- Generative Adversarial Networks (GANs), Some Open Questions by Sanjeev Arora
Required Reading:
- Slide: An introduction to graph neural networks and its applications in molecular property prediction by Hossein Hajiabolhassan
- Slide: Graph Neural Networks by Xiachong Feng
- Paper: A Comprehensive Survey on Graph Neural Networks by Zonghan Wu, Shirui Pan, Fengwen Chen, Guodong Long, Chengqi Zhang, Philip S. Yu
Suggested Reading:
- Course: Basics of Graph Neural Networks by Zak Jost
- Blog: A Gentle Introduction to Graph Neural Networks
- Book: Graph Representation Learning by William L. Hamilton
- Tutorial: Representation Learning on Networks by Jure Leskovec
- Paper: A Practical Guide to Graph Neural Networks by Isaac Ronald Ward, Jack Joyner, Casey Lickfold, Stash Rowe, Yulan Guo, and Mohammed Bennamoun
- Blog: Deep Graph Library (DGL): A Python package that interfaces between existing tensor libraries and data being expressed as graphs.
Additional Reading:
- GitHub: Graph Neural Networks
- Extra Slides:
- Slide: Graph Neural Networks: Models and Applications by Yao Ma, Wei Jin, Jiliang Tang, Lingfei Wu and Tengfei Ma. This is a perfect slide about graph neural networks.
Required Reading:
- Slide: Attention Mechanisms by Marcos V. Treviso
- Book: Dive into Deep Learning (Chapter 10) by Mag Gardner, Max Drummy, Joanne Quinn, Joanne McEachen, and Michael Fullan
Suggested Reading:
- Blog: Transformer Architecture: The Positional Encoding by Amirhossein Kazemnejad
- Blog: Transformers From Scratch by Peter Bloem
- Blog: Transformers are Graph Neural Networks by Chaitanya K. Joshi
- Blog: How Transformers work in deep learning and NLP: an Intuitive Introduction by Nikolas Adaloglou
- Blog: The Illustrated Transformer by Jay Alammar
Additional Reading:
- Blog: Illustrated: Self-Attention by Raimi Karim
- Papers:
- Papers with Code: The mission of Papers With Code is to create a free and open resource with Machine Learning papers, code and evaluation tables.
- Deep Learning Papers Reading Roadmap by Flood Sung
- Awesome - Most Cited Deep Learning Papers by Terry Taewoong Um
- Deep Learning Courses:
- Deep Learning Do It Yourself! by Marc Lelarge, Jill-Jênn Vie, and Andrei Bursuc
- The Deep Learning Lecture Series 2020 by DeepMind Research Scientists and Research Engineers
- Deep Learning by François Fleuret
- Deep Learning by Aykut Erdem
- Mini Course in Deep Learning with PyTorch for AIMS by Alfredo Canziani
- Introduction to Pytorch Code Examples by Andrew Ng and Kian Katanforoosh
- The blog of Christopher Olah: Fascinating tutorials about neural networks
- The blog of Adit Deshpande: The Last 5 Years In Deep Learning
- Fascinating Tutorials on Deep Learning
- Deep Learning (Faster Data Science Education by Kaggle) by Dan Becker
Saturday and Monday 10:30-12:00 AM (Fall 2020)
Tuesday 16:00-18:00 PM (Fall 2020), Refer to the following link to check the assignments.
Projects are programming assignments that cover the topic of this course. Any project is written by Jupyter Notebook. Projects will require the use of Python 3.7, as well as additional Python libraries.
Google Colab is a free cloud service and it supports free GPU!
- How to Use Google Colab by Souvik Mandal
- Primer for Learning Google Colab
- Deep Learning Development with Google Colab, TensorFlow, Keras & PyTorch
- Technical Notes On Using Data Science & Artificial Intelligence: To Fight For Something That Matters by Chris Albon
The students can include mathematical notation within markdown cells using LaTeX in their Jupyter Notebooks.
- A Brief Introduction to LaTeX PDF
- Math in LaTeX PDF
- Sample Document PDF
- TikZ: A collection Latex files of PGF/TikZ figures (including various neural networks) by Petar Veličković.
- Projects and Midterm – 50%
- Endterm – 50%
- First Midterm Examination: Saturday 1399/09/01, 10:30-12:00
- Second Midterm Examination: Saturday 1399/10/06, 10:30-12:00
- Final Examination: Wednesday 1399/10/24, 14:00-16:00
General mathematical sophistication; and a solid understanding of Algorithms, Linear Algebra, and Probability Theory, at the advanced undergraduate or beginning graduate level, or equivalent.
- Video: Professor Gilbert Strang's Video Lectures on linear algebra.
- Learn Probability and Statistics Through Interactive Visualizations: Seeing Theory was created by Daniel Kunin while an undergraduate at Brown University. The goal of this website is to make statistics more accessible through interactive visualizations (designed using Mike Bostock’s JavaScript library D3.js).
- Statistics and Probability: This website provides training and tools to help you solve statistics problems quickly, easily, and accurately - without having to ask anyone for help.
- Jupyter NoteBooks: Introduction to Statistics by Bargava
- Video: Professor John Tsitsiklis's Video Lectures on Applied Probability.
- Video: Professor Krishna Jagannathan's Video Lectures on Probability Theory.
Have a look at some reports of Kaggle or Stanford students (CS224N, CS224D) to get some general inspiration.
It is necessary to have a GitHub account to share your projects. It offers plans for both private repositories and free accounts. Github is like the hammer in your toolbox, therefore, you need to have it!
Honesty and integrity are vital elements of the academic works. All your submitted assignments must be entirely your own (or your own group's).
We will follow the standard of Department of Mathematical Sciences approach:
- You can get help, but you MUST acknowledge the help on the work you hand in
- Failure to acknowledge your sources is a violation of the Honor Code
- You can talk to others about the algorithm(s) to be used to solve a homework problem; as long as you then mention their name(s) on the work you submit
- You should not use code of others or be looking at code of others when you write your own: You can talk to people but have to write your own solution/code
I will be having office hours for this course on Saturday (09:00 AM--10:00 AM). If this is not convenient, email me at [email protected] or talk to me after class.