Skip to content

Latest commit

 

History

History
106 lines (65 loc) · 6.23 KB

l1_course_handout.md

File metadata and controls

106 lines (65 loc) · 6.23 KB

50.043 Database Systems and Big Data Course Handout

This page will be updated regularly. Sync up often.

Course Description

Database systems manage data which is at the heart of modern computing applications. This course covers the fundamentals of traditional databases, such as Oracle and MySQL, and core ideas of recent big data systems.

Students will learn important problems in data management that these systems are designed to solve. They will experience the internal design and implementation of relational databases. They will also understand the internals of state‐of‐the‐art big data platforms, namely Apache Spark, and use them on Amazon cloud (Amazon Web Service). The students will be able to determine the advantages and limitations of different database systems.

Resource

The main resources are lecture slides, tutorial sessions, and online documentations. There are no official textbooks. But the following are useful for reference and deeper understanding of some topics.

  1. Abraham Siberschatz, Henry Korth, S Sudarshan. Database System Concepts. 6th edition. (DSC)
  2. Raghu Ramakrishnan, Johannes Gehrke. Database management systems. 3rd edition (DBM)
  3. Hector Garcia-Molina, Jeffrey D. Ullman, Jennifer Widom. Database systems, the complete book. 2nd edition. (DS)

Instructors

TAs

Communication

If you have course/assignment/project related questions, please post it on the dedicated MS teams channel.

Grading

Your final grade is computed as follows:

  1. Homework: 12% There will be 2 homework assignments, 6 points each.

  2. Project: 60% Group project, up to 3 per group. Unless notifying the instructors otherwise, all group members have the same grade for the project.

  3. Class participation: 3% Ask/answer questions during classes, spot mistakes, etc.

  4. Final: 25%

Things you need to prepare

  • If you are using Windows 10 or Windows 11, please install ubuntu subsystems
  • If you are using Linux, it should be perfect.
  • If you are using Mac, please install homebrew.
  • Make sure Java >8 is installed and ant is installed.
    • Ubuntu: sudo apt install ant ant-contrib
    • Mac: brew install ant ant-contrib

Project

Please refer to the project page.

Submission Policy and Plagiarism

  1. You will do the assignment/project on your own (own teams) and will not copy paste solutions from someone else.
  2. You will not post any solutions related to this course to a private/public repository that is accessible by the public/others.
  3. Students are allowed to have a private repository for their assignment which no one can access.
  4. For projects, students can only invite their partners as collaborators to a private repository.
  5. Failing to follow the Code of Honour will result in failing the course and/or being submitted to the University Disciplinary Committee. The consequences apply to both the person who shares their work and the person who copies the work.

Schedule (22 Jan 2023 - 30 Apr 2023)

Week Lecture Cohort Reference Remarks
1 (22/1) Intro, ER Model MySQL, AWS DBM: Chapter 1-2,
DS: Chapter 7
Tuesday is a CNY holiday, Cohort Class Cancelled, Lecture will be conducted during Cohort Class slots.
Please self-study the MySQL, AWS e-learning materials (from eDimension)
2 (29/1) Relational Model, Relational Algebra Data Model, Relational Algebra DBM: Chapter 3-4,
DSC: Chapter 2 & 6
Project Team Submission (5/2 23:59)
3 (5/2) SQL, NoSQL SQL DBM: Chapter 4-5,
DSC: Chapter 2-4
4 (12/2) Functional Dependency, Normal Forms Functional Dependency, Normal Forms DBM: Chapter 19,
DSC: Chapter 8
Assignment 1 Submission (19/2 23:59)
5 (19/2) Storage, Index Strorage, Index DBM: Chapter 19,
DSC: Chapter 8
6 (26/2) Query Operations Query Operations DBM: Chapter 12-14,
DSC: Chapter 12
Project Lab 1 Submission (5/3 23:59)
7 (5/3) Recess Week Recess Week Self-study flintrock and spark cluster setup (edimension video tutorial)
8 (12/3) Query Optimization Query Optimization DBM: Chapter 15 ,
DSC: Chapter 13
Project Lab 2 Submission (19/3 23:59)
9 (19/3) Transaction Recovery and Concurrency Transactions DBM: Chapter 16-18,
DSC: Chapter 14-16
10 (26/3) HDFS, MapReduce HDFS, MapReduce
11 (2/4) Spark part 1 and 2 Spark Project Lab 3 Submission (9/4 23:59)
12 (9/4) Guest Lecture Consultation Assignment 2 Submission (16/4 23:59)
13 (16/4) Yarn, Revision Consultation Project Lab 4 Submission (23/4 23:59)
14 (23/4)

Make Up and Alternative Assessment

Make ups for Final exam will be administered when there is an official Leave of Absence from OSA. There will be only one make up. There will be no make-up if students miss the make up test.