This is an advanced theory course that examines contemporary computational methods that enable machine learning and data science at scale. We will cover topics including randomized algorithms, concentration inequalities, dimensionality reduction, convex optimization, spectral methods, and compressed sensing. The course emphasizes rigorous mathematical analysis and algorithmic design.
Office: 370 Jay St., Brooklyn, NY • Office 1104
Instructor Office Hours: By appointment over zoom. Monday's 10 am to 11 am (please email at least a day in advance to schedule)
Lectures: Fridays 2:00pm–4:30pm, 6 MetroTech Room 674.
Problem-Solving Session: Tuesdays 12:30pm, Room 826, 370 Jay Street. Led by course assistant. Weekly problem-solving and Q&A.
TA Office Hours: Wednesdays 12:00pm–1:00pm and Thursdays 10:00am–11:00am, 8th Floor Common area, 370 Jay Street
Reading Group: Details TBA. For students working on final projects. Open to all interested students for discussion and workshopping of research papers.
This course is mathematically rigorous and intended for graduate students and advanced undergraduates. Students should have prior coursework in machine learning, a strong background in algorithms, a solid understanding of linear algebra, experience with probability theory and random variables, and comfort with writing and understanding rigorous mathematical proofs.
Problem Sets: 40%, Midterm Exam: 25%, Final Project OR Final Exam: 25%, Participation: 10%.
There will be four problem sets throughout the semester. All assignments must be submitted via Gradescope by 11:59pm ET on the due dates listed below.
Formatting: Students have to write solutions in LaTeX or Markdown.
Grading Policy: Each problem set will be graded as follows. For each problem you solve completely, clearly indicate it is a complete solution. After submission, you will be asked to explain one randomly selected problem from among your completely solved problems on a whiteboard. You may reference your written homework during the explanation. The grade you receive on the explanation will be applied to all of your completely solved problems. For problems you do not solve completely, you may write "I don't know" to receive 25% credit on that problem. Problems that are incomplete without writing "I don't know" receive 0% credit. There is no partial credit for incomplete solutions.
Example: If a problem set has 5 problems and you completely solve 3 of them, one of those 3 will be randomly selected for you to explain. If you receive 90% on your explanation, you receive 90% on all 3 completely solved problems. For the remaining 2 problems, you can write "I don't know" on each to receive 25% credit, or 0% if you leave them incomplete. Your final score would be: (90% + 90% + 90% + 25% + 25%) / 5 = 64%.
| Assignment | Due Date |
|---|---|
| Problem Set 1 | Wednesday, February 4, 2026 |
| Problem Set 2 | Wednesday, February 25, 2026 |
| Problem Set 3 | Wednesday, March 25, 2026 |
| Problem Set 4 | Wednesday, April 22, 2026 |
Midterm Exam: Friday, March 13, 2026 (first half of class). Final Exam: Friday, May 8, 2026, 2:00pm–4:30pm (regular classroom). Students who choose to complete a final project may opt out of the final exam.
Project Proposal Due: Thursday, April 2, 2026. Guidelines: here
There is no textbook to purchase. Course material will consist of my slides, lecture notes scribed by Teal Witter, as well as assorted online resources, including papers, notes from other courses, and publicly available surveys.
The following schedule is tentative and subject to change. Topics and resources will be updated throughout the semester.
| Week | Date | Topic | Resources |
|---|---|---|---|
| 1 | 1/23 | Random variables, concentration, Markov's inequality [slides] [notes] | Probability review, Alternative resource, Mark-and-recapture paper |
| 2 | 1/30 | Efficient hashing, Chebyshev inequality | Princeton universality notes, Flajolet-Durand paper |
| 3 | 2/6 | Exponential tail bounds (Chernoff, Bernstein) | Terry Tao notes, Power of two choices |
| 4 | 2/13 | High-dimensional geometry | Foundations of Data Science (Ch. 2) |
| 5 | 2/20 | Johnson-Lindenstrauss lemma, dimensionality reduction | Princeton JL notes, Anupam Gupta's notes |
| 6 | 2/27 | High-dimensional nearest neighbor search, locality sensitive hashing | Stanford LSH notes, Indyk-Motwani analysis |
| 7 | 3/6 | Gradient descent, projected gradient descent | Stanford linear algebra, Mądry's notes, Moritz Hardt's notes, Sébastien Bubeck's book |
| 8 | 3/13 | Midterm + GD continuation, second-order conditions | Second-order conditions notes |
| — | 3/20 | SPRING BREAK (no class) | — |
| 9 | 3/27 | Online and stochastic gradient descent | Elad Hazan's book |
| 10 | 4/3 | Center of gravity, ellipsoid method, LP relaxation | Princeton ellipsoid method, Interior point method, Nisheeth Vishnoi's book |
| 11 | 4/10 | Singular value decomposition, Krylov subspace methods | Princeton power method, Foundations of Data Science (Ch. 3), Stanford notes |
| 12 | 4/17 | Spectral graph theory, spectral clustering, stochastic block model | Princeton SBM notes, Karate Club paper, Stanford spectral graph |
| 13 | 4/24 | SBM continuation, randomized numerical linear algebra, sketching for regression, ε-nets | Princeton sketching notes, ε-nets context |
| 14 | 5/1 | Fast JL, sparse recovery, compressed sensing | FJLT paper, Compressed sensing notes |
| 15 | 5/8 | FINAL EXAM (2:00pm) | Final exam preparation materials |
Note: Lecture notes, slides, and additional resources will be posted on Brightspace before each lecture.
Brightspace: TBA. Gradescope: TBA.
All students are expected to follow NYU's academic integrity policies. Violations will be taken seriously and may result in failure of the course and/or disciplinary action by the university.