NYU CS-GY 6763: Algorithmic Machine Learning and Data Science

NYU CS-GY 6763 (3943): Algorithmic Machine Learning and Data Science

Spring 2026

Course Overview

This is an advanced theory course that examines contemporary computational methods that enable machine learning and data science at scale. We will cover topics including randomized algorithms, concentration inequalities, dimensionality reduction, convex optimization, spectral methods, and compressed sensing. The course emphasizes rigorous mathematical analysis and algorithmic design.

Instructor and Course Team

Professor

Course Assistant

Course Assistant

Course Assistant

giancarlo.pereira@nyu.edu

Office: 370 Jay St., Brooklyn, NY • Office 1104
Instructor Office Hours: By appointment over zoom. Monday's 10 am to 11 am (please email at least a day in advance to schedule)

Meeting Times and Locations

Lectures: Fridays 2:00pm–4:30pm, 6 MetroTech Room 674.

Problem-Solving Session: Tuesdays 12:30pm, Room 826, 370 Jay Street. Led by course assistant. Weekly problem-solving and Q&A.

TA Office Hours: Wednesdays 12:00pm–1:00pm and Thursdays 10:00am–11:00am, 8th Floor Common area, 370 Jay Street

Reading Group: Sign-up sheet. For students working on final projects. Open to all interested students for discussion and workshopping of research papers.

Prerequisites

This course is mathematically rigorous and intended for graduate students and advanced undergraduates. Students should have prior coursework in machine learning, a strong background in algorithms, a solid understanding of linear algebra, experience with probability theory and random variables, and comfort with writing and understanding rigorous mathematical proofs.

Grading

Problem Sets: 40%, Midterm Exam: 25%, Final Project OR Final Exam: 25%, Participation: 10%.

Problem Sets

There will be four problem sets throughout the semester. All assignments must be submitted via Gradescope by 11:59pm ET on the due dates listed below.

Formatting: Students have to write solutions in LaTeX or Markdown.

Grading Policy: Each problem set will be graded as follows. For each problem you solve completely, clearly indicate it is a complete solution. After submission, you will be asked to explain one randomly selected problem from among your completely solved problems on a whiteboard. You may reference your written homework during the explanation. The grade you receive on the explanation will be applied to all of your completely solved problems. For problems you do not solve completely, you may write "I don't know" to receive 25% credit on that problem. Problems that are incomplete without writing "I don't know" receive 0% credit. There is no partial credit for incomplete solutions.

Example: If a problem set has 5 problems and you completely solve 3 of them, one of those 3 will be randomly selected for you to explain. If you receive 90% on your explanation, you receive 90% on all 3 completely solved problems. For the remaining 2 problems, you can write "I don't know" on each to receive 25% credit, or 0% if you leave them incomplete. Your final score would be: (90% + 90% + 90% + 25% + 25%) / 5 = 64%.

Assignment	Due Date
Problem Set 1	Friday, February 6, 2026
Problem Set 2	Wednesday, February 25, 2026
Problem Set 3	Wednesday, March 25, 2026
Problem Set 4	Wednesday, April 22, 2026

Collaboration Policy: Collaboration is allowed on homework problems, but solutions and code must be written independently. Writing should not be done in parallel. You must list all collaborators separately for each problem.

Use of External Results: Unless otherwise stated, referencing non-standard theorems and proofs not given in class or previous problems is not allowed. All solutions must be proven from first principles.

Exams

Midterm Exam: Friday, March 13, 2026 (first half of class). Final Exam: Friday, May 8, 2026, 2:00pm–4:30pm (regular classroom). Students who choose to complete a final project may opt out of the final exam.

Final Project

Project Proposal Due: Thursday, April 2, 2026. Guidelines: here. Example projects: here.

Course Resources

There is no textbook to purchase. Course material will consist of my slides, lecture notes scribed by Teal Witter, as well as assorted online resources, including papers, notes from other courses, and publicly available surveys.

Lecture Schedule

The following schedule is tentative and subject to change. Topics and resources will be updated throughout the semester.

Week	Date	Topic	Resources
1	1/23	Random variables, concentration, Markov's inequality [slides] [notes]	Probability review, Alternative resource, Mark-and-recapture paper
2	1/30	Efficient hashing, Chebyshev inequality [slides] [notes] [notes]	Princeton universality notes, Fingerprinting notes, Flajolet-Durand paper
3	2/6	Exponential tail bounds (Chernoff, Bernstein) [slides]	Terry Tao notes, Power of two choices
4	2/13	High-dimensional geometry [slides]	Foundations of Data Science (Ch. 2)
5	2/20	Johnson-Lindenstrauss lemma, dimensionality reduction [slides]	Princeton JL notes, Anupam Gupta's notes
6	2/27	High-dimensional nearest neighbor search, locality sensitive hashing [slides]	Stanford LSH notes, Indyk-Motwani analysis
7	3/6	Gradient descent, projected gradient descent [slides]	Stanford linear algebra, Mądry's notes, Moritz Hardt's notes, Sébastien Bubeck's book
8	3/13	Midterm	Second-order conditions notes
—	3/20	SPRING BREAK (no class)	—
9	3/27	Online and stochastic gradient descent	Elad Hazan's book
10	4/3	Center of gravity, ellipsoid method, LP relaxation	Princeton ellipsoid method, Interior point method, Nisheeth Vishnoi's book
11	4/10	Singular value decomposition, Krylov subspace methods	Princeton power method, Foundations of Data Science (Ch. 3), Stanford notes
12	4/17	Spectral graph theory, spectral clustering, stochastic block model	Princeton SBM notes, Karate Club paper, Stanford spectral graph
13	4/24	SBM continuation, randomized numerical linear algebra, sketching for regression, ε-nets	Princeton sketching notes, ε-nets context
14	5/1	Fast JL, sparse recovery, compressed sensing	FJLT paper, Compressed sensing notes
15	5/8	FINAL EXAM (2:00pm)	Final exam preparation materials

Note: Lecture notes, slides, and additional resources will be posted on Brightspace before each lecture.

Course Links

Academic Integrity

All students are expected to follow NYU's academic integrity policies. Violations will be taken seriously and may result in failure of the course and/or disciplinary action by the university.

← Back to Ainesh Bakshi's Homepage