Optimization & Decision Intelligence Group

Optimization for Data Science (Autumn 2025)

This course provides an in-depth theoretical treatment of classical and modern optimization methods that are relevant in data science. After a general discussion about the role that optimization has in the process of learning from data, we give an introduction to the theory of (convex) optimization. Based on this, we present and analyze algorithms in the following four categories: first-order methods (gradient and coordinate descent, Frank-Wolfe, subgradient and mirror descent, stochastic and incremental gradient methods); second-order methods (Newton and quasi Newton methods); non-convexity (local convergence, provable global convergence, cone programming, convex relaxations); min-max optimization (extragradient methods).

The emphasis is on the motivations and design principles behind the algorithms, on provable performance bounds, and on the mathematical tools and techniques to prove them. The goal is to equip students with a fundamental understanding about why optimization algorithms work, and what their limits are. This understanding will be of help in selecting suitable algorithms in a given application, but providing concrete practical guidance is not our focus.

Lecturers: Prof. Dr. Niao He (OAT Y21.1), Prof. Dr. Bernd Gärtner (OAT Z15), Dr. Zebang Shen (OAT Y21.2).

Guest Lecturers: Dr. Bingcong Li (OAT Y14) on Nonconvex Functions, Dr. Ya-Ping Hsieh (OAT Y14) on Min-Max Optimization.

Classes: Mon 14-15 (HG E 5), Tue 10-12 (HG D 1.1).

Useful Links: Course Catalogue (261-5110-00L). All materials are available at Moodle including annoucements and Q&A.

Prerequisites: A solid background in analysis and linear algebra; some background in theoretical computer science (computational complexity, analysis of algorithms); the ability to understand and write mathematical proofs.

Schedule and Course Materials

Week	Date	Topic
1	Mon 15.09 (cancelled) Tue 16.09	Introduction and Optimization
2	Mon 22.09 Tue 23.09	Theory of Convex Functions
3	Mon 29.09 Tue 30.09	Gradient Descent
4	Mon 06.10 Tue 07.10	Projected Gradient Descent and Coordinate Descent
5	Mon 13.10 Tue 14.10	The Frank-Wolfe Algorithm
6	Mon 20.10 Tue 21.10	Newton’s Method and Quasi-Newton Methods
7	Mon 27.10 Tue 28.10	Nonconvex Functions
8	Mon 03.11 Tue 04.11	Subgradient Method
9	Mon 10.11 Tue 11.11	Mirror Descent, Smoothing, Proximal Algorithms
10	Mon 17.11 Tue 18.11	Min-Max Optimization, Part I
11	Mon 24.11 Tue 25.11	Min-Max Optimization, Part II
12	Mon 01.12 Tue 02.12	Stochastic Optimization: SGD
13	Mon 08.12 Tue 09.12	Finite Sum Optimization
14	Mon 15.12 Tue 16.12	Data Science Applications

Grading, Graded Assignments and Exam

There will be a written exam in the examination session. Furthermore, there will be two mandatory graded assignments during the semester. The final grade of the whole course will be calculated as a weighted average of the grades for the exam (70%) and the graded assignments (30%).

Concretely, let P1, P2 be the performances in the two graded assignments, measured as the percentage of points being attained (between 0% and 100%). A graded assignment that is not handed in is counted with a performance of 0%. Let PE be the performance in the final exam. Then the overall course performance is computed as P = 0.15*P1 + 0.15*P2 + 0.7*PE. A course performance of P >= 50% is guaranteed to lead to a passing grade, but depending on the overall performance of the cohort, we may lower the threshold for a passing grade.

Graded Assignments (30%):

At two times during the course of the semester, we will hand out graded assignments (compulsory continuous performance assessments). The solutions are expected to be typeset in LaTeX or similar. Assignments can be discussed with colleagues, but we expect an independent writeup.

The estimated release dates of the graded assignments are as follows: 28.10.2025 and 16.12.2025. You will have three weeks to finish each of the graded assignments.

Exam (70%):

Date to be determined. The written exam lasts 180 minutes. 4 pages (A4) of written material are allowed.

Addtional Reading Materials

Convex Optimization, by Stephen Boyd and Lieven Vandenberghe

Convex Optimization: Algorithms and Complexity, by Sébastien Bubeck

High-Dimensional Statistics: A Non-Asymptotic Viewpoint, by Martin J. Wainwright

Assistants and Regular Exercises

Liang Zhang (OAT Y21.2)	Xiang Li (OAT Y23)	Jiduan Wu
Andrey Kharitenko (OAT Y14)	Haofeng Yang	Fangyuan Sun

Regular Exercises:

The exercises are discussed in classes. Students are expected to try to solve the problems beforehand. Your assistant is happy to look at your solutions and correct/comment them. We assign students to classes according to surnames. Attendance according to these assignments is not compulsory but encouraged. The details of the classes are as follows.

Group	Students with Surnames	Time	Room
A	A - L	Tue 14-16	HG D 1.2
B	M - Z	Fri 14-16 (Cancelled)	CAB G 51