CS395T: Sublinear Algorithms (Fall 2016)

Grades will be based on the following weighting of class components: Plus and minus modifiers will not appear in the final grade.
Logistics: Tue/Thu 3:30 - 5:00
GDC 5.304
Unique Number: 51810
Course web page: http://www.cs.utexas.edu/~ecprice/courses/sublinear/
Professor: Eric Price
Email: ecprice@cs.utexas.edu
Office: GDC 4.510
Office Hours: Wednesday 3-4pm
TA: Zhao Song
Email: zhaos@utexas.edu
Office Hours: TBA
Content: This graduate course will study algorithms that can process very large data sets. In particular, we will consider algorithms for:
  • Data streams, where you don't have enough space to store all the data being generated.
  • Property testing, where you don't have enough time to look at all the data.
  • Compressed sensing, where you don't have enough measurement capacity to observe all the data.
Useful References: The last instantiation of this class was similar to this one. Other similar courses include Sublinear Algorithms (at MIT), Algorithms for Big Data (at Harvard), and Sublinear Algorithms for Big Datasets (at the University of Buenos Aires).
Problem Sets: Problem sets are due every other week at the beginning of class. Typewritten solutions are preferred.
  1. Problem Set 1. Due September 15.
  2. Problem Set 2. Due October 4.
  3. Problem Set 3. Due October 20.
  4. Problem Set 4. Due November 8.
  5. Problem Set 5. Due November 22.
Lectures:
  1. Thursday, August 25. Course overview; basic uniformity testing. [Lecture notes (pdf) (tex)]
  2. Tuesday, August 30. Concentration inequalities; distinct elements. [Lecture notes (pdf) (tex)]
  3. Thursday, September 1. More distinct elements algorithms and lower bounds. [Lecture notes (pdf) (tex)]
  4. Tuesday, September 6. Concentration of measure. [Lecture notes (pdf) (tex)] [scratch]
  5. Thursday, September 8. Subgamma variables; Johnson-Lindenstrauss. [Lecture notes (pdf) (tex)]
  6. Tuesday, September 13. Count-Min sketch. [Lecture notes (pdf) (source)] [scratch]
  7. Thursday, September 15. Count-Sketch. [Lecture notes (pdf) (tex)]
  8. Tuesday, September 20. L0 sampling; exact sparse recovery. [Lecture notes (pdf) (tex)] [scratch]
  9. Thursday, September 22. Graph sketching. [Lecture notes (pdf) (source)] [scratch]
  10. Tuesday, September 27. Coresets. [Lecture notes (pdf) (tex)] [scratch]
  11. Thursday, September 29. Cauchy distribution; Fp moment estimation. [Lecture notes (pdf) (tex)]
  12. Tuesday, October 4. Fp moment estimation lower bounds; packing/covering numbers. [Lecture notes (pdf) (tex)]
  13. Thursday, October 6. Maurey's empirical method; Restricted Isometry Property. [Lecture notes (pdf) (tex)]
  14. Thursday, October 13. Proving the RIP; iterative hard thresholding. [Lecture notes (pdf) (tex)]
  15. Tuesday, October 18. Model-based compressive sensing. [Lecture notes (pdf) (tex)]
  16. Thursday, October 20. L1 minimization. [Lecture notes (pdf) (tex)]
  17. Tuesday, October 25. Lower bounds for sparse recovery. [Lecture notes (pdf) (tex)]
  18. Thursday, October 27. Adaptive sparse recovery. [Lecture notes (pdf) (tex)]
  19. Tuesday, November 1. RIP-1; SSMP. [Lecture notes (pdf) (tex)]
  20. Thursday, November 3. Fourier uncertainty principle. Symmetrization; Dudley's entropy integral; start Fourier RIP. [Lecture notes (pdf) (tex)]
  21. Tuesday, November 8. Finish Fourier RIP. [Lecture notes (pdf) (tex)]
  22. Thursday, November 10. Property testing: monotonicity, grids.
  23. Tuesday, November 17. Property testing on graphs.
  24. Thursday, November 19. Distribution testing: uniformity, identity.
  25. Tuesday, November 24. Distribution testing: identity of pairs of distributions; independence.
The tentative outline for the course is as follows:
  • Uniformity testing
  • Concentration inequalities and Johnson-Lindenstrauss
  • Distinct elements counting
  • Heavy hitters
  • Graph sketching
  • Compressed sensing
  • Model-based compressed sensing
  • Sparse Fourier transforms
  • Property testing
  • Other streaming models: random order, distributional
Prerequisites: Mathematical maturity and comfort with undergraduate algorithms and basic probability. Ideally also familiarity with linear algebra.
Grading:40%: Homework
30%: Final project
20%: Scribing lectures
10%: Participation
Scribing: In each class, two students will be assigned to take notes. These notes should be written up in a standard LaTeX format before the next class.
Homework
policy:
There will be a homework assignment roughly every two weeks.

Collaboration policy: You are encouraged to collaborate on homework. However, you must write up your own solutions. You should also state the names of those you collaborated with on the first page of your submission.

Final project: In lieu of a final exam, students will perform final projects. These may be done individually or in groups of 2-3. An ideal final project would perform a piece of original research in a topic related to the course. Failing that, one may perform a literature survey covering several research papers in the field.

Students will present their results to the class during the last week of classes. The final paper will be due on the scheduled final exam day.

Students with
Disabilites:
Any student with a documented disability (physical or cognitive) who requires academic accommodations should contact the Services for Students with Disabilities area of the Office of the Dean of Students at 471-6259 (voice) or 471-4641 (TTY for users who are deaf or hard of hearing) as soon as possible to request an official letter outlining authorized accommodations.