Department: The University of Texas at Austin, Department of Computer Sciences
Instructor: Professor William H. Press
Term: Spring, 2008
Meets: MW 1:30 - 3:00 PM in PAR 201
Bayesian vs. frequentist approaches. Common univariate and multivariate distributions, statistical tests, contingency tables. Random deviates, Monte Carlo and stochastic simulation, bootstrap methods. Model fitting, multidimensional optimization, simulated annealing. Gaussian mixture models, EM methods. Hidden Markov models, Markov chain Monte Carlo and its geneeralizations. Entropy, mutual information, applications to clustering and classification. Emphasis will be on practical methods of computation with examples drawn mostly from bioinformatics, especially genomics, but with utility for other fields dealing with large amounts of experimental or observational data. Individual or collaborative projects using actual data (provided either by the student or by the instructor) will be encouraged.
Prerequisites: Graduate standing or upper-division undergraduate with consent of instructor, mathematics at least including undergraduate multivariable calculus and linear algebra; must be comfortable programming in C++, Java, or C.
There will be no written exams or required problem sets as such. Grades will be based on class participation, contributions to this wiki (either creating new pages or adding material to pages created by the instructor), individual or group-collaborative projects (with appropriate results added to the wiki), and an individual 20-30 minute interview with each student by the instructor at the end of the term. During lectures I'll suggest optional problems which any interested student may solve and post for participation credit. (Multiple students may post solutions, or comment on each others' solutions.)
4/16/08 Be sure to sign up for an exam time here. Exam dates are May 5 and May 6, 2008.
Lecture notes as actually delivered are posted to the lecturenotes page. Student comments and discussion are appended to these pages. (This counts for participation credit!)
Lecture notes as later cleaned up, arranged by topic, and prefaced by summary slides are linked to the course outline and concepts list, here.
Dr. Sun Young Park gave a guest lecture on biomedical image analysis.
Students may contribute or collaborate on topical pages at any time at contributed.
Data files used as examples in the lectures, and some code files, are at data and code files.
Here are student contributions indexed by name.
Use this computer help page to exchange hints on getting Matlab, C++, and mex files working on your individual computers.
Remember, projects are due on Wednesday, May 7, 2008, at noon. Either send to me by email, or else drop off at my office (ACES 3.258).
Instructions for student projects are here.
Individual or group student project results will eventually be found at projects.
Every student will have free electronic access to Numerical Recipes Third Edition.
Some useful books are listed at books.
Useful links are listed at links.
In class, we'll show examples using Matlab and C++. You will need to be able to program in C++ or Java. If you program in C, it should not be hard for you to acquire enough C++ by following the examples in the lecture notes.
You can use Mathematica instead of Matlab if it is more familiar to you. If you haven't previously used either language, then you will learn basic Matlab programming in this course. (It's not hard.)
You may also be able to use Octave instead of Matlab. Octave is an open source replacement for Matlab, and after limited testing seems to have no problems running code written for Matlab. It's available for linux/win/mac; more info here.