CS 4641-B Machine Learning — Spring 2019

Tuesday & Thursday 12:00pm-1:15pm, Klaus room 1443

Instructor: Brian Hrolenok
@cc.gatech.edu email: brian.hrolenok
Office: TSRB 241
Office Hours: Tu/Th 1:30pm-2:30pm (and by appointment)

Course description

CS 4641 is a 3-credit introductory course on Machine Learning intended for undergraduates. Machine Learning is the area in the broader field of Artificial Intelligence that focuses on algorithms for making the best decisions given data. The theoretical and practical specifics of each of these terms in a variety of problem domains form the core of ML research. This course is an introduction to a very broad and active field, and presents specific algorithms and approaches in such a way that grounds them in broader classes within that field. Topics will include supervised and unsupervised learning, randomized search algorithms, Bayesian learning methods, and reinforcement learning. The course also covers theoretical concepts such as inductive bias, PAC and Mistake-bound learning frameworks, minimum description length principle, and Ockham's Razor. This course will include several individual programming and report based assignments.

Learning objectives:

To provide a broad survey of approaches and techniques in ML.
To develop a deeper understanding of several major topics in ML.
To develop the design and programming skills that will help you to build computational artifacts that learn from data.
To develop the basic skills necessary to pursue research in ML.

Prerequisites. The official prerequisite for this course is CS 1331, although familiarity in the following topics will be useful:

Probability
Statistics
Linear algebra
Data structures
Computational complexity

Textbook: There is no textbook for this class. Specific readings from the literature will be provided via Canvas.

Assignments

All assignment submissions will be handled through Canvas, and are due by the date and time listed there. Canvas should be configured to allow submissions up to an hour late for most assignments, but these will incur a 10% penalty, and late submissions may be disabled if abused. Submissions by email will not be accepted.

Assignment 1: This first assignment asks you to explore multiple techniques in supervised learning, with a particular focus on comparative analysis. This project is open-ended and requires significant time to complete, which means you should start as early as possible.

Assignment 2: This assignment covers several randomized optimization techniques that are useful for solving complex optimization problems common in the ML domain. This project is shorter than Assignment 1, but the due date is sooner and the midterm will be happening in parallel, so please start early.

Assignment 3: This assignment asks you to use several clustering, feature selection, and feature transformation algorithms on the datasets you've previously analyzed.

Assignment 4: This assignment asks you to explore Markov Decision Processes by designing a few problems, and solving them using Value Iteration, Policy Iteration, and Reinforcement Learning.

Exams

There will be two exams in this class, a midterm and a final (cumulative). The date for the midterm may differ from what is listed below, but the intent is for you to have feedback before the grade-change deadline. The date for the final is fixed by the registrar. Both exams are closed-book, closed-notes, and in class. There will be no make up exam unless previously arranged (well in advance), or excused by the Dean of Students.

Grading policies

Your TAs and I will strive to provide you reasonably detailed and timely feedback on every assignment and exam. If you have any questions about any of your grades please reach out to us, either by coming to scheduled office hours or via your "@gatech.edu" email address. If there is an error with your grade, please contact us within a week of when feedback is returned, otherwise we might not be able to change it.

Point breakdown:

Projects: 50%
Midterm: 25%
Final: 25%

Academic Integrity
All of the assignments in this class are individual work only. For some aspects of some assignments you are allowed and even encouraged to use resources publicly available on the Internet, with two caveats:

When you can use public resources, it will be explicitly stated. If it's not explicitly stated, assume it's not allowed. If you're unsure, ask first.
Thoroughly document where and when you obtained any code or libraries that you use which you did not write yourself. Otherwise, you run the risk of appearing to misrepresent another's work as your own. When in doubt, be explicit about where the code came from.

Being a student at Georgia Tech can be very stressful, and it's far too easy to overload your semester with difficult or time-intensive classes. When you have multiple assignments due in the same week, sometimes you have to decide how much time you can spend on each one, and sometimes there just aren't enough hours in the day. Come and talk to me about it, there's probably some way we can make things work. I'm far more willing to give extensions, hold extra office hours, and curve than I am willing to overlook violations of the honor code.

Schedule

The following is the tentative schedule for the spring 2019 semester. Please check your email and Canvas regularly for any changes, as this website may not be updated immediately.

Week 1 — January 8th & 10th.
Supervised Learning part 1. Decision trees.

Week 2 — January 15th & 17th.
Supervised Learning part 2. Regression and Classification, Neural Networks.

Week 3 — January 22nd & 24th.
Supervised Learning part 3. Instanced based learning and ensemble learning.

Week 4 — January 29th & 31st.
Supervised Learning part 4. Kernel Methods, SVMs, computational learning theory.

Week 5 — February 5th & 7th. Assignment 1 due (tentative).
Supervised Learning part 5. VC dimension, Bayesian learning.

Week 6 — February 12th & 14th.
Supervised Learning part 6. Bayesian inference, randomized optimization.

Week 7 — February 19th & 21st.
Midterm review. The midterm will be on Thursday, February 21st, during the normal class period.

Week 8 — February 26th & 28th. Assignment 2 due (tentative).
Unsupervised Learning part 1. Clustering.

Week 9 — March 5th & 7th.
Unsupervised Learning part 2. Feature selection.

Week 10 — March 12th & 14th.
Unsupervised Learning part 3. Feature transformation.

Week 11 — March 19th & 21st.
Spring break — no class.

Week 12 — March 26th & 28th.
Unsupervised Learning part 4. Information Theory.

Week 13 — April 2nd & 4th. Assignment 3 due (tentative).
Reinforcement Learning part 1. Markov Decision Processes.

Week 14 — April 9th & 11th.
Reinforcement Learning part 2. Q-Learning.

Week 15 — April 16th & 18th.
Reinforcement Learning part 3. Game theory.

Week 16 — April 23rd. Assignment 4 due by 11:59pm EDT on Sunday April 21st.
Course summary and final review.