Ref: Coursera - University of Washington Machine Learning Course.
Link to the course: Link.
ROUGH NOTES (!)
Updated: 7/6/26
MACHINE LEARNING
We will consider
MACHINE LEARNING: CASE STUDIES
[Machine learning is changing the world]
Old view of ML
Data -> ML Algorithm -> “My curve is better than your curve” -> Write a paper.
Intelligent Applications
Disruptive companies (Companies that take a market and completely change it) are often differentiated by INTELLIGENT APPLICATIONS using Machine Learning.
Eg: In the early days, Amazon disrupted the retail market by bringing in product recommendations into their website.
Eg: Google disrupted the advertising market by doing really targeted advertising, using ML to figure out what people would click on.
Eg: Netflix, the movie distribution company, changed how we watch movies. (We don’t go to a shop and rent movies anymore, we go to the web and stream data.) At the core, there is a recommender system that helps me find the movies that I like, out of the many, many thousands of movies they are serving.
Eg: Consider Pandora, a music recommendation system.
Eg: Consider Facebook. It connects me with people who I might want to be friends with.
Eg: Consider Uber. It disrupted the taxi industry by optimizing how to connect drivers with people in real time.
We will get ready to build intelligent applications like these.
The machine learning pipeline
Data -> ML Method -> Intelligence.
By Intelligence, we mean things like: What product am I likely to buy right now?
[Why a case study approach?]
Case Study 1: Predicting house prices.
Data -> ML Method -> Intelligence.
The intelligence we are deriving is: A value associated with some house that’s not on the market.
The data is: We will look at other houses and look at their house sales prices.
The ML method is something that’s gonna relate the house attributes to the sales price. (This method is called Regression).
Case Study 2: Sentiment analysis
Data -> ML Method -> Intelligence.
The data is: Reviews of some restaurants, each review has a rating.
Intelligence: We wanna take a review (A sample review: “Sushi was awesome, the food was awesome, but the service was awful”) and be able to classify whether it had a positive sentiment or a negative sentiment.
ML Method: We might analyze the text of this review in terms of how many times it uses “awesome words” versus how many times it uses “awful words”. From these other reviews that we have, we’re gonna learn some decision boundary. (This method is a Classification method).
Case Study 3: Document retrieval
Data -> ML Method -> Intelligence.
Intelligence: Finding an article or a book that’s of interest to our reader.
Data: Huge collection of articles we could recommend.
ML Method: We will try to find structure in this data, based on groups of related articles. Maybe there is a collection of articles about sports, about world news, etc. If we find this structure and annotate our collection of documents with these labels, we can use this for very rapid document retrieval. (This method is a Clustering method).
Case Study 4: Product recommendation
Data -> ML Method -> Intelligence.
Data: Your past purchases, and purchase histories of all customers.
Intelligence: Product recommendation.
ML Method: We will take the data and arrange it into a customers by products matrix (see the pic: the squares here indicate products that a customer actually purchased). From this matrix, we will learn features about users and features about products. Once we learn those features, we can use the features to see how much agreement is there between the attributes the user likes and the attributes of the product. (This method is a Matrix Factorization method).
Case Study 5: Visual product recommender
Data -> ML Method -> Intelligence.
Data: Images.
Intelligence: Suppose we input an image, say of a shoe. We want a set of results of shoes that are visually similar to the input image.
ML Method: To go from an image to a set of related images, we need to have very good features about that image to find other images that are similar. We will derive those features using neural networks. (This method is a Deep Learning method).
Roughly: Every layer of the neural network provides more and more descriptive features.
[Specialization overview]
Other ML classes: Laundry list of algorithms and methods. Problem with this approach is, since you start with algorithms, you end up with really simplistic use cases and applications disconnected from reality.
This ML specialization: From use cases to models and algorithms.
Data -> ML Method -> Intelligence
ML Method aspects: Task; Models and parameters; Optimization algorithm
Intelligence aspects: Evaluation
We will defer the “Model and parameters”; “Optimization algorithm” aspects to the follow up courses (For now, they will be black boxes). We will focus on “Task” and “Evaluation” aspects now.
Course - 1 : ML Case Studies
This course.
Course - 2 : Regression
Case study: Predicting house prices.
Models:
-
Linear regression
-
Regularization
- Ridge (L2)
- Lasso (L1)
Algorithms:
-
Gradient descent
-
Coordinate descent
Concepts:
Loss functions, bias-variance tradeoff, cross validation, sparsity, overfitting, model selection.
Course - 3 : Classification
Case study: Analyzing sentiment.
Models:
-
Linear classifiers (logistic regression, SVMs, perceptron)
-
Kernels
-
Decision trees
Algorithms:
-
Stochastic gradient descent
-
Boosting
Concepts:
Decision boundaries, MLE, ensemble methods, random forests, CART, online learning.
Course - 4 : Clustering and Retrieval
Case study: Finding documents.
Models:
-
Nearest neighbours
-
Clustering, mixtures of Gaussians
-
Latent Dirichlet allocation (LDA)
Algorithms:
-
KD-trees, locally-sensitive hashing (LSH)
-
K-means
-
Expectation-maximization (EM)
Concepts:
Distance metrics, approximation algorithms, hashing, sampling algorithms, scaling up with map-reduce.
Course - 5 : Matrix Factorization and Dimensionality Reduction (seems unavailable)
Case study: Recommending products.
Models:
-
Collaborative filtering
-
Matrix factorization
-
PCA
Algorithms:
-
Coordinate descent
-
Eigendecomposition
-
SVD
Concepts:
Matrix completion, eigenvalues, random projections, cold-start problem, diversity, scaling up.
Course - 6 : Capstone (seems unavailable)
An intelligent application using deep learning.
Build and deploy a recommender using product images and text sentiment.