Folders and files, repository files navigation, coursera-uw-machine-learning-foundations-a-case-study-approach.

Course can be found in Coursera

Notebook for quick search can be found in my blog SSQ

Week 1 Introduction

  • Regression. Case study: Predicting house prices
  • Classification. Case study: Analyzing sentiment
  • Clustering & Retrieval. Case study: Finding documents
  • Matrix Factorization & Dimensionality Reduction. Case study: Recommending Products
  • Capstone. An intelligent application using deep learning
  • Familiar with Ipython notebook and Sframe

Week 2 Regression Predicting House Prices

  • Linear Regression
  • Adding higher order effects
  • Evaluating overfitting via training/test split
  • Adding other features
  • Other regression examples
  • Implement Linear Regression model with different several features

Week 3 Classification Analyzing Sentiment

  • Classifier applications
  • Linear classifiers
  • Decision boundaries
  • Training and evaluating a classifier
  • What’s a good accuracy?
  • False positives, false negatives, and confusion matrices
  • Learning curves: How much data do I need?
  • Class probabilities
  • Implement Logistic Regression model with different several features
  • Python 100.0%

A Review of Machine Learning Foundations — A Case Study Approach by Coursera

Things i know..

On December 11, 2016 I completed the course “Machine Learning Foundations: A Case Study Approach” by Coursera. This course is a great introduction to the world of Machine Learning, and through this blog post my goal is to give a brief review of the course and its content.

What is it About?

“Machine Learning Foundations: A Case Study Approach” is an introductory course about common Machine Learning concepts such as regression, classification, clustering and similarity, recommender systems, and deep learning. It’s a hands-on-experience course (what they call a use-case study) which allows for a more practical understanding of common methods used in Machine Learning, rather than diving up immediately to the theory behind them. The course serves as a foundation to getting started with the Machine Learning specialization , which will later cover those same topics in more detail.

Structure and Content

The course is structured in 6 weeks (about 10 hr per week commitment), each of them covering a specific Machine Learning concept. Each Machine Learning concept is explained through a series of video lessons followed by a quiz (usually 5-10 questions), and finally a programming assignment in which you will implement a small application using the Machine Learning method studied during that course week. Lessons use an approach which is more focused on general principles rather than specific implementations or tools of it.

Instructors recommend using Python with GraphLab (a Machine Learning modeling tool for developers and data scientists), but other languages or packages can be used as well. There’s no need to install the recommended packages on your local machine, since a GraphLab service running on Amazon’s cloud is already provided for each student enrolled in the course.

I enjoyed taking this course, it’s an introductory course and thus you might be able to skip it if you already have some experience with Machine Learning. Immediately after completion, I started taking the Regression course , and as explained above, this course is a lot more theoretical and algorithms will now be implemented from scratch (instead of using third party libraries).


Machine Learning Foundations: A Case Study Approach Quiz Answer

Team Networking Funda

  • In Data Science Quiz
  • In Machine Learning Specialization
  • On March 4, 2024

Get All Weeks Machine Learning Foundations: A Case Study Approach Quiz Answers

Table of contents, week 1: machine learning foundations: a case study approach quiz answer, quiz 1: s frames.

Q 1:Download the Wiki People SFrame. Then open a new Jupyter notebook, import TuriCreate, and read the SFrame data. 

Answer: Click here

Q 2: How many rows are in the SFrame? (Do NOT use commas or periods.)

Q 3: Which name is in the last row?

Q 4: Read the text column for Harpdog Brown. He was honored with:

Q 5: Sort the SFrame according to the text column, in ascending order. What is the name entry in the first row?

Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: regression.

Q 2: True or false: The model that best minimizes training error is the one that will perform best for the task of prediction on new data.

Q 3: The following table illustrates the results of evaluating 4 models with different parameter choices on some data set. Which of the following models fits this data the best?

Q 4: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.) <!– wp:shortcode –> View

Machine Learning Foundations: A Case Study Approach Quiz Answer

Q 5: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

Machine Learning Foundations: A Case Study Approach Quiz Answer

Q 6: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

none of the above

Q 7: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)

Q 8: Which of the following plots would you not expect to see as a plot of training and test error curves?

Q 9: True or false: One always prefers to use a model with more features since it better captures the true underlying process.

Quiz 2: Predicting house prices

Q 1: Selection and summary statistics: We found the zip code with the highest average house price. What is the average house price of that zip code?

Q 2: Filtering data: What fraction of the houses have living space between 2000 sq.ft. and 4000 sq.ft.?

Q 3: Building a regression model with several more features: What is the difference in RMSE between the model trained with my_features and the one trained with advanced_features ?

Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: classification.

Q 1: The simple threshold classifier for sentiment analysis described in the video ( check all that apply ):

Q 2: For a linear classifier classifying between “positive” and “negative” sentiment in a review x, Score(x) = 0 implies ( check all that apply ):

Q 3: For which of the following datasets would a linear classifier perform perfectly?

Machine Learning Foundations: A Case Study Approach Quiz Answer

Q 4: True or false: High classification accuracy always indicates a good classifier.

Q 5: True or false: For a classifier classifying between 5 classes, there always exists a classifier with an accuracy greater than 0.18.

Q 6: True or false: A false negative is always worse than a false positive.

Q 7: Which of the following statements are true? ( Check all that apply )

Quiz 2: Analyzing product sentiment

Q 1: Out of the 11 words in selected_words , which one is most used in the reviews in the dataset?

Q 2: Out of the 11 words in selected_words , which one is least used in the reviews in the dataset?

Q 3: Out of the 11 words in selected_words , which one got the most positive weight in the selected_words_model ?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

Question 4: Out of the 11 words in selected_words , which one got the most negative weight in the selected_words_model ?

Q 5: Which of the following ranges contains the accuracy of the selected_words_model on the test_data ?

Q 6: Which of the following ranges contains the accuracy of the sentiment_model in the IPython Notebook from lecture on the test_data ?

Q 7: Which of the following ranges contains the accuracy of the majority class classifier, which simply predicts the majority class on the test_data?

Q 8: How do you compare the different learned models with the baseline approach where we are just predicting the majority class?

Q 9: Which of the following ranges contains the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’, according to the sentiment_model from the IPython Notebook from lecture?

Q 10: Consider the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture. Which of the following ranges contains the predicted_sentiment for this review, if we use the selected_words_model to analyze it?

Q 11: Why is the value of the predicted_sentiment for the most positive review found using the sentiment_model much more positive than the value predicted using the selected_words_model ?

Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: clustering and similarity.

Q 1:A country, called Simpleland , has a language with a small vocabulary of just “the” , “on” , “and” , “go” , “round” , “bus” , and “wheels” . For a word count vector with indices ordered as the words appear above, what is the word count vector for a document that simply says “the wheels on the bus go round and round.”

Please enter the vector of counts as follows: If the counts were [ “the” =1, “on” =3, “and” =2, “go”= 1, “round”= 2, “ bus”= 1, “ wheels”= 1], enter 1321211.

Question 2: In Simpleland , a reader is enjoying a document with a representation: [1 3 2 1 2 1 1]. Which of the following articles would you recommend to this reader next?

Question 3: A corpus in Simpleland has 99 articles. If you pick one article and perform a 1-nearest neighbor search to find the closest article to this query article, how many times must you compute the similarity between two articles?

Question 4: For the TF-IDF representation, does the relative importance of words in a document depend on the base of the logarithm used? For example, take the words “ bus ” and “ wheels ” in a particular document. Is the ratio between the TF-IDF values for “ bus ” and “ wheels ” different when computed using log base 2 versus log base 10?

Question 5:Which of the following statements are true? ( Check all that apply ):

Question 6: Which of the following pictures represents the best k-means solution? ( Squares represent observations, plus signs are cluster centers, and colors indicate assignments of observations to cluster centers .)

Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 2: Retrieving Wikipedia articles

Q 1: Top word count words for Elton John

Question 2: Top TF-IDF words for Elton John

Question 3: The cosine distance between ‘Elton John’s and ‘Victoria Beckham’s articles (represented with TF-IDF) falls within which range?

Question 4: The cosine distance between ‘Elton John’s and ‘Paul McCartney’s articles (represented with TF-IDF) falls within which range?

Question 5: Who is closer to ‘Elton John’, ‘Victoria Beckham’ or ‘Paul McCartney’?

Question 6: Who is the nearest cosine-distance neighbor to ‘Elton John’ using raw word counts?

Question 7: Who is the nearest cosine-distance neighbor to ‘Elton John’ using TF-IDF?

Question 8: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using raw word counts?

Question 9: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using TF-IDF?

Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: recommender systems.

Q1: Recommending items based on global popularity can ( check all that apply ):

Question 2: Recommending items using a classification approach can ( check all that apply ):

Question 3:Recommending items using a simple count-based co-occurrence matrix can ( check all that apply ):

Question 4:Recommending items using featured matrix factorization can ( check all that apply ):

Question 5:Normalizing co-occurrence matrices is used primarily to account for:

Question 6: A store has 3 customers and 3 products. Below are the learned feature vectors for each user and product. Based on this estimated model, which product would you recommend most highly to User #2 ?

Question 7: For the liked and recommended items displayed below, calculate the recall and round to 2 decimal points. ( As in the lesson, green squares indicate recommended items, and magenta squares are liked items. Items not recommended are grayed out for clarity .) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

Question 8: For the liked and recommended items displayed below, calculate the precision and round to 2 decimal points. ( As in the lesson, green squares indicate recommended items, and magenta squares are liked items. Items not recommended are grayed out for clarity .) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)

Question 9: Based on the precision-recall curves in the figure below, which recommender would you use?

Quiz 2: Recommending songs

Question 1: Which of the artists below have had the most unique users listening to their songs?

Question 2: Which of the artists below is the most popular artist, the one with the highest total listen_count, in the data set?

Question 3: Which of the artists below is the least popular artist, the one with the smallest total listen_count, in the data set?

Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: deep learning.

Question 1: Which of the following statements are true ? ( Check all that apply )

Question 2: A simple linear classifier can represent which of the following functions? ( Check all that apply )

x1 AND x2 AND NOT x3

x1 OR (x2 AND NOT x3)

Question 3: Which of the following neural networks can represent the following function? Select all that apply.

(x1 AND x2) OR (NOT x1 AND NOT x2)

Machine Learning Foundations: A Case Study Approach Quiz Answer

Question 4: Which of the following statements is true ? ( Check all that apply )

Question 5: If you have lots of images of different types of plankton labeled with their species name and lots of computational resources, what would you expect to perform better predictions:

Question 6: If you have a few images of different types of plankton labeled with their species name, what would you expect to perform better predictions:

Quiz 2: Deep features for image retrieval

Question 1: What’s the least common category in the training data?

Question 2: Of the images below, which is the nearest ‘cat’ labeled image in the training data to the first image in the test data (image_test[0:1])?

Machine Learning Foundations: A Case Study Approach Quiz Answer

Question 3: Of the images below, which is the nearest ‘dog’ labeled image in the training data to the the first image in the test data (image_test[0:1])?

Machine Learning Foundations: A Case Study Approach Quiz Answer

Question 4: :For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘cat’ in the training data?

Question 5: For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘dog’ in the training data?

Question 6: On average, is the first image in the test data closer to its 5 nearest neighbors in the ‘cat’ data or in the ‘dog’ data?

Question 7: In what range is the accuracy of the 1-nearest neighbor classifier at classifying ‘dog’ images from the test set?

Machine Learning Foundations: A Case Study Approach Course Review

In our experience, we suggest you enroll in Machine Learning Foundations: A Case Study Approach courses and gain some new skills from Professionals completely free and we assure you will be worth it.

Machine Learning Foundations: A Case Study Approach for free, if you are stuck anywhere between a quiz or a graded assessment quiz, just visit Networking Funda to Machine Learning Foundations: A Case Study Approach Quiz Answers.

Get All Course Quiz Answers of Machine Learning Specialization

Machine Learning: Regression Coursera Quiz Answers

Machine Learning: Classification Coursera Quiz Answers

Machine Learning: Clustering & Retrieval Quiz Answers

Team Networking Funda

Team Networking Funda

We are Team Networking Funda, a group of passionate authors and networking enthusiasts committed to sharing our expertise and experiences in networking and team building. With backgrounds in Data Science, Information Technology, Health, and Business Marketing, we bring diverse perspectives and insights to help you navigate the challenges and opportunities of professional networking and teamwork.

Week 1: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: s frames.

Q 1:Download the Wiki People SFrame. Then open a new Jupyter notebook, import TuriCreate, and read the SFrame data.

Answer: Click here

Q 2: How many rows are in the SFrame? (Do NOT use commas or periods.)

Answer: 59071

Q 3: Which name is in the last row?

  • C​onradign Netzer
  • C​thy Caruth
  • F​awaz Damrah

Q 4: Read the text column for Harpdog Brown. He was honored with:

  • A​ Grammy award for his latest blues album.
  • A gold harmonica to recognize his innovative playing style.
  • A lifetime membership in the Hamilton Blues Society.

Q 5: Sort the SFrame according to the text column, in ascending order. What is the name entry in the first row?

  • Z​ygfryd Szo
  • D​igby Morrell
  • 0​07 James Bond
  • 108 (artist)
  • 8​ Ball Aitken

Week 2: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: regression.

Q 1: Which figure represents an overfitted model?


Q 2: True or false: The model that best minimizes training error is the one that will perform best for the task of prediction on new data.

Q 3: The following table illustrates the results of evaluating 4 models with different parameter choices on some data set. Which of the following models fits this data the best?

Q 4: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)


  • none of the above;

Q 5: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)


  • none of the above

Q 6: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)


Q 7: Assume we fit the following quadratic function: f(x) = w0+w1*x+w2*(x^2) to the dataset shown (blue circles). The fitted function is shown by the green curve in the picture below. Out of the 3 parameters of the fitted function (w0, w1, w2), which ones are estimated to be 0? (Note: you must select all parameters estimated as 0 to get the question correct.)


Q 8: Which of the following plots would you not expect to see as a plot of training and test error curves?


Q 9: True or false: One always prefers to use a model with more features since it better captures the true underlying process.

Quiz 2: Predicting house prices

Q 1: Selection and summary statistics: We found the zip code with the highest average house price. What is the average house price of that zip code?

  • $2,160,607 ;

Q 2: Filtering data: What fraction of the houses have living space between 2000 sq.ft. and 4000 sq.ft.?

  • Between 0.2 and 0.29
  • Between 0.3 and 0.39
  • Between 0.4 and 0.49
  • Between 0.5 and 0.59
  • Between 0.6 and 0.69

Q 3: Building a regression model with several more features: What is the difference in RMSE between the model trained with my_features and the one trained with advanced_features ?

  • the RMSE of the model with advanced_features lower by less than $25,000
  • the RMSE of the model with advanced_features lower by between $25,001 and $35,000
  • the RMSE of the model with advanced_features lower by between $35,001 and $45,000
  • the RMSE of the model with advanced_features lower by between $45,001 and $55,000
  • the RMSE of the model with advanced_features lower by more than $55,000

Week 3: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: classification;.

Q 1: The simple threshold classifier for sentiment analysis described in the video ( check all that apply ):

  • Must have pre-defined positive and negative attributes
  • Must either count attributes equally or pre-define weights on attributes
  • Defines a possibly non-linear decision boundary

Q 2: For a linear classifier classifying between “positive” and “negative” sentiment in a review x, Score(x) = 0 implies ( check all that apply ):

  • The review is very clearly “negative”
  • We are uncertain whether the review is “positive” or “negative”
  • We need to retrain our classifier because an error has occurred

Q 3: For which of the following datasets would a linear classifier perform perfectly?


Q 4: True or false: High classification accuracy always indicates a good classifier.

Q 5: True or false: For a classifier classifying between 5 classes, there always exists a classifier with accuracy greater than 0.18.

Q 6: True or false: A false negative is always worse than a false positive.

Q 7: Which of the following statements are true? ( Check all that apply )

  • Test error tends to decrease with more training data until a point, and then does not change (i.e., curve flattens out)
  • Test error always goes to 0 with an unboundedly large training dataset
  • Test error is never a function of the amount of training data

Quiz 2: Analyzing product sentiment;

Q 1: Out of the 11 words in selected_words , which one is most used in the reviews in the dataset?

Q 2: Out of the 11 words in selected_words , which one is least used in the reviews in the dataset?

Q 3: Out of the 11 words in selected_words , which one got the most positive weight in the selected_words_model ?

(Tip: when printing the list of coefficients, make sure to use print_rows(rows=12) to print ALL coefficients.)

Question 4: Out of the 11 words in selected_words , which one got the most negative weight in the selected_words_model ?

Q 5: Which of the following ranges contains the accuracy of the selected_words_model on the test_data ?

  • 0.811 to 0.841
  • 0.841 to 0.871
  • 0.871 to 0.901
  • 0.901 to 0.931

Q 6: Which of the following ranges contains the accuracy of the sentiment_model in the IPython Notebook from lecture on the test_data ?

Q 7: Which of the following ranges contains the accuracy of the majority class classifier, which simply predicts the majority class on the test_data?

  • 0.811 to 0.843
  • 0.843 to 0.871
  • 0.901 to 0.931;

Q 8: How do you compare the different learned models with the baseline approach where we are just predicting the majority class?

  • They all performed about the same.
  • The model learned using all words performed much better than the one using the only the selected_words . And, the model learned using the selected_words performed much better than just predicting the majority class.
  • The model learned using all words performed much better than the other two. The other two approaches performed about the same.
  • Predicting the simply majority class performed much better than the other two models .

Q 9: Which of the following ranges contains the ‘predicted_sentiment’ for the most positive review for ‘Baby Trend Diaper Champ’, according to the sentiment_model from the IPython Notebook from lecture?

Q 10: Consider the most positive review for ‘Baby Trend Diaper Champ’ according to the sentiment_model from the IPython Notebook from lecture. Which of the following ranges contains the predicted_sentiment for this review, if we use the selected_words_model to analyze it?;

Q 11: Why is the value of the predicted_sentiment for the most positive review found using the sentiment_model much more positive than the value predicted using the selected_words_model ?

  • The sentiment_model is just too positive about everything.
  • The selected_words_model is just too negative about everything.
  • This review was positive, but used too many of the negative words in selected_words .
  • None of the selected_words appeared in the text of this review.

Week 4: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: clustering and similarity;.

Q 1:A country, called Simpleland , has a language with a small vocabulary of just “the” , “on” , “and” , “go” , “round” , “bus” , and “wheels” . For a word count vector with indices ordered as the words appear above, what is the word count vector for a document that simply says “the wheels on the bus go round and round.”

Please enter the vector of counts as follows: If the counts were [ “the” =1, “on” =3, “and” =2, “go”= 1, “round”= 2, “ bus”= 1, “ wheels”= 1], enter 1321211.

Answer: 21112111

Question 2: In Simpleland , a reader is enjoying a document with a representation: [1 3 2 1 2 1 1]. Which of the following articles would you recommend to this reader next?

  • [7 0 2 1 0 0 1]
  • [1 7 0 0 2 0 1]
  • [1 0 0 0 7 1 2]
  • [0 2 0 0 7 1 1]

Question 3: A corpus in Simpleland has 99 articles. If you pick one article and perform 1-nearest neighbor search to find the closest article to this query article, how many times must you compute the similarity between two articles?

Question 4: For the TF-IDF representation, does the relative importance of words in a document depend on the base of the logarithm used? For example, take the words “ bus ” and “ wheels ” in a particular document. Is the ratio between the TF-IDF values for “ bus ” and “ wheels ” different when computed using log base 2 versus log base 10?

Question 5:Which of the following statements are true? ( Check all that apply ):

  • Deciding whether an email is spam or not spam using the text of the email and some spam / not spam labels is a supervised learning problem.
  • Dividing emails into two groups based on the text of each email is a supervised learning problem.
  • If we are performing clustering, we typically assume we either do not have or do not use class labels in training the model.

Question 6: Which of the following pictures represents the best k-means solution? ( Squares represent observations, plus signs are cluster centers, and colors indicate assignments of observations to cluster centers .)


Quiz 2: Retrieving Wikipedia articles;

Q 1: Top word count words for Elton John

  • (the, john, singer)
  • (england, awards, musician)
  • (the, in, and)
  • (his, the, since)
  • (rock, artists, best)

Question 2: Top TF-IDF words for Elton John

  • (furnish,elton,billboard)
  • (john,elton,fivedecade)
  • (the,of,has)
  • (awards,rock,john)
  • (elton,john,singer)

Question 3: The cosine distance between ‘Elton John’s and ‘Victoria Beckham’s articles (represented with TF-IDF) falls within which range?

  • 0.1 to 0.29;
  • 0.3 to 0.49
  • 0.5 to 0.69
  • 0.7 to 0.89

Question 4: The cosine distance between ‘Elton John’s and ‘Paul McCartney’s articles (represented with TF-IDF) falls within which range?

  • 0.1 to 0.29

Question 5: Who is closer to ‘Elton John’, ‘Victoria Beckham’ or ‘Paul McCartney’?

  • Victoria Beckham
  • Paul McCartney

Question 6: Who is the nearest cosine-distance neighbor to ‘Elton John’ using raw word counts?;

  • Cliff Richard
  • Roger Daltrey
  • George Bush

Question 7: Who is the nearest cosine-distance neighbor to ‘Elton John’ using TF-IDF?

  • Rod Stewart
  • Elvis Presley

Question 8: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using raw word counts?

  • Stephen Dow Beckham
  • Louis Molloy
  • Adrienne Corri
  • Mary Fitzgerald (artist) ;

Question 9: Who is the nearest cosine-distance neighbor to ‘Victoria Beckham’ using TF-IDF?

  • Caroline Rush
  • David Beckham
  • Carrie Reichardt

Week 5: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: recommender systems.

Q 1: Recommending items based on global popularity can ( check all that apply ):

  • provide personalization
  • capture context (e.g., time of day)

Question 2: Recommending items using a classification approach can ( check all that apply ):;

Question 3:Recommending items using a simple count based co-occurrence matrix can ( check all that apply ):

Question 4:Recommending items using featurized matrix factorization can ( check all that apply ):

Question 5:Normalizing co-occurrence matrices is used primarily to account for:

  • people who purchased many items
  • items purchased by many ;
  • eliminating rare products

Question 6: A store has 3 customers and 3 products. Below are the learned feature vectors for each user and product. Based on this estimated model, which product would you recommend most highly to User #2 ?

  • Product #3;

Question 7: For the liked and recommended items displayed below, calculate the recall and round to 2 decimal points. ( As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity .) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)


Answer: 0.33

Question 8: For the liked and recommended items displayed below, calculate the precision and round to 2 decimal points. ( As in the lesson, green squares indicate recommended items, magenta squares are liked items. Items not recommended are grayed out for clarity .) Note: enter your answer in American decimal format (e.g. enter 0.98, not 0,98)


Answer: 0.25

Question 9: Based on the precision-recall curves in the figure below, which recommender would you use?


Quiz 2: Recommending songs

Question 1: Which of the artists below have had the most unique users listening to their songs?

  • Foo Fighters
  • Taylor Swift

Question 2: Which of the artists below is the most popular artist, the one with highest total listen_count, in the data set?

  • Kings of Leon

Question 3: Which of the artists below is the least popular artist, the one with smallest total listen_count, in the data set?

  • William Tabbert
  • Velvet Underground & Nico
  • The Cool Kids;

Week 6: Machine Learning Foundations: A Case Study Approach Quiz Answer

Quiz 1: deep learning.

Question 1: Which of the following statements are true ? ( Check all that apply )

  • Linear classifiers are never useful, because they cannot represent XOR.
  • Linear classifiers are useful, because, with enough data, they can represent anything.
  • Having good non-linear features can allow us to learn very accurate linear classifiers.

Question 2: A simple linear classifier can represent which of the following functions? ( Check all that apply )

  • x1 OR x2 OR NOT x3
  • x1 AND x2 AND NOT x3
  • x1 OR (x2 AND NOT x3)

Question 3: Which of the the following neural networks can represent the following function? Select all that apply.

(x1 AND x2) OR (NOT x1 AND NOT x2)


Question 4: Which of the following statements is true ? ( Check all that apply )

  • Features in computer vision act like local detectors.
  • Deep learning has had impact in computer vision, because it’s used to combine all the different hand-created features that already exist.
  • By learning non-linear features, neural networks have allowed us to automatically learn detectors for computer vision.

Question 5: If you have lots of images of different types of plankton labeled with their species name, and lots of computational resources, what would you expect to perform better predictions:

  • a deep neural network trained on this data.
  • a simple classifier trained on this data, using deep features as input, which were trained using ImageNet data.

Question 6: If you have a few images of different types of plankton labeled with their species name, what would you expect to perform better predictions:

Quiz 2: Deep features for image retrieval

Question 1: What’s the least common category in the training data?

Question 2: Of the images below, which is the nearest ‘cat’ labeled image in the training data to the the first image in the test data (image_test[0:1])?


Question 3: Of the images below, which is the nearest ‘dog’ labeled image in the training data to the the first image in the test data (image_test[0:1])?


Question 4: :For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘cat’ in the training data?

Question 5: For the first image in the test data, in what range is the mean distance between this image and its 5 nearest neighbors that were labeled ‘dog’ in the training data?

Question 6: On average, is the first image in the test data closer to its 5 nearest neighbors in the ‘cat’ data or in the ‘dog’ data?

Question 7: In what range is the accuracy of the 1-nearest neighbor classifier at classifying ‘dog’ images from the test set?

Based on our knowledge, we urge you to enroll in this course so you can pick up new skills from specialists. It will be worthwhile, we trust.

