Machine Learning: Classification

Methode

Machine Learning: Classification

Coursera (CC)
Logo von Coursera (CC)
Bewertung: starstarstarstar_halfstar_border 7,2 Bildungsangebote von Coursera (CC) haben eine durchschnittliche Bewertung von 7,2 (aus 6 Bewertungen)

Tipp: Haben Sie Fragen? Für weitere Details einfach auf "Kostenlose Informationen" klicken.

Beschreibung

When you enroll for courses through Coursera you get to choose for a paid plan or for a free plan

  • Free plan: No certicification and/or audit only. You will have access to all course materials except graded items.
  • Paid plan: Commit to earning a Certificate—it's a trusted, shareable way to showcase your new skills.

About this course: Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classi…

Gesamte Beschreibung lesen

Frequently asked questions

Es wurden noch keine FAQ hinterlegt. Falls Sie Fragen haben oder Unterstützung benötigen, kontaktieren Sie unseren Kundenservice. Wir helfen gerne weiter!

Noch nicht den perfekten Kurs gefunden? Verwandte Themen: Machine Learning, Big Data, Microsoft Azure, Data Science und Data Mining.

When you enroll for courses through Coursera you get to choose for a paid plan or for a free plan

  • Free plan: No certicification and/or audit only. You will have access to all course materials except graded items.
  • Paid plan: Commit to earning a Certificate—it's a trusted, shareable way to showcase your new skills.

About this course: Case Studies: Analyzing Sentiment & Loan Default Prediction In our case study on analyzing sentiment, you will create models that predict a class (positive/negative sentiment) from input features (text of the reviews, user profile information,...). In our second case study for this course, loan default prediction, you will tackle financial data, and predict when a loan is likely to be risky or safe for the bank. These tasks are an examples of classification, one of the most widely used areas of machine learning, with a broad array of applications, including ad targeting, spam detection, medical diagnosis and image classification. In this course, you will create classifiers that provide state-of-the-art performance on a variety of tasks. You will become familiar with the most successful techniques, which are most widely used in practice, including logistic regression, decision trees and boosting. In addition, you will be able to design and implement the underlying algorithms that can learn these models at scale, using stochastic gradient ascent. You will implement these technique on real-world, large-scale machine learning tasks. You will also address significant tasks you will face in real-world applications of ML, including handling missing data and measuring precision and recall to evaluate a classifier. This course is hands-on, action-packed, and full of visualizations and illustrations of how these techniques will behave on real data. We've also included optional content in every module, covering advanced topics for those who want to go even deeper! Learning Objectives: By the end of this course, you will be able to: -Describe the input and output of a classification model. -Tackle both binary and multiclass classification problems. -Implement a logistic regression model for large-scale classification. -Create a non-linear model using decision trees. -Improve the performance of any model using boosting. -Scale your methods with stochastic gradient ascent. -Describe the underlying decision boundaries. -Build a classification model to predict sentiment in a product review dataset. -Analyze financial data to predict loan defaults. -Use techniques for handling missing data. -Evaluate your models using precision-recall metrics. -Implement these techniques in Python (or in the language of your choice, though Python is highly recommended).

Created by:  University of Washington
  • Taught by:  Carlos Guestrin, Amazon Professor of Machine Learning

    Computer Science and Engineering
  • Taught by:  Emily Fox, Amazon Professor of Machine Learning

    Statistics
Basic Info Course 3 of 4 in the Machine Learning Specialization Commitment 7 weeks of study, 5-8 hours/week Language English How To Pass Pass all graded assignments to complete the course. User Ratings 4.7 stars Average User Rating 4.7See what learners said 課程作業

每門課程都像是一本互動的教科書,具有預先錄製的視頻、測驗和項目。

來自同學的幫助

與其他成千上萬的學生相聯繫,對想法進行辯論,討論課程材料,並尋求幫助來掌握概念。

證書

獲得正式認證的作業,並與朋友、同事和雇主分享您的成功。

University of Washington Founded in 1861, the University of Washington is one of the oldest state-supported institutions of higher education on the West Coast and is one of the preeminent research universities in the world.

Syllabus


WEEK 1


Welcome!



Classification is one of the most widely used techniques in machine learning, with a broad array of applications, including sentiment analysis, ad targeting, spam detection, risk assessment, medical diagnosis and image classification. The core goal of classification is to predict a category or class y from some inputs x. Through this course, you will become familiar with the fundamental models and algorithms used in classification, as well as a number of core machine learning concepts. Rather than covering all aspects of classification, you will focus on a few core techniques, which are widely used in the real-world to get state-of-the-art performance. By following our hands-on approach, you will implement your own algorithms on multiple real-world tasks, and deeply grasp the core techniques needed to be successful with these approaches in practice. This introduction to the course provides you with an overview of the topics we will cover and the background knowledge and resources we assume you have.


8 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Welcome to the classification course, a part of the Machine Learning Specialization
  3. Video: What is this course about?
  4. Video: Impact of classification
  5. Video: Course overview
  6. Video: Outline of first half of course
  7. Video: Outline of second half of course
  8. Video: Assumed background
  9. Video: Let's get started!
  10. 閱讀: Reading: Software tools you'll need


Linear Classifiers & Logistic Regression



Linear classifiers are amongst the most practical classification methods. For example, in our sentiment analysis case-study, a linear classifier associates a coefficient with the counts of each word in the sentence. In this module, you will become proficient in this type of representation. You will focus on a particularly useful type of linear classifier called logistic regression, which, in addition to allowing you to predict a class, provides a probability associated with the prediction. These probabilities are extremely useful, since they provide a degree of confidence in the predictions. In this module, you will also be able to construct features from categorical inputs, and to tackle classification problems with more than two class (multiclass problems). You will examine the results of these techniques on a real-world product sentiment analysis task.


18 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Linear classifiers: A motivating example
  3. Video: Intuition behind linear classifiers
  4. Video: Decision boundaries
  5. Video: Linear classifier model
  6. Video: Effect of coefficient values on decision boundary
  7. Video: Using features of the inputs
  8. Video: Predicting class probabilities
  9. Video: Review of basics of probabilities
  10. Video: Review of basics of conditional probabilities
  11. Video: Using probabilities in classification
  12. Video: Predicting class probabilities with (generalized) linear models
  13. Video: The sigmoid (or logistic) link function
  14. Video: Logistic regression model
  15. Video: Effect of coefficient values on predicted probabilities
  16. Video: Overview of learning logistic regression models
  17. Video: Encoding categorical inputs
  18. Video: Multiclass classification with 1 versus all
  19. Video: Recap of logistic regression classifier
  20. 閱讀: Predicting sentiment from product reviews

Graded: Linear Classifiers & Logistic Regression
Graded: Predicting sentiment from product reviews

WEEK 2


Learning Linear Classifiers



Once familiar with linear classifiers and logistic regression, you can now dive in and write your first learning algorithm for classification. In particular, you will use gradient ascent to learn the coefficients of your classifier from data. You first will need to define the quality metric for these tasks using an approach called maximum likelihood estimation (MLE). You will also become familiar with a simple technique for selecting the step size for gradient ascent. An optional, advanced part of this module will cover the derivation of the gradient for logistic regression. You will implement your own learning algorithm for logistic regression from scratch, and use it to learn a sentiment analysis classifier.


18 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Goal: Learning parameters of logistic regression
  3. Video: Intuition behind maximum likelihood estimation
  4. Video: Data likelihood
  5. Video: Finding best linear classifier with gradient ascent
  6. Video: Review of gradient ascent
  7. Video: Learning algorithm for logistic regression
  8. Video: Example of computing derivative for logistic regression
  9. Video: Interpreting derivative for logistic regression
  10. Video: Summary of gradient ascent for logistic regression
  11. Video: Choosing step size
  12. Video: Careful with step sizes that are too large
  13. Video: Rule of thumb for choosing step size
  14. Video: (VERY OPTIONAL) Deriving gradient of logistic regression: Log trick
  15. Video: (VERY OPTIONAL) Expressing the log-likelihood
  16. Video: (VERY OPTIONAL) Deriving probability y=-1 given x
  17. Video: (VERY OPTIONAL) Rewriting the log likelihood into a simpler form
  18. Video: (VERY OPTIONAL) Deriving gradient of log likelihood
  19. Video: Recap of learning logistic regression classifiers
  20. 閱讀: Implementing logistic regression from scratch

Graded: Learning Linear Classifiers
Graded: Implementing logistic regression from scratch

Overfitting & Regularization in Logistic Regression



As we saw in the regression course, overfitting is perhaps the most significant challenge you will face as you apply machine learning approaches in practice. This challenge can be particularly significant for logistic regression, as you will discover in this module, since we not only risk getting an overly complex decision boundary, but your classifier can also become overly confident about the probabilities it predicts. In this module, you will investigate overfitting in classification in significant detail, and obtain broad practical insights from some interesting visualizations of the classifiers' outputs. You will then add a regularization term to your optimization to mitigate overfitting. You will investigate both L2 regularization to penalize large coefficient values, and L1 regularization to obtain additional sparsity in the coefficients. Finally, you will modify your gradient ascent algorithm to learn regularized logistic regression classifiers. You will implement your own regularized logistic regression classifier from scratch, and investigate the impact of the L2 penalty on real-world sentiment analysis data.


13 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Evaluating a classifier
  3. Video: Review of overfitting in regression
  4. Video: Overfitting in classification
  5. Video: Visualizing overfitting with high-degree polynomial features
  6. Video: Overfitting in classifiers leads to overconfident predictions
  7. Video: Visualizing overconfident predictions
  8. Video: (OPTIONAL) Another perspecting on overfitting in logistic regression
  9. Video: Penalizing large coefficients to mitigate overfitting
  10. Video: L2 regularized logistic regression
  11. Video: Visualizing effect of L2 regularization in logistic regression
  12. Video: Learning L2 regularized logistic regression with gradient ascent
  13. Video: Sparse logistic regression with L1 regularization
  14. Video: Recap of overfitting & regularization in logistic regression
  15. 閱讀: Logistic Regression with L2 regularization

Graded: Overfitting & Regularization in Logistic Regression
Graded: Logistic Regression with L2 regularization

WEEK 3


Decision Trees



Along with linear classifiers, decision trees are amongst the most widely used classification techniques in the real world. This method is extremely intuitive, simple to implement and provides interpretable predictions. In this module, you will become familiar with the core decision trees representation. You will then design a simple, recursive greedy algorithm to learn decision trees from data. Finally, you will extend this approach to deal with continuous inputs, a fundamental requirement for practical problems. In this module, you will investigate a brand new case-study in the financial sector: predicting the risk associated with a bank loan. You will implement your own decision tree learning algorithm on real loan data.


13 videos, 3 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Predicting loan defaults with decision trees
  3. Video: Intuition behind decision trees
  4. Video: Task of learning decision trees from data
  5. Video: Recursive greedy algorithm
  6. Video: Learning a decision stump
  7. Video: Selecting best feature to split on
  8. Video: When to stop recursing
  9. Video: Making predictions with decision trees
  10. Video: Multiclass classification with decision trees
  11. Video: Threshold splits for continuous inputs
  12. Video: (OPTIONAL) Picking the best threshold to split on
  13. Video: Visualizing decision boundaries
  14. Video: Recap of decision trees
  15. 閱讀: Identifying safe loans with decision trees
  16. 閱讀: Implementing binary decision trees

Graded: Decision Trees
Graded: Identifying safe loans with decision trees
Graded: Implementing binary decision trees

WEEK 4


Preventing Overfitting in Decision Trees



Out of all machine learning techniques, decision trees are amongst the most prone to overfitting. No practical implementation is possible without including approaches that mitigate this challenge. In this module, through various visualizations and investigations, you will investigate why decision trees suffer from significant overfitting problems. Using the principle of Occam's razor, you will mitigate overfitting by learning simpler trees. At first, you will design algorithms that stop the learning process before the decision trees become overly complex. In an optional segment, you will design a very practical approach that learns an overly-complex tree, and then simplifies it with pruning. Your implementation will investigate the effect of these techniques on mitigating overfitting on our real-world loan data set.


8 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: A review of overfitting
  3. Video: Overfitting in decision trees
  4. Video: Principle of Occam's razor: Learning simpler decision trees
  5. Video: Early stopping in learning decision trees
  6. Video: (OPTIONAL) Motivating pruning
  7. Video: (OPTIONAL) Pruning decision trees to avoid overfitting
  8. Video: (OPTIONAL) Tree pruning algorithm
  9. Video: Recap of overfitting and regularization in decision trees
  10. 閱讀: Decision Trees in Practice

Graded: Preventing Overfitting in Decision Trees
Graded: Decision Trees in Practice

Handling Missing Data



Real-world machine learning problems are fraught with missing data. That is, very often, some of the inputs are not observed for all data points. This challenge is very significant, happens in most cases, and needs to be addressed carefully to obtain great performance. And, this issue is rarely discussed in machine learning courses. In this module, you will tackle the missing data challenge head on. You will start with the two most basic techniques to convert a dataset with missing data into a clean dataset, namely skipping missing values and inputing missing values. In an advanced section, you will also design a modification of the decision tree learning algorithm that builds decisions about missing data right into the model. You will also explore these techniques in your real-data implementation.


6 videos, 1 reading expand


  1. 閱讀: Slides presented in this module
  2. Video: Challenge of missing data
  3. Video: Strategy 1: Purification by skipping missing data
  4. Video: Strategy 2: Purification by imputing missing data
  5. Video: Modifying decision trees to handle missing data
  6. Video: Feature split selection with missing data
  7. Video: Recap of handling missing data

Graded: Handling Missing Data

WEEK 5


Boosting



One of the most exciting theoretical questions that have been asked about machine learning is whether simple classifiers can be combined into a highly accurate ensemble. This question lead to the developing of boosting, one of the most important and practical techniques in machine learning today. This simple approach can boost the accuracy of any classifier, and is widely used in practice, e.g., it's used by more than half of the teams who win the Kaggle machine learning competitions. In this module, you will first define the ensemble classifier, where multiple models vote on the best prediction. You will then explore a boosting algorithm called AdaBoost, which provides a great approach for boosting classifiers. Through visualizations, you will become familiar with many of the practical aspects of this techniques. You will create your very own implementation of AdaBoost, from scratch, and use it to boost the performance of your loan risk predictor on real data.


13 videos, 3 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: The boosting question
  3. Video: Ensemble classifiers
  4. Video: Boosting
  5. Video: AdaBoost overview
  6. Video: Weighted error
  7. Video: Computing coefficient of each ensemble component
  8. Video: Reweighing data to focus on mistakes
  9. Video: Normalizing weights
  10. Video: Example of AdaBoost in action
  11. Video: Learning boosted decision stumps with AdaBoost
  12. 閱讀: Exploring Ensemble Methods
  13. Video: The Boosting Theorem
  14. Video: Overfitting in boosting
  15. Video: Ensemble methods, impact of boosting & quick recap
  16. 閱讀: Boosting a decision stump

Graded: Exploring Ensemble Methods
Graded: Boosting
Graded: Boosting a decision stump

WEEK 6


Precision-Recall



In many real-world settings, accuracy or error are not the best quality metrics for classification. You will explore a case-study that significantly highlights this issue: using sentiment analysis to display positive reviews on a restaurant website. Instead of accuracy, you will define two metrics: precision and recall, which are widely used in real-world applications to measure the quality of classifiers. You will explore how the probabilities output by your classifier can be used to trade-off precision with recall, and dive into this spectrum, using precision-recall curves. In your hands-on implementation, you will compute these metrics with your learned classifier on real-world sentiment analysis data.


8 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Case-study where accuracy is not best metric for classification
  3. Video: What is good performance for a classifier?
  4. Video: Precision: Fraction of positive predictions that are actually positive
  5. Video: Recall: Fraction of positive data predicted to be positive
  6. Video: Precision-recall extremes
  7. Video: Trading off precision and recall
  8. Video: Precision-recall curve
  9. Video: Recap of precision-recall
  10. 閱讀: Exploring precision and recall

Graded: Precision-Recall
Graded: Exploring precision and recall

WEEK 7


Scaling to Huge Datasets & Online Learning



With the advent of the internet, the growth of social media, and the embedding of sensors in the world, the magnitudes of data that our machine learning algorithms must handle have grown tremendously over the last decade. This effect is sometimes called "Big Data". Thus, our learning algorithms must scale to bigger and bigger datasets. In this module, you will develop a small modification of gradient ascent called stochastic gradient, which provides significant speedups in the running time of our algorithms. This simple change can drastically improve scaling, but makes the algorithm less stable and harder to use in practice. In this module, you will investigate the practical techniques needed to make stochastic gradient viable, and to thus to obtain learning algorithms that scale to huge datasets. You will also address a new kind of machine learning problem, online learning, where the data streams in over time, and we must learn the coefficients as the data arrives. This task can also be solved with stochastic gradient. You will implement your very own stochastic gradient ascent algorithm for logistic regression from scratch, and evaluate it on sentiment analysis data.


16 videos, 2 readings expand


  1. 閱讀: Slides presented in this module
  2. Video: Gradient ascent won't scale to today's huge datasets
  3. Video: Timeline of scalable machine learning & stochastic gradient
  4. Video: Why gradient ascent won't scale
  5. Video: Stochastic gradient: Learning one data point at a time
  6. Video: Comparing gradient to stochastic gradient
  7. Video: Why would stochastic gradient ever work?
  8. Video: Convergence paths
  9. Video: Shuffle data before running stochastic gradient
  10. Video: Choosing step size
  11. Video: Don't trust last coefficients
  12. Video: (OPTIONAL) Learning from batches of data
  13. Video: (OPTIONAL) Measuring convergence
  14. Video: (OPTIONAL) Adding regularization
  15. Video: The online learning task
  16. Video: Using stochastic gradient for online learning
  17. Video: Scaling to huge datasets through parallelization & module recap
  18. 閱讀: Training Logistic Regression via Stochastic Gradient Ascent

Graded: Scaling to Huge Datasets & Online Learning
Graded: Training Logistic Regression via Stochastic Gradient Ascent

Werden Sie über neue Bewertungen benachrichtigt

Es wurden noch keine Bewertungen geschrieben.

Schreiben Sie eine Bewertung

Haben Sie Erfahrung mit diesem Kurs? Schreiben Sie jetzt eine Bewertung und helfen Sie Anderen dabei die richtige Weiterbildung zu wählen. Als Dankeschön spenden wir € 1,00 an Stiftung Edukans.

Es wurden noch keine FAQ hinterlegt. Falls Sie Fragen haben oder Unterstützung benötigen, kontaktieren Sie unseren Kundenservice. Wir helfen gerne weiter!

Bitte füllen Sie das Formular so vollständig wie möglich aus

(optional)
(optional)
(optional)
(optional)

Haben Sie noch Fragen?

(optional)

Anmeldung für Newsletter

Damit Ihnen per E-Mail oder Telefon weitergeholfen werden kann, speichern wir Ihre Daten.
Mehr Informationen dazu finden Sie in unseren Datenschutzbestimmungen.