Classification: ROC and AUC  |  Machine Learning  |  Google for Developers (2024)

Stay organized with collections Save and categorize content based on your preferences.

The previous section presented a set of model metrics, all calculated at asingle classification threshold value. But if you want to evaluate amodel's quality across all possible thresholds, you need different tools.

Receiver-operating characteristic curve (ROC)

The ROC curveis a visual representation of model performance across all thresholds.The long version of the name, receiver operating characteristic, is a holdoverfrom WWII radar detection.

The ROC curve is drawn by calculating the true positive rate (TPR)and false positive rate (FPR) at every possible threshold (in practice, atselected intervals), then graphing TPR over FPR. A perfect model,which at some threshold has a TPR of 1.0 and a FPR of 0.0, canbe represented by either a point at(0, 1) if all other thresholds are ignored, or by the following:

Classification: ROC and AUC | Machine Learning | Google for Developers (1)

Area under the curve (AUC)

The area under the ROC curve (AUC)represents the probability that the model,if given a randomly chosen positive and negative example, will rank thepositive higher than the negative.

The perfect model above, containing a square with sides of length 1, has anarea under the curve (AUC) of 1.0. This means there is a 100% probability thatthe model will correctly rank a randomly chosen positive example higher than arandomly chosen negative example. In other words, looking at the spread ofdata points below, AUC gives the probability that the model will place arandomly chosen square to the right of a randomly chosen circle, independent ofwhere the threshold is set.

Classification: ROC and AUC | Machine Learning | Google for Developers (2)

In more concrete terms, a spam classifier with AUCof 1.0 always assigns a random spam email a higher probability of beingspam than a random legitimate email. The actual classification of eachemail depends on the threshold that you choose.

For a binary classifier, a model that does exactly as well as random guesses orcoin flips has a ROC that is a diagonal line from (0,0) to (1,1). The AUC is0.5, representing a 50% probability of correctly ranking a random positive andnegative example.

In the spam classifier example, a spam classifier with AUC of 0.5 assignsa random spam email a higher probability of being spam than a randomlegitimate email only half the time.

Classification: ROC and AUC | Machine Learning | Google for Developers (3)

(Optional, advanced) Precision-recall curve

AUC and ROC work well for comparing models when the dataset is roughly balanced between classes. When the dataset is imbalanced, precision-recall curves (PRCs) and the area under those curves may offer a better comparative visualization of model performance. Precision-recall curves are created by plotting precision on the y-axis and recall on the x-axis across all thresholds.

Classification: ROC and AUC | Machine Learning | Google for Developers (4)

AUC and ROC for choosing model and threshold

AUC is a useful measure for comparing the performance of two different models,as long as the dataset is roughly balanced. (See Precision-recall curve,above, for imbalanced datasets.) The model with greater area underthe curve is generally the better one.

Classification: ROC and AUC | Machine Learning | Google for Developers (5) Classification: ROC and AUC | Machine Learning | Google for Developers (6)

The points on a ROC curve closest to (0,1) represent a range of thebest-performing thresholds for the given model. As discussed in theThresholds,Confusion matrixandChoice of metric and tradeoffssections, the threshold you choose depends on which metric is most important tothe specific use case. Consider the points A, B, and C in the followingdiagram, each representing a threshold:

Classification: ROC and AUC | Machine Learning | Google for Developers (7)

If false positives (false alarms) are highly costly, it may make sense tochoose a threshold that gives a lower FPR, like the one at point A, even if TPRis reduced. Conversely, if false positives are cheap and false negatives(missed true positives) highly costly, the threshold for point C, whichmaximizes TPR, may be preferable. If the costs are roughly equivalent, point Bmay offer the best balance between TPR and FPR.

Here is the ROC curve for the data we have seen before:

Exercise: Check your understanding

In practice, ROC curves are much less regular than the illustrations given above. Which of the following models, represented by their ROC curve and AUC, has the best performance?

Classification: ROC and AUC | Machine Learning | Google for Developers (8)

This model has the highest AUC, which corresponds with the best performance.

Classification: ROC and AUC | Machine Learning | Google for Developers (9)

Classification: ROC and AUC | Machine Learning | Google for Developers (10)

Classification: ROC and AUC | Machine Learning | Google for Developers (11)

Which of the following models performs worse than chance?

Classification: ROC and AUC | Machine Learning | Google for Developers (12)

This model has an AUC lower than 0.5, which means it performs worse than chance.

Classification: ROC and AUC | Machine Learning | Google for Developers (13)

This model performs slightly better than chance.

Classification: ROC and AUC | Machine Learning | Google for Developers (14)

This model performs the same as chance.

Classification: ROC and AUC | Machine Learning | Google for Developers (15)

This is a hypothetical perfect classifier.

(Optional, advanced) Bonus question

Which of the following changes can be made to the worse-than-chance model in the previous question to cause it to perform better than chance?

Reverse the predictions, so predictions of 1 become 0, and predictions of 0 become 1.

If a binary classifier reliably puts examples in the wrong classes more often than chance, switching the class label immediately makes its predictions better than chance without having to retrain the model.

Have it always predict the negative class.

This may or may not improve performance above chance. Also, as discussed in the Accuracy section, this isn't a useful model.

Have it always predict the positive class.

This may or may not improve performance above chance. Also, as discussed in the Accuracy section, this isn't a useful model.

Imagine a situation where it's better to allow some spam to reach the inbox than to send a business-critical email to the spam folder. You've trained a spam classifier for this situation where the positive class is spam and the negative class is not-spam. Which of the following points on the ROC curve for your classifier is preferable?

Classification: ROC and AUC | Machine Learning | Google for Developers (16)

Point A

In this use case, it's better to minimize false positives, even if true positives also decrease.

Point B

This threshold balances true and false positives.

Point C

This threshold maximizes true positives (flags more spam) at a cost of more false positives (more legitimate emails flagged as spam).

Except as otherwise noted, the content of this page is licensed under the Creative Commons Attribution 4.0 License, and code samples are licensed under the Apache 2.0 License. For details, see the Google Developers Site Policies. Java is a registered trademark of Oracle and/or its affiliates.

Last updated 2024-09-03 UTC.

Classification: ROC and AUC  |  Machine Learning  |  Google for Developers (2024)
Top Articles
Cryptocurrency Tax Loss Harvesting: How to Save Money | CoinLedger
How to Fix Android Phone Stuck in Recovery Mode
Knoxville Tennessee White Pages
Dairy Queen Lobby Hours
From Algeria to Uzbekistan-These Are the Top Baby Names Around the World
THE 10 BEST Women's Retreats in Germany for September 2024
Craigslist Kennewick Pasco Richland
Overzicht reviews voor 2Cheap.nl
2013 Chevy Cruze Coolant Hose Diagram
104 Presidential Ct Lafayette La 70503
Raid Guides - Hardstuck
Newgate Honda
Aces Fmc Charting
California Department of Public Health
Help with Choosing Parts
Craigslist Blackshear Ga
Kitty Piggy Ssbbw
Amc Flight Schedule
Rachel Griffin Bikini
Craigslist Portland Oregon Motorcycles
Msu 247 Football
Pjs Obits
Wsop Hunters Club
Wbiw Weather Watchers
Dtlr Duke St
Tips and Walkthrough: Candy Crush Level 9795
1 Filmy4Wap In
Celina Powell Lil Meech Video: A Controversial Encounter Shakes Social Media - Video Reddit Trend
Craigs List Jonesboro Ar
Arrest Gif
Cfv Mychart
Craigslist Northern Minnesota
4.231 Rounded To The Nearest Hundred
Lcsc Skyward
Askhistorians Book List
Pipa Mountain Hot Pot渝味晓宇重庆老火锅 Menu
Panchang 2022 Usa
Microsoftlicentiespecialist.nl - Microcenter - ICT voor het MKB
Timothy Kremchek Net Worth
Weapons Storehouse Nyt Crossword
Craigslist List Albuquerque: Your Ultimate Guide to Buying, Selling, and Finding Everything - First Republic Craigslist
3496 W Little League Dr San Bernardino Ca 92407
Discover Wisconsin Season 16
The Conners Season 5 Wiki
Vintage Stock Edmond Ok
Child care centers take steps to avoid COVID-19 shutdowns; some require masks for kids
Best Haircut Shop Near Me
Caphras Calculator
Mlb Hitting Streak Record Holder Crossword Clue
Evil Dead Rise - Everything You Need To Know
28 Mm Zwart Spaanplaat Gemelamineerd (U999 ST9 Matte | RAL9005) Op Maat | Zagen Op Mm + ABS Kantenband
Latest Posts
Article information

Author: Margart Wisoky

Last Updated:

Views: 5599

Rating: 4.8 / 5 (58 voted)

Reviews: 89% of readers found this page helpful

Author information

Name: Margart Wisoky

Birthday: 1993-05-13

Address: 2113 Abernathy Knoll, New Tamerafurt, CT 66893-2169

Phone: +25815234346805

Job: Central Developer

Hobby: Machining, Pottery, Rafting, Cosplaying, Jogging, Taekwondo, Scouting

Introduction: My name is Margart Wisoky, I am a gorgeous, shiny, successful, beautiful, adventurous, excited, pleasant person who loves writing and wants to share my knowledge and understanding with you.