SlideShare a Scribd company logo
1 of 20
Machine Learning with Spark
and Cassandra - Testing
Tests for Binary Classification Models,
Regression Models,
And Multi-class Classification Models
Series
Machine Learning with Spark and
Cassandra
● Environment Setup
● Data Pre-processing
● Testing
● Validation
● Model Selection Tests
How do we test machine
learning models?
● Tests are a statistical measure of how well our models
work.
● Calculated by running a model on held out data with known
properties and comparing model predictions to known
labels
● Works differently for different types of ML models
● An attempt to capture the potential performance on data the
model will see in day to day operation
When do we test?
On what data?
When to test?
● Whenever we have a trained model, we can start testing. Depending on what we find and where we
are, the test can have us proceeding on to next steps or returning to previous ones.
○ Sometimes we go back to tune the parameters of our model.
○ Sometimes we may want to pick a new algorithm to train altogether.
○ Other times we move forwards to more complex testing strategies or onwards to deployment.
● The same calculations for test statistics can also be a part of the mathematical process for training
our model
What data to train on.
● Should always train on held out data, never the same data that was
used to train the model.
○ ML algorithms often involve optimization on test statistics for the
training dataset. Testing on the training set completely fails to help us
generalize to real data.
● There exist multiple methods for choosing data to be held out,
should always be done randomly.
○ Simplest method is to split data into two random chunks, train on one and
then test on the other
○ Can also split into three chunks, one for training, one for testing, one for
final validation
○ More complex schemes exist, to be covered next time in talk on validation
Binary Classification Tests
● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence
or absence of a particular thing, other times picking between two categories.
● In order to test our binary classification models we use something called a confusion matrix. It
categorizes our predictions based on what value we predicted and what the actual value is.
● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence
or absence of a particular thing, other times picking between two categories.
● In order to test our binary classification models we use something called a confusion matrix. It
categorizes our predictions based on what value we predicted and what the actual value is.
● We use these values to compute more meaningful metrics.
● The most commonly used is accuracy. Accuracy is computed as
correct predictions divided by all predictions. Its a general measure
of how likely we are to correctly predict a given example.
● Recall is computed as the number of correctly identified positive
values divided by the number of actual positive values. It measures
how well our model detects the presence of positive values.
● Precision is calculated as the number of correctly identified positive
values divided by the number of positive predictions. It measures the
reliability of the positive prediction.
● We can use Recall and Precision to calculate a composite value, the
F1 score. If either recall or precision is low, the f1-score will also be
small. It emphasizes the importance of the incorrect predictions.
Test Error for Regression Models
● Regression models estimate functions, and produce predictions in the form of scalar values.
Classification tests do not work for them. Instead we use the difference between predicted and
actual values as a simple error metric.
● Adding error values without extra processing is a bad idea since
errors in different directions can cancel out.
● Instead we use metrics like the sum of squared error (SSE) a simple
measure that captures error over the entire test set.
● We can also use mean squared error (MSE), which in some cases is
better since it is independent of the number of examples in the test
set.
● Root mean squared error (RMSE) is sometimes preferable since it is
returned in the same units as our predictions rather than units
squared, but still maintains many of the statistical feature of the
MSE.
● We sometimes prefer absolute error measures to squared error measures, which we calculate by
taking the absolute value of our error measure rather that squaring them.
● Large error values and therefore outliers are emphasized more by squared error measures.
● The discontinuity in the absolute value function makes it difficult to calculate gradients.
Confusion Matrices for Multiclass
Classification Models
● Multiclass classifiers predict a value that can have more than two but still finite possible values
● We test them by building confusion matrices, similar to binary classification, but these cannot be
turned directly into test metrics.
● We build an n by n grid, where n is the number of possible classes and place each test result into its
cell based on what was predicted and the actual class of the example.
● We can then turn that into n individual matrices, one for each class. We treat correct predictions
on a particular class as true positives, and then all other predictions are classified based on their
relation to the class that the matrix is for.
● From these new matrices, we can calculate our test metrics for each class. We can then combine
these values in various ways based on what is important for our application.
● We average scores together, but we can average based on the number of classes, weighting each
classes scores equally (called macro-average), or we can weight each score by the number of
examples that class has (called micro-average).
● Macro-average can act as a general score though it may obscure very high or low performance on
particular classes. If performance on a particular class is important we may choose to
micro-average or even look at the individual test scores.
Demo
Any Questions?
Strategy: Scalable Fast Data
Architecture: Cassandra, Spark, Kafka
Engineering: Node, Python, JVM,CLR
Operations: Cloud, Container
Rescue: Downtime!! I need help.
 www.anant.us | solutions@anant.us | (855) 262-6826
3 Washington Circle, NW | Suite 301 | Washington, DC 20037

More Related Content

More from Anant Corporation

Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksAnant Corporation
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionAnant Corporation
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Anant Corporation
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & FutureAnant Corporation
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Anant Corporation
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackAnant Corporation
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergAnant Corporation
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsAnant Corporation
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraAnant Corporation
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Anant Corporation
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessAnant Corporation
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsAnant Corporation
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionAnant Corporation
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersAnant Corporation
 
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Anant Corporation
 
Data Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocData Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocAnant Corporation
 
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraApache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraAnant Corporation
 
Apache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesApache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesAnant Corporation
 
Apache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingApache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingAnant Corporation
 

More from Anant Corporation (20)

Cassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward TalksCassandra Lunch 130: Recap of Cassandra Forward Talks
Cassandra Lunch 130: Recap of Cassandra Forward Talks
 
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with ArcionData Engineer's Lunch 90: Migrating SQL Data with Arcion
Data Engineer's Lunch 90: Migrating SQL Data with Arcion
 
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
Data Engineer's Lunch 89: Machine Learning Orchestration with AirflowMachine ...
 
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & FutureCassandra Lunch 129: What’s New:  Apache Cassandra 4.1+ Features & Future
Cassandra Lunch 129: What’s New: Apache Cassandra 4.1+ Features & Future
 
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
Data Engineer's Lunch #86: Building Real-Time Applications at Scale: A Case S...
 
Data Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data StackData Engineer's Lunch #85: Designing a Modern Data Stack
Data Engineer's Lunch #85: Designing a Modern Data Stack
 
CL 121
CL 121CL 121
CL 121
 
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache IcebergData Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
Data Engineer's Lunch #83: Strategies for Migration to Apache Iceberg
 
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOpsApache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
Apache Cassandra Lunch 120: Apache Cassandra Monitoring Made Easy with AxonOps
 
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache CassandraApache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
Apache Cassandra Lunch 119: Desktop GUI Tools for Apache Cassandra
 
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
Data Engineer's Lunch #82: Automating Apache Cassandra Operations with Apache...
 
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise ConsciousnessData Engineer's Lunch #60: Series - Developing Enterprise Consciousness
Data Engineer's Lunch #60: Series - Developing Enterprise Consciousness
 
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data PlatformsData Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
Data Engineer's Lunch #81: Reverse ETL Tools for Modern Data Platforms
 
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature SelectionData Engineer’s Lunch #67: Machine Learning - Feature Selection
Data Engineer’s Lunch #67: Machine Learning - Feature Selection
 
Data Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource ManagersData Engineer's Lunch #80: Apache Spark Resource Managers
Data Engineer's Lunch #80: Apache Spark Resource Managers
 
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
Data Engineer's Lunch #77: Apache Arrow Flight SQL: A Universal Standard for ...
 
Data Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google DataprocData Engineer's Lunch #76: Airflow and Google Dataproc
Data Engineer's Lunch #76: Airflow and Google Dataproc
 
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax AstraApache Cassandra Lunch #115: Google Dataproc and DataStax Astra
Apache Cassandra Lunch #115: Google Dataproc and DataStax Astra
 
Apache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual TablesApache Cassandra Lunch #114: Cassandra Virtual Tables
Apache Cassandra Lunch #114: Cassandra Virtual Tables
 
Apache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query LoggingApache Cassandra Lunch #110: Full Query Logging
Apache Cassandra Lunch #110: Full Query Logging
 

Recently uploaded

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024The Digital Insurer
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsJoaquim Jorge
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Principled Technologies
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 

Recently uploaded (20)

Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024Manulife - Insurer Innovation Award 2024
Manulife - Insurer Innovation Award 2024
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Artificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and MythsArtificial Intelligence: Facts and Myths
Artificial Intelligence: Facts and Myths
 
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
Bajaj Allianz Life Insurance Company - Insurer Innovation Award 2024
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
Deploy with confidence: VMware Cloud Foundation 5.1 on next gen Dell PowerEdg...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 

Machine Learning with Spark and Cassandra - Testing

  • 1. Machine Learning with Spark and Cassandra - Testing Tests for Binary Classification Models, Regression Models, And Multi-class Classification Models
  • 2. Series Machine Learning with Spark and Cassandra ● Environment Setup ● Data Pre-processing ● Testing ● Validation ● Model Selection Tests
  • 3. How do we test machine learning models?
  • 4. ● Tests are a statistical measure of how well our models work. ● Calculated by running a model on held out data with known properties and comparing model predictions to known labels ● Works differently for different types of ML models ● An attempt to capture the potential performance on data the model will see in day to day operation
  • 5. When do we test? On what data?
  • 6. When to test? ● Whenever we have a trained model, we can start testing. Depending on what we find and where we are, the test can have us proceeding on to next steps or returning to previous ones. ○ Sometimes we go back to tune the parameters of our model. ○ Sometimes we may want to pick a new algorithm to train altogether. ○ Other times we move forwards to more complex testing strategies or onwards to deployment. ● The same calculations for test statistics can also be a part of the mathematical process for training our model
  • 7.
  • 8. What data to train on. ● Should always train on held out data, never the same data that was used to train the model. ○ ML algorithms often involve optimization on test statistics for the training dataset. Testing on the training set completely fails to help us generalize to real data. ● There exist multiple methods for choosing data to be held out, should always be done randomly. ○ Simplest method is to split data into two random chunks, train on one and then test on the other ○ Can also split into three chunks, one for training, one for testing, one for final validation ○ More complex schemes exist, to be covered next time in talk on validation
  • 10. ● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence or absence of a particular thing, other times picking between two categories. ● In order to test our binary classification models we use something called a confusion matrix. It categorizes our predictions based on what value we predicted and what the actual value is. ● Binary classifiers predict a value which has a boolean typing. It sometimes focuses on the presence or absence of a particular thing, other times picking between two categories. ● In order to test our binary classification models we use something called a confusion matrix. It categorizes our predictions based on what value we predicted and what the actual value is.
  • 11. ● We use these values to compute more meaningful metrics. ● The most commonly used is accuracy. Accuracy is computed as correct predictions divided by all predictions. Its a general measure of how likely we are to correctly predict a given example. ● Recall is computed as the number of correctly identified positive values divided by the number of actual positive values. It measures how well our model detects the presence of positive values. ● Precision is calculated as the number of correctly identified positive values divided by the number of positive predictions. It measures the reliability of the positive prediction. ● We can use Recall and Precision to calculate a composite value, the F1 score. If either recall or precision is low, the f1-score will also be small. It emphasizes the importance of the incorrect predictions.
  • 12. Test Error for Regression Models
  • 13. ● Regression models estimate functions, and produce predictions in the form of scalar values. Classification tests do not work for them. Instead we use the difference between predicted and actual values as a simple error metric. ● Adding error values without extra processing is a bad idea since errors in different directions can cancel out. ● Instead we use metrics like the sum of squared error (SSE) a simple measure that captures error over the entire test set. ● We can also use mean squared error (MSE), which in some cases is better since it is independent of the number of examples in the test set. ● Root mean squared error (RMSE) is sometimes preferable since it is returned in the same units as our predictions rather than units squared, but still maintains many of the statistical feature of the MSE.
  • 14. ● We sometimes prefer absolute error measures to squared error measures, which we calculate by taking the absolute value of our error measure rather that squaring them. ● Large error values and therefore outliers are emphasized more by squared error measures. ● The discontinuity in the absolute value function makes it difficult to calculate gradients.
  • 15. Confusion Matrices for Multiclass Classification Models
  • 16. ● Multiclass classifiers predict a value that can have more than two but still finite possible values ● We test them by building confusion matrices, similar to binary classification, but these cannot be turned directly into test metrics. ● We build an n by n grid, where n is the number of possible classes and place each test result into its cell based on what was predicted and the actual class of the example. ● We can then turn that into n individual matrices, one for each class. We treat correct predictions on a particular class as true positives, and then all other predictions are classified based on their relation to the class that the matrix is for.
  • 17. ● From these new matrices, we can calculate our test metrics for each class. We can then combine these values in various ways based on what is important for our application. ● We average scores together, but we can average based on the number of classes, weighting each classes scores equally (called macro-average), or we can weight each score by the number of examples that class has (called micro-average). ● Macro-average can act as a general score though it may obscure very high or low performance on particular classes. If performance on a particular class is important we may choose to micro-average or even look at the individual test scores.
  • 18. Demo
  • 20. Strategy: Scalable Fast Data Architecture: Cassandra, Spark, Kafka Engineering: Node, Python, JVM,CLR Operations: Cloud, Container Rescue: Downtime!! I need help.  www.anant.us | solutions@anant.us | (855) 262-6826 3 Washington Circle, NW | Suite 301 | Washington, DC 20037