SlideShare une entreprise Scribd logo
1  sur  15
2018 Catalytics, LLC - Proprietary and Confidential
Analyzing Breast
Cancer Dataset with
Azure
Machine Learning (ML)
Studio
Frank Mendoza
CEO, Catalytics
Chicago Technology for Value-Based Healthcare
Meetup
January 23, 2018
2018 Catalytics, LLC - Proprietary and Confidential
• Total of 569 records in dataset – donated in 1995
• 30 distinct numerical attributes (or features) associated with
each record
• No categorical features available within the dataset
Breast Cancer Wisconsin (Diagnostic) Dataset
Description
Location: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
2018 Catalytics, LLC - Proprietary and Confidential
Breast Cancer Wisconsin (Diagnostic) Dataset
Description, cont.
• Column identified as “Diagnosis” is the dataset label
• M = malignant
• B = benign 300+
200+
Example of Measurements
2018 Catalytics, LLC - Proprietary and Confidential
Core Steps to build Predictive Models using Machine Learning
5(a)
Test API
2018 Catalytics, LLC - Proprietary and Confidential
Acquire Data & Prepare
• Dataset did not have any missing values
• Manipulation was still required to ensure training process would be successful –
normalization, etc.
• Split data into two sets to Train & Test model
• Training = 311 records (~54%)
• Testing set 1 = 208 records (~36%)
• Additional Testing set was to test model after API created – step 5(a)
• Testing set 2 = 50 records (~10%)
• Training & Testing set 1 was uploaded to Azure Machine Learning (ML) Studio
2018 Catalytics, LLC - Proprietary and Confidential
Training Predictive Model
Choosing algorithms
• Since label is 2 class – Benign vs. Malignant; it was clear that a
Classification model would be necessary
• Multiple models were developed to identify the best algorithm to use
• Two class Logistic Regression
• Two class Support Vector Machine
• Two class Boosted Decision Tree
• Two class Neural Network - WINNER
2018 Catalytics, LLC - Proprietary and Confidential
Optimizing Neural Network Model
• Feature Selection – identify which attributes matter
Important Less Important
2018 Catalytics, LLC - Proprietary and Confidential
Feature Selection, continued
• Azure ML contains a module called “Permutation Feature Importance” that will
test features to identify importance
2018 Catalytics, LLC - Proprietary and Confidential
Cross Validation
• Azure ML contains a module called “Cross Validation Model” that will evaluate
model by partitioning the data – used to ensure that model will perform
against unseen/ new data
10 folds
2018 Catalytics, LLC - Proprietary and Confidential
Neural Network Classification Model
Optimized
• Feature selection allowed us to remove 14 attributes that did not
contribute to improving model
• Accuracy improved from 0.976 to 0.981
AZURE ML
DEMONSTRATION
AZURE ML API/ EXCEL
DEMONSTRATION
2018 Catalytics, LLC - Proprietary and Confidential
Frank Mendoza, CEO & Chief Catalyst
900 E. Pecan St, Suite 300-286 Pflugerville, TX 78660-8048
Phone: +1 (512) 767-8604
Fax: +1 (737) 703-5478
Email: Frank@CatalyticsConsulting.com
linkedin.com/in/fxmendoza
Twitter: @DataDrivenMind
Appendix
2018 Catalytics, LLC - Proprietary and Confidential
Attribute Information
1) ID number
2) Diagnosis (M = malignant, B = benign)
3-32) Ten real-valued features are computed for each cell nucleus:
a) radius (mean of distances from center to points on the perimeter)
b) texture (standard deviation of gray-scale values)
c) perimeter
d) area
e) smoothness (local variation in radius lengths)
f) compactness (perimeter^2 / area - 1.0)
g) concavity (severity of concave portions of the contour)
h) concave points (number of concave portions of the contour)
i) symmetry
j) fractal dimension ("coastline approximation" - 1)
Location: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.707&rep=rep1&type=pdf

Contenu connexe

Similaire à Analyzing Breast Cancer Dataset with Azure Machine Learning Studio

Sai Teja K Resume.pdf
Sai Teja K Resume.pdfSai Teja K Resume.pdf
Sai Teja K Resume.pdfSaiTejaK11
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...EMC
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionProvectus
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AIGary Allemann
 
Big data services slideshare - agilisium 2.0 - v1.0
Big data services   slideshare - agilisium 2.0 - v1.0Big data services   slideshare - agilisium 2.0 - v1.0
Big data services slideshare - agilisium 2.0 - v1.0Agilisium Consulting
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...Value Amplify Consulting
 
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdf
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdfIntroduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdf
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdfOne Federal Solution
 
New technologies for data protection
New technologies for data protectionNew technologies for data protection
New technologies for data protectionUlf Mattsson
 
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataMicrosoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataHealth Catalyst
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)Julien SIMON
 
Deploying Predictive Analytics in Healthcare
Deploying Predictive Analytics in HealthcareDeploying Predictive Analytics in Healthcare
Deploying Predictive Analytics in HealthcareHealth Catalyst
 
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataMicrosoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataDale Sanders
 
Intelligent Digital Mesh Testing
Intelligent Digital Mesh TestingIntelligent Digital Mesh Testing
Intelligent Digital Mesh TestingNagarro
 
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...QASymphony
 
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Prasanna Hegde
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Sandesh Rao
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...DataScienceConferenc1
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningAndrew Clark
 
From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment KNIMESlides
 

Similaire à Analyzing Breast Cancer Dataset with Azure Machine Learning Studio (20)

Sai Teja K Resume.pdf
Sai Teja K Resume.pdfSai Teja K Resume.pdf
Sai Teja K Resume.pdf
 
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
Strata Rx 2013 - Data Driven Drugs: Predictive Models to Improve Product Qual...
 
MLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in ProductionMLOps and Data Quality: Deploying Reliable ML Models in Production
MLOps and Data Quality: Deploying Reliable ML Models in Production
 
Deliveinrg explainable AI
Deliveinrg explainable AIDeliveinrg explainable AI
Deliveinrg explainable AI
 
Big data services slideshare - agilisium 2.0 - v1.0
Big data services   slideshare - agilisium 2.0 - v1.0Big data services   slideshare - agilisium 2.0 - v1.0
Big data services slideshare - agilisium 2.0 - v1.0
 
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
AI Class Topic 3: Building Machine Learning Predictive Systems (Predictive Ma...
 
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdf
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdfIntroduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdf
Introduction-to-Big-Data-Analytics-in-Logistics-and-Supply-Chain-Management.pdf
 
SESE 2021: Where Systems Engineering meets AI/ML
SESE 2021: Where Systems Engineering meets AI/MLSESE 2021: Where Systems Engineering meets AI/ML
SESE 2021: Where Systems Engineering meets AI/ML
 
New technologies for data protection
New technologies for data protectionNew technologies for data protection
New technologies for data protection
 
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big DataMicrosoft: A Waking Giant In Healthcare Analytics and Big Data
Microsoft: A Waking Giant In Healthcare Analytics and Big Data
 
An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)An introduction to Machine Learning with scikit-learn (October 2018)
An introduction to Machine Learning with scikit-learn (October 2018)
 
Deploying Predictive Analytics in Healthcare
Deploying Predictive Analytics in HealthcareDeploying Predictive Analytics in Healthcare
Deploying Predictive Analytics in Healthcare
 
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big DataMicrosoft: A Waking Giant in Healthcare Analytics and Big Data
Microsoft: A Waking Giant in Healthcare Analytics and Big Data
 
Intelligent Digital Mesh Testing
Intelligent Digital Mesh TestingIntelligent Digital Mesh Testing
Intelligent Digital Mesh Testing
 
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...
Quality Jam 2017: Jesse Reed & Kyle McMeekin "Test Case Management & Explorat...
 
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
Unlocking DataDriven Talent Intelligence Transforming TALENTX with Industry P...
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...
 
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
[DSC Europe 23] Milos Solujic - Data Lakehouse Revolutionizing Data Managemen...
 
Reinventing Auditing with Machine Learning
Reinventing Auditing with Machine LearningReinventing Auditing with Machine Learning
Reinventing Auditing with Machine Learning
 
From raw data to deployment
From raw data to deployment From raw data to deployment
From raw data to deployment
 

Plus de Dan Wellisch

Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...
Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...
Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...Dan Wellisch
 
The Role Of Community-Based Organizations in Achieving Population Health Goals
The Role Of Community-Based Organizations in Achieving Population Health GoalsThe Role Of Community-Based Organizations in Achieving Population Health Goals
The Role Of Community-Based Organizations in Achieving Population Health GoalsDan Wellisch
 
Health Industry Cybersecurity Best Practices
Health Industry Cybersecurity Best PracticesHealth Industry Cybersecurity Best Practices
Health Industry Cybersecurity Best PracticesDan Wellisch
 
Driving Data to Cut Healthcare Costs
Driving Data to Cut Healthcare CostsDriving Data to Cut Healthcare Costs
Driving Data to Cut Healthcare CostsDan Wellisch
 
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...Dan Wellisch
 
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...Dan Wellisch
 
Who Is A HIPAA Business Associate ?
Who Is A  HIPAA  Business  Associate ?Who Is A  HIPAA  Business  Associate ?
Who Is A HIPAA Business Associate ?Dan Wellisch
 
Chronic Care Management - Implemented By TimeDoc - May 2018
Chronic Care Management - Implemented By TimeDoc - May 2018Chronic Care Management - Implemented By TimeDoc - May 2018
Chronic Care Management - Implemented By TimeDoc - May 2018Dan Wellisch
 
Managing HIPAA Business Associate Relationships - April 24, 2018
Managing HIPAA Business Associate Relationships  -  April 24, 2018  Managing HIPAA Business Associate Relationships  -  April 24, 2018
Managing HIPAA Business Associate Relationships - April 24, 2018 Dan Wellisch
 
Using Models For Analytically-Driven Cultural Transformation
Using Models For Analytically-Driven Cultural TransformationUsing Models For Analytically-Driven Cultural Transformation
Using Models For Analytically-Driven Cultural TransformationDan Wellisch
 
Simple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-StepSimple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-StepDan Wellisch
 
Helping Health Healthcare: Financial Decision Support
Helping Health Healthcare: Financial Decision SupportHelping Health Healthcare: Financial Decision Support
Helping Health Healthcare: Financial Decision SupportDan Wellisch
 
AWS Machine Learning Workshop
AWS Machine Learning WorkshopAWS Machine Learning Workshop
AWS Machine Learning WorkshopDan Wellisch
 
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?Dan Wellisch
 
HIPAA Panel Discussion
HIPAA Panel Discussion HIPAA Panel Discussion
HIPAA Panel Discussion Dan Wellisch
 
Using Predictive Analytics For Care Management And Coordination
Using Predictive Analytics For Care Management And CoordinationUsing Predictive Analytics For Care Management And Coordination
Using Predictive Analytics For Care Management And CoordinationDan Wellisch
 
Rcm (Revenue Cycle Management)
Rcm (Revenue Cycle Management)Rcm (Revenue Cycle Management)
Rcm (Revenue Cycle Management)Dan Wellisch
 
Driving to consumerism
Driving to consumerismDriving to consumerism
Driving to consumerismDan Wellisch
 
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Using The Hadoop Ecosystem to Drive Healthcare InnovationUsing The Hadoop Ecosystem to Drive Healthcare Innovation
Using The Hadoop Ecosystem to Drive Healthcare InnovationDan Wellisch
 

Plus de Dan Wellisch (19)

Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...
Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...
Measuring, Mismeasuring, and Remeasuring - Creating Meaningful Key Performanc...
 
The Role Of Community-Based Organizations in Achieving Population Health Goals
The Role Of Community-Based Organizations in Achieving Population Health GoalsThe Role Of Community-Based Organizations in Achieving Population Health Goals
The Role Of Community-Based Organizations in Achieving Population Health Goals
 
Health Industry Cybersecurity Best Practices
Health Industry Cybersecurity Best PracticesHealth Industry Cybersecurity Best Practices
Health Industry Cybersecurity Best Practices
 
Driving Data to Cut Healthcare Costs
Driving Data to Cut Healthcare CostsDriving Data to Cut Healthcare Costs
Driving Data to Cut Healthcare Costs
 
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...
US Healthcare Reform Landscape - Addendum to June 2018 Presentation to the Ch...
 
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...
Payer Analytics In A Shifting Healthcare Landscape - June Presentation To Chi...
 
Who Is A HIPAA Business Associate ?
Who Is A  HIPAA  Business  Associate ?Who Is A  HIPAA  Business  Associate ?
Who Is A HIPAA Business Associate ?
 
Chronic Care Management - Implemented By TimeDoc - May 2018
Chronic Care Management - Implemented By TimeDoc - May 2018Chronic Care Management - Implemented By TimeDoc - May 2018
Chronic Care Management - Implemented By TimeDoc - May 2018
 
Managing HIPAA Business Associate Relationships - April 24, 2018
Managing HIPAA Business Associate Relationships  -  April 24, 2018  Managing HIPAA Business Associate Relationships  -  April 24, 2018
Managing HIPAA Business Associate Relationships - April 24, 2018
 
Using Models For Analytically-Driven Cultural Transformation
Using Models For Analytically-Driven Cultural TransformationUsing Models For Analytically-Driven Cultural Transformation
Using Models For Analytically-Driven Cultural Transformation
 
Simple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-StepSimple Linear Regression: Step-By-Step
Simple Linear Regression: Step-By-Step
 
Helping Health Healthcare: Financial Decision Support
Helping Health Healthcare: Financial Decision SupportHelping Health Healthcare: Financial Decision Support
Helping Health Healthcare: Financial Decision Support
 
AWS Machine Learning Workshop
AWS Machine Learning WorkshopAWS Machine Learning Workshop
AWS Machine Learning Workshop
 
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?
What Are The All Payer Claims Databases (SCPDs) And What Could Be Used For?
 
HIPAA Panel Discussion
HIPAA Panel Discussion HIPAA Panel Discussion
HIPAA Panel Discussion
 
Using Predictive Analytics For Care Management And Coordination
Using Predictive Analytics For Care Management And CoordinationUsing Predictive Analytics For Care Management And Coordination
Using Predictive Analytics For Care Management And Coordination
 
Rcm (Revenue Cycle Management)
Rcm (Revenue Cycle Management)Rcm (Revenue Cycle Management)
Rcm (Revenue Cycle Management)
 
Driving to consumerism
Driving to consumerismDriving to consumerism
Driving to consumerism
 
Using The Hadoop Ecosystem to Drive Healthcare Innovation
Using The Hadoop Ecosystem to Drive Healthcare InnovationUsing The Hadoop Ecosystem to Drive Healthcare Innovation
Using The Hadoop Ecosystem to Drive Healthcare Innovation
 

Dernier

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Delhi Call girls
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxolyaivanovalion
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023ymrp368
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxolyaivanovalion
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Researchmichael115558
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...amitlee9823
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxolyaivanovalion
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxolyaivanovalion
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfLars Albertsson
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...amitlee9823
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxolyaivanovalion
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...amitlee9823
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...amitlee9823
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionfulawalesam
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% SecurePooja Nehwal
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxolyaivanovalion
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Callshivangimorya083
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxMohammedJunaid861692
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFxolyaivanovalion
 

Dernier (20)

Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
Best VIP Call Girls Noida Sector 22 Call Me: 8448380779
 
BigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptxBigBuy dropshipping via API with DroFx.pptx
BigBuy dropshipping via API with DroFx.pptx
 
Data-Analysis for Chicago Crime Data 2023
Data-Analysis for Chicago Crime Data  2023Data-Analysis for Chicago Crime Data  2023
Data-Analysis for Chicago Crime Data 2023
 
Ravak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptxRavak dropshipping via API with DroFx.pptx
Ravak dropshipping via API with DroFx.pptx
 
Discover Why Less is More in B2B Research
Discover Why Less is More in B2B ResearchDiscover Why Less is More in B2B Research
Discover Why Less is More in B2B Research
 
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
Chintamani Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore ...
 
Mature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptxMature dropshipping via API with DroFx.pptx
Mature dropshipping via API with DroFx.pptx
 
Smarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptxSmarteg dropshipping via API with DroFx.pptx
Smarteg dropshipping via API with DroFx.pptx
 
Schema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdfSchema on read is obsolete. Welcome metaprogramming..pdf
Schema on read is obsolete. Welcome metaprogramming..pdf
 
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
Call Girls Hsr Layout Just Call 👗 7737669865 👗 Top Class Call Girl Service Ba...
 
VidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptxVidaXL dropshipping via API with DroFx.pptx
VidaXL dropshipping via API with DroFx.pptx
 
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
Junnasandra Call Girls: 🍓 7737669865 🍓 High Profile Model Escorts | Bangalore...
 
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
Call Girls Indiranagar Just Call 👗 7737669865 👗 Top Class Call Girl Service B...
 
Week-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interactionWeek-01-2.ppt BBB human Computer interaction
Week-01-2.ppt BBB human Computer interaction
 
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% SecureCall me @ 9892124323  Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
Call me @ 9892124323 Cheap Rate Call Girls in Vashi with Real Photo 100% Secure
 
Carero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptxCarero dropshipping via API with DroFx.pptx
Carero dropshipping via API with DroFx.pptx
 
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip CallDelhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
Delhi Call Girls CP 9711199171 ☎✔👌✔ Whatsapp Hard And Sexy Vip Call
 
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts ServiceCall Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
Call Girls In Shalimar Bagh ( Delhi) 9953330565 Escorts Service
 
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptxBPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
BPAC WITH UFSBI GENERAL PRESENTATION 18_05_2017-1.pptx
 
Halmar dropshipping via API with DroFx
Halmar  dropshipping  via API with DroFxHalmar  dropshipping  via API with DroFx
Halmar dropshipping via API with DroFx
 

Analyzing Breast Cancer Dataset with Azure Machine Learning Studio

  • 1. 2018 Catalytics, LLC - Proprietary and Confidential Analyzing Breast Cancer Dataset with Azure Machine Learning (ML) Studio Frank Mendoza CEO, Catalytics Chicago Technology for Value-Based Healthcare Meetup January 23, 2018
  • 2. 2018 Catalytics, LLC - Proprietary and Confidential • Total of 569 records in dataset – donated in 1995 • 30 distinct numerical attributes (or features) associated with each record • No categorical features available within the dataset Breast Cancer Wisconsin (Diagnostic) Dataset Description Location: https://archive.ics.uci.edu/ml/datasets/Breast+Cancer+Wisconsin+(Diagnostic)
  • 3. 2018 Catalytics, LLC - Proprietary and Confidential Breast Cancer Wisconsin (Diagnostic) Dataset Description, cont. • Column identified as “Diagnosis” is the dataset label • M = malignant • B = benign 300+ 200+ Example of Measurements
  • 4. 2018 Catalytics, LLC - Proprietary and Confidential Core Steps to build Predictive Models using Machine Learning 5(a) Test API
  • 5. 2018 Catalytics, LLC - Proprietary and Confidential Acquire Data & Prepare • Dataset did not have any missing values • Manipulation was still required to ensure training process would be successful – normalization, etc. • Split data into two sets to Train & Test model • Training = 311 records (~54%) • Testing set 1 = 208 records (~36%) • Additional Testing set was to test model after API created – step 5(a) • Testing set 2 = 50 records (~10%) • Training & Testing set 1 was uploaded to Azure Machine Learning (ML) Studio
  • 6. 2018 Catalytics, LLC - Proprietary and Confidential Training Predictive Model Choosing algorithms • Since label is 2 class – Benign vs. Malignant; it was clear that a Classification model would be necessary • Multiple models were developed to identify the best algorithm to use • Two class Logistic Regression • Two class Support Vector Machine • Two class Boosted Decision Tree • Two class Neural Network - WINNER
  • 7. 2018 Catalytics, LLC - Proprietary and Confidential Optimizing Neural Network Model • Feature Selection – identify which attributes matter Important Less Important
  • 8. 2018 Catalytics, LLC - Proprietary and Confidential Feature Selection, continued • Azure ML contains a module called “Permutation Feature Importance” that will test features to identify importance
  • 9. 2018 Catalytics, LLC - Proprietary and Confidential Cross Validation • Azure ML contains a module called “Cross Validation Model” that will evaluate model by partitioning the data – used to ensure that model will perform against unseen/ new data 10 folds
  • 10. 2018 Catalytics, LLC - Proprietary and Confidential Neural Network Classification Model Optimized • Feature selection allowed us to remove 14 attributes that did not contribute to improving model • Accuracy improved from 0.976 to 0.981
  • 12. AZURE ML API/ EXCEL DEMONSTRATION
  • 13. 2018 Catalytics, LLC - Proprietary and Confidential Frank Mendoza, CEO & Chief Catalyst 900 E. Pecan St, Suite 300-286 Pflugerville, TX 78660-8048 Phone: +1 (512) 767-8604 Fax: +1 (737) 703-5478 Email: Frank@CatalyticsConsulting.com linkedin.com/in/fxmendoza Twitter: @DataDrivenMind
  • 15. 2018 Catalytics, LLC - Proprietary and Confidential Attribute Information 1) ID number 2) Diagnosis (M = malignant, B = benign) 3-32) Ten real-valued features are computed for each cell nucleus: a) radius (mean of distances from center to points on the perimeter) b) texture (standard deviation of gray-scale values) c) perimeter d) area e) smoothness (local variation in radius lengths) f) compactness (perimeter^2 / area - 1.0) g) concavity (severity of concave portions of the contour) h) concave points (number of concave portions of the contour) i) symmetry j) fractal dimension ("coastline approximation" - 1) Location: http://citeseerx.ist.psu.edu/viewdoc/download?doi=10.1.1.56.707&rep=rep1&type=pdf