SlideShare une entreprise Scribd logo
1  sur  40
Introduction to Augustus OVERVIEW Open Data Group September 17, 2009
Website and Community Augustus is an open source scoring engine for statistical and data mining models based on the Predictive Model Markup Language (PMML). It is written in Python and is freely available. http://augustus.googlecode.com
 
Getting Augustus ,[object Object],[object Object],[object Object],[object Object]
 
Source ,[object Object],[object Object],[object Object]
 
 
 
Documentation and Community ,[object Object],[object Object],[object Object],[object Object]
 
 
Using Augustus ,[object Object],[object Object],[object Object]
Development and Use Cycle ,[object Object],[object Object],[object Object],[object Object],[object Object]
Development and Use Cycle 2. Model schema 1. Data Inputs
Running Augustus 3. Obtain new model with Producer 4. Score with Consumer
Work Flows ,[object Object],[object Object]
Components ,[object Object],[object Object],[object Object],[object Object]
Producers and Consumers ,[object Object],[object Object],[object Object]
Post Processing ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Segments ,[object Object],[object Object],[object Object],[object Object]
Result of Scoring
Case Study: Auto ,[object Object],[object Object],[object Object],[object Object],[object Object]
Work Flow Overview
Auto: Weighted Batch Using the Baseline for Training: $ cd WeightedBatch `-- scripts |-- consume.py |-- postprocess.py `-- produce.py http://code.google.com/p/augustus/source/browse/#svn/trunk/examples/auto/WeightedBatch
Input for the Producer The Producer takes the training data set.  In the code, we have declared how we want to test the data import augustus.modellib.baseline.producer.Producer as Producer def makeConfigs(inFile, outFile, inPMML, outPMML): #open data file inf = uni.UniTable().fromfile(inFile) #start the configuration file   test = ET.SubElement(root, "test") test.set("field", "Automaker") test.set("weightField", "Count") test.set("testStatistic", "dDist") test.set("testType", "threshold") test.set("threshold", "0.475")
Input for the Producer Continued # use a discrete distribution model for test baseline = ET.SubElement(test, "baseline") baseline.set("dist", "discrete") baseline.set("file", str(inFile)) baseline.set("type", "UniTable") # create the segmentation declarations for the two fields at this level ''' Taken out for the example, other Use Cases will focus on Segments segmentation = ET.SubElement(test, "segmentation") makeSegment(inf, segmentation, "Color") ''' #output the configuration file tree = ET.ElementTree(root) tree.write(outFile)
Running the Producer( Training) $ cd scripts $ python2.5 produce.py -f wtraining.nab -t20 (0.000 secs)  Beginning timing (0.000 secs)  Creating configuration file (0.001 secs)  Creating input PMML file (0.001 secs)  Starting producer (0.000 secs)  Inputting configurations (0.001 secs)  Inputting model (0.008 secs)  Collecting stats for baseline distribution (0.011 secs)  Events 20.067% processed (0.009 secs)  Events 40.134% processed (0.009 secs)  Events 60.201% processed (0.009 secs)  Events 80.268% processed (0.009 secs)  Events 100.000% processed (0.000 secs)  Making test distributions from statistics (0.002 secs)  Outputting PMML (0.062 secs)  Lifetime of timer
Model generated by the Producer <PMML version=&quot;3.1&quot;> <Header copyright=&quot; &quot; /> < DataDictionary > < DataField  dataType=&quot;string&quot; name=&quot;Automaker&quot; optype=&quot;categorical&quot; /> < DataField  dataType=&quot;string&quot; name=&quot;Color&quot; optype=&quot;categorical&quot; /> < DataField  dataType=&quot;float&quot; name=&quot;Count&quot; optype=&quot;continuous&quot; /> </ DataDictionary > < BaselineModel  functionName=&quot;baseline&quot;> < MiningSchema > < MiningField  name=&quot;Automaker&quot; /> < MiningField  name=&quot;Color&quot; /> < MiningField  name=&quot;Count&quot; /> </ MiningSchema > </ BaselineModel > </PMML>
Model generated by the Producer (Cont) ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Producer Output The training step used the code in producer.py to generate a model and get expected results.  Training generated the following files: . |-- consumer |  `-- wtraining.nab.pmml  MODEL WITH EXPECTED VALUES BASED ON THE TRAINING DATA `-- producer |-- wtraining.nab.pmml  BASELINE DATA, DATA DICTIONARY, MINING SCHEMA `-- wtraining.nab.xml  MODEL FILE USED FOR TRAINING
Training XML ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Unitable ,[object Object],[object Object],[object Object],[object Object]
Running the Consumer cd script $ python2.5 consume.py -b wtraining.nab -f wscoring.nab Ready to score . |-- consumer |  |-- wscoring.nab.wtraining.nab.xml |  `-- wtraining.nab.pmml |-- postprocess |  `-- wscoring.nab.wtraining.nab.xml `-- producer |-- wtraining.nab.pmml `-- wtraining.nab.xml This examples generates a report in the post process directory.
Consumer (Scoring) output $ cat consumer/wscoring.nab.wtraining.nab.xml <pmmlDeployment> <inputData> <readOnce /> <batchScoring /> <fromFile name=&quot;../data/wscoring.nab&quot; type=&quot;UniTable&quot; /> </inputData> <inputModel> <fromFile name=&quot;../consumer/wtraining.nab.pmml&quot; /> </inputModel> <output> <report name=&quot;report&quot;> <toFile name=&quot;../postprocess/wscoring.nab.wtraining.nab.xml&quot; /> <outputRow name=&quot;event&quot;> <score name=&quot;score&quot; /> <alert name=&quot;alert&quot; /> <segments name=&quot;segments&quot; /> </outputRow> </report> </output> </pmmlDeployment>
Scoring Report $ cat postprocess/ wscoring.nab.wtraining.nab.xml <report> < event > < score >0.471458430077</ score > < alert >True</ alert > < Segments ></ Segments > </ event > </report>
Unitable ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]
Key Features of Unitable ,[object Object],[object Object],[object Object],[object Object],[object Object]
Key Features of Unitable (cont) ,[object Object],[object Object],[object Object]
For more information ,[object Object],[object Object],[object Object],[object Object],[object Object],[object Object]

Contenu connexe

Similaire à Augustus Overview Open Source Analytics

Demystifying Amazon Sagemaker (ACD Kochi)
Demystifying Amazon Sagemaker (ACD Kochi)Demystifying Amazon Sagemaker (ACD Kochi)
Demystifying Amazon Sagemaker (ACD Kochi)AWS User Group Pune
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningParis Data Engineers !
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured StreamingKnoldus Inc.
 
ACDKOCHI19 - Demystifying amazon sagemaker
ACDKOCHI19 - Demystifying amazon sagemakerACDKOCHI19 - Demystifying amazon sagemaker
ACDKOCHI19 - Demystifying amazon sagemakerAWS User Group Kochi
 
Build, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMakerBuild, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMakerAmazon Web Services
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and EngineeringVijayananda Mohire
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and EngineeringVijayananda Mohire
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019Mark Tabladillo
 
VSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic WorkflowsVSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic WorkflowsBigML, Inc
 
Spring batch for large enterprises operations
Spring batch for large enterprises operations Spring batch for large enterprises operations
Spring batch for large enterprises operations Ignasi González
 
RPG Program for Unit Testing RPG
RPG Program for Unit Testing RPG RPG Program for Unit Testing RPG
RPG Program for Unit Testing RPG Greg.Helton
 
Developing Drizzle Replication Plugins
Developing Drizzle Replication PluginsDeveloping Drizzle Replication Plugins
Developing Drizzle Replication PluginsPadraig O'Sullivan
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleJim Dowling
 
Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetVasyl Senko
 

Similaire à Augustus Overview Open Source Analytics (20)

Test automation process
Test automation processTest automation process
Test automation process
 
Test automation process _ QTP
Test automation process _ QTPTest automation process _ QTP
Test automation process _ QTP
 
Demystifying Amazon Sagemaker (ACD Kochi)
Demystifying Amazon Sagemaker (ACD Kochi)Demystifying Amazon Sagemaker (ACD Kochi)
Demystifying Amazon Sagemaker (ACD Kochi)
 
QSpiders - Installation and Brief Dose of Load Runner
QSpiders - Installation and Brief Dose of Load RunnerQSpiders - Installation and Brief Dose of Load Runner
QSpiders - Installation and Brief Dose of Load Runner
 
Utilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learningUtilisation de MLflow pour le cycle de vie des projet Machine learning
Utilisation de MLflow pour le cycle de vie des projet Machine learning
 
Introduction to Structured Streaming
Introduction to Structured StreamingIntroduction to Structured Streaming
Introduction to Structured Streaming
 
ACDKOCHI19 - Demystifying amazon sagemaker
ACDKOCHI19 - Demystifying amazon sagemakerACDKOCHI19 - Demystifying amazon sagemaker
ACDKOCHI19 - Demystifying amazon sagemaker
 
Build, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMakerBuild, Train & Deploy Your ML Application on Amazon SageMaker
Build, Train & Deploy Your ML Application on Amazon SageMaker
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
Key projects Data Science and Engineering
Key projects Data Science and EngineeringKey projects Data Science and Engineering
Key projects Data Science and Engineering
 
201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019201906 04 Overview of Automated ML June 2019
201906 04 Overview of Automated ML June 2019
 
VSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic WorkflowsVSSML16 L7. REST API, Bindings, and Basic Workflows
VSSML16 L7. REST API, Bindings, and Basic Workflows
 
Spring batch for large enterprises operations
Spring batch for large enterprises operations Spring batch for large enterprises operations
Spring batch for large enterprises operations
 
About Qtp 92
About Qtp 92About Qtp 92
About Qtp 92
 
About QTP 9.2
About QTP 9.2About QTP 9.2
About QTP 9.2
 
About Qtp_1 92
About Qtp_1 92About Qtp_1 92
About Qtp_1 92
 
RPG Program for Unit Testing RPG
RPG Program for Unit Testing RPG RPG Program for Unit Testing RPG
RPG Program for Unit Testing RPG
 
Developing Drizzle Replication Plugins
Developing Drizzle Replication PluginsDeveloping Drizzle Replication Plugins
Developing Drizzle Replication Plugins
 
Serverless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData SeattleServerless ML Workshop with Hopsworks at PyData Seattle
Serverless ML Workshop with Hopsworks at PyData Seattle
 
Measurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNetMeasurement .Net Performance with BenchmarkDotNet
Measurement .Net Performance with BenchmarkDotNet
 

Dernier

Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...amitlee9823
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...amitlee9823
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...lizamodels9
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756dollysharma2066
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableSeo
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...amitlee9823
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with CultureSeta Wicaksana
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...lizamodels9
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...Aggregage
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdfRenandantas16
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1kcpayne
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfAdmir Softic
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...daisycvs
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxAndy Lambert
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentationuneakwhite
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Serviceritikaroy0888
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableDipal Arora
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...rajveerescorts2022
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxpriyanshujha201
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Sheetaleventcompany
 

Dernier (20)

Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
Call Girls Kengeri Satellite Town Just Call 👗 7737669865 👗 Top Class Call Gir...
 
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
Call Girls Jp Nagar Just Call 👗 7737669865 👗 Top Class Call Girl Service Bang...
 
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
Call Girls In DLf Gurgaon ➥99902@11544 ( Best price)100% Genuine Escort In 24...
 
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
FULL ENJOY Call Girls In Mahipalpur Delhi Contact Us 8377877756
 
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service AvailableCall Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
Call Girls Ludhiana Just Call 98765-12871 Top Class Call Girl Service Available
 
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
Call Girls Electronic City Just Call 👗 7737669865 👗 Top Class Call Girl Servi...
 
Organizational Transformation Lead with Culture
Organizational Transformation Lead with CultureOrganizational Transformation Lead with Culture
Organizational Transformation Lead with Culture
 
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
Call Girls From Pari Chowk Greater Noida ❤️8448577510 ⊹Best Escorts Service I...
 
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
The Path to Product Excellence: Avoiding Common Pitfalls and Enhancing Commun...
 
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf0183760ssssssssssssssssssssssssssss00101011 (27).pdf
0183760ssssssssssssssssssssssssssss00101011 (27).pdf
 
Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1Katrina Personal Brand Project and portfolio 1
Katrina Personal Brand Project and portfolio 1
 
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdfDr. Admir Softic_ presentation_Green Club_ENG.pdf
Dr. Admir Softic_ presentation_Green Club_ENG.pdf
 
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
Quick Doctor In Kuwait +2773`7758`557 Kuwait Doha Qatar Dubai Abu Dhabi Sharj...
 
Monthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptxMonthly Social Media Update April 2024 pptx.pptx
Monthly Social Media Update April 2024 pptx.pptx
 
Uneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration PresentationUneak White's Personal Brand Exploration Presentation
Uneak White's Personal Brand Exploration Presentation
 
Call Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine ServiceCall Girls In Panjim North Goa 9971646499 Genuine Service
Call Girls In Panjim North Goa 9971646499 Genuine Service
 
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service AvailableCall Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
Call Girls Pune Just Call 9907093804 Top Class Call Girl Service Available
 
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
👉Chandigarh Call Girls 👉9878799926👉Just Call👉Chandigarh Call Girl In Chandiga...
 
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptxB.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
B.COM Unit – 4 ( CORPORATE SOCIAL RESPONSIBILITY ( CSR ).pptx
 
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
Chandigarh Escorts Service 📞8868886958📞 Just📲 Call Nihal Chandigarh Call Girl...
 

Augustus Overview Open Source Analytics

  • 1. Introduction to Augustus OVERVIEW Open Data Group September 17, 2009
  • 2. Website and Community Augustus is an open source scoring engine for statistical and data mining models based on the Predictive Model Markup Language (PMML). It is written in Python and is freely available. http://augustus.googlecode.com
  • 3.  
  • 4.
  • 5.  
  • 6.
  • 7.  
  • 8.  
  • 9.  
  • 10.
  • 11.  
  • 12.  
  • 13.
  • 14.
  • 15. Development and Use Cycle 2. Model schema 1. Data Inputs
  • 16. Running Augustus 3. Obtain new model with Producer 4. Score with Consumer
  • 17.
  • 18.
  • 19.
  • 20.
  • 21.
  • 23.
  • 25. Auto: Weighted Batch Using the Baseline for Training: $ cd WeightedBatch `-- scripts |-- consume.py |-- postprocess.py `-- produce.py http://code.google.com/p/augustus/source/browse/#svn/trunk/examples/auto/WeightedBatch
  • 26. Input for the Producer The Producer takes the training data set. In the code, we have declared how we want to test the data import augustus.modellib.baseline.producer.Producer as Producer def makeConfigs(inFile, outFile, inPMML, outPMML): #open data file inf = uni.UniTable().fromfile(inFile) #start the configuration file test = ET.SubElement(root, &quot;test&quot;) test.set(&quot;field&quot;, &quot;Automaker&quot;) test.set(&quot;weightField&quot;, &quot;Count&quot;) test.set(&quot;testStatistic&quot;, &quot;dDist&quot;) test.set(&quot;testType&quot;, &quot;threshold&quot;) test.set(&quot;threshold&quot;, &quot;0.475&quot;)
  • 27. Input for the Producer Continued # use a discrete distribution model for test baseline = ET.SubElement(test, &quot;baseline&quot;) baseline.set(&quot;dist&quot;, &quot;discrete&quot;) baseline.set(&quot;file&quot;, str(inFile)) baseline.set(&quot;type&quot;, &quot;UniTable&quot;) # create the segmentation declarations for the two fields at this level ''' Taken out for the example, other Use Cases will focus on Segments segmentation = ET.SubElement(test, &quot;segmentation&quot;) makeSegment(inf, segmentation, &quot;Color&quot;) ''' #output the configuration file tree = ET.ElementTree(root) tree.write(outFile)
  • 28. Running the Producer( Training) $ cd scripts $ python2.5 produce.py -f wtraining.nab -t20 (0.000 secs) Beginning timing (0.000 secs) Creating configuration file (0.001 secs) Creating input PMML file (0.001 secs) Starting producer (0.000 secs) Inputting configurations (0.001 secs) Inputting model (0.008 secs) Collecting stats for baseline distribution (0.011 secs) Events 20.067% processed (0.009 secs) Events 40.134% processed (0.009 secs) Events 60.201% processed (0.009 secs) Events 80.268% processed (0.009 secs) Events 100.000% processed (0.000 secs) Making test distributions from statistics (0.002 secs) Outputting PMML (0.062 secs) Lifetime of timer
  • 29. Model generated by the Producer <PMML version=&quot;3.1&quot;> <Header copyright=&quot; &quot; /> < DataDictionary > < DataField dataType=&quot;string&quot; name=&quot;Automaker&quot; optype=&quot;categorical&quot; /> < DataField dataType=&quot;string&quot; name=&quot;Color&quot; optype=&quot;categorical&quot; /> < DataField dataType=&quot;float&quot; name=&quot;Count&quot; optype=&quot;continuous&quot; /> </ DataDictionary > < BaselineModel functionName=&quot;baseline&quot;> < MiningSchema > < MiningField name=&quot;Automaker&quot; /> < MiningField name=&quot;Color&quot; /> < MiningField name=&quot;Count&quot; /> </ MiningSchema > </ BaselineModel > </PMML>
  • 30.
  • 31. Producer Output The training step used the code in producer.py to generate a model and get expected results. Training generated the following files: . |-- consumer | `-- wtraining.nab.pmml MODEL WITH EXPECTED VALUES BASED ON THE TRAINING DATA `-- producer |-- wtraining.nab.pmml BASELINE DATA, DATA DICTIONARY, MINING SCHEMA `-- wtraining.nab.xml MODEL FILE USED FOR TRAINING
  • 32.
  • 33.
  • 34. Running the Consumer cd script $ python2.5 consume.py -b wtraining.nab -f wscoring.nab Ready to score . |-- consumer | |-- wscoring.nab.wtraining.nab.xml | `-- wtraining.nab.pmml |-- postprocess | `-- wscoring.nab.wtraining.nab.xml `-- producer |-- wtraining.nab.pmml `-- wtraining.nab.xml This examples generates a report in the post process directory.
  • 35. Consumer (Scoring) output $ cat consumer/wscoring.nab.wtraining.nab.xml <pmmlDeployment> <inputData> <readOnce /> <batchScoring /> <fromFile name=&quot;../data/wscoring.nab&quot; type=&quot;UniTable&quot; /> </inputData> <inputModel> <fromFile name=&quot;../consumer/wtraining.nab.pmml&quot; /> </inputModel> <output> <report name=&quot;report&quot;> <toFile name=&quot;../postprocess/wscoring.nab.wtraining.nab.xml&quot; /> <outputRow name=&quot;event&quot;> <score name=&quot;score&quot; /> <alert name=&quot;alert&quot; /> <segments name=&quot;segments&quot; /> </outputRow> </report> </output> </pmmlDeployment>
  • 36. Scoring Report $ cat postprocess/ wscoring.nab.wtraining.nab.xml <report> < event > < score >0.471458430077</ score > < alert >True</ alert > < Segments ></ Segments > </ event > </report>
  • 37.
  • 38.
  • 39.
  • 40.