SlideShare une entreprise Scribd logo
1  sur  19
Data Analytics Group (DAG)
Research Topics
Data Analytics Group
• 5 academics, 4 postdocs and 15 PhD and
exchanging students.
• 8 Australian Research Council Discovery
projects last 10 years.
• High profile publications.
• Strong industry collaborations.
• Currently > 1.5 millions of fund.
Relationship discovery for forecasting and
decision making
• Association analysis
• Causal inference
• What if reasoning
• ARC project on causal
relationship discovery.
Jixue Liu – UniSA
• The mapping problem: Do the two table show the same type
information. More specifically, is B the number of service
years, C the salary?
• The quality problem: Does the data have errors and how to
detect?
• The cleaning problem: What is the best way to correct the
quality errors?
• D2D CRC project, identity resolution and data linkage in
integrating police.
Data Integration, Quality, Semantics
Posi SvcYrs Sal
Lect 2 50
Lect 3 60
Lect 2 50
A B C
Prof 2 60
Lect 5 70
Prof 3 70
Prof 2 65
Text mining (mining
unstructured data)
• NHMRC Centre of Research
Excellence in Post-Marketing
Surveillance of Medicines and
Medical Devices.
– mining of social media to identify
potential signals of medication
safety
• D2D CRC project
– Beat the news - building
classification and prediction
systems for civil unrest events.
Outlier (anomaly) detection in sensor
(time series) data
• SA Water
– Development of smart data analytics tools to improve
treatment performance assessment.
• Water Research Australia
– Optimisation of existing instrumentation to achieve
better process performance
Positive correlation of birth rate to
stork population
• increasing the stork population would increase
the birth rate?
A brief book
http://www.springer.com/computer/ai/book/978-3-319-14432-0
Software and test data sets: http://nugget.unisa.edu.au/Causalbook/
Building easily interpretable causal models
Big data  causal predictions
Causal
prediction
methods
Causal reasoning (what if analysis)
Changes of causes or context  Changes of
outcomes
?
Causal decision trees
• A decision may not be
causally interpretable.
11
Provide
• Classifications and
• Causal explanations.
Causal decision tree vs decision tree
12
Causal discovery for feature selection
Causal inference– Do calculus
Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000.
X1 X2 … Xn-1 Xn
5.2 7.5 6.5 5.2
5.6 7.2 6.6 5.3
… … … … …
5.4 7.1 7.1 5.7
5.7 6.9 6.9 5.8
+1
+0.8
SA Water Sensor data management
StorageBinary File Database
Non-standard
formats
Manually set Excel
to load file
Difficult to search
and comparison
Standard formats
Automatic data
inputs
Convenient
Easily searchable
and comparable
Visualization
Visualization
Single plot
Fragments
Difficult to compare time
series with different
resolutions
Multiple plots with
different resolutions
Various sources
Interactive
show any range
zoom in and zoom out
Web based
no installation
Analysis
Customized Analysis Functions
Statistics and
comparisons
Prediction view Anomaly detection
Easy for data
analysis
Customized
functions
Expandable
data sources
analytic tools
Reporting
Weekly Summary
Summarisation
Comparison
benchmarks
Deviations
Anomaly
distributions
Automatically evaluate and quantify the quality of water in a week.
Pattern modelling and recognition
Filter
anomalies
Model new
patterns
Update new
patterns

Contenu connexe

Tendances

resume_LangZhou
resume_LangZhouresume_LangZhou
resume_LangZhou
Lang Zhou
 
Collaborating surveycenters
Collaborating surveycentersCollaborating surveycenters
Collaborating surveycenters
Erik Olsen
 
Contect_General Update v3
Contect_General Update v3Contect_General Update v3
Contect_General Update v3
Jason Rebello
 

Tendances (20)

resume_LangZhou
resume_LangZhouresume_LangZhou
resume_LangZhou
 
Moocs Middleware
Moocs Middleware Moocs Middleware
Moocs Middleware
 
Are Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science AlignedAre Funders and Academic Institutions Approaches to Data Science Aligned
Are Funders and Academic Institutions Approaches to Data Science Aligned
 
Talk on reproducibility in EEG research
Talk on reproducibility in EEG researchTalk on reproducibility in EEG research
Talk on reproducibility in EEG research
 
Collaborating surveycenters
Collaborating surveycentersCollaborating surveycenters
Collaborating surveycenters
 
Digital transformation of translational medicine
Digital transformation of translational medicineDigital transformation of translational medicine
Digital transformation of translational medicine
 
50 Years of Data Science
50 Years of Data Science50 Years of Data Science
50 Years of Data Science
 
Contect_General Update v3
Contect_General Update v3Contect_General Update v3
Contect_General Update v3
 
Rebecca E. Cooney MedicReS World Congress 2015
Rebecca E. Cooney MedicReS World Congress 2015Rebecca E. Cooney MedicReS World Congress 2015
Rebecca E. Cooney MedicReS World Congress 2015
 
Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?Big Data: Big Opportunities or Big Trouble?
Big Data: Big Opportunities or Big Trouble?
 
Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...Introduction to Research methodology: Orientation for Doctoral Program Course...
Introduction to Research methodology: Orientation for Doctoral Program Course...
 
ML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the icebergML & AI in Drug development: the hidden part of the iceberg
ML & AI in Drug development: the hidden part of the iceberg
 
Journal club summary: Open Science save lives
Journal club summary: Open Science save livesJournal club summary: Open Science save lives
Journal club summary: Open Science save lives
 
Biomarkers for psychological phenotypes?
Biomarkers for psychological phenotypes?Biomarkers for psychological phenotypes?
Biomarkers for psychological phenotypes?
 
Introducing ReResearch
Introducing ReResearchIntroducing ReResearch
Introducing ReResearch
 
Journal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific ComputingJournal Club - Best Practices for Scientific Computing
Journal Club - Best Practices for Scientific Computing
 
Pharmacy Capstone 6 2 2016
Pharmacy Capstone 6 2 2016Pharmacy Capstone 6 2 2016
Pharmacy Capstone 6 2 2016
 
CIRPA 2016: It's Show Time: Are Your Data Ready to be the "Next Big Thing"?
CIRPA 2016: It's Show Time: Are Your Data Ready to be the "Next Big Thing"?CIRPA 2016: It's Show Time: Are Your Data Ready to be the "Next Big Thing"?
CIRPA 2016: It's Show Time: Are Your Data Ready to be the "Next Big Thing"?
 
Reproducibility of computational research: methods to avoid madness (Session ...
Reproducibility of computational research: methods to avoid madness (Session ...Reproducibility of computational research: methods to avoid madness (Session ...
Reproducibility of computational research: methods to avoid madness (Session ...
 
eric_fitzgerald_resume
eric_fitzgerald_resumeeric_fitzgerald_resume
eric_fitzgerald_resume
 

Similaire à Causal discovery

NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in Education
Philip Piety
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
Michael Atkins
 
Aug2013 NIST program slides
Aug2013 NIST program slidesAug2013 NIST program slides
Aug2013 NIST program slides
GenomeInABottle
 

Similaire à Causal discovery (20)

DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?DataONE Education Module 01: Why Data Management?
DataONE Education Module 01: Why Data Management?
 
The Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based MedicineThe Uneven Future of Evidence-Based Medicine
The Uneven Future of Evidence-Based Medicine
 
Sharing and standards christopher hart - clinical innovation and partnering...
Sharing and standards   christopher hart - clinical innovation and partnering...Sharing and standards   christopher hart - clinical innovation and partnering...
Sharing and standards christopher hart - clinical innovation and partnering...
 
Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015Keynote speech - Carole Goble - Jisc Digital Festival 2015
Keynote speech - Carole Goble - Jisc Digital Festival 2015
 
RARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research ObjectsRARE and FAIR Science: Reproducibility and Research Objects
RARE and FAIR Science: Reproducibility and Research Objects
 
Enhancing Our Capacity for Large Health Dataset Analysis
Enhancing Our Capacity for Large Health Dataset AnalysisEnhancing Our Capacity for Large Health Dataset Analysis
Enhancing Our Capacity for Large Health Dataset Analysis
 
CORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An OverviewCORE: Quantitative Research Methodology: An Overview
CORE: Quantitative Research Methodology: An Overview
 
Finding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics DatasetsFinding and Accessing Human Genomics Datasets
Finding and Accessing Human Genomics Datasets
 
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
ISMB/ECCB 2013 Keynote Goble Results may vary: what is reproducible? why do o...
 
NCME Big Data in Education
NCME Big Data  in EducationNCME Big Data  in Education
NCME Big Data in Education
 
Jisc's new shared data centre
Jisc's new shared data centreJisc's new shared data centre
Jisc's new shared data centre
 
Data_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdfData_Science_Applications_&_Use_Cases.pdf
Data_Science_Applications_&_Use_Cases.pdf
 
2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)2015 GU-ICBI Poster (third printing)
2015 GU-ICBI Poster (third printing)
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Data_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptxData_Science_Applications_&_Use_Cases.pptx
Data_Science_Applications_&_Use_Cases.pptx
 
Martone grethe
Martone gretheMartone grethe
Martone grethe
 
PhRMA Some Early Thoughts
PhRMA Some Early ThoughtsPhRMA Some Early Thoughts
PhRMA Some Early Thoughts
 
Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?Will Biomedical Research Fundamentally Change in the Era of Big Data?
Will Biomedical Research Fundamentally Change in the Era of Big Data?
 
Aug2013 NIST program slides
Aug2013 NIST program slidesAug2013 NIST program slides
Aug2013 NIST program slides
 
Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1Fundamentals of Data science Introduction Unit 1
Fundamentals of Data science Introduction Unit 1
 

Dernier

Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
panagenda
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Safe Software
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
WSO2
 

Dernier (20)

Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)Powerful Google developer tools for immediate impact! (2023-24 C)
Powerful Google developer tools for immediate impact! (2023-24 C)
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu SubbuApidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
Apidays Singapore 2024 - Modernizing Securities Finance by Madhu Subbu
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot ModelNavi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
Navi Mumbai Call Girls 🥰 8617370543 Service Offer VIP Hot Model
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 

Causal discovery

  • 1. Data Analytics Group (DAG) Research Topics
  • 2. Data Analytics Group • 5 academics, 4 postdocs and 15 PhD and exchanging students. • 8 Australian Research Council Discovery projects last 10 years. • High profile publications. • Strong industry collaborations. • Currently > 1.5 millions of fund.
  • 3. Relationship discovery for forecasting and decision making • Association analysis • Causal inference • What if reasoning • ARC project on causal relationship discovery.
  • 4. Jixue Liu – UniSA • The mapping problem: Do the two table show the same type information. More specifically, is B the number of service years, C the salary? • The quality problem: Does the data have errors and how to detect? • The cleaning problem: What is the best way to correct the quality errors? • D2D CRC project, identity resolution and data linkage in integrating police. Data Integration, Quality, Semantics Posi SvcYrs Sal Lect 2 50 Lect 3 60 Lect 2 50 A B C Prof 2 60 Lect 5 70 Prof 3 70 Prof 2 65
  • 5. Text mining (mining unstructured data) • NHMRC Centre of Research Excellence in Post-Marketing Surveillance of Medicines and Medical Devices. – mining of social media to identify potential signals of medication safety • D2D CRC project – Beat the news - building classification and prediction systems for civil unrest events.
  • 6. Outlier (anomaly) detection in sensor (time series) data • SA Water – Development of smart data analytics tools to improve treatment performance assessment. • Water Research Australia – Optimisation of existing instrumentation to achieve better process performance
  • 7. Positive correlation of birth rate to stork population • increasing the stork population would increase the birth rate?
  • 8. A brief book http://www.springer.com/computer/ai/book/978-3-319-14432-0 Software and test data sets: http://nugget.unisa.edu.au/Causalbook/
  • 9. Building easily interpretable causal models Big data  causal predictions Causal prediction methods
  • 10. Causal reasoning (what if analysis) Changes of causes or context  Changes of outcomes ?
  • 11. Causal decision trees • A decision may not be causally interpretable. 11 Provide • Classifications and • Causal explanations.
  • 12. Causal decision tree vs decision tree 12
  • 13. Causal discovery for feature selection
  • 14. Causal inference– Do calculus Judea Pearl. Causality: Models, Reasoning, and Inference. Cambridge University Press, 2000. X1 X2 … Xn-1 Xn 5.2 7.5 6.5 5.2 5.6 7.2 6.6 5.3 … … … … … 5.4 7.1 7.1 5.7 5.7 6.9 6.9 5.8 +1 +0.8
  • 15. SA Water Sensor data management StorageBinary File Database Non-standard formats Manually set Excel to load file Difficult to search and comparison Standard formats Automatic data inputs Convenient Easily searchable and comparable
  • 16. Visualization Visualization Single plot Fragments Difficult to compare time series with different resolutions Multiple plots with different resolutions Various sources Interactive show any range zoom in and zoom out Web based no installation
  • 17. Analysis Customized Analysis Functions Statistics and comparisons Prediction view Anomaly detection Easy for data analysis Customized functions Expandable data sources analytic tools
  • 19. Pattern modelling and recognition Filter anomalies Model new patterns Update new patterns