SlideShare une entreprise Scribd logo
1  sur  49
Télécharger pour lire hors ligne
VP AIOps for the Autonomous Database
Sandesh Rao
Introduction to AutoML and Data Science
using the Oracle Autonomous Database
@sandeshr
https://www.linkedin.com/in/raosandesh/
https://www.slideshare.net/SandeshRao4
The following is intended to outline our general product direction. It is intended for information
purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any
material, code, or functionality, and should not be relied upon in making purchasing decisions. The
development, release, timing, and pricing of any features or functionality described for Oracle’s
products may change and remains at the sole discretion of Oracle Corporation.
Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and
prospects are “forward-looking statements” and are subject to material risks and uncertainties. A
detailed discussion of these factors and other risks that affect our business is contained in Oracle’s
Securities and Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and
Form 10-Q under the heading “Risk Factors.” These filings are available on the SEC’s website or on
Oracle’s website at http://www.oracle.com/investor. All information in this presentation is current as of
September 2019 and Oracle undertakes no duty to update any statement in light of new information or
future events.
Safe harbor statement
1. Overview of ML and the Autonomous Database
2. Journey of the DBA to Data Scientist
3. OML Examples
4. AutoML and what’s coming
5. Questions
Agenda
Tasks Specific to Business and Innovation
• Architecture, planning, data modeling
• Data security and lifecycle management
• Application related tuning
• End-to-End service level management
Maintenance Tasks
• Configuration and tuning of systems, network, storage
• Database provisioning, patching
• Database backups, H/A, disaster recovery
• Database optimization
Traditionally DBAs are Responsible for:
Value Scale
Innovation
Maintenance
Tasks Specific to Business and Innovation
• Architecture, planning, data modeling
• Data security and lifecycle management
• Application related tuning
• End-to-End service level management
Maintenance Tasks
• Configuration and tuning of systems, network, storage
• Database provisioning, patching
• Database backups, H/A, disaster recovery
• Database optimization
Freedom from Drudgery for DBA: More Time to Innovate and Improve the Business
Autonomous Database Removes Generic Tasks
Value Scale
Innovation
Maintenance
Machine Learning
Solving data-driven
problems
Discovering insights
Making predictions
Data Security
Data classification,
Data life-cycle mgmt
Application Tuning
SQL tuning,
connection mgmt
The Evolution of the DBA/Database Developer Role
Data Engineer
Architecture,
“data wrangler”
Data extraction
Data wrangling
Deriving new attributes
(“feature engineering”)
…
…
…
Import predictions & insights
Translate and deploy ML models
Automate
You Are Probably Already Doing Most of This Work!
Database Developer to Data Scientist Journey
1 - https://www.infoworld.com/article/3228245/data-science/the-80-20-data-science-dilemma.html
Typically 80% of the work
Most data scientists spend only 20 percent of their time
on actual data analysis and 80 percent of their time
finding, cleaning, and reorganizing huge amounts of
data, which is an inefficient data strategy1
Eliminated or minimized with Oracle
Data Management platform becomes
combine/hybrid DM + machine learning platform
Analytics Value vs. Maturity
Reports &
Dashboards
Data
Information
Predictions & Insights Appls with ML
Analytical Maturity
ValueofAnalytics
Diagnostic
Analysis &
Reports
Predictive /
Machine
Learning
“ML Enabled”
Applications
What Happened?
Why it Happened?
What WILL happen?
Automated ML Appls
Algorithms automatically sift through large amounts of data to discover
hidden patterns, new insights and make predictions
What is Machine Learning?
Identify most important factor (Attribute Importance)
Predict customer behavior (Classification)
Find profiles of targeted people or items (Classification
Predict or estimate a value (Regression)
Segment a population (Clustering)
Find fraudulent or “rare events” (Anomaly Detection)
Determine co-occurring items in a “basket” (Associations)X1
X2
A1A2A3A4 A5A6 A7
SupervisedLearningUnsupervisedLearning
Copyright © 2020 Oracle and/or its affiliates.
CRISP-DM Methodology
Six Major Steps
10 https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_miningCopyright © 2020 Oracle and/or its affiliates.
DATA UNDERSTANDING
Assemble the “right data”
Data profiling
• Data visualization
• Univariate statistics/group by
• Bi-variate statistics
DATA PREPARATION
Sampling/Stratified
Algorithm req’d transforms
• Auto Data Preparation
• Missing Values, Binning,
Normalization, etc.
• Unstructured data
• Aggregations
Domain specific transforms
• “Engineered Features”
Features Selection
MODELING
Algorithm settings/defaults
• Stratified sampling
• Feature selection
• Build model(s)
EVALUATION
Model evaluation
Model comparison
Model selection
DEPLOYMENT
In-DB ML model apply
• Real-time ML apply
• In-database, REST
Embed methodology
• Applications
• Dashboards
BUSINESS UNDERSTANDING
Well-defined
business problem
* Automated and/or system defaults
Oracle Machine Learning
Oracle Machine Learning extends Oracle Database(s)
and enables users to build “AI” applications and
analytics dashboards
OML delivers powerful in-database machine learning
algorithms, automated ML functionality via SQL APIs
and integration with open source Python* and R.
Oracle Machine Learning
OML Services*
Model Deployment and Management,
Cognitive Image and Text
OML4SQL
SQL API
OML4Py*
Python API
OML4R
R API
OML Notebooks
with Apache Zeppelin on
Autonomous Database
OML4Spark
R API on Big Data
Oracle Data Miner
Oracle SQL Developer extension
* Coming soonCopyright © 2020 Oracle and/or its affiliates.
Database Developer to Data Scientist Journey
Database Developer to Data Scientist Journey
• Business Understanding—Week 1
• Data Understanding—Week 2
• Data Preparation—Week 3
• Modeling (ML)—Week 4
• Evaluation—Week 5
• Deployment—Week 6
Six Major Steps (Oracle Machine Learning POV)
13
https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining
Copyright © 2020 Oracle and/or its affiliates.
Poorly Defined Better
Data Mining
Technique
Predict employees that leave
• Based on past employees that voluntarily left:
• Create New Attribute EmplTurnover à O/1
Predict customers that churn
• Based on past customers that have churned:
• Create New Attribute Churn à YES/NO
Target “best” customers
• Recency, Frequency Monetary (RFM) Analysis
• Specific Dollar Amount over Time Window:
• Who has spent $500+ in most recent 18 months
How can I make more $$? • What helps me sell soft drinks & coffee?
Which customers are likely to buy? • How much is each customer likely to spend?
Who are my “best customers”? • What descriptive “rules” describe “best customers”?
How can I combat fraud?
• Which transactions are the most anomalous?
• Then roll-up to physician, claimant, employee…
Week 1—Business Understanding
Start with a Well-Defined Business Problem Statement
Week 1—Business Understanding
Target “best” customers who have GOOD CREDIT and make payments
15
Be Extremely Specific in your Problem Statement
Copyright © 2020 Oracle and/or its affiliates.
Week 2—Data Understanding
16
Review the Data; Does it Makes Sense?
Are AGEs all positive, 0-120?
Are INCOME values weekly or monthly?
Are the LOAN_AMOUNTS reasonable?
Etc….
Copyright © 2020 Oracle and/or its affiliates.
Week 2—Data Understanding
17
Review the Data; Does it Makes Sense?
Copyright © 2019 Oracle and/or its affiliates.
Simple, exploratory graphs to understand
the data
Copyright © 2020 Oracle and/or its affiliates.
Week 2—Data Understanding
18
Review the Data; Does it Makes Sense?
Are AGEs all positive, 0-120?
Are INCOME values weekly or monthly?
Are the LOAN_AMOUNTS reasonable?
Etc….
Copyright © 2020 Oracle and/or its affiliates.
Week 3—Data Preparation
Prepare the Data, Create New Derived Attributes or “Engineered Features”
Source Attribute New Attribute/”Engineered Feature”
Date of Birth AGE
Address DISTANCE_TO_DESTINATION
COMMUTE_TIME
Call detail records (CDRs) #_DROPPED_CALLS
PERCENT_INTERNATIONAL
Salary PERCENT_VS_PEERS
Purchases TOTALS_PER_CATEGORY (e.g. Food,
Clothing)
Copyright © 2020 Oracle and/or its affiliates.
Week 3—Data Preparation
Oracle Machine Learning’s Auto Data Prep (ADP) and ML algorithms are designed with intelligent defaults
and can automatically deal with:
– Missing values
– Outliers
– Binning
– Too many distinct values
– Too many constants
– Trans data/aggregations
– Unstructured data
– Correlated data
20
Prepare the Data, Create New Derived Attributes or “Engineered Features”
Copyright © 2020 Oracle and/or its affiliates.
Week 4—Modeling (Machine Learning)
21
First, Identify the Key Attributes That Most Influence the Target Attribute
Copyright © 2020 Oracle and/or its affiliates.
Week 4—Modeling (Machine Learning)
22
Training and Testing ML Models using 60/40% Random Samples
Historical DataTrain Test
Build Model Test Model Evaluate ModelTrain ModelHistorical Data
Copyright © 2020 Oracle and/or its affiliates.
Week 4—Modeling (Machine Learning)
23
Build multiple models with different algorithms and settings
Copyright © 2020 Oracle and/or its affiliates.
Week 5—Model Evaluation (ML)
Randomly selected “hold
out” sample of data that
was used to train the ML
model
Compute Cumulative
Gains, Lift, Accuracy, etc.
Review the attributes used
in the model and model
coefficients
Make sure the model
makes sense
24
Next, test model accuracy
Copyright © 2020 Oracle and/or its affiliates.
Model Evaluation
Week 6—Deployment
Simple SQL Apply scripts run
100% inside the Database for
immediate ML model deployment
25
Apply the Models to Predict “Best Customers”
Model Apply/”Scoring”
Copyright © 2020 Oracle and/or its affiliates.
Week 6—Deployment
Simple SQL Apply scripts run 100% inside the Database
for model build, model apply and immediate ML model
deployment
26
Apply the Models to Predict “Best Customers”
Copyright © 2020 Oracle and/or its affiliates.
Model Build
Model Apply
Results
Congratulations!
Almost there J
Data Scientist
OML examples
Simple SQL Syntax—Statistical Comparisons (t-tests)
Compare AVE Purchase Amounts Men vs. Women Grouped_By INCOME_LEVEL
Statistical Functions
SELECT SUBSTR(cust_income_level, 1, 22) income_level,
AVG(DECODE(cust_gender, 'M', amount_sold, null)) sold_to_men,
AVG(DECODE(cust_gender, 'F', amount_sold, null)) sold_to_women,
STATS_T_TEST_INDEPU(cust_gender, amount_sold, 'STATISTIC', 'F') t_observed,
STATS_T_TEST_INDEPU(cust_gender, amount_sold) two_sided_p_value
FROM customers c, sales s
WHERE c.cust_id = s.cust_id
GROUP BY ROLLUP(cust_income_level)
ORDER BY income_level, sold_to_men, sold_to_women, t_observed;
STATS_T_TEST_INDEPU (SQL) Example;
P_Values < 05 show statistically
significantly differences in the amounts
purchased by men vs. women
Simple SQL Syntax—Attribute Importance - ML Model Build (PL/SQL)
OAA Model Build and Real-time SQL Apply Prediction
BEGIN
DBMS_DATA_MINING.CREATE_MODEL(
model_name => 'BUY_INSURANCE_AI',
mining_function => DBMS_DATA_MINING.ATTRIBUTE_IMPORTANCE,
data_table_name => 'CUST_INSUR_LTV',
case_id_column_name => 'cust_id',
target_column_name => 'BUY_INSURANCE',
settings_table_name => 'Att_Import_Mode_Settings');
END;
/
SELECT attribute_name, rank , attribute_value
FROM BUY_INSURANCE_AI
ORDER BY rank, attribute_name;
Model Results (SQL query)
ATTRIBUTE_NAME RANK ATTRIBUTE_VALUE
BANK_FUNDS 1 0.2161
MONEY_MONTLY_OVERDRAWN 2 0.1489
N_TRANS_ATM 3 0.1463
N_TRANS_TELLER 4 0.1156
T_AMOUNT_AUTOM_PAYMENTS 5 0.1095
A1A2A3A4 A5A6 A7
OML for R Model Build
> ore.odmAI (BUY_INSURANCE ~ ., CUST_INSUR_LTV)
Call:
ore.odmAI(formula = BUY_INSURANCE ~ ., data = CUST_INSUR_LTV)
Simple R Language Syntax—Attribute Importance
ML Model Build (R)
Model Results (R)
Importance:
importance rank
BANK_FUNDS 0.2161187797 1
MONEY_MONTLY_OVERDRAWN 0.1489347141 2
N_TRANS_ATM 0.1463026512 3
N_TRANS_TELLER 0.1155879786 4
T_AMOUNT_AUTOM_PAYMENTS 0.1095178647 5
A1A2A3A4 A5A6 A7
Copyright © 2020 Oracle and/or its affiliates.
OML for Python Model Build—Coming soon!
> ai_mod = ai(**setting) # Create AI model object
> ai_mod = ai_mod.fit(train_x, train_y)
Simple Python Language Syntax—Attribute Importance
ML Model Build (Python)
Model Results (Python)
Importance:
variable importance rank
BANK_FUNDS 0.2161187797 1
MONEY_MONTLY_OVERDRAWN 0.1489347141 2
N_TRANS_ATM 0.1463026512 3
N_TRANS_TELLER 0.1155879786 4
T_AMOUNT_AUTOM_PAYMENTS 0.1095178647 5
A1A2A3A4 A5A6 A7
Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning
Key Features:
• Collaborative UI for data
scientist and analysts
• Packaged with Autonomous Databases
• Quick start Example notebooks
• Easy access to shared notebooks,
templates, permissions, scheduler, etc.
• OML4SQL
• OML4Py coming soon
• Supports deployment of OML models
Machine Learning Notebooks included in Autonomous Databases
Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning
Key Features:
• Collaborative UI for data
scientist and analysts
• Packaged with Autonomous Databases
• Quick start Example notebooks
• Easy access to shared notebooks,
templates, permissions, scheduler, etc.
• OML4SQL
• OML4Py coming soon
• Supports deployment of OML models
Machine Learning Notebooks included in Autonomous Databases
Copyright © 2020 Oracle and/or its affiliates.
Oracle Machine Learning for R / Python
Transparency layer
‐ Leverage proxy objects so data remain in database
‐ Overload native functions translating functionality to SQL
‐ Use familiar R/Python syntax to manipulate database data
Parallel, distributed algorithms
‐ Scalability and performance
‐ Exposes in-database algorithms available from OML4SQL
Embedded execution
‐ Manage and invoke R or Python scripts in Oracle Database
‐ Data-parallel, task-parallel, and non-parallel execution
‐ Use open source packages to augment functionality
OML4Py, Automated Machine Learning - AutoML
‐ Feature selection, model selection, hyper-parameter tuning
Multiple Components/APIs of Oracle Machine Learning
Database
Server
Client
SQL Interfaces
SQL*Plus
SQLDeveloper
OML4Py OML4R
Copyright © 2020 Oracle and/or its affiliates.
* Coming soon
Multiple Languages UIs Supported for End Users & Apps Development
Oracle Machine Leaning
Application DevelopersDBAs
R & Python Data Scientists “Citizen” Data ScientistsNotebook Users & DS Teams
New! New!
Coming Soon! | AutoML – new with OML4Py
Auto Feature Selection
– Reduce # of features by
identifying most
predictive
– Improve performance
and accuracy
Increase data scientist productivity – reduce overall compute time
Auto Algorithm
Selection
Much faster than
exhaustive search
Auto Feature
Selection
De-noise data and
reduce # of features
AutoTune
Significant accuracy
improvement
Auto Algorithm Selection
– Identify in-database
algorithm that achieves
highest model quality
– Find best algorithm faster
than with exhaustive search
Auto Tune Hyperparameters
– Significantly improve
model accuracy
– Avoid manual or exhaustive
search techniques
Copyright © 2020 Oracle and/or its affiliates.
Enables non-expert users to leverage Machine Learning
Data
Table ML Model
Coming Soon! | OML AutoML User Interface
Automate production and deployment of ML models
• Enhance Data Scientist productivity
and user-experience
• Enable non-expert users to leverage ML
• Unify model deployment and monitoring
• Support model management
Features
• Minimal user input: data, target
• Model leaderboard
• Model deployment via REST
• Model monitoring
• Cognitive features for image and text
“Code-free” user interface supporting automated end-to-end machine
learning
Copyright © 2020 Oracle and/or its affiliates.
Coming Soon! | OML AutoML User Interface
Automate production and deployment of ML models
• Enhance Data Scientist productivity
and user-experience
• Enable non-expert users to leverage ML
• Unify model deployment and monitoring
• Support model management
Features
• Minimal user input: data, target
• Model leaderboard
• Model deployment via REST
• Model monitoring
• Cognitive features for image and text
“Code-free” user interface supporting automated end-to-end machine
learning
Copyright © 2020 Oracle and/or its affiliates.
Coming Soon! | Algorithms for Database 20c
Gradient Boosted Trees (XGBoost)
• Highly popular and powerful algorithm – Kaggle winners
• Classification, regression, ranking, survival analysis
MSET-SPRT
• Multivariate State Estimation Technique - Sequential
Probability Ratio Test (MSET-SPRT)
• Nonlinear, nonparametric anomaly detection
algorithm designed to monitor critical processes.
• Detects subtle anomalies while also producing
minimal false alarms.
• Calibrates expected behavior from historical normal
operational sequence of monitored signals.
• Re-implemented and sped up in-DB and based on original
Oracle Labs algorithm
Two major new ML algorithms
Copyright © 2020 Oracle and/or its affiliates.
Oracle Data Miner UI
Easy to use to define
analytical
methodologies that
can be shared
SQL Developer
Extension
Workflow API
and generates SQL
code for immediate
deployment
Drag and Drop, Workflows, Easy to Use UI for “Citizen Data Scientist”
Copyright © 2020 Oracle and/or its affiliates.
Manage and Analyze All Your Data
Big Data SQL / R
SQL / R / Python
Object
Store
“Engineered Features”
– Derived attributes
that reflect domain
knowledge—key to
best models e.g.:
• Counts
• Totals
• Changes
over time
Boil down the Data Lake
Architecturally,
lots of options
and flexibility
In-Database Machine Learning
More Models
Better Models
Faster, More Secure
Less Cost
Ready to Deploy!
No Need To Extract and
Move Data
Data stays in Database
Zero time required.
No production impact.
Data Preparation and
Transformation
Accelerated with
Automatic Data Prep
No separate environment
required. Much faster data prep.
Data stays protected and secured.
Data Mining and
Model Building
SQL, R, Python
Oracle Data Miner UI
OML Notebooks
Oracle Data Miner and AutoML
greatly speed model building.
Less skill required. No coding.
No Need to Transform
Production Data
Embedded Data
Preparation
No need for second
production instance.
Model Scoring
Accelerated Via
Exadata Database Machine
Faster model validation
Easy to repeat model building as often as needed
• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined
• OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,
Oracle’s Machine Learning & Adv. Analytics Algorithms
CLASSIFICATION
• Naïve Bayes
• Logistic Regression (GLM)
• Decision Tree
• Random Forest
• Neural Network
• Support Vector Machine
• Explicit Semantic Analysis
CLUSTERING
• Hierarchical K-Means
• Hierarchical O-Cluster
• Expectation Maximization (EM)
ANOMALY DETECTION
• One-Class SVM
TIME SERIES
• State of the art forecasting using
Exponential Smoothing
• Includes all popular models
e.g. Holt-Winters with trends,
seasons, irregularity, missing data
REGRESSION
• Linear Model
• Generalized Linear Model
• Support Vector Machine (SVM)
• Stepwise Linear regression
• Neural Network
• LASSO *
ATTRIBUTE IMPORTANCE
• Minimum Description Length
• Principal Comp Analysis (PCA)
• Unsupervised Pair-wise KL Div
• CUR decomposition for row & AI
ASSOCIATION RULES
• A priori/ market basket
PREDICTIVE QUERIES
• Predict, cluster, detect, features
SQL ANALYTICS
• SQL Windows, SQL Patterns,
SQL Aggregates
FEATURE EXTRACTION
• Principal Comp Analysis (PCA)
• Non-negative Matrix Factorization
• Singular Value Decomposition (SVD)
• Explicit Semantic Analysis (ESA)
TEXT MINING SUPPORT
• Algorithms support text
• Tokenization and theme extraction
• Explicit Semantic Analysis (ESA) for
document similarity
STATISTICAL FUNCTIONS
• Basic statistics: min, max,
median, stdev, t-test, F-test,
Pearson’s, Chi-Sq, ANOVA, etc.
R PACKAGES
• CRAN R Algorithm Packages
through Embedded R Execution
• Spark MLlib algorithm integration
EXPORTABLE ML MODELS
• REST APIs for deployment
X
1
X
2
A
1
A
2
A
3
A
4
A
5
A
6
A
7
ANALYTICAL SQL
• SQL Windows
• SQL Aggregate functions
• LAG/LEAD functions
• SQL for Pattern Matching
• Additional approximate
query
processing: APPROX_COUNT
, APPROX_SUM,
APPROX_RANK
• Regular Expressions
• Linear regression
• ANOVA (Analysis of
variance)
• Test Distribution fit
(e.g. Normal distribution
test, Binomial test, Weibull
test, Uniform
test, Exponential
test, Poisson test, etc.)
• Statistical Aggregates (min,
max, mean, median, stdev,
mode, quantiles, plus x
sigma, minus x sigma, top n
outliers, bottom n outliers)
STATISTICAL FUNCTIONS
• Descriptive statistics
(e.g. median, stdev, mode, sum,
etc.)
• Hypothesis testing
(t-test, F-test, Kolmogorov-
Smirnov test, Mann Whitney
test, Wilcoxon Signed Ranks test
• Correlations analysis
(parametric and nonparametric
e.g.
Pearson’s test for
correlation, Spearman's rho
coefficient, Kendall's tau-b
correlation coefficient)
• Ranking functions
• Cross Tabulations with Chi-square
statistics|
Oracle’s Machine Learning & Adv. Analytics Algorithms
Algorithms Operate on Data
ML and AI are just “Algorithms”
Move the Algorithms; Not the Data!;
It Changes Everything!
Thank You
Any Questions ?
Sandesh Rao
VP AIOps for the Autonomous Database
@sandeshr
https://www.linkedin.com/in/raosandesh/
https://www.slideshare.net/SandeshRao4
Introduction to AutoML and Data Science using the Oracle Autonomous Database - LuxOUG Jun 2020

Contenu connexe

Tendances

LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...Sandesh Rao
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaSandesh Rao
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseSandesh Rao
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGSandesh Rao
 
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...Sandesh Rao
 
Machine Learning and AI at Oracle
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at OracleSandesh Rao
 
Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaData meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaSandesh Rao
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowAIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowSandesh Rao
 
RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016Sandesh Rao
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACSandesh Rao
 
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0Sandesh Rao
 
Biwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labBiwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labCharlie Berger
 
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...Charlie Berger
 
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...Charlie Berger
 
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RAC
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RACNZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RAC
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RACSandesh Rao
 
Exachk Customer Presentation
Exachk Customer PresentationExachk Customer Presentation
Exachk Customer PresentationSandesh Rao
 
20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database 20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database Sandesh Rao
 
What's new in oracle trace file analyzer 18.2.0
What's new in oracle trace file analyzer 18.2.0What's new in oracle trace file analyzer 18.2.0
What's new in oracle trace file analyzer 18.2.0Sandesh Rao
 
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...Sandesh Rao
 
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...Sandesh Rao
 

Tendances (20)

LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
LAD -GroundBreakers-Jul 2019 - The Machine Learning behind the Autonomous Dat...
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmea
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous Database
 
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUGIntroducing new AIOps innovations in Oracle 19c - San Jose AICUG
Introducing new AIOps innovations in Oracle 19c - San Jose AICUG
 
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
 
Machine Learning and AI at Oracle
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at Oracle
 
Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaData meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow India
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowAIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
 
RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016
 
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RACAUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
AUSOUG - NZOUG-GroundBreakers-Jun 2019 - 19c RAC
 
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0What's new in Oracle Trace File Analyzer version 12.2.1.1.0
What's new in Oracle Trace File Analyzer version 12.2.1.1.0
 
Biwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labBiwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on lab
 
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
Oracle Database House Party_Oracle Machine Learning to Pick a Good Inexpensiv...
 
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
Oracle Machine Learning Overview and From Oracle Data Professional to Oracle ...
 
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RAC
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RACNZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RAC
NZOUG-GroundBreakers-2018 - Troubleshooting and Diagnosing 18c RAC
 
Exachk Customer Presentation
Exachk Customer PresentationExachk Customer Presentation
Exachk Customer Presentation
 
20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database 20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database
 
What's new in oracle trace file analyzer 18.2.0
What's new in oracle trace file analyzer 18.2.0What's new in oracle trace file analyzer 18.2.0
What's new in oracle trace file analyzer 18.2.0
 
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
AUSOUG - Introducing New AI Ops Innovations in Oracle 19c Autonomous Health F...
 
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
 

Similaire à Introduction to AutoML and Data Science using the Oracle Autonomous Database - LuxOUG Jun 2020

The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamSenturus
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Sandesh Rao
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data DiscoveryHarald Erb
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Chain Sys Corporation
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...Dr. Wilfred Lin (Ph.D.)
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeCognizant
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIDenodo
 
Oracle Stream Analytics - Developer Introduction
Oracle Stream Analytics - Developer IntroductionOracle Stream Analytics - Developer Introduction
Oracle Stream Analytics - Developer IntroductionJeffrey T. Pollock
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It? Caserta
 
oracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxoracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxAdityaDas899782
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...DATAVERSITY
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeThomas Kelly, PMP
 
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingFast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingCodemotion
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysNEWYORKSYS-IT SOLUTIONS
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessInformatica
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...Big Data Week
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseCaserta
 

Similaire à Introduction to AutoML and Data Science using the Oracle Autonomous Database - LuxOUG Jun 2020 (20)

The Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science TeamThe Data Lake: Empowering Your Data Science Team
The Data Lake: Empowering Your Data Science Team
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...
 
Big Data Discovery
Big Data DiscoveryBig Data Discovery
Big Data Discovery
 
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...Neoaug 2013 critical success factors for data quality management-chain-sys-co...
Neoaug 2013 critical success factors for data quality management-chain-sys-co...
 
5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...5 big data at work linking discovery and bi to improve business outcomes from...
5 big data at work linking discovery and bi to improve business outcomes from...
 
Kaizentric Presentation
Kaizentric PresentationKaizentric Presentation
Kaizentric Presentation
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
 
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BIAugmentation, Collaboration, Governance: Defining the Future of Self-Service BI
Augmentation, Collaboration, Governance: Defining the Future of Self-Service BI
 
Oracle Stream Analytics - Developer Introduction
Oracle Stream Analytics - Developer IntroductionOracle Stream Analytics - Developer Introduction
Oracle Stream Analytics - Developer Introduction
 
A6 big data_in_the_cloud
A6 big data_in_the_cloudA6 big data_in_the_cloud
A6 big data_in_the_cloud
 
What Data Do You Have and Where is It?
What Data Do You Have and Where is It? What Data Do You Have and Where is It?
What Data Do You Have and Where is It?
 
Data Science and Analytics
Data Science and Analytics Data Science and Analytics
Data Science and Analytics
 
oracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptxoracleadvancedanalyticsv2otn-2859525.pptx
oracleadvancedanalyticsv2otn-2859525.pptx
 
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
ADV Slides: What the Aspiring or New Data Scientist Needs to Know About the E...
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
 
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision MakingFast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
Fast Data Mining: Real Time Knowledge Discovery for Predictive Decision Making
 
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ NewyorksysWhat is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
What is OLAP -Data Warehouse Concepts - IT Online Training @ Newyorksys
 
Why an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business SuccessWhy an AI-Powered Data Catalog Tool is Critical to Business Success
Why an AI-Powered Data Catalog Tool is Critical to Business Success
 
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
BDW Chicago 2016 - Ramu Kalvakuntla, Sr. Principal - Technical - Big Data Pra...
 
Big Data's Impact on the Enterprise
Big Data's Impact on the EnterpriseBig Data's Impact on the Enterprise
Big Data's Impact on the Enterprise
 

Plus de Sandesh Rao

Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Sandesh Rao
 
Oracle Database performance tuning using oratop
Oracle Database performance tuning using oratopOracle Database performance tuning using oratop
Oracle Database performance tuning using oratopSandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022Sandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUGSandesh Rao
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao
 
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUGSandesh Rao
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsSandesh Rao
 
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUGSandesh Rao
 
TFA Collector - what can one do with it
TFA Collector - what can one do with it TFA Collector - what can one do with it
TFA Collector - what can one do with it Sandesh Rao
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaIntroduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaSandesh Rao
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Sandesh Rao
 
TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new Sandesh Rao
 
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Sandesh Rao
 
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
The Machine Learning behind the Autonomous Database   ILOUG Feb 2020 The Machine Learning behind the Autonomous Database   ILOUG Feb 2020
The Machine Learning behind the Autonomous Database ILOUG Feb 2020 Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Sandesh Rao
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019Sandesh Rao
 
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...Sandesh Rao
 
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...Sandesh Rao
 

Plus de Sandesh Rao (20)

Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022
 
Oracle Database performance tuning using oratop
Oracle Database performance tuning using oratopOracle Database performance tuning using oratop
Oracle Database performance tuning using oratop
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
 
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
 
TFA Collector - what can one do with it
TFA Collector - what can one do with it TFA Collector - what can one do with it
TFA Collector - what can one do with it
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaIntroduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020
 
TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new
 
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
 
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
The Machine Learning behind the Autonomous Database   ILOUG Feb 2020 The Machine Learning behind the Autonomous Database   ILOUG Feb 2020
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
 
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
 
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
LAD - GroundBreakers - Jul 2019 - Using Oracle Autonomous Health Framework to...
 

Dernier

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DaySri Ambati
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024Lorenzo Miniero
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Scott Keck-Warren
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxNavinnSomaal
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piececharlottematthew16
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxhariprasad279825
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsMark Billinghurst
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostZilliz
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLScyllaDB
 

Dernier (20)

From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo DayH2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
H2O.ai CEO/Founder: Sri Ambati Keynote at Wells Fargo Day
 
SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024SIP trunking in Janus @ Kamailio World 2024
SIP trunking in Janus @ Kamailio World 2024
 
Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024Advanced Test Driven-Development @ php[tek] 2024
Advanced Test Driven-Development @ php[tek] 2024
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
SAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptxSAP Build Work Zone - Overview L2-L3.pptx
SAP Build Work Zone - Overview L2-L3.pptx
 
Story boards and shot lists for my a level piece
Story boards and shot lists for my a level pieceStory boards and shot lists for my a level piece
Story boards and shot lists for my a level piece
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
Artificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptxArtificial intelligence in cctv survelliance.pptx
Artificial intelligence in cctv survelliance.pptx
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
Human Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR SystemsHuman Factors of XR: Using Human Factors to Design XR Systems
Human Factors of XR: Using Human Factors to Design XR Systems
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage CostLeverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
Leverage Zilliz Serverless - Up to 50X Saving for Your Vector Storage Cost
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Developer Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQLDeveloper Data Modeling Mistakes: From Postgres to NoSQL
Developer Data Modeling Mistakes: From Postgres to NoSQL
 

Introduction to AutoML and Data Science using the Oracle Autonomous Database - LuxOUG Jun 2020

  • 1. VP AIOps for the Autonomous Database Sandesh Rao Introduction to AutoML and Data Science using the Oracle Autonomous Database @sandeshr https://www.linkedin.com/in/raosandesh/ https://www.slideshare.net/SandeshRao4
  • 2. The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation. Statements in this presentation relating to Oracle’s future plans, expectations, beliefs, intentions and prospects are “forward-looking statements” and are subject to material risks and uncertainties. A detailed discussion of these factors and other risks that affect our business is contained in Oracle’s Securities and Exchange Commission (SEC) filings, including our most recent reports on Form 10-K and Form 10-Q under the heading “Risk Factors.” These filings are available on the SEC’s website or on Oracle’s website at http://www.oracle.com/investor. All information in this presentation is current as of September 2019 and Oracle undertakes no duty to update any statement in light of new information or future events. Safe harbor statement
  • 3. 1. Overview of ML and the Autonomous Database 2. Journey of the DBA to Data Scientist 3. OML Examples 4. AutoML and what’s coming 5. Questions Agenda
  • 4. Tasks Specific to Business and Innovation • Architecture, planning, data modeling • Data security and lifecycle management • Application related tuning • End-to-End service level management Maintenance Tasks • Configuration and tuning of systems, network, storage • Database provisioning, patching • Database backups, H/A, disaster recovery • Database optimization Traditionally DBAs are Responsible for: Value Scale Innovation Maintenance
  • 5. Tasks Specific to Business and Innovation • Architecture, planning, data modeling • Data security and lifecycle management • Application related tuning • End-to-End service level management Maintenance Tasks • Configuration and tuning of systems, network, storage • Database provisioning, patching • Database backups, H/A, disaster recovery • Database optimization Freedom from Drudgery for DBA: More Time to Innovate and Improve the Business Autonomous Database Removes Generic Tasks Value Scale Innovation Maintenance
  • 6. Machine Learning Solving data-driven problems Discovering insights Making predictions Data Security Data classification, Data life-cycle mgmt Application Tuning SQL tuning, connection mgmt The Evolution of the DBA/Database Developer Role Data Engineer Architecture, “data wrangler”
  • 7. Data extraction Data wrangling Deriving new attributes (“feature engineering”) … … … Import predictions & insights Translate and deploy ML models Automate You Are Probably Already Doing Most of This Work! Database Developer to Data Scientist Journey 1 - https://www.infoworld.com/article/3228245/data-science/the-80-20-data-science-dilemma.html Typically 80% of the work Most data scientists spend only 20 percent of their time on actual data analysis and 80 percent of their time finding, cleaning, and reorganizing huge amounts of data, which is an inefficient data strategy1 Eliminated or minimized with Oracle Data Management platform becomes combine/hybrid DM + machine learning platform
  • 8. Analytics Value vs. Maturity Reports & Dashboards Data Information Predictions & Insights Appls with ML Analytical Maturity ValueofAnalytics Diagnostic Analysis & Reports Predictive / Machine Learning “ML Enabled” Applications What Happened? Why it Happened? What WILL happen? Automated ML Appls
  • 9. Algorithms automatically sift through large amounts of data to discover hidden patterns, new insights and make predictions What is Machine Learning? Identify most important factor (Attribute Importance) Predict customer behavior (Classification) Find profiles of targeted people or items (Classification Predict or estimate a value (Regression) Segment a population (Clustering) Find fraudulent or “rare events” (Anomaly Detection) Determine co-occurring items in a “basket” (Associations)X1 X2 A1A2A3A4 A5A6 A7 SupervisedLearningUnsupervisedLearning Copyright © 2020 Oracle and/or its affiliates.
  • 10. CRISP-DM Methodology Six Major Steps 10 https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_miningCopyright © 2020 Oracle and/or its affiliates. DATA UNDERSTANDING Assemble the “right data” Data profiling • Data visualization • Univariate statistics/group by • Bi-variate statistics DATA PREPARATION Sampling/Stratified Algorithm req’d transforms • Auto Data Preparation • Missing Values, Binning, Normalization, etc. • Unstructured data • Aggregations Domain specific transforms • “Engineered Features” Features Selection MODELING Algorithm settings/defaults • Stratified sampling • Feature selection • Build model(s) EVALUATION Model evaluation Model comparison Model selection DEPLOYMENT In-DB ML model apply • Real-time ML apply • In-database, REST Embed methodology • Applications • Dashboards BUSINESS UNDERSTANDING Well-defined business problem * Automated and/or system defaults
  • 11. Oracle Machine Learning Oracle Machine Learning extends Oracle Database(s) and enables users to build “AI” applications and analytics dashboards OML delivers powerful in-database machine learning algorithms, automated ML functionality via SQL APIs and integration with open source Python* and R. Oracle Machine Learning OML Services* Model Deployment and Management, Cognitive Image and Text OML4SQL SQL API OML4Py* Python API OML4R R API OML Notebooks with Apache Zeppelin on Autonomous Database OML4Spark R API on Big Data Oracle Data Miner Oracle SQL Developer extension * Coming soonCopyright © 2020 Oracle and/or its affiliates.
  • 12. Database Developer to Data Scientist Journey
  • 13. Database Developer to Data Scientist Journey • Business Understanding—Week 1 • Data Understanding—Week 2 • Data Preparation—Week 3 • Modeling (ML)—Week 4 • Evaluation—Week 5 • Deployment—Week 6 Six Major Steps (Oracle Machine Learning POV) 13 https://en.wikipedia.org/wiki/Cross-industry_standard_process_for_data_mining Copyright © 2020 Oracle and/or its affiliates.
  • 14. Poorly Defined Better Data Mining Technique Predict employees that leave • Based on past employees that voluntarily left: • Create New Attribute EmplTurnover à O/1 Predict customers that churn • Based on past customers that have churned: • Create New Attribute Churn à YES/NO Target “best” customers • Recency, Frequency Monetary (RFM) Analysis • Specific Dollar Amount over Time Window: • Who has spent $500+ in most recent 18 months How can I make more $$? • What helps me sell soft drinks & coffee? Which customers are likely to buy? • How much is each customer likely to spend? Who are my “best customers”? • What descriptive “rules” describe “best customers”? How can I combat fraud? • Which transactions are the most anomalous? • Then roll-up to physician, claimant, employee… Week 1—Business Understanding Start with a Well-Defined Business Problem Statement
  • 15. Week 1—Business Understanding Target “best” customers who have GOOD CREDIT and make payments 15 Be Extremely Specific in your Problem Statement Copyright © 2020 Oracle and/or its affiliates.
  • 16. Week 2—Data Understanding 16 Review the Data; Does it Makes Sense? Are AGEs all positive, 0-120? Are INCOME values weekly or monthly? Are the LOAN_AMOUNTS reasonable? Etc…. Copyright © 2020 Oracle and/or its affiliates.
  • 17. Week 2—Data Understanding 17 Review the Data; Does it Makes Sense? Copyright © 2019 Oracle and/or its affiliates. Simple, exploratory graphs to understand the data Copyright © 2020 Oracle and/or its affiliates.
  • 18. Week 2—Data Understanding 18 Review the Data; Does it Makes Sense? Are AGEs all positive, 0-120? Are INCOME values weekly or monthly? Are the LOAN_AMOUNTS reasonable? Etc…. Copyright © 2020 Oracle and/or its affiliates.
  • 19. Week 3—Data Preparation Prepare the Data, Create New Derived Attributes or “Engineered Features” Source Attribute New Attribute/”Engineered Feature” Date of Birth AGE Address DISTANCE_TO_DESTINATION COMMUTE_TIME Call detail records (CDRs) #_DROPPED_CALLS PERCENT_INTERNATIONAL Salary PERCENT_VS_PEERS Purchases TOTALS_PER_CATEGORY (e.g. Food, Clothing) Copyright © 2020 Oracle and/or its affiliates.
  • 20. Week 3—Data Preparation Oracle Machine Learning’s Auto Data Prep (ADP) and ML algorithms are designed with intelligent defaults and can automatically deal with: – Missing values – Outliers – Binning – Too many distinct values – Too many constants – Trans data/aggregations – Unstructured data – Correlated data 20 Prepare the Data, Create New Derived Attributes or “Engineered Features” Copyright © 2020 Oracle and/or its affiliates.
  • 21. Week 4—Modeling (Machine Learning) 21 First, Identify the Key Attributes That Most Influence the Target Attribute Copyright © 2020 Oracle and/or its affiliates.
  • 22. Week 4—Modeling (Machine Learning) 22 Training and Testing ML Models using 60/40% Random Samples Historical DataTrain Test Build Model Test Model Evaluate ModelTrain ModelHistorical Data Copyright © 2020 Oracle and/or its affiliates.
  • 23. Week 4—Modeling (Machine Learning) 23 Build multiple models with different algorithms and settings Copyright © 2020 Oracle and/or its affiliates.
  • 24. Week 5—Model Evaluation (ML) Randomly selected “hold out” sample of data that was used to train the ML model Compute Cumulative Gains, Lift, Accuracy, etc. Review the attributes used in the model and model coefficients Make sure the model makes sense 24 Next, test model accuracy Copyright © 2020 Oracle and/or its affiliates. Model Evaluation
  • 25. Week 6—Deployment Simple SQL Apply scripts run 100% inside the Database for immediate ML model deployment 25 Apply the Models to Predict “Best Customers” Model Apply/”Scoring” Copyright © 2020 Oracle and/or its affiliates.
  • 26. Week 6—Deployment Simple SQL Apply scripts run 100% inside the Database for model build, model apply and immediate ML model deployment 26 Apply the Models to Predict “Best Customers” Copyright © 2020 Oracle and/or its affiliates. Model Build Model Apply Results
  • 29. Simple SQL Syntax—Statistical Comparisons (t-tests) Compare AVE Purchase Amounts Men vs. Women Grouped_By INCOME_LEVEL Statistical Functions SELECT SUBSTR(cust_income_level, 1, 22) income_level, AVG(DECODE(cust_gender, 'M', amount_sold, null)) sold_to_men, AVG(DECODE(cust_gender, 'F', amount_sold, null)) sold_to_women, STATS_T_TEST_INDEPU(cust_gender, amount_sold, 'STATISTIC', 'F') t_observed, STATS_T_TEST_INDEPU(cust_gender, amount_sold) two_sided_p_value FROM customers c, sales s WHERE c.cust_id = s.cust_id GROUP BY ROLLUP(cust_income_level) ORDER BY income_level, sold_to_men, sold_to_women, t_observed; STATS_T_TEST_INDEPU (SQL) Example; P_Values < 05 show statistically significantly differences in the amounts purchased by men vs. women
  • 30. Simple SQL Syntax—Attribute Importance - ML Model Build (PL/SQL) OAA Model Build and Real-time SQL Apply Prediction BEGIN DBMS_DATA_MINING.CREATE_MODEL( model_name => 'BUY_INSURANCE_AI', mining_function => DBMS_DATA_MINING.ATTRIBUTE_IMPORTANCE, data_table_name => 'CUST_INSUR_LTV', case_id_column_name => 'cust_id', target_column_name => 'BUY_INSURANCE', settings_table_name => 'Att_Import_Mode_Settings'); END; / SELECT attribute_name, rank , attribute_value FROM BUY_INSURANCE_AI ORDER BY rank, attribute_name; Model Results (SQL query) ATTRIBUTE_NAME RANK ATTRIBUTE_VALUE BANK_FUNDS 1 0.2161 MONEY_MONTLY_OVERDRAWN 2 0.1489 N_TRANS_ATM 3 0.1463 N_TRANS_TELLER 4 0.1156 T_AMOUNT_AUTOM_PAYMENTS 5 0.1095 A1A2A3A4 A5A6 A7
  • 31. OML for R Model Build > ore.odmAI (BUY_INSURANCE ~ ., CUST_INSUR_LTV) Call: ore.odmAI(formula = BUY_INSURANCE ~ ., data = CUST_INSUR_LTV) Simple R Language Syntax—Attribute Importance ML Model Build (R) Model Results (R) Importance: importance rank BANK_FUNDS 0.2161187797 1 MONEY_MONTLY_OVERDRAWN 0.1489347141 2 N_TRANS_ATM 0.1463026512 3 N_TRANS_TELLER 0.1155879786 4 T_AMOUNT_AUTOM_PAYMENTS 0.1095178647 5 A1A2A3A4 A5A6 A7 Copyright © 2020 Oracle and/or its affiliates.
  • 32. OML for Python Model Build—Coming soon! > ai_mod = ai(**setting) # Create AI model object > ai_mod = ai_mod.fit(train_x, train_y) Simple Python Language Syntax—Attribute Importance ML Model Build (Python) Model Results (Python) Importance: variable importance rank BANK_FUNDS 0.2161187797 1 MONEY_MONTLY_OVERDRAWN 0.1489347141 2 N_TRANS_ATM 0.1463026512 3 N_TRANS_TELLER 0.1155879786 4 T_AMOUNT_AUTOM_PAYMENTS 0.1095178647 5 A1A2A3A4 A5A6 A7 Copyright © 2020 Oracle and/or its affiliates.
  • 33. Oracle Machine Learning Key Features: • Collaborative UI for data scientist and analysts • Packaged with Autonomous Databases • Quick start Example notebooks • Easy access to shared notebooks, templates, permissions, scheduler, etc. • OML4SQL • OML4Py coming soon • Supports deployment of OML models Machine Learning Notebooks included in Autonomous Databases Copyright © 2020 Oracle and/or its affiliates.
  • 34. Oracle Machine Learning Key Features: • Collaborative UI for data scientist and analysts • Packaged with Autonomous Databases • Quick start Example notebooks • Easy access to shared notebooks, templates, permissions, scheduler, etc. • OML4SQL • OML4Py coming soon • Supports deployment of OML models Machine Learning Notebooks included in Autonomous Databases Copyright © 2020 Oracle and/or its affiliates.
  • 35. Oracle Machine Learning for R / Python Transparency layer ‐ Leverage proxy objects so data remain in database ‐ Overload native functions translating functionality to SQL ‐ Use familiar R/Python syntax to manipulate database data Parallel, distributed algorithms ‐ Scalability and performance ‐ Exposes in-database algorithms available from OML4SQL Embedded execution ‐ Manage and invoke R or Python scripts in Oracle Database ‐ Data-parallel, task-parallel, and non-parallel execution ‐ Use open source packages to augment functionality OML4Py, Automated Machine Learning - AutoML ‐ Feature selection, model selection, hyper-parameter tuning Multiple Components/APIs of Oracle Machine Learning Database Server Client SQL Interfaces SQL*Plus SQLDeveloper OML4Py OML4R Copyright © 2020 Oracle and/or its affiliates. * Coming soon
  • 36. Multiple Languages UIs Supported for End Users & Apps Development Oracle Machine Leaning Application DevelopersDBAs R & Python Data Scientists “Citizen” Data ScientistsNotebook Users & DS Teams New! New!
  • 37. Coming Soon! | AutoML – new with OML4Py Auto Feature Selection – Reduce # of features by identifying most predictive – Improve performance and accuracy Increase data scientist productivity – reduce overall compute time Auto Algorithm Selection Much faster than exhaustive search Auto Feature Selection De-noise data and reduce # of features AutoTune Significant accuracy improvement Auto Algorithm Selection – Identify in-database algorithm that achieves highest model quality – Find best algorithm faster than with exhaustive search Auto Tune Hyperparameters – Significantly improve model accuracy – Avoid manual or exhaustive search techniques Copyright © 2020 Oracle and/or its affiliates. Enables non-expert users to leverage Machine Learning Data Table ML Model
  • 38. Coming Soon! | OML AutoML User Interface Automate production and deployment of ML models • Enhance Data Scientist productivity and user-experience • Enable non-expert users to leverage ML • Unify model deployment and monitoring • Support model management Features • Minimal user input: data, target • Model leaderboard • Model deployment via REST • Model monitoring • Cognitive features for image and text “Code-free” user interface supporting automated end-to-end machine learning Copyright © 2020 Oracle and/or its affiliates.
  • 39. Coming Soon! | OML AutoML User Interface Automate production and deployment of ML models • Enhance Data Scientist productivity and user-experience • Enable non-expert users to leverage ML • Unify model deployment and monitoring • Support model management Features • Minimal user input: data, target • Model leaderboard • Model deployment via REST • Model monitoring • Cognitive features for image and text “Code-free” user interface supporting automated end-to-end machine learning Copyright © 2020 Oracle and/or its affiliates.
  • 40. Coming Soon! | Algorithms for Database 20c Gradient Boosted Trees (XGBoost) • Highly popular and powerful algorithm – Kaggle winners • Classification, regression, ranking, survival analysis MSET-SPRT • Multivariate State Estimation Technique - Sequential Probability Ratio Test (MSET-SPRT) • Nonlinear, nonparametric anomaly detection algorithm designed to monitor critical processes. • Detects subtle anomalies while also producing minimal false alarms. • Calibrates expected behavior from historical normal operational sequence of monitored signals. • Re-implemented and sped up in-DB and based on original Oracle Labs algorithm Two major new ML algorithms Copyright © 2020 Oracle and/or its affiliates.
  • 41. Oracle Data Miner UI Easy to use to define analytical methodologies that can be shared SQL Developer Extension Workflow API and generates SQL code for immediate deployment Drag and Drop, Workflows, Easy to Use UI for “Citizen Data Scientist” Copyright © 2020 Oracle and/or its affiliates.
  • 42.
  • 43. Manage and Analyze All Your Data Big Data SQL / R SQL / R / Python Object Store “Engineered Features” – Derived attributes that reflect domain knowledge—key to best models e.g.: • Counts • Totals • Changes over time Boil down the Data Lake Architecturally, lots of options and flexibility
  • 44. In-Database Machine Learning More Models Better Models Faster, More Secure Less Cost Ready to Deploy! No Need To Extract and Move Data Data stays in Database Zero time required. No production impact. Data Preparation and Transformation Accelerated with Automatic Data Prep No separate environment required. Much faster data prep. Data stays protected and secured. Data Mining and Model Building SQL, R, Python Oracle Data Miner UI OML Notebooks Oracle Data Miner and AutoML greatly speed model building. Less skill required. No coding. No Need to Transform Production Data Embedded Data Preparation No need for second production instance. Model Scoring Accelerated Via Exadata Database Machine Faster model validation Easy to repeat model building as often as needed
  • 45. • OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined • OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc, Oracle’s Machine Learning & Adv. Analytics Algorithms CLASSIFICATION • Naïve Bayes • Logistic Regression (GLM) • Decision Tree • Random Forest • Neural Network • Support Vector Machine • Explicit Semantic Analysis CLUSTERING • Hierarchical K-Means • Hierarchical O-Cluster • Expectation Maximization (EM) ANOMALY DETECTION • One-Class SVM TIME SERIES • State of the art forecasting using Exponential Smoothing • Includes all popular models e.g. Holt-Winters with trends, seasons, irregularity, missing data REGRESSION • Linear Model • Generalized Linear Model • Support Vector Machine (SVM) • Stepwise Linear regression • Neural Network • LASSO * ATTRIBUTE IMPORTANCE • Minimum Description Length • Principal Comp Analysis (PCA) • Unsupervised Pair-wise KL Div • CUR decomposition for row & AI ASSOCIATION RULES • A priori/ market basket PREDICTIVE QUERIES • Predict, cluster, detect, features SQL ANALYTICS • SQL Windows, SQL Patterns, SQL Aggregates FEATURE EXTRACTION • Principal Comp Analysis (PCA) • Non-negative Matrix Factorization • Singular Value Decomposition (SVD) • Explicit Semantic Analysis (ESA) TEXT MINING SUPPORT • Algorithms support text • Tokenization and theme extraction • Explicit Semantic Analysis (ESA) for document similarity STATISTICAL FUNCTIONS • Basic statistics: min, max, median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc. R PACKAGES • CRAN R Algorithm Packages through Embedded R Execution • Spark MLlib algorithm integration EXPORTABLE ML MODELS • REST APIs for deployment X 1 X 2 A 1 A 2 A 3 A 4 A 5 A 6 A 7
  • 46. ANALYTICAL SQL • SQL Windows • SQL Aggregate functions • LAG/LEAD functions • SQL for Pattern Matching • Additional approximate query processing: APPROX_COUNT , APPROX_SUM, APPROX_RANK • Regular Expressions • Linear regression • ANOVA (Analysis of variance) • Test Distribution fit (e.g. Normal distribution test, Binomial test, Weibull test, Uniform test, Exponential test, Poisson test, etc.) • Statistical Aggregates (min, max, mean, median, stdev, mode, quantiles, plus x sigma, minus x sigma, top n outliers, bottom n outliers) STATISTICAL FUNCTIONS • Descriptive statistics (e.g. median, stdev, mode, sum, etc.) • Hypothesis testing (t-test, F-test, Kolmogorov- Smirnov test, Mann Whitney test, Wilcoxon Signed Ranks test • Correlations analysis (parametric and nonparametric e.g. Pearson’s test for correlation, Spearman's rho coefficient, Kendall's tau-b correlation coefficient) • Ranking functions • Cross Tabulations with Chi-square statistics| Oracle’s Machine Learning & Adv. Analytics Algorithms
  • 47. Algorithms Operate on Data ML and AI are just “Algorithms” Move the Algorithms; Not the Data!; It Changes Everything!
  • 48. Thank You Any Questions ? Sandesh Rao VP AIOps for the Autonomous Database @sandeshr https://www.linkedin.com/in/raosandesh/ https://www.slideshare.net/SandeshRao4