SlideShare une entreprise Scribd logo
1  sur  60
Télécharger pour lire hors ligne
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
When Data meets AI
AICUG Meetup – Santa Clara , Oracle
Sandesh Rao
VP AIOps , Autonomous Database
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Safe Harbor Statement
The following is intended to outline our general product direction. It is intended for
information purposes only, and may not be incorporated into any contract. It is not a
commitment to deliver any material, code, or functionality, and should not be relied upon
in making purchasing decisions. The development, release, timing, and pricing of any
features or functionality described for Oracle’s products may change and remains at the
sole discretion of Oracle Corporation.
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
whoami
Real
Application
Clusters - HA
DataGuard-
DR
Machine
Learning-
AIOps
Enterprise
Management
Sharding
Big Data
Operational
Management
Home
Automation
Geek
@sandeshr
https://www.linkedin.com/in/raosandesh/
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Agenda
• What motivated us to go into Machine
Learning ?
• Which algorithms, tools & technologies are
used?
• Oracle & Machine Learning initiatives and
tools
• Cx_oracle and OML4Py -
• Questions and Open Talk
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 5
Why Machine Learning for us and why now?
• Lots of Data generated as exhaust from systems
– Cloud , different formats and interfaces , frameworks
• Machine Learning has become accessible
– Anyone can be a Data Scientist
– Algorithms are accessible as libraries aka scikit , keras ,
tensorflow ..
– Sandbox to get started as easy as a docker init
• Business use cases
• How to find value from the data , fewer guesses to make decisions
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
ML Project Workflow
• Set Business Objectives
• Gather , Prepare and Cleanse Data
• Model Data
– Feature Extraction , Test , Train ,
Optimizer
– Loss Function , effectiveness
– Framework and Library to use
• Apply the Model as an inference
engine
– Decision making using the Model’s
output
– Tune Model till outcome is closer to
Business Objective
6
Set Business
Objectives
Understand Use
case
Create Pseudo
Code
Synthetic Data
Generation
Pick Tools and
Frameworks
Train Test Model
Deploy Model
Measure Results
and Feedback
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Types of Machine Learning
Supervised Learning
Predict future outcomes with the help of
training data provided by human experts
Semi-Supervised Learning
Discover patterns within raw data and make
predictions, which are then reviewed by human
experts, who provide feedback which is used to
improve the model accuracy
Unsupervised Learning
Find patterns without any external input other
than the raw data
Reinforcement Learning
Take decisions based on past rewards for this
type of action
7
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
• Hierarchical k-means, Orthogonal
Partitioning Clustering, Expectation-
Maximization
Clustering
Feature Extraction/Attribute
Importance / Component Analysis
• Decision Tree, Naive Bayes, Random
Forest, Logistic Regression, Support
Vector Machine
Classification
8
Machine Learning Algorithms
• Multiple Regression, Support Vector
Machine, Linear Model, LASSO, Random
Forest, Ridge Regression, Generalized
Linear Model, Stepwise Linear Regression
Regression
Association & Collaborative Filtering
Reinforcement Learning - brute force,
Monte Carlo, temporal difference....
• Many different use cases
Neural network & deep Learning with
Deep Neural Network
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Modeling Phase – AutoML to the rescue
Provide Dataset to
AutoML2
Configuration parameters
for model picked
Dataset is divided into
training set & testing set
Actual Training
Evaluate performance of
trained model
Tweak model parameters,
change predictors change
test/train data splits and
change algorithms
Pick model plus
parameters depending on
outcome and measure , F1
, Precision , Recall , MSE
Document all runs and
apply A/B testing to see
what the variations
produce
9
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 10
Tools & Libraries Assisting ML projects
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
What is Oracle Doing Around Machine Learning?
• Big Data Appliance
• Big Data Discovery , Big Data Preparation Data Visualization Cloud
• Analytics Cloud
– Sales, Marketing, HCM on top of SaaS
• DaaS – Oracle Data Cloud , Eloqua ..
• Oracle Labs (labs.oracle.com)
– Machine Learning Research Group
• Autonomous Database
– Zeppelin Notebooks preloaded with use cases
– Applied Machine Learning used for Implementing AIOps
11
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle AI Platform Cloud Service – Coming Soon…
• Collaborative end-to-end machine learning in the cloud
• Enables data science teams to
– Organize their work
– Access data and computing resources
– Build , Train , Deploy
– Manage models
• Collaborative , Self-Service , Integrated
• https://cloud.oracle.com/en_US/ai-platform
12
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Autonomous Data Warehouse Cloud Key Features
Highly Elastic
Independently scale compute and
storage, without having to overpay for
fixed blocks of resources
Built-in Web-Based SQL ML Tool
Apache Zeppelin Oracle Machine Learning
notebooks ready to run ML from browser
Database migration utility
Dedicated cloud-ready migration tools
for easy migration from Amazon
Redshift, SQL Server and other databases
Enterprise Grade Security
Data is encrypted by default in the cloud,
as well as in transit and at rest
High-Performance Queries
and Concurrent Workloads
Optimized query performance with
preconfigured resource profiles for different
types of users
Oracle SQL
Autonomous DW Cloud is compatible with
all business analytics tools that support
Oracle Database
Self Driving
Fully automated database for self-tuning
patching and upgrading itself while the
system is running
Cloud-Based Data Loading
Fast, scalable data-loading from Oracle
Object Store, AWS S3, or on-premises
13
Oracle Machine Learning
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Machine Learning and Advanced Analytics
• Support multiple data platforms, analytical engines, languages, UIs and
deployment strategies
Strategy and Road Map
Big Data / Big Data Cloud Relational
ML Algorithms
Common core, parallel, distributed
SQL R, Python, etc.GUI
Data Miner, RStudio
Notebooks
Advanced Analytics
Oracle Database Cloud DWCS
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
CLASSIFICATION
– Naïve Bayes
– Logistic Regression (GLM)
– Decision Tree
– Random Forest
– Neural Network
– Support Vector Machine
– Explicit Semantic Analysis
CLUSTERING
– Hierarchical K-Means
– Hierarchical O-Cluster
– Expectation Maximization (EM)
ANOMALY DETECTION
– One-Class SVM
TIME SERIES
– Holt-Winters, Regular & Irregular,
with and w/o trends & seasonal
– Single, Double Exp Smoothing
REGRESSION
– Linear Model
– Generalized Linear Model
– Support Vector Machine (SVM)
– Stepwise Linear regression
– Neural Network
– LASSO
ATTRIBUTE IMPORTANCE
– Minimum Description Length
– Principal Comp Analysis (PCA)
– Unsupervised Pair-wise KL Div
– CUR decomposition for row & AI
ASSOCIATION RULES
– A priori/ market basket
PREDICTIVE QUERIES
– Predict, cluster, detect, features
SQL ANALYTICS
– SQL Windows, SQL Patterns,
SQL Aggregates
A1 A2 A3 A4 A5 A6 A7
• OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined
• OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc,
Oracle’s Machine Learning & Adv. Analytics Algorithms
FEATURE EXTRACTION
– Principal Comp Analysis (PCA)
– Non-negative Matrix Factorization
– Singular Value Decomposition (SVD)
– Explicit Semantic Analysis (ESA)
TEXT MINING SUPPORT
– Algorithms support text type
– Tokenization and theme extraction
– Explicit Semantic Analysis (ESA) for
document similarity
STATISTICAL FUNCTIONS
– Basic statistics: min, max,
median, stdev, t-test, F-test,
Pearson’s, Chi-Sq, ANOVA, etc.
R PACKAGES
– CRAN R Algorithm Packages
through Embedded R Execution
– Spark MLlib algorithm integration
EXPORTABLE ML MODELS
– C and Java code for deployment
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Machine Learning
Key Features
• Collaborative UI for data scientists
– Packaged with Autonomous Data
Warehouse Cloud (V1)
– Easy access to shared notebooks,
templates, permissions, scheduler, etc.
– SQL ML algorithms API (V1)
– Supports deployment of ML analytics
Machine Learning Notebook for Autonomous Data Warehouse Cloud
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Machine Learning UI in ADW
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
AI and ML with Python and Oracle
1
2
3
What is Python
Oracle’s Advanced Analytics
cx_Oracle Package
Oracle Machine Learning for Python
20
4
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
What is Python?
• An interpreted, object-oriented, high level, general purpose
programming language
• Designed for rapid application development and scripting to connect
existing components
• Open source scripting language and environment
https://www.python.org
• Created in the late 1980s
• World-wide usage
– Widely taught in Universities
– Many Data Scientists know and use Python
• Thousands of open source packages to enhance productivity
21
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Popularity by question views for major PLs
Growth of major programming languages using Stack Overflow question views
22
https://insights.stackoverflow.com/trends?tags=python%2Cjavascript%2Cjava%2Cc%23%2Cphp%2Cc%2B%2B&utm_source=so
-owned&utm_medium=blog&utm_campaign=gen-blog&utm_content=blog-link&utm_term=incredible-growth-python
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Why use Python?
23
Small
Resembles English
Many third-party libraries
Strict punctuation rules
Uniform code formatting
PyPI – 168203 projects
https://pypi.python.org/pypi
Wide-spread user groupsSimple language
Heavily used for websites
Increasingly used by data scientists
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Python IDEs
24
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OAC/OBIEE/ODV
Oracle Database Enterprise Edition
Oracle’s Advanced Analytics
Multiple interfaces across platforms — SQL, R, Python*, GUI, Dashboards, Apps
Oracle Advanced Analytics - Database Option
SQL, R & Python* Integration
for Scalable, Distributed, Parallel in-Database ML Execution
SQL Developer/
Oracle Data Miner
ApplicationsR & Python* Clients
Data / Business AnalystsR & Python programmers Business Analysts/Mgrs Domain End UsersUsers
Platform
Hadoop
Oracle R Advanced
Analytics for Hadoop
Big Data Connectors
Parallel, distributed
Spark-based algorithms
Oracle Cloud
25
Oracle Database
* Not yet released
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Advanced Analytics differentiators
Work directly with data in Database and Hadoop
• Eliminate need to request extracts from IT/DBA – immediate access to database and Hadoop data
• Process data where they reside – minimize or eliminate data movement
Scalability and Performance
• Use parallel, distributed algorithms that scale to big data on Oracle Database and Hadoop platforms
• Leverage powerful engineered systems to build models on billions of rows of data or
millions of models in parallel
Ease of deployment
• Using Oracle Database, place Python, R, and SQL scripts immediately in production (no need to recode)
• Use production quality infrastructure without custom plumbing or extra complexity
Process support
• Maintain and ensure data security, backup, and recovery using existing processes
• Store, access, manage, and track analytics objects (models, scripts, workflows, data) in Oracle Database
26
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle’s Python Technologies
Supporting Oracle Database
• cx_Oracle package
• Oracle Machine Learning for Python
Component of the Oracle Advanced Analytics option to Oracle Database
27
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
cx_Oracle
• Python package enabling scalable and performant connectivity to Oracle Database
– Open source, publicly available on PyPI, OTN, and github
– Oracle is maintainer
• Oracle Database Interface for Python conforming to Python DB API 2.0 specification
– Optimized driver based on OCI
– Execute SQL statements from Python
– Enables transactional behavior for insert, update, and delete
Oracle Database
cx_Oracle
28
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
cx_Oracle - Requirements
• Easily installed from PyPI
• Support for Python 2 and 3
• Support for Oracle Client 11.2, 12.1, 12.2, 18
– Oracle's standard cross-version interoperability, allows easy upgrades and
connectivity to different Oracle Database versions
• Connect to Oracle Database 9.2, 10, 11, 12, 18
– (Depending on the Oracle Client version used)
• SQL and PL/SQL Execution
– Underlying Oracle Client libraries have optimizations: compressed fetch, pre-fetching,
client and server result set caching, and statement caching with auto-tuning
29
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
cx_Oracle Example
import cx_Oracle
con = cx_Oracle.connect('pythonhol/welcome@127.0.0.1/orcl')
print(con.version)
con.close()
con = cx_Oracle.connect('pythonhol', 'welcome', '127.0.0.1:/orcl:pooled',
cclass = "HOL", purity = cx_Oracle.ATTR_PURITY_SELF)
con.close()
30
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
cx_Oracle Example
cur = con.cursor() # opens cursor for statements to use
cur.execute('select * from departments order by department_id')
for result in cur: # prints all data
print(result)
#or
row = cur.fetchone() # return a single row as tuple and advance row
print(row)
row = cur.fetchone()
print(row)
#or
res = cur.fetchmany(numRows=3) # returns list of tuples
print(res)
cur.close()
con.close()
31
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Data Types (1)
Full listing for cx_Oracle
cx_Oracle Type Oracle Type Python Type
cx_Oracle.BINARY RAW bytes (Python 3), str (Python 2)
cx_Oracle.BFILE BFILE cx_Oracle.LOB
cx_Oracle.BOOLEAN boolean (PL/SQL only) bool
cx_Oracle.CLOB CLOB cx_Oracle.LOB
cx_Oracle.CURSOR REF CURSOR cx_Oracle.Cursor
cx_Oracle.DATETIME DATE datetime.datetime
cx_Oracle.FIXED_CHAR CHAR str
cx_Oracle.FIXED_NCHAR NCHAR str (Python 3), unicode (Python 2)
cx_Oracle.INTERVAL INTERVAL DAY TO SECOND datetime.timedelta
cx_Oracle.LOB CLOB, BLOB, BFILE, NCLOB cx_Oracle.LOB
cx_Oracle.LONG_BINARY LONG RAW bytes (Python 3), str (Python 2)
32
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Data Types (2)
Full listing for cx_Oracle
cx_Oracle Type Oracle Type Python Type
cx_Oracle.LONG_STRING LONG str
cx_Oracle.NATIVE_FLOAT BINARY_DOUBLE float
cx_Oracle.NATIVE_INT - int (Python 3), long/int (Python 2)
cx_Oracle.NCHAR NVARCHAR2 str (Python 3), unicode (Python 2)
cx_Oracle.NCLOB NCLOB cx_Oracle.LOB
cx_Oracle.NUMBER NUMBER float
cx_Oracle.OBJECT instances created by CREATE OR REPLACE
TYPE
cx_Oracle.Object
cx_Oracle.ROWID ROWID str
cx_Oracle.STRING VARCHAR2 str
cx_Oracle.TIMESTAMP TIMESTAMP, TIMESTAMP WITH TIME ZONE,
TIMESTAMP WITH LOCAL TIME ZONE
datetime.datetime
33
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 34
Oracle Machine Learning
for Python (OML4Py)
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Traditional Python and Database Interaction
• Access latency
• Memory limitation – data size
• Single threaded
• Paradigm shift: Python à SQL à Python
• Ad hoc production deployment
• Issues for backup, recovery, security
Python
script
cron job
Database
Flat Files extract / exportread
export load
35
SQL
mxODBC, pyodbc, turboodbc, JayDeBeApi, cx_Oracle
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Machine Learning for Python
Oracle Advanced Analytics option to Oracle Database >= 18c
• Use Oracle Database as HPC environment
• Use in-database parallel and distributed
machine learning algorithms
• Manage Python scripts and
Python objects in Oracle Database
• Integrate Python results into applications
and dashboards via SQL
• Produce better models faster with
automated machine learning
36
Oracle Database
User tables
In-db
stats
Database
Server
Machine
SQL Interfaces
SQL*Plus,
SQLDeveloper, …
Oracle Machine Learning
for Python
Python Client
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Oracle Machine Learning for Python
• Transparency layer
– Leverage proxy objects so data remain in database
– Overload Python functions translating functionality to SQL
– Use familiar Python syntax to manipulate database data
• Parallel, distributed algorithms
– Scalability and performance
– Exposes in-database algorithms from Oracle Data Mining
• Embedded Python execution
– Manage and invoke Python scripts in Oracle Database
– Data-parallel, task-parallel, and non-parallel execution
– Use open source Python packages
• Automated machine learning
– Feature selection, model selection, hyper-parameter tuning
37
Oracle Database
User tables
In-db
stats
Database
Server
Machine
SQL Interfaces
SQL*Plus,
SQLDeveloper, …
Oracle Machine Learning
for Python
Python Client
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OML4Py Transparency Layer
• Leverages proxy objects for database data: oml.DataFrame
# Create table from Pandas DataFrame data
DATA = oml.create(data, table = 'BOSTON')
# Get proxy object to DB table boston
DATA = oml.sync(table = 'BOSTON')
• Overloads Python functions translating functionality to SQL
• Uses familiar Python syntax to manipulate database data
DATA.shape
DATA.head()
DATA.describe()
DATA.std()
DATA.skew()
train_dat, test_dat =
DATA.split()
train_dat.shape
test_dat.shape
38
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Data Transfer-related functions
• oml.create(x, table[, oranumber, dbtypes, . . . ])
– Creates a table in Oracle Database from a Pandas DataFrame returning a proxy object
• oml.push(x[, oranumber, dbtypes])
– Pushes data to Oracle Database creating a temporary table returning a proxy object
• oml.sync(schema=None, regex_match=False, table=None, view=None, query=None)
– Creates a DataFrame proxy object in Python that represents an Oracle Database table
• oml.drop([table, view])
– Drops the named database table or view
• oml.dir()
– Returns the names of OML objects in the workspace
• oml.cursor()
– Returns a cx_Oracle cursor object of the current OML database connection
39
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
List of functions on OML DataFrame executed in-database
• KFold
• append
• columns
• concat
• corr
• count
• create_view
• crosstab
• cumsum
• describe
• drop
• drop_duplicates
• dropna
• head
• kurtosis
• materialize
• max
• mean
• median
• merge
• min
• nunique
• pivot_table
• pull
• rename
• round
• select_types
• shape
• skew
• sort_values
• split
• std
• sum
• t_dot
• tail
• types
40
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Example – create a DataFrame
41
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Example using crosstab on oml.DataFrame
42
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OML4Py 1.0
Machine Learning algorithms in-Database
• Decision Tree
• Naïve Bayes
• Generalized Linear Model
• Support Vector Machine
• RandomForest
• Neural Network
Regression
• Generalized Linear Model
• Neural Network
• Support Vector Machine
Classification
Attribute Importance
• Minimum Description Length
Clustering
• Expectation Maximization
• Hierarchical k-Means
Feature Extraction
• Singular Value Decomposition
• Explicit Semantic Analysis
Market Basket Analysis
• Apriori – Association Rules
Anomaly Detection
• 1 Class Support Vector Machine
…plus open source Python packages for algorithms in
combination with embedded Python execution
43
Supports integrated partitioned models, text mining
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Connect to the database
Client Python Engine
OML4Py
Python user on laptop
Oracle Database
Transparency Layer
import oml
import os
sid = os.environ["ORACLE_SID"]
oml.connect(user="pyquser", password="pyquser",
dsn="(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=...)
(PORT=1521))(CONNECT_DATA=(SID=sid)))")
oml.isconnected()
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Invoke in-database aggregation function
Client Python Engine
OML4Py
Python user on desktop
Oracle Database
User tables
Transparency Layer
ONTIME_S = oml.sync(table="ONTIME_S")
res = ONTIME_S.crosstab('DEST')
type(res)
res.head()
Source data is a DataFrame, ONTIME_S,
which is an Oracle Database table
crosstab() function overloaded to accept OML
DataFrame objects and transparently
generates SQL for execution in Oracle
Database
Returns an ‘oml.core.frame.DataFrame’ object
In-db
stats
select DEST, count(*)
from ONTIME_S
group by DEST
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OML4Py Embedded Python
def fit(data):
from sklearn.svm import LinearSVC
x = data.drop('TARGET',
axis = 1).values
y = data['TARGET']
return LinearSVC().fit(x, y)
oml.script.create('sk_svc_fit', fit,
overwrite = True)
oml.script.dir()
mod = oml.table_apply(train_dat,
func = 'sk_svc_fit',
oml_input_type = 'pandas.DataFrame')
46
Client Python Engine
OML4Py
User tables
pyq*eval ()
interface
2
3
Oracle Database
extproc
DB Python Engine
4
OML4Py
1
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
oml.group_apply – partitioned data flow
Client Python Engine
OML4Py
User tables
DB Python Engine
pyq*eval ()
interface
extproc
2
3
4
OML4Py
Oracle Database
extproc
DB Python Engine
4
OML4Py
def build_lm(dat):
from sklearn import linear_model
lm = linear_model.LinearRegression()
X = dat[['PETAL_WIDTH']]
y = dat[['PETAL_LENGTH']]
lm.fit(X, y)
return lm
index = oml.DataFrame(IRIS['SPECIES'])
mods = oml.group_apply(
IRIS[:,['PETAL_LENGTH',
"PETAL_WIDTH",
'SPECIES']],
index,
func=build_lm)
sorted(mods.pull().items())
1
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Embedded Python Execution functions
• oml.do_eval(func[, func_value, func_owner, . . . ])
– Executes the user-defined Python function at the Oracle Database server machine
• oml.table_apply(data, func[, func_value, . . . ])
– Executes the user-defined Python function at the Oracle Database server machine supplying data
pulled from Oracle Database
• oml.row_apply(data, func[, func_value, . . . ])
– Partitions a table or view into row chunks and executes the user-defined python function on each chunk
within one or more Python processes running at the Oracle Database server machine
• oml.group_apply(data, index, func[, . . . ])
– Partitions a table or view by the values in column(s) specified in index and executes the user-defined python
function on those partitions within one or more Python processes running at the Oracle Database server machine
• oml.index_apply(times, func[, func_value, . . . ])
– Executes the user-defined python function multiple times inside Oracle Database server
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Script repository functions for saving Python scripts in ODB
• oml.script.create(name, func[, is_global, . . . ])
– Creates a Python script, which contains a single function definition, in the
Oracle Database Python script repository
• oml.script.dir([name, regex_match, sctype])
– Lists the scripts present in the Oracle Database Python script repository
• oml.script.load(name[, owner])
– Loads the named script from the Oracle Database Python script repository as a callable object
• oml.script.drop(name[, is_global, silent])
– Drops the named script from the Oracle Database Python script repository
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Datastore functions for saving Python objects in ODB
• oml.ds.save(objs, name[, description, . . . ])
– Saves Python objects to a datastore in the user’s Oracle Database schema
• oml.ds.dir([name, regex_match, dstype])
– Lists existing datastores available to the current session user
• oml.ds.describe(name[, owner])
– Describes the contents of the named datastore available to the current session user
• oml.ds.load(name[, objs, owner, to_globals])
– Loads Python objects from a datastore in the user’s Oracle Database schema
• oml.ds.delete(name[, objs, regex_match])
– Deletes one or more datastores from the user’s Oracle Database schema or deletes specific objects to delete from
within a datastore
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Data Types
Mapping between OML4Py and Oracle Database
cx_Oracle Read Python cx_Oracle Write
varchar2, char, clob str varchar2, char, clob
number, binary_double, binary_float float if oranumber == True then number (default)
else binary_double
boolean if oranumber == True then number (default)
else binary_double
raw, blob bytes raw, blob
51
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
AutoML – new with OML4Py in Oracle Advanced Analytics
• Goal: increase model quality and data scientist productivity while reducing overall compute time
• Auto Feature Selection
– Reduce the number of features by identifying most relevant
– Improve performance and accuracy
• Auto Model Selection for classification and regression
– Identify best algorithm to achieve maximum score
– Find best model many times faster than with exhaustive search techniques
• Auto Tuning of Hyper-parameters
– Significantly improve model accuracy
– Avoid manual or exhaustive search techniques
52
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Auto Feature Selection: Motivation & Example
Confidential – Oracle Internal/Restricted/Highly Restricted 53
• Many real-world datasets have a
large number of irrelevant
features
• Slows down training
• Goal: Speed-up ML pipeline by
selecting most relevant features
0
5
10
15
20
25
30
1 2
Trainingtime(seconds)
ML training time
0.93
0.94
0.95
0.96
0.97
0.98
0.99
1
1 2
Accuracy
Prediction Accuracy
33x
+4%
OpenML dataset 312 with 1925 rowsOpenML dataset 40996 (56000 rows, 784 columns)
Using SVM Gaussian with Auto Feature Selection
• Features reduced from 784 to 309
• Accuracy improves from 65.9% to 84.3%
• Training time reduced 1.3x
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Auto Feature Selection: Evaluation for OAA SVM Gaussian
Confidential – Oracle Internal/Restricted/Highly Restricted 54
• 150 Datasets with
more than 500 cases
0.5
0.55
0.6
0.65
0.7
0.75
0.8
0.85
0.9
0.95
1
1 2 3 4 5 6 7 8 9 10
Accuracy
Series1
Series2
Avg Accuracy Gain
2.5%
Avg Feature Reduction
52%
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Auto Feature Selection Example
fs = FeatureSelection(mining_function = 'classification',
score_metric = 'accuracy')
selected_features = fs.reduce('dt', X_train, y_train)
X_train = X_train[:,selected_features]
55Confidential – Oracle Internal/Restricted/Highly Restricted
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Auto Model Selection Example
ms = ModelSelection(mining_function = 'classification',
score_metric = 'accuracy')
best_model = ms.select(X_train, y_train)
y_pred = best_model.predict(X_test)
56Confidential – Oracle Internal/Restricted/Highly Restricted
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Auto Tune Example
at = Autotune(mining_function = 'classification',
score_metric = 'accuracy')
evals = at.tune('dt', X_train, y_train)
mod = evals['best_model']
y_pred = mod.predict(X_test)
57Confidential – Oracle Internal/Restricted/Highly Restricted
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
OML4Py - Deployment Architecture
Oracle Confidential – Internal/Restricted/Highly Restricted 58
Oracle Database
Python 3 engine
OAA / OML4Py
Zeppelin / Jupyter
web interface
BDA / Hadoop
Big Data SQL
Web browser
Web browser
OML4Py Client
Python Engine
Python Script
Repository
Python Object
Datastore
Oracle Analytics Cloud
Oracle Data Visualization Desktop
OBIEE
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
Summary - Oracle Machine Learning for Python
• Oracle Database enabled with Python scripting language and environment
for the enterprise via Oracle Advanced Analytics option
• Oracle’s Python technologies extend Python for enterprise use
– Supports data analysis, exploration, and machine learning
– Enables streamlined production development
– Automates key data science steps for greater data scientist productivity,
while enhancing accuracy and performance
• Achieve performance and scalability leveraging Oracle Database as a
high performance compute engine
59
Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |

Contenu connexe

Tendances

20 tips and tricks with the Autonomous Database
20 tips and tricks with the Autonomous Database20 tips and tricks with the Autonomous Database
20 tips and tricks with the Autonomous DatabaseSandesh Rao
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseSandesh Rao
 
ORAchk EXAchk what's new in 12.1.0.2.7
ORAchk EXAchk what's new in 12.1.0.2.7ORAchk EXAchk what's new in 12.1.0.2.7
ORAchk EXAchk what's new in 12.1.0.2.7Sandesh Rao
 
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEAIntroduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEASandesh Rao
 
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 Sandesh Rao
 
Biwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labBiwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labCharlie Berger
 
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...Sandesh Rao
 
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...Lucas Jellema
 
Machine Learning and AI at Oracle
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at OracleSandesh Rao
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaSandesh Rao
 
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2..."Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...Markus Michalewicz
 
RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016Sandesh Rao
 
#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks
#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks
#dbhouseparty - Graph Technologies - More than just Social (Distancing) NetworksTammy Bednar
 
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...Tammy Bednar
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...Trivadis
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowAIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowSandesh Rao
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACMarkus Michalewicz
 
Ground Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseGround Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseMaria Colgan
 
20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database 20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database Sandesh Rao
 
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...Sandesh Rao
 

Tendances (20)

20 tips and tricks with the Autonomous Database
20 tips and tricks with the Autonomous Database20 tips and tricks with the Autonomous Database
20 tips and tricks with the Autonomous Database
 
Top 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous DatabaseTop 20 FAQs on the Autonomous Database
Top 20 FAQs on the Autonomous Database
 
ORAchk EXAchk what's new in 12.1.0.2.7
ORAchk EXAchk what's new in 12.1.0.2.7ORAchk EXAchk what's new in 12.1.0.2.7
ORAchk EXAchk what's new in 12.1.0.2.7
 
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEAIntroduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
Introduction to Machine Learning - From DBA's to Data Scientists - OGBEMEA
 
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019 The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
The Machine Learning behind the Autonomous Database- EMEA Tour Oct 2019
 
Biwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on labBiwa summit 2015 oaa oracle data miner hands on lab
Biwa summit 2015 oaa oracle data miner hands on lab
 
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
AIOUG -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA'...
 
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...The Art of Intelligence – Introduction Machine Learning for Oracle profession...
The Art of Intelligence – Introduction Machine Learning for Oracle profession...
 
Machine Learning and AI at Oracle
Machine Learning and AI at OracleMachine Learning and AI at Oracle
Machine Learning and AI at Oracle
 
How to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmeaHow to use Exachk effectively to manage Exadata environments OGBEmea
How to use Exachk effectively to manage Exadata environments OGBEmea
 
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2..."Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
"Changing Role of the DBA" Skills to Have, to Obtain & to Nurture - Updated 2...
 
RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016RAC Troubleshooting and Diagnosability Sangam2016
RAC Troubleshooting and Diagnosability Sangam2016
 
#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks
#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks
#dbhouseparty - Graph Technologies - More than just Social (Distancing) Networks
 
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
#dbhouseparty - Using Oracle’s Converged “AI” Database to Pick a Good but Ine...
 
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
TechEvent 2019: Create a Private Database Cloud in the Public Cloud using the...
 
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should knowAIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
AIOUG : ODEVCYathra 2018 - Oracle Autonomous Database What Every DBA should know
 
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RACThe Top 5 Reasons to Deploy Your Applications on Oracle RAC
The Top 5 Reasons to Deploy Your Applications on Oracle RAC
 
Ground Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous DatabaseGround Breakers Romania: Oracle Autonomous Database
Ground Breakers Romania: Oracle Autonomous Database
 
20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database 20 Tips and Tricks with the Autonomous Database
20 Tips and Tricks with the Autonomous Database
 
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
NZOUG - GroundBreakers-2018 -Using Oracle Autonomous Health Framework to Pres...
 

Similaire à Data meets AI - AICUG - Santa Clara

Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaData meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaSandesh Rao
 
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderEmbedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderDataconomy Media
 
Artificial Intelligence and Machine Learning with the Oracle Data Science Cloud
Artificial Intelligence and Machine Learning with the Oracle Data Science CloudArtificial Intelligence and Machine Learning with the Oracle Data Science Cloud
Artificial Intelligence and Machine Learning with the Oracle Data Science CloudJuarez Junior
 
Oracle Data Science Platform
Oracle Data Science PlatformOracle Data Science Platform
Oracle Data Science PlatformOracle Developers
 
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Charlie Berger
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021Sandesh Rao
 
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...Sandesh Rao
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud ServicesJean Ihm
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is nextThomas Teske
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleatSistemas
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017aioughydchapter
 
01 oracle application integration overview
01 oracle application integration overview01 oracle application integration overview
01 oracle application integration overviewnksolanki
 
The Changing Role of a DBA in an Autonomous World
The Changing Role of a DBA in an Autonomous WorldThe Changing Role of a DBA in an Autonomous World
The Changing Role of a DBA in an Autonomous WorldMaria Colgan
 
Enterprise Cloud transformation z pohledu Oracle
Enterprise Cloud transformation z pohledu OracleEnterprise Cloud transformation z pohledu Oracle
Enterprise Cloud transformation z pohledu OracleMarketingArrowECS_CZ
 
On24 oracle-machine-learning-platform-12-feb-2020-webcast
On24 oracle-machine-learning-platform-12-feb-2020-webcastOn24 oracle-machine-learning-platform-12-feb-2020-webcast
On24 oracle-machine-learning-platform-12-feb-2020-webcastTill Huber
 
Oracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOTN Systems Hub
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationTammy Bednar
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseJeffrey T. Pollock
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningProvectus
 

Similaire à Data meets AI - AICUG - Santa Clara (20)

Data meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow IndiaData meets AI - ATP Roadshow India
Data meets AI - ATP Roadshow India
 
Embedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern StaenderEmbedded-ml(ai)applications - Bjoern Staender
Embedded-ml(ai)applications - Bjoern Staender
 
Artificial Intelligence and Machine Learning with the Oracle Data Science Cloud
Artificial Intelligence and Machine Learning with the Oracle Data Science CloudArtificial Intelligence and Machine Learning with the Oracle Data Science Cloud
Artificial Intelligence and Machine Learning with the Oracle Data Science Cloud
 
Oracle Data Science Platform
Oracle Data Science PlatformOracle Data Science Platform
Oracle Data Science Platform
 
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
Oracle’s Advanced Analytics & Machine Learning 12.2c New Features & Road Map;...
 
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
AutoML - Heralding a New Era of Machine Learning - CASOUG Oct 2021
 
Week 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud ComputingWeek 12: Cloud AI- DSA 441 Cloud Computing
Week 12: Cloud AI- DSA 441 Cloud Computing
 
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
LAD -GroundBreakers-Jul 2019 - Introduction to Machine Learning - From DBA's ...
 
An Introduction to Graph: Database, Analytics, and Cloud Services
An Introduction to Graph:  Database, Analytics, and Cloud ServicesAn Introduction to Graph:  Database, Analytics, and Cloud Services
An Introduction to Graph: Database, Analytics, and Cloud Services
 
SOUG Day - autonomous what is next
SOUG Day - autonomous what is nextSOUG Day - autonomous what is next
SOUG Day - autonomous what is next
 
DevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-OracleDevOps Spain 2019. Olivier Perard-Oracle
DevOps Spain 2019. Olivier Perard-Oracle
 
Oracle analytics cloud overview feb 2017
Oracle analytics cloud overview   feb 2017Oracle analytics cloud overview   feb 2017
Oracle analytics cloud overview feb 2017
 
01 oracle application integration overview
01 oracle application integration overview01 oracle application integration overview
01 oracle application integration overview
 
The Changing Role of a DBA in an Autonomous World
The Changing Role of a DBA in an Autonomous WorldThe Changing Role of a DBA in an Autonomous World
The Changing Role of a DBA in an Autonomous World
 
Enterprise Cloud transformation z pohledu Oracle
Enterprise Cloud transformation z pohledu OracleEnterprise Cloud transformation z pohledu Oracle
Enterprise Cloud transformation z pohledu Oracle
 
On24 oracle-machine-learning-platform-12-feb-2020-webcast
On24 oracle-machine-learning-platform-12-feb-2020-webcastOn24 oracle-machine-learning-platform-12-feb-2020-webcast
On24 oracle-machine-learning-platform-12-feb-2020-webcast
 
Oracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suiteOracle super cluster for oracle e business suite
Oracle super cluster for oracle e business suite
 
DBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through MigrationDBCS Office Hours - Modernization through Migration
DBCS Office Hours - Modernization through Migration
 
Big Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San JoseBig Data at Oracle - Strata 2015 San Jose
Big Data at Oracle - Strata 2015 San Jose
 
Feature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine LearningFeature Store as a Data Foundation for Machine Learning
Feature Store as a Data Foundation for Machine Learning
 

Plus de Sandesh Rao

Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Sandesh Rao
 
Oracle Database performance tuning using oratop
Oracle Database performance tuning using oratopOracle Database performance tuning using oratop
Oracle Database performance tuning using oratopSandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022Sandesh Rao
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUGSandesh Rao
 
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUGSandesh Rao
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsSandesh Rao
 
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUGSandesh Rao
 
TFA Collector - what can one do with it
TFA Collector - what can one do with it TFA Collector - what can one do with it
TFA Collector - what can one do with it Sandesh Rao
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaIntroduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaSandesh Rao
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Sandesh Rao
 
TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new Sandesh Rao
 
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Sandesh Rao
 
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
The Machine Learning behind the Autonomous Database   ILOUG Feb 2020 The Machine Learning behind the Autonomous Database   ILOUG Feb 2020
The Machine Learning behind the Autonomous Database ILOUG Feb 2020 Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020Sandesh Rao
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Sandesh Rao
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Sandesh Rao
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019Sandesh Rao
 

Plus de Sandesh Rao (18)

Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022Whats new in Autonomous Database in 2022
Whats new in Autonomous Database in 2022
 
Oracle Database performance tuning using oratop
Oracle Database performance tuning using oratopOracle Database performance tuning using oratop
Oracle Database performance tuning using oratop
 
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022Analysis of Database Issues using AHF and Machine Learning v2 -  AOUG2022
Analysis of Database Issues using AHF and Machine Learning v2 - AOUG2022
 
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
Analysis of Database Issues using AHF and Machine Learning v2 -  SOUGAnalysis of Database Issues using AHF and Machine Learning v2 -  SOUG
Analysis of Database Issues using AHF and Machine Learning v2 - SOUG
 
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
15 Troubleshooting tips and Tricks for Database 21c - KSAOUG
 
How to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata EnvironmentsHow to Use EXAchk Effectively to Manage Exadata Environments
How to Use EXAchk Effectively to Manage Exadata Environments
 
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
15 Troubleshooting Tips and Tricks for database 21c - OGBEMEA KSAOUG
 
TFA Collector - what can one do with it
TFA Collector - what can one do with it TFA Collector - what can one do with it
TFA Collector - what can one do with it
 
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmeaIntroduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
Introduction to Machine learning - DBA's to data scientists - Oct 2020 - OGBEmea
 
Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020Troubleshooting tips and tricks for Oracle Database Oct 2020
Troubleshooting tips and tricks for Oracle Database Oct 2020
 
TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new TFA, ORAchk and EXAchk 20.2 - What's new
TFA, ORAchk and EXAchk 20.2 - What's new
 
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
Oracle Autonomous Health Service- For Protecting Your On-Premise Databases- F...
 
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
The Machine Learning behind the Autonomous Database   ILOUG Feb 2020 The Machine Learning behind the Autonomous Database   ILOUG Feb 2020
The Machine Learning behind the Autonomous Database ILOUG Feb 2020
 
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020Troubleshooting Tips and Tricks for Database 19c   ILOUG Feb 2020
Troubleshooting Tips and Tricks for Database 19c ILOUG Feb 2020
 
Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...Introduction to Machine Learning and Data Science using the Autonomous databa...
Introduction to Machine Learning and Data Science using the Autonomous databa...
 
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
Troubleshooting Tips and Tricks for Database 19c - Sangam 2019
 
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
Oracle Real Application Clusters 19c- Best Practices and Internals- EMEA Tour...
 
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019Troubleshooting Tips and Tricks for Database 19c - EMEA Tour  Oct 2019
Troubleshooting Tips and Tricks for Database 19c - EMEA Tour Oct 2019
 

Dernier

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024Lonnie McRorey
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsPixlogix Infotech
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupFlorian Wilhelm
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxBkGupta21
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteDianaGray10
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyAlfredo García Lavilla
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfMounikaPolabathina
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfPrecisely
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfAddepto
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 

Dernier (20)

TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024TeamStation AI System Report LATAM IT Salaries 2024
TeamStation AI System Report LATAM IT Salaries 2024
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
The Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and ConsThe Ultimate Guide to Choosing WordPress Pros and Cons
The Ultimate Guide to Choosing WordPress Pros and Cons
 
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data PrivacyTrustArc Webinar - How to Build Consumer Trust Through Data Privacy
TrustArc Webinar - How to Build Consumer Trust Through Data Privacy
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
Streamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project SetupStreamlining Python Development: A Guide to a Modern Project Setup
Streamlining Python Development: A Guide to a Modern Project Setup
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
unit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptxunit 4 immunoblotting technique complete.pptx
unit 4 immunoblotting technique complete.pptx
 
Take control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test SuiteTake control of your SAP testing with UiPath Test Suite
Take control of your SAP testing with UiPath Test Suite
 
Commit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easyCommit 2024 - Secret Management made easy
Commit 2024 - Secret Management made easy
 
What is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdfWhat is DBT - The Ultimate Data Build Tool.pdf
What is DBT - The Ultimate Data Build Tool.pdf
 
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdfHyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
Hyperautomation and AI/ML: A Strategy for Digital Transformation Success.pdf
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Gen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdfGen AI in Business - Global Trends Report 2024.pdf
Gen AI in Business - Global Trends Report 2024.pdf
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 

Data meets AI - AICUG - Santa Clara

  • 1. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | When Data meets AI AICUG Meetup – Santa Clara , Oracle Sandesh Rao VP AIOps , Autonomous Database
  • 2. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Safe Harbor Statement The following is intended to outline our general product direction. It is intended for information purposes only, and may not be incorporated into any contract. It is not a commitment to deliver any material, code, or functionality, and should not be relied upon in making purchasing decisions. The development, release, timing, and pricing of any features or functionality described for Oracle’s products may change and remains at the sole discretion of Oracle Corporation.
  • 3. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | whoami Real Application Clusters - HA DataGuard- DR Machine Learning- AIOps Enterprise Management Sharding Big Data Operational Management Home Automation Geek @sandeshr https://www.linkedin.com/in/raosandesh/
  • 4. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Agenda • What motivated us to go into Machine Learning ? • Which algorithms, tools & technologies are used? • Oracle & Machine Learning initiatives and tools • Cx_oracle and OML4Py - • Questions and Open Talk
  • 5. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 5 Why Machine Learning for us and why now? • Lots of Data generated as exhaust from systems – Cloud , different formats and interfaces , frameworks • Machine Learning has become accessible – Anyone can be a Data Scientist – Algorithms are accessible as libraries aka scikit , keras , tensorflow .. – Sandbox to get started as easy as a docker init • Business use cases • How to find value from the data , fewer guesses to make decisions
  • 6. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | ML Project Workflow • Set Business Objectives • Gather , Prepare and Cleanse Data • Model Data – Feature Extraction , Test , Train , Optimizer – Loss Function , effectiveness – Framework and Library to use • Apply the Model as an inference engine – Decision making using the Model’s output – Tune Model till outcome is closer to Business Objective 6 Set Business Objectives Understand Use case Create Pseudo Code Synthetic Data Generation Pick Tools and Frameworks Train Test Model Deploy Model Measure Results and Feedback
  • 7. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Types of Machine Learning Supervised Learning Predict future outcomes with the help of training data provided by human experts Semi-Supervised Learning Discover patterns within raw data and make predictions, which are then reviewed by human experts, who provide feedback which is used to improve the model accuracy Unsupervised Learning Find patterns without any external input other than the raw data Reinforcement Learning Take decisions based on past rewards for this type of action 7
  • 8. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | • Hierarchical k-means, Orthogonal Partitioning Clustering, Expectation- Maximization Clustering Feature Extraction/Attribute Importance / Component Analysis • Decision Tree, Naive Bayes, Random Forest, Logistic Regression, Support Vector Machine Classification 8 Machine Learning Algorithms • Multiple Regression, Support Vector Machine, Linear Model, LASSO, Random Forest, Ridge Regression, Generalized Linear Model, Stepwise Linear Regression Regression Association & Collaborative Filtering Reinforcement Learning - brute force, Monte Carlo, temporal difference.... • Many different use cases Neural network & deep Learning with Deep Neural Network
  • 9. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Modeling Phase – AutoML to the rescue Provide Dataset to AutoML2 Configuration parameters for model picked Dataset is divided into training set & testing set Actual Training Evaluate performance of trained model Tweak model parameters, change predictors change test/train data splits and change algorithms Pick model plus parameters depending on outcome and measure , F1 , Precision , Recall , MSE Document all runs and apply A/B testing to see what the variations produce 9
  • 10. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 10 Tools & Libraries Assisting ML projects
  • 11. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | What is Oracle Doing Around Machine Learning? • Big Data Appliance • Big Data Discovery , Big Data Preparation Data Visualization Cloud • Analytics Cloud – Sales, Marketing, HCM on top of SaaS • DaaS – Oracle Data Cloud , Eloqua .. • Oracle Labs (labs.oracle.com) – Machine Learning Research Group • Autonomous Database – Zeppelin Notebooks preloaded with use cases – Applied Machine Learning used for Implementing AIOps 11
  • 12. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle AI Platform Cloud Service – Coming Soon… • Collaborative end-to-end machine learning in the cloud • Enables data science teams to – Organize their work – Access data and computing resources – Build , Train , Deploy – Manage models • Collaborative , Self-Service , Integrated • https://cloud.oracle.com/en_US/ai-platform 12
  • 13. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Autonomous Data Warehouse Cloud Key Features Highly Elastic Independently scale compute and storage, without having to overpay for fixed blocks of resources Built-in Web-Based SQL ML Tool Apache Zeppelin Oracle Machine Learning notebooks ready to run ML from browser Database migration utility Dedicated cloud-ready migration tools for easy migration from Amazon Redshift, SQL Server and other databases Enterprise Grade Security Data is encrypted by default in the cloud, as well as in transit and at rest High-Performance Queries and Concurrent Workloads Optimized query performance with preconfigured resource profiles for different types of users Oracle SQL Autonomous DW Cloud is compatible with all business analytics tools that support Oracle Database Self Driving Fully automated database for self-tuning patching and upgrading itself while the system is running Cloud-Based Data Loading Fast, scalable data-loading from Oracle Object Store, AWS S3, or on-premises 13 Oracle Machine Learning
  • 14. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Machine Learning and Advanced Analytics • Support multiple data platforms, analytical engines, languages, UIs and deployment strategies Strategy and Road Map Big Data / Big Data Cloud Relational ML Algorithms Common core, parallel, distributed SQL R, Python, etc.GUI Data Miner, RStudio Notebooks Advanced Analytics Oracle Database Cloud DWCS
  • 15. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | CLASSIFICATION – Naïve Bayes – Logistic Regression (GLM) – Decision Tree – Random Forest – Neural Network – Support Vector Machine – Explicit Semantic Analysis CLUSTERING – Hierarchical K-Means – Hierarchical O-Cluster – Expectation Maximization (EM) ANOMALY DETECTION – One-Class SVM TIME SERIES – Holt-Winters, Regular & Irregular, with and w/o trends & seasonal – Single, Double Exp Smoothing REGRESSION – Linear Model – Generalized Linear Model – Support Vector Machine (SVM) – Stepwise Linear regression – Neural Network – LASSO ATTRIBUTE IMPORTANCE – Minimum Description Length – Principal Comp Analysis (PCA) – Unsupervised Pair-wise KL Div – CUR decomposition for row & AI ASSOCIATION RULES – A priori/ market basket PREDICTIVE QUERIES – Predict, cluster, detect, features SQL ANALYTICS – SQL Windows, SQL Patterns, SQL Aggregates A1 A2 A3 A4 A5 A6 A7 • OAA (Oracle Data Mining + Oracle R Enterprise) and ORAAH combined • OAA includes support for Partitioned Models, Transactional, Unstructured, Geo-spatial, Graph data. etc, Oracle’s Machine Learning & Adv. Analytics Algorithms FEATURE EXTRACTION – Principal Comp Analysis (PCA) – Non-negative Matrix Factorization – Singular Value Decomposition (SVD) – Explicit Semantic Analysis (ESA) TEXT MINING SUPPORT – Algorithms support text type – Tokenization and theme extraction – Explicit Semantic Analysis (ESA) for document similarity STATISTICAL FUNCTIONS – Basic statistics: min, max, median, stdev, t-test, F-test, Pearson’s, Chi-Sq, ANOVA, etc. R PACKAGES – CRAN R Algorithm Packages through Embedded R Execution – Spark MLlib algorithm integration EXPORTABLE ML MODELS – C and Java code for deployment
  • 16. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Machine Learning Key Features • Collaborative UI for data scientists – Packaged with Autonomous Data Warehouse Cloud (V1) – Easy access to shared notebooks, templates, permissions, scheduler, etc. – SQL ML algorithms API (V1) – Supports deployment of ML analytics Machine Learning Notebook for Autonomous Data Warehouse Cloud
  • 17. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Machine Learning UI in ADW
  • 18. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 19. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |
  • 20. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | AI and ML with Python and Oracle 1 2 3 What is Python Oracle’s Advanced Analytics cx_Oracle Package Oracle Machine Learning for Python 20 4
  • 21. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | What is Python? • An interpreted, object-oriented, high level, general purpose programming language • Designed for rapid application development and scripting to connect existing components • Open source scripting language and environment https://www.python.org • Created in the late 1980s • World-wide usage – Widely taught in Universities – Many Data Scientists know and use Python • Thousands of open source packages to enhance productivity 21
  • 22. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Popularity by question views for major PLs Growth of major programming languages using Stack Overflow question views 22 https://insights.stackoverflow.com/trends?tags=python%2Cjavascript%2Cjava%2Cc%23%2Cphp%2Cc%2B%2B&utm_source=so -owned&utm_medium=blog&utm_campaign=gen-blog&utm_content=blog-link&utm_term=incredible-growth-python
  • 23. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Why use Python? 23 Small Resembles English Many third-party libraries Strict punctuation rules Uniform code formatting PyPI – 168203 projects https://pypi.python.org/pypi Wide-spread user groupsSimple language Heavily used for websites Increasingly used by data scientists
  • 24. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Python IDEs 24
  • 25. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | OAC/OBIEE/ODV Oracle Database Enterprise Edition Oracle’s Advanced Analytics Multiple interfaces across platforms — SQL, R, Python*, GUI, Dashboards, Apps Oracle Advanced Analytics - Database Option SQL, R & Python* Integration for Scalable, Distributed, Parallel in-Database ML Execution SQL Developer/ Oracle Data Miner ApplicationsR & Python* Clients Data / Business AnalystsR & Python programmers Business Analysts/Mgrs Domain End UsersUsers Platform Hadoop Oracle R Advanced Analytics for Hadoop Big Data Connectors Parallel, distributed Spark-based algorithms Oracle Cloud 25 Oracle Database * Not yet released
  • 26. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Advanced Analytics differentiators Work directly with data in Database and Hadoop • Eliminate need to request extracts from IT/DBA – immediate access to database and Hadoop data • Process data where they reside – minimize or eliminate data movement Scalability and Performance • Use parallel, distributed algorithms that scale to big data on Oracle Database and Hadoop platforms • Leverage powerful engineered systems to build models on billions of rows of data or millions of models in parallel Ease of deployment • Using Oracle Database, place Python, R, and SQL scripts immediately in production (no need to recode) • Use production quality infrastructure without custom plumbing or extra complexity Process support • Maintain and ensure data security, backup, and recovery using existing processes • Store, access, manage, and track analytics objects (models, scripts, workflows, data) in Oracle Database 26
  • 27. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle’s Python Technologies Supporting Oracle Database • cx_Oracle package • Oracle Machine Learning for Python Component of the Oracle Advanced Analytics option to Oracle Database 27
  • 28. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | cx_Oracle • Python package enabling scalable and performant connectivity to Oracle Database – Open source, publicly available on PyPI, OTN, and github – Oracle is maintainer • Oracle Database Interface for Python conforming to Python DB API 2.0 specification – Optimized driver based on OCI – Execute SQL statements from Python – Enables transactional behavior for insert, update, and delete Oracle Database cx_Oracle 28
  • 29. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | cx_Oracle - Requirements • Easily installed from PyPI • Support for Python 2 and 3 • Support for Oracle Client 11.2, 12.1, 12.2, 18 – Oracle's standard cross-version interoperability, allows easy upgrades and connectivity to different Oracle Database versions • Connect to Oracle Database 9.2, 10, 11, 12, 18 – (Depending on the Oracle Client version used) • SQL and PL/SQL Execution – Underlying Oracle Client libraries have optimizations: compressed fetch, pre-fetching, client and server result set caching, and statement caching with auto-tuning 29
  • 30. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | cx_Oracle Example import cx_Oracle con = cx_Oracle.connect('pythonhol/welcome@127.0.0.1/orcl') print(con.version) con.close() con = cx_Oracle.connect('pythonhol', 'welcome', '127.0.0.1:/orcl:pooled', cclass = "HOL", purity = cx_Oracle.ATTR_PURITY_SELF) con.close() 30
  • 31. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | cx_Oracle Example cur = con.cursor() # opens cursor for statements to use cur.execute('select * from departments order by department_id') for result in cur: # prints all data print(result) #or row = cur.fetchone() # return a single row as tuple and advance row print(row) row = cur.fetchone() print(row) #or res = cur.fetchmany(numRows=3) # returns list of tuples print(res) cur.close() con.close() 31
  • 32. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Data Types (1) Full listing for cx_Oracle cx_Oracle Type Oracle Type Python Type cx_Oracle.BINARY RAW bytes (Python 3), str (Python 2) cx_Oracle.BFILE BFILE cx_Oracle.LOB cx_Oracle.BOOLEAN boolean (PL/SQL only) bool cx_Oracle.CLOB CLOB cx_Oracle.LOB cx_Oracle.CURSOR REF CURSOR cx_Oracle.Cursor cx_Oracle.DATETIME DATE datetime.datetime cx_Oracle.FIXED_CHAR CHAR str cx_Oracle.FIXED_NCHAR NCHAR str (Python 3), unicode (Python 2) cx_Oracle.INTERVAL INTERVAL DAY TO SECOND datetime.timedelta cx_Oracle.LOB CLOB, BLOB, BFILE, NCLOB cx_Oracle.LOB cx_Oracle.LONG_BINARY LONG RAW bytes (Python 3), str (Python 2) 32
  • 33. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Data Types (2) Full listing for cx_Oracle cx_Oracle Type Oracle Type Python Type cx_Oracle.LONG_STRING LONG str cx_Oracle.NATIVE_FLOAT BINARY_DOUBLE float cx_Oracle.NATIVE_INT - int (Python 3), long/int (Python 2) cx_Oracle.NCHAR NVARCHAR2 str (Python 3), unicode (Python 2) cx_Oracle.NCLOB NCLOB cx_Oracle.LOB cx_Oracle.NUMBER NUMBER float cx_Oracle.OBJECT instances created by CREATE OR REPLACE TYPE cx_Oracle.Object cx_Oracle.ROWID ROWID str cx_Oracle.STRING VARCHAR2 str cx_Oracle.TIMESTAMP TIMESTAMP, TIMESTAMP WITH TIME ZONE, TIMESTAMP WITH LOCAL TIME ZONE datetime.datetime 33
  • 34. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | 34 Oracle Machine Learning for Python (OML4Py)
  • 35. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Traditional Python and Database Interaction • Access latency • Memory limitation – data size • Single threaded • Paradigm shift: Python à SQL à Python • Ad hoc production deployment • Issues for backup, recovery, security Python script cron job Database Flat Files extract / exportread export load 35 SQL mxODBC, pyodbc, turboodbc, JayDeBeApi, cx_Oracle
  • 36. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Machine Learning for Python Oracle Advanced Analytics option to Oracle Database >= 18c • Use Oracle Database as HPC environment • Use in-database parallel and distributed machine learning algorithms • Manage Python scripts and Python objects in Oracle Database • Integrate Python results into applications and dashboards via SQL • Produce better models faster with automated machine learning 36 Oracle Database User tables In-db stats Database Server Machine SQL Interfaces SQL*Plus, SQLDeveloper, … Oracle Machine Learning for Python Python Client
  • 37. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Oracle Machine Learning for Python • Transparency layer – Leverage proxy objects so data remain in database – Overload Python functions translating functionality to SQL – Use familiar Python syntax to manipulate database data • Parallel, distributed algorithms – Scalability and performance – Exposes in-database algorithms from Oracle Data Mining • Embedded Python execution – Manage and invoke Python scripts in Oracle Database – Data-parallel, task-parallel, and non-parallel execution – Use open source Python packages • Automated machine learning – Feature selection, model selection, hyper-parameter tuning 37 Oracle Database User tables In-db stats Database Server Machine SQL Interfaces SQL*Plus, SQLDeveloper, … Oracle Machine Learning for Python Python Client
  • 38. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | OML4Py Transparency Layer • Leverages proxy objects for database data: oml.DataFrame # Create table from Pandas DataFrame data DATA = oml.create(data, table = 'BOSTON') # Get proxy object to DB table boston DATA = oml.sync(table = 'BOSTON') • Overloads Python functions translating functionality to SQL • Uses familiar Python syntax to manipulate database data DATA.shape DATA.head() DATA.describe() DATA.std() DATA.skew() train_dat, test_dat = DATA.split() train_dat.shape test_dat.shape 38
  • 39. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Data Transfer-related functions • oml.create(x, table[, oranumber, dbtypes, . . . ]) – Creates a table in Oracle Database from a Pandas DataFrame returning a proxy object • oml.push(x[, oranumber, dbtypes]) – Pushes data to Oracle Database creating a temporary table returning a proxy object • oml.sync(schema=None, regex_match=False, table=None, view=None, query=None) – Creates a DataFrame proxy object in Python that represents an Oracle Database table • oml.drop([table, view]) – Drops the named database table or view • oml.dir() – Returns the names of OML objects in the workspace • oml.cursor() – Returns a cx_Oracle cursor object of the current OML database connection 39
  • 40. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | List of functions on OML DataFrame executed in-database • KFold • append • columns • concat • corr • count • create_view • crosstab • cumsum • describe • drop • drop_duplicates • dropna • head • kurtosis • materialize • max • mean • median • merge • min • nunique • pivot_table • pull • rename • round • select_types • shape • skew • sort_values • split • std • sum • t_dot • tail • types 40
  • 41. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Example – create a DataFrame 41
  • 42. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Example using crosstab on oml.DataFrame 42
  • 43. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | OML4Py 1.0 Machine Learning algorithms in-Database • Decision Tree • Naïve Bayes • Generalized Linear Model • Support Vector Machine • RandomForest • Neural Network Regression • Generalized Linear Model • Neural Network • Support Vector Machine Classification Attribute Importance • Minimum Description Length Clustering • Expectation Maximization • Hierarchical k-Means Feature Extraction • Singular Value Decomposition • Explicit Semantic Analysis Market Basket Analysis • Apriori – Association Rules Anomaly Detection • 1 Class Support Vector Machine …plus open source Python packages for algorithms in combination with embedded Python execution 43 Supports integrated partitioned models, text mining
  • 44. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Connect to the database Client Python Engine OML4Py Python user on laptop Oracle Database Transparency Layer import oml import os sid = os.environ["ORACLE_SID"] oml.connect(user="pyquser", password="pyquser", dsn="(DESCRIPTION=(ADDRESS=(PROTOCOL=TCP)(HOST=...) (PORT=1521))(CONNECT_DATA=(SID=sid)))") oml.isconnected()
  • 45. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Invoke in-database aggregation function Client Python Engine OML4Py Python user on desktop Oracle Database User tables Transparency Layer ONTIME_S = oml.sync(table="ONTIME_S") res = ONTIME_S.crosstab('DEST') type(res) res.head() Source data is a DataFrame, ONTIME_S, which is an Oracle Database table crosstab() function overloaded to accept OML DataFrame objects and transparently generates SQL for execution in Oracle Database Returns an ‘oml.core.frame.DataFrame’ object In-db stats select DEST, count(*) from ONTIME_S group by DEST
  • 46. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | OML4Py Embedded Python def fit(data): from sklearn.svm import LinearSVC x = data.drop('TARGET', axis = 1).values y = data['TARGET'] return LinearSVC().fit(x, y) oml.script.create('sk_svc_fit', fit, overwrite = True) oml.script.dir() mod = oml.table_apply(train_dat, func = 'sk_svc_fit', oml_input_type = 'pandas.DataFrame') 46 Client Python Engine OML4Py User tables pyq*eval () interface 2 3 Oracle Database extproc DB Python Engine 4 OML4Py 1
  • 47. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | oml.group_apply – partitioned data flow Client Python Engine OML4Py User tables DB Python Engine pyq*eval () interface extproc 2 3 4 OML4Py Oracle Database extproc DB Python Engine 4 OML4Py def build_lm(dat): from sklearn import linear_model lm = linear_model.LinearRegression() X = dat[['PETAL_WIDTH']] y = dat[['PETAL_LENGTH']] lm.fit(X, y) return lm index = oml.DataFrame(IRIS['SPECIES']) mods = oml.group_apply( IRIS[:,['PETAL_LENGTH', "PETAL_WIDTH", 'SPECIES']], index, func=build_lm) sorted(mods.pull().items()) 1
  • 48. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Embedded Python Execution functions • oml.do_eval(func[, func_value, func_owner, . . . ]) – Executes the user-defined Python function at the Oracle Database server machine • oml.table_apply(data, func[, func_value, . . . ]) – Executes the user-defined Python function at the Oracle Database server machine supplying data pulled from Oracle Database • oml.row_apply(data, func[, func_value, . . . ]) – Partitions a table or view into row chunks and executes the user-defined python function on each chunk within one or more Python processes running at the Oracle Database server machine • oml.group_apply(data, index, func[, . . . ]) – Partitions a table or view by the values in column(s) specified in index and executes the user-defined python function on those partitions within one or more Python processes running at the Oracle Database server machine • oml.index_apply(times, func[, func_value, . . . ]) – Executes the user-defined python function multiple times inside Oracle Database server
  • 49. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Script repository functions for saving Python scripts in ODB • oml.script.create(name, func[, is_global, . . . ]) – Creates a Python script, which contains a single function definition, in the Oracle Database Python script repository • oml.script.dir([name, regex_match, sctype]) – Lists the scripts present in the Oracle Database Python script repository • oml.script.load(name[, owner]) – Loads the named script from the Oracle Database Python script repository as a callable object • oml.script.drop(name[, is_global, silent]) – Drops the named script from the Oracle Database Python script repository
  • 50. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Datastore functions for saving Python objects in ODB • oml.ds.save(objs, name[, description, . . . ]) – Saves Python objects to a datastore in the user’s Oracle Database schema • oml.ds.dir([name, regex_match, dstype]) – Lists existing datastores available to the current session user • oml.ds.describe(name[, owner]) – Describes the contents of the named datastore available to the current session user • oml.ds.load(name[, objs, owner, to_globals]) – Loads Python objects from a datastore in the user’s Oracle Database schema • oml.ds.delete(name[, objs, regex_match]) – Deletes one or more datastores from the user’s Oracle Database schema or deletes specific objects to delete from within a datastore
  • 51. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Data Types Mapping between OML4Py and Oracle Database cx_Oracle Read Python cx_Oracle Write varchar2, char, clob str varchar2, char, clob number, binary_double, binary_float float if oranumber == True then number (default) else binary_double boolean if oranumber == True then number (default) else binary_double raw, blob bytes raw, blob 51
  • 52. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | AutoML – new with OML4Py in Oracle Advanced Analytics • Goal: increase model quality and data scientist productivity while reducing overall compute time • Auto Feature Selection – Reduce the number of features by identifying most relevant – Improve performance and accuracy • Auto Model Selection for classification and regression – Identify best algorithm to achieve maximum score – Find best model many times faster than with exhaustive search techniques • Auto Tuning of Hyper-parameters – Significantly improve model accuracy – Avoid manual or exhaustive search techniques 52
  • 53. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Auto Feature Selection: Motivation & Example Confidential – Oracle Internal/Restricted/Highly Restricted 53 • Many real-world datasets have a large number of irrelevant features • Slows down training • Goal: Speed-up ML pipeline by selecting most relevant features 0 5 10 15 20 25 30 1 2 Trainingtime(seconds) ML training time 0.93 0.94 0.95 0.96 0.97 0.98 0.99 1 1 2 Accuracy Prediction Accuracy 33x +4% OpenML dataset 312 with 1925 rowsOpenML dataset 40996 (56000 rows, 784 columns) Using SVM Gaussian with Auto Feature Selection • Features reduced from 784 to 309 • Accuracy improves from 65.9% to 84.3% • Training time reduced 1.3x
  • 54. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Auto Feature Selection: Evaluation for OAA SVM Gaussian Confidential – Oracle Internal/Restricted/Highly Restricted 54 • 150 Datasets with more than 500 cases 0.5 0.55 0.6 0.65 0.7 0.75 0.8 0.85 0.9 0.95 1 1 2 3 4 5 6 7 8 9 10 Accuracy Series1 Series2 Avg Accuracy Gain 2.5% Avg Feature Reduction 52%
  • 55. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Auto Feature Selection Example fs = FeatureSelection(mining_function = 'classification', score_metric = 'accuracy') selected_features = fs.reduce('dt', X_train, y_train) X_train = X_train[:,selected_features] 55Confidential – Oracle Internal/Restricted/Highly Restricted
  • 56. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Auto Model Selection Example ms = ModelSelection(mining_function = 'classification', score_metric = 'accuracy') best_model = ms.select(X_train, y_train) y_pred = best_model.predict(X_test) 56Confidential – Oracle Internal/Restricted/Highly Restricted
  • 57. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Auto Tune Example at = Autotune(mining_function = 'classification', score_metric = 'accuracy') evals = at.tune('dt', X_train, y_train) mod = evals['best_model'] y_pred = mod.predict(X_test) 57Confidential – Oracle Internal/Restricted/Highly Restricted
  • 58. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | OML4Py - Deployment Architecture Oracle Confidential – Internal/Restricted/Highly Restricted 58 Oracle Database Python 3 engine OAA / OML4Py Zeppelin / Jupyter web interface BDA / Hadoop Big Data SQL Web browser Web browser OML4Py Client Python Engine Python Script Repository Python Object Datastore Oracle Analytics Cloud Oracle Data Visualization Desktop OBIEE
  • 59. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. | Summary - Oracle Machine Learning for Python • Oracle Database enabled with Python scripting language and environment for the enterprise via Oracle Advanced Analytics option • Oracle’s Python technologies extend Python for enterprise use – Supports data analysis, exploration, and machine learning – Enables streamlined production development – Automates key data science steps for greater data scientist productivity, while enhancing accuracy and performance • Achieve performance and scalability leveraging Oracle Database as a high performance compute engine 59
  • 60. Copyright © 2018, Oracle and/or its affiliates. All rights reserved. |