FSI202 Machine Learning in Capital Markets

© 2017, Amazon Web Services, Inc. or its Affiliates. All rights reserved.
Frank Tarsillo, IHS Markit
Peter Williams, AWS Financial Services Partners
August 14, 2017
Machine Learning in
Capital Markets

Takeaways from today’s session
I. How is machine learning changing Financial Services?
II. Case Study: IHS Markit’s commodity inventory solution
III. How do customers get started?
IV. Best practices when building machine learning in the cloud

How is machine learning
changing Financial Services?

Artificial Intelligence
A system or service that
can perform tasks usually
requiring human
intelligence.
AA
Hardware + learning
algorithms
Fit a network structure
to the data
Fit a function
To the data
AI
ML
DL

The challenge: SCALE
Data Training Prediction

Data
Training Prediction
PBs and EBs of
existing data
Aggressive migration

Data Training
Prediction
PBs and EBs of
existing data
Tons of GPUs
Elastic capacity
Pre-built images

Data Training Prediction
PBs and EBs of
existing data
Tons of GPUs
Elastic capacity
Pre-built images
Tons of GPUs and CPUs
Serverless
At the edge, on IoT Devices

Business Group Use Case ML Models AWS Tools
Asset Management
Equity ratings
Portfolio management
Portfolio optimization
- Machine Learning
(Clustering, SVM, Logistic
Regression, Gaussian
processes)
- Deep Learning (Embedding,
Matrix Factorization, LSTM,
conversational UI)
- Graphical models
- Amazon Machine Learning
- Deep Learning MXNet
- SparkML on Amazon EMR
- Tensor Flow
Banking
Credit card/lending fraud
Customer onboarding
Digital banking
Compliance
Anti-Money Laundering,
Counter-Terrorism Financing
Due Diligence
Security
Cyber-attack detection
Data leak detection
Virus detection
Financial Services use cases

AWS has become the Center of Gravity for AI

Commodity inventory challenges
• Global warehouses are all different
• Diverse flow of paper documents from warehouse
inventory reports to bills of lading
• Digitize, classify, and aggregate the document flow
• Improve cost and risk management
• Localization diversity
• Eliminate error-prone manual operations

Singapore
青岛
Vlissingen
Singapore Al 123.4 mT
Vlissingen Al 234.5 mT
Qingdao Al 567.8 mT
Vendors
Report generation
Vendors continue to generate their
reports as usual. Documents are
redirected to a Markit server.
Process
Markit scrapes the critical data
from the warehouse’s unique
report using a variety of
technologies including OCR.
Audit
Markit stores a digital copy of
the original document for
auditing purposes.
Disseminate
The data is standardized for
language and terminology, then
compiled onto a customer
portal.
IHS Markit Trading firms
Data access
Dependent teams at the trading
firms connect to the portal to
pull a personalized view of the
required data.
Logistics
teams
Finance
teams
Compliance
teams
Schedulers/
traders
Commodity inventory solution

IHS Markit – cloud strategy
• Cloud first strategy
• Cloud native vs. lift and shift
• Time-to-market focus
• Investment upfront
• Refreshing the estate
• Financial transparency
• A DevOps world
• Global scale
• Cost savings in the end

Architecture approach
• Microservices design approach
• PaaS supporting orchestration and discovery
• OCR, classification, and analysis documents
• Cloud enablement with some lock-in
• API driven
• Modern UI, SSO, Preferences, Notifications
• SDLC – Agile, CI/CD, high velocity of change
• Highly available (Multi-AZ/Global)

Machine learning enablement
• Document classification and templates to reduce exceptions
• Configuration data and models (Keras + Tensorflow)
• Use of AML-based images to tool vs. DIY
• AWS compute with GPU on-demand for trainer
• Data lake of document classifications (strategic)
OCR
Classification
Predictor
QA ReviewMLDoc APIs
Document
Repository
Classification
Trainer
Corrected
Predictions

P2 Instances
ML product maintenance
ML AMI
ML AMI
Production
Predictor
(AWS Lambda)
Job Scheduler
Train model
with new data
Design new
model
New document
Prediction
corrections
Trained
models
Documents Results
Data Lake (Amazon S3)

Commodity tracker benefits
of docs STP
80%
improvement in
timeliness
60%
increase in efficiency
50%
Focus human effort
on the exceptions.
Reduce risk by
cutting the lag
between data
collection and use.
Empower teams to
focus energy on
value-creation
activities.

Amazon Machine Learning (Amazon ML)
Easy-to-use, managed machine learning
service built for developers
Robust, powerful machine learning
technology based on Amazon’s internal
systems
Create models using your data already
stored in the AWS cloud
Deploy models to production in seconds

Easy to use and developer-friendly
Use the intuitive, powerful service console to build
and explore your initial models
• Data retrieval
• Model training, quality evaluation, fine-tuning
• Deployment and management
Automate model lifecycle with fully featured APIs and
SDKs
• Java, Python, .NET, JavaScript, Ruby, PHP
Easily create smart iOS and Android applications with
AWS Mobile SDK

Powerful machine learning technology
Based on Amazon’s battle-hardened internal systems
Not just the algorithms:
• Smart data transformations
• Input data and model quality alerts
• Built-in industry best practices
Grows with your needs
• Train on up to 100 GB of data
• Generate billions of predictions
• Obtain predictions in batches or real-time

Integrated with the AWS data ecosystem
Access data that is stored in Amazon S3,
Amazon Redshift, or MySQL databases in
Amazon RDS
Output predictions to Amazon S3 for easy
integration with your data flows
Use AWS Identity and Access Management
(IAM) for fine-grained data access permission
policies

Fully managed model and prediction services
End-to-end service, with no servers to provision
and manage
One-click production model deployment
Programmatically query model metadata to
enable automatic retraining workflows
Monitor prediction usage patterns with Amazon
CloudWatch metrics

Three supported types of predictions
Binary classification
• Predict the answer to a Yes/No question
Multiclass classification
• Predict the correct category from a list
Regression
• Predict the value of a numeric variable

AWS AI: Democratized artificial intelligence

Thank you!
frank.tarsillo@ihsmarkit.com
willpe@amazon.com

Appendix:
Best Practices with ML

Train
model
Evaluate and
optimize
Retrieve
predictions
1 2 3
Building smart applications with Amazon ML

- Create a datasource object pointing to your data
- Explore and understand your data
- Transform data and train your model
Train
model
Evaluate and
optimize
Retrieve
predictions
1 2 3

Create a datasource object
>>> import boto
>>> ml = boto.connect_machinelearning()
>>> ds = ml.create_data_source_from_s3(
data_source_id = ’my_datasource',
data_spec = {
'DataLocationS3': 's3://bucket/input/data.csv',
'DataSchemaLocationS3': 's3://bucket/input/data.schema',
’compute_statistics’: True } )

Explore and understand your data

Train your model
>>> import boto
>>> model = ml.create_ml_model(
ml_model_id = ’my_model',
ml_model_type = 'REGRESSION',
training_data_source_id = 'my_datasource')

- Measure and understand model quality
- Adjust model interpretation
Train
model
Evaluate and
optimize
Retrieve
predictions
1 2 3

Fine-tune model interpretation

- Batch predictions
- Real-time predictions
Train
model
Evaluate and
optimize
Retrieve
predictions
1 2 3

Batch predictions
Asynchronous, large-volume prediction generation
Request through service console or API
Best for applications that deal with batches of data records
>>> import boto
>>> model = ml.create_batch_prediction(
batch_prediction_id = 'my_batch_prediction’,
batch_prediction_data_source_id = ’my_datasource’,
output_uri = 's3://examplebucket/output/’)

Real-time predictions
Synchronous, low-latency, high-throughput prediction generation
Request through service API, server, or mobile SDKs
Best for interaction applications that deal with individual data records
>>> import boto
>>> ml.predict(
predict_endpoint = ’example_endpoint’,
record = {’key1':’value1’, ’key2':’value2’})
{
'Prediction': {
'predictedValue': 13.284348,
'details': {
'Algorithm': 'SGD',
'PredictiveModelType': 'REGRESSION’
}
}
}

FSI202 Machine Learning in Capital Markets

Recommandé

Recommandé

Contenu connexe

Tendances

Tendances (20)

Similaire à FSI202 Machine Learning in Capital Markets

Similaire à FSI202 Machine Learning in Capital Markets (20)

Plus de Amazon Web Services

Plus de Amazon Web Services (20)

Dernier

Dernier (20)

FSI202 Machine Learning in Capital Markets