2. Agenda
• Artificial Intelligence At Amazon
• Text-to-speech: Amazon Polly
• Object and face detection: Amazon Rekognition
• Machine Learning as a service: Amazon Machine Learning
• Spark MLlib on Amazon EMR
• Apache MXNet: Deep Learning
• Resources
3. • Artificial Intelligence: design software applications which
exhibit human-like behavior, e.g. speech, natural language
processing, reasoning or intuition
• Machine Learning: teach machines to learn without being
explicitly programmed
• Deep Learning: using neural networks, teach machines to
learn from complex data where features cannot be explicitly
expressed
4. Artificial Intelligence At Amazon
Thousands Of Employees Across The Company Focused on AI
Discovery &
Search
Fulfilment &
Logistics
Enhance
Existing Products
Define New
Categories Of
Products
Bring Machine
Learning To All
16. Amazon Machine Learning
Easy-to-use, managed machine learning service built for
developers
Robust, powerful technology based on Amazon’s internal
systems
Create regression and classification models using your data
already
stored in the AWS Cloud
Deploy models to production in seconds
17. ”
“
Fraud.net Uses AWS to Quickly, Easily Detect
Online Fraud
Fraud.net is the world’s leading
crowdsourced fraud prevention
platform.
Amazon Machine Learning
helps us reduce complexity and
make sense of emerging fraud
patterns.
• Needed to build and train a larger number of
more targeted machine-learning models
• Uses Amazon Machine Learning to provide
more than 20 models
• Easily builds and trains models to effectively
detect online payment fraud
• Reduces complexity and makes sense of
emerging fraud patterns
• Saves clients $1 million weekly by helping
them detect and prevent fraud
Oliver Clark
CTO,
Fraud.net
”
“
20. Amazon Elastic Map Reduce (EMR)
• Map Reduce, Apache Spark, Presto, etc.
• Launch a cluster in minutes
• Open source distribution or MapR distribution
• Elasticity of the cloud
• Built in security features
• Pay by the hour and save with Spot instances
• Flexibility to customize
21. Integration with AWS backends
Amazon DynamoDB
EMR-DynamoDB
connector
Amazon RDS
Amazon
Kinesis
Streaming data
connectorsJDBC Data Source
w/ Spark SQL
ElasticSearch
connector
Amazon Redshift
Spark-Redshift
connector
EMR File System
(EMRFS) Amazon S3
Amazon EMR
Amazon ES
22. Running Spark jobs on EMR
Amazon EMR
Step API
Submit a Spark
application
Amazon EMR
AWS Data Pipeline
Airflow, Luigi, or other
schedulers on EC2
Create a pipeline
to schedule job
submission or create
complex workflows
AWS Lambda
Use AWS Lambda to
submit applications to
EMR Step API or directly
to Spark on your cluster
AWS Glue
23. Spark ML on Amazon EMR: spam detector
Adapted from https://github.com/databricks/learning-spark/blob/master/src/main/scala/com/oreilly/learningsparkexamples/scala/MLlib.scala
25. Apache MXNet: Open Source library for Deep Learning
Programmable Portable High Performance
Near linear scaling
across hundreds of GPUs
Highly efficient
models for mobile
and IoT
Simple syntax,
multiple languages
Most Open Best On AWS
Optimized for
Deep Learning on AWS
Accepted into the
Apache Incubator
https://mxnet.incubator.apache.org/
27. Running MXNet in Spark
• Perform Data processing and Deep Learning on the same cluster
• Amazon EMR support GPU instances (g2 family)
• You can use them for distributed training
• This is still experimental
• More information:
https://github.com/apache/incubator-mxnet/tree/master/scala-
package/spark
Using AWS, C-SPAN can sample a frame every six seconds for recognition against indexed faces in a database of 97,000 people. Previously, this was done manually: Indexers scrolled through screen captures to identify who was speaking at any given point and select an image to represent each individual in each video. C-SPAN expects to save 8,000 to 9,000 hours a year in labor by automating that process using Rekognition, and will be able to index 100% of its incoming footage and archives.
Today, we have announced Amazon ML, the newest addition to the Amazon Web Services family.
Amazon ML is easy to use, and intended for developers – people who are already most connected and familiar with data instrumentation, pipelines and storage/
Amazon ML is based on the same robust ML technology that is already used within Amazon’s internal systems, generating billions of predictions weekly
Amazon ML is built to make it simple and reliable to use the data that you are already storing in the AWS cloud, in products like Amazon S3, Amazon Redshift and Amazon RD
And lastly, Amazon ML is built to eliminate the gap between having models and using these models to build smart applications. Production deployment is only a click away – and sometimes you won’t even need that one click.
STORY BACKGROUND
Fraud.net uses Amazon Machine Learning to support its machine-learning models.
The company uses Amazon DynamoDB and AWS Lambda to run code without provisioning and managing servers.
Uses Amazon Redshift for data analysis.
SOLUTION & BENEFITS
Launches and trains machine-learning models in almost half the time it took on other platforms.
Reduces complexity and makes sense of emerging fraud patterns.
Saves customers $1 million each week.
CONTENT TAGS
Main use case: Big Data, Analytics, & Business Intelligence (BI)
Keywords: online fraud, fraud detection, machine learning, big data, Amazon Machine Learning, business intelligence
AWS Services used: Amazon DynamoDB, Amazon Redshift, Amazon Machine Learning, Amazon S3, AWS Lambda
Benefits Realized: Agility, Better Performance, Ease of Use, Lower Cost, Reliability, Scalability/Elasticity, Speed