SlideShare une entreprise Scribd logo
1  sur  63
Télécharger pour lire hors ligne
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
Introducing
Amazon Textract
A I M 3 6 3
Ranju Das
General Manager
Amazon Textract
Wendy Tse
Sr. Product Manager
AWS
John Newton
Chief Technology Officer
Alfresco
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Documents are important
Primary tool of record keeping, communicating, collaborating, and transacting
Finance
Insurance
Real estate
Accounting
Tax management
Medical
Legal
Business management
Education
And many more…
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
16.3M US mortgage applications ($2.1T) in 2016
*
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
About 240M W2 tax forms will be processed for FY2018 in the
US
*
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Need for processing documents
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
How documents are processed today
Optical Character Recognition
(OCR)
Manual
processing
Rules and
template-based extraction
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Manual processing
Expensive Error-prone Time-consuming
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Manualprocessing
Output
1. Exempt is true
2. 28 is true
3. CPP/QPP is true
4. RPC/RRQ is true
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Optical Character Recognition (OCR)
Error-prone Flat bag of wordsSimple documents only
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Optical Character Recognition (OCR)
Output
Extract data quickly & No code templates to
accurately maintain
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Optical Character Recognition (OCR)
Output
Start Date End Date Employer Name Position Held Reason for leaving
1/15/2009 6/30/2013 Any Company Head Baker Family Relocated
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Output
Full Name Date of Birth Gender
John X Doe 01 01 1971
Male
First Middle Last MM DD YYYY
Female
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Rules and template-based extraction
Limited by
accuracy of OCR
Significant development and
management overhead
Templates
are brittle
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Challenges for processing documents
Rules and template-based extraction
The well-known W2 US tax form has 100s of variants each year
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
It looks easy, but …
…not a single corresponding pixel value in common
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract features
Text extraction Table extraction Form extraction
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract—Text Extraction
Blocks: PAGE, PARAGRAPH, LINE, WORD
is washed by waves, and cooled
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Name Description
Blocks List of blocks identified
from the document
ID Unique ID of the unit
Relationships CHILD
Block type PAGE, PARAGRAPH, LINE, WORD
Pages Contains number of
pages in the document
Amazon Textract—Text Extraction API
DetectDocumentText
Name Description
Document Blob or Amazon S3 object
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract—Table Extraction
Blocks: PAGE, TABLE, CELL
For each ’block’ you get:
• Text
• Confidence score
• Block relationships (e.g. cells within a table)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Name Description
Document Blob or Amazon S3 object
FeatureTypes TABLES
Name Description
Blocks List of blocks identified
from the document
ID Unique ID of the unit
Relationships CHILD
Block type PAGE, TABLE, CELL
Pages Contains number of
pages in the document
Amazon Textract—Table Extraction API
AnalyzeDocumentwith “table” as FeatureTypes parameter
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract—Form Extraction
Blocks: PAGE, KEY_VALUE_SET
For each ’block’ of your document:
• Form field name (key) and field value (value) association
• Confidence score
• Page number
• Block relationships
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Name Description
Document Blob or Amazon S3 object
FeatureTypes FORMS
Name Description
Blocks List of blocks identified
from the document
ID Unique ID of the unit
Relationships KEY, VALUE, CHILD
Block type PAGE, KEY_VALUE_SET
Pages Contains number of
pages in the document
Amazon Textract—Forms Extraction API
AnalyzeDocumentwith “forms” as FeatureTypes parameter
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Supports single-page documents
such
as images (e.g.,
mobile capture)
For multi-page documents,
up to 3,000 pages
Amazon Textract
Sync and async
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract—
TextExtractionsimplified
Output
Extract data quickly &
accurately
No code or templates to
maintain
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Output {
Start Date: 1/15/2009
End Date: 6/30/2013
Employer Name: Any Company
Position Held: Head Baker
Reason for leaving: Family relocated
}
Amazon Textract—
TableExtractionsimplified
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Output Full Name:
First: John
Middle: X
Last: Doe
Date of Birth:
MM: 01
DD: 01
YYYY: 1971
Gender:
Male: True
Female: False
Amazon Textract—
FormExtractionsimplified
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Text Extraction: OCR reimagined
Orientation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Text Extraction: OCR reimagined
Structure variability
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Text Extraction: OCR reimagined
Document variability
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Segmentation and rectification
Photometric
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Segmentation and rectification
Geometric
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Table and cell detection
Understand document structure and context to find tables
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Table and cell detection
Understand cells even without explicit boundaries
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Table and cell detection
Variable-sized rows and columns
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Field name (key) and value Extraction
Full Name:
First: John
Middle: X
Last: Doe
Date of Birth:
MM: 01
DD: 01
YYYY: 1971
Gender:
Male: True
Female: False
Detect phrases or groups of words
Output
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Inferring key/value association
Detect structures of the same form without templates
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Inferring key/value association
Key and value association
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Beyond OCR: Inferring key/value association
Infer empty values
Full Name:
First: John
Middle: null
Last: Doe
Date of Birth:
MM: 01
DD: 01
YYYY: 1971
Gender:
Male: True
Female: False
Output
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Reference architecture—Index and search documents
Input
Uploaded document
images such as tax
forms, credit
applications, or
medical notes
Amazon S3
Uploaded
documents are
stored in data lake
AWS Lambda
A Lambda function is
triggered to initiate
document analysis
using the
Hieroglyph API
Amazon Textract
Automatically
extract text,
including key-value
pairs and tables
Amazon
Elasticsearch
Service
Extracted data and
confidence scores
are indexed to
enable document
search
Output
Perform contextual
search on millions of
documents or
integrate data into
your document
management system
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Reference architecture—Form capture
Input
Customer uses mobile
app to capture a photo of
a W2 form
Amazon Textract
The Amazon Textract API
is integrated into the end-
user application to
automatically extract text
from the W2 form and
auto-populate the
form fields
Customer Application
Customers experience
real-time capture of their
tax information by taking
a photo instead of
manual data entry
Database
User submitted data is
loaded into a
database for use in tax
preparation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Reference architecture—Extract for NLP
Quickly turnextracted text/datainto actionableinsights
Input
Uploaded document
images of medical
notes, explanation of
benefits, and
patient forms
Amazon S3
Uploaded
documents are
stored in S3
NLP
Use natural language
processing to extract
insights from
medical documents
Amazon
Elasticsearch
Service
Easily search
through extracted
data and text
insights
Output
Discover medical
insights to improve
patient care
Amazon Textract
Automatically
extract words and
lines of text, and
tables
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract
Launch customers
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.
John Newton
CTO & Co-Founder
Alfresco Software, Inc
Brad Christus
Sr. Director, Global Presales
Alfresco Software, Inc
More than 80% of business information is locked in
unstructured content
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Agenda
• Who is Alfresco?
• The need for intelligent OCR
• Alfresco + Amazon Textract
• Use case example
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Who is Alfresco?
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
The need for intelligent OCR
• Paper is expensive
• Customers want mobile capture
• Manual data entry is slow and error-prone
• Access to embedded information speeds decisions and handling
• Extracting key-value and tabular data necessary to integrate documents
to line of business systems
• Storing documents with extracted information aids retrieval and proper
compliance
Amazon Textract allows us to move beyond just Digital
Transformation to Automatic Transformation
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Alfresco Digital Business Platform & Amazon Textract
JSON
JPG
Amazon
Textract
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Use cases
Enterprise
document library
Loan
applications
Claims & case
processing
Transaction &
logistics records
Research &
analysis
Real-time
video
Medical &
personnel records
Government records
& archives
Discovery &
litigation
Media
management
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Deep Dive: Financial services
Accelerate loan applications
On average, it takes a person 6-8 weeks to close a mortgage loan
Any incorrect data could mean weeks of delay and hours of wasted employee time
• With Alfresco and Amazon Textract, you can
• Automate the loan application process
• Extract data across multiple sources with high levels of accuracy
• Increase security by automating the data entry layer
• Decrease loan processing times
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract
Benefits
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Per 100 pages or images
processed
Up to 1M
pages/month
1M+
pages/month
Text Detection $0.15 $0.06
Table Extraction
(Text Detection included)
$1.50 $1.00
Key-Value Detection
(Text Detection included)
$5.00 $4.00
All $6.50 $5.00
Amazon Textract
Pricing
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Features Free for first three months
Text Detection Up to 1,000 pages
or images
Table Detection
Up to 100 pages
or images
Key-Value Detection
Text, Table, and
Key-Value Detection
Amazon Textract
FreeTier
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Textract
Regions
US West (Oregon)
US East (N. Virginia)
US East (Ohio) EU (Ireland)
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Amazon Textract
Preview
https://pages.awscloud.com/textract-preview.html
LEARN MORE
or
SIGN UP
Thank you!
© 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
Ranju Das
ranjudas@amazon.com
Wendy Tse
tsewendy@amazon.com
John Newton
john.newton@alfresco.com
© 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.

Contenu connexe

Tendances

AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAmazon Web Services
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Amazon Web Services
 
Introduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud ComputingIntroduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud ComputingAmazon Web Services
 
AWS Multi-Account Architecture and Best Practices
AWS Multi-Account Architecture and Best PracticesAWS Multi-Account Architecture and Best Practices
AWS Multi-Account Architecture and Best PracticesAmazon Web Services
 
AWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & ComplianceAWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & ComplianceAmazon Web Services
 
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...Amazon Web Services
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveCobus Bernard
 
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017Amazon Web Services Korea
 
AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019Amazon Web Services
 
Amazon Connect delivers personalized customer experience for your contact center
Amazon Connect delivers personalized customer experience for your contact centerAmazon Connect delivers personalized customer experience for your contact center
Amazon Connect delivers personalized customer experience for your contact centerAmazon Web Services
 
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...Amazon Web Services
 
Azure subscription management with EA and CSP
Azure subscription management with EA and CSPAzure subscription management with EA and CSP
Azure subscription management with EA and CSPDaichi Isami
 
Introducing AWS Fargate - AWS Online Tech Talks
Introducing AWS Fargate - AWS Online Tech TalksIntroducing AWS Fargate - AWS Online Tech Talks
Introducing AWS Fargate - AWS Online Tech TalksAmazon Web Services
 
Amazon Rekognition: Deep Learning-Based Image and Video Analysis
Amazon Rekognition: Deep Learning-Based Image and Video AnalysisAmazon Rekognition: Deep Learning-Based Image and Video Analysis
Amazon Rekognition: Deep Learning-Based Image and Video AnalysisAmazon Web Services
 

Tendances (20)

AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the CloudAWS Neptune - A Fast and reliable Graph Database Built for the Cloud
AWS Neptune - A Fast and reliable Graph Database Built for the Cloud
 
Computer Vision con AWS
Computer Vision con AWSComputer Vision con AWS
Computer Vision con AWS
 
Introducing AWS Fargate
Introducing AWS FargateIntroducing AWS Fargate
Introducing AWS Fargate
 
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
Big Data Analytics Architectural Patterns and Best Practices (ANT201-R1) - AW...
 
Introduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud ComputingIntroduction to AWS Services and Cloud Computing
Introduction to AWS Services and Cloud Computing
 
AWS Multi-Account Architecture and Best Practices
AWS Multi-Account Architecture and Best PracticesAWS Multi-Account Architecture and Best Practices
AWS Multi-Account Architecture and Best Practices
 
AWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & ComplianceAWS Security Week: Security, Identity, & Compliance
AWS Security Week: Security, Identity, & Compliance
 
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
Introducing AWS DataSync - Simplify, automate, and accelerate online data tra...
 
AWS Lake Formation Deep Dive
AWS Lake Formation Deep DiveAWS Lake Formation Deep Dive
AWS Lake Formation Deep Dive
 
Fundamentals of AWS Security
Fundamentals of AWS SecurityFundamentals of AWS Security
Fundamentals of AWS Security
 
AWS Account Best Practices
AWS Account Best PracticesAWS Account Best Practices
AWS Account Best Practices
 
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017
엔터프라이즈 기술 지원을 통한 효율적인 클라우드 운영 사례 - AWS Summit Seoul 2017
 
Building-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWSBuilding-a-Data-Lake-on-AWS
Building-a-Data-Lake-on-AWS
 
AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019AWS Core Services Overview, Immersion Day Huntsville 2019
AWS Core Services Overview, Immersion Day Huntsville 2019
 
SaaS on AWS - ISV challenges
SaaS on AWS - ISV challengesSaaS on AWS - ISV challenges
SaaS on AWS - ISV challenges
 
Amazon Connect delivers personalized customer experience for your contact center
Amazon Connect delivers personalized customer experience for your contact centerAmazon Connect delivers personalized customer experience for your contact center
Amazon Connect delivers personalized customer experience for your contact center
 
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...
Introducing AWS Transfer for SFTP, a Fully Managed SFTP Service for Amazon S3...
 
Azure subscription management with EA and CSP
Azure subscription management with EA and CSPAzure subscription management with EA and CSP
Azure subscription management with EA and CSP
 
Introducing AWS Fargate - AWS Online Tech Talks
Introducing AWS Fargate - AWS Online Tech TalksIntroducing AWS Fargate - AWS Online Tech Talks
Introducing AWS Fargate - AWS Online Tech Talks
 
Amazon Rekognition: Deep Learning-Based Image and Video Analysis
Amazon Rekognition: Deep Learning-Based Image and Video AnalysisAmazon Rekognition: Deep Learning-Based Image and Video Analysis
Amazon Rekognition: Deep Learning-Based Image and Video Analysis
 

Similaire à Amazon Textract Introduces AI for Document Processing

[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...
[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...
[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...Amazon Web Services
 
Add Intelligence to Applications - AIM203 - Anaheim AWS Summit
Add Intelligence to Applications - AIM203 - Anaheim AWS SummitAdd Intelligence to Applications - AIM203 - Anaheim AWS Summit
Add Intelligence to Applications - AIM203 - Anaheim AWS SummitAmazon Web Services
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAmazon Web Services
 
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML Services
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML ServicesIVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML Services
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML ServicesAmazon Web Services Japan
 
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI services
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI servicesAWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI services
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI servicesAmazon Web Services
 
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...AWSKRUG - AWS한국사용자모임
 
Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018Amazon Web Services
 
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...Amazon Web Services
 
Using Big Data Retail to Build a Single View of Your Customer.pdf
Using Big Data Retail to Build a Single View of Your Customer.pdfUsing Big Data Retail to Build a Single View of Your Customer.pdf
Using Big Data Retail to Build a Single View of Your Customer.pdfAmazon Web Services
 
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...Amazon Web Services
 
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...Amazon Web Services
 
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...AWS Germany
 
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...Amazon Web Services
 
Understanding Graph Databases: AWS Developer Workshop at Web Summit
Understanding Graph Databases: AWS Developer Workshop at Web SummitUnderstanding Graph Databases: AWS Developer Workshop at Web Summit
Understanding Graph Databases: AWS Developer Workshop at Web SummitAmazon Web Services
 
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video AnalysisBDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video AnalysisAmazon Web Services
 
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...Michaela Bromfield
 
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...Amazon Web Services
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLAmazon Web Services
 
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Amazon Web Services
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerAmazon Web Services
 

Similaire à Amazon Textract Introduces AI for Document Processing (20)

[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...
[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...
[NEW LAUNCH!] Extract Insights from Millions of Documents with Amazon Textrac...
 
Add Intelligence to Applications - AIM203 - Anaheim AWS Summit
Add Intelligence to Applications - AIM203 - Anaheim AWS SummitAdd Intelligence to Applications - AIM203 - Anaheim AWS Summit
Add Intelligence to Applications - AIM203 - Anaheim AWS Summit
 
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdfAdd intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
Add intelligence to applications - AIM205 - Santa Clara AWS Summit.pdf
 
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML Services
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML ServicesIVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML Services
IVS CTO Night And Day 2018 Winter - [re:Cap] AI & ML Services
 
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI services
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI servicesAWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI services
AWS Summit Singapore 2019 | Accelerating ML Adoption with Our New AI services
 
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...
데이터 라벨링 노가다는 이제 그만 - Amazon Sagemaker Ground Truth :: 소성운 -...
 
Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018Using AI for real-life data enrichment - Tel Aviv Summit 2018
Using AI for real-life data enrichment - Tel Aviv Summit 2018
 
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...
[REPEAT] Better Analytics Through Natural Language Processing (AIM405-R) - AW...
 
Using Big Data Retail to Build a Single View of Your Customer.pdf
Using Big Data Retail to Build a Single View of Your Customer.pdfUsing Big Data Retail to Build a Single View of Your Customer.pdf
Using Big Data Retail to Build a Single View of Your Customer.pdf
 
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...
Build, Deploy, and Serve Machine Learning Models on Streaming Data (ANT345-R1...
 
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...
Build a Searchable Media Library & Moderate Content at Scale Using Machine Le...
 
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
Modern Applications Web Day | Manage Your Infrastructure and Configuration on...
 
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...
Leadership Session: Digital Advertising - Customer Learning & the Road Ahead ...
 
Understanding Graph Databases: AWS Developer Workshop at Web Summit
Understanding Graph Databases: AWS Developer Workshop at Web SummitUnderstanding Graph Databases: AWS Developer Workshop at Web Summit
Understanding Graph Databases: AWS Developer Workshop at Web Summit
 
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video AnalysisBDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
BDA303 Amazon Rekognition: Deep Learning-Based Image and Video Analysis
 
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
Emerging Trends in Big Data, Analytics, Machine Learning, and Internet-of-Thi...
 
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...
Amazon SageMaker Ground Truth: Build High-Quality and Accurate ML Training Da...
 
Using Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with MLUsing Data to Delight and Retain Customers with ML
Using Data to Delight and Retain Customers with ML
 
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
Machine Learning for the Enterprise, ft. Sony Interactive Entertainment (ENT2...
 
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMakerBDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
BDA304 Build Deep Learning Applications with TensorFlow and Amazon SageMaker
 

Plus de Amazon Web Services

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Amazon Web Services
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Amazon Web Services
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateAmazon Web Services
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSAmazon Web Services
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Amazon Web Services
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Amazon Web Services
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...Amazon Web Services
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsAmazon Web Services
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareAmazon Web Services
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSAmazon Web Services
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAmazon Web Services
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareAmazon Web Services
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWSAmazon Web Services
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckAmazon Web Services
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without serversAmazon Web Services
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...Amazon Web Services
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSAmazon Web Services
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAmazon Web Services
 

Plus de Amazon Web Services (20)

Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
Come costruire servizi di Forecasting sfruttando algoritmi di ML e deep learn...
 
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
Big Data per le Startup: come creare applicazioni Big Data in modalità Server...
 
Esegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS FargateEsegui pod serverless con Amazon EKS e AWS Fargate
Esegui pod serverless con Amazon EKS e AWS Fargate
 
Costruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWSCostruire Applicazioni Moderne con AWS
Costruire Applicazioni Moderne con AWS
 
Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot Come spendere fino al 90% in meno con i container e le istanze spot
Come spendere fino al 90% in meno con i container e le istanze spot
 
Open banking as a service
Open banking as a serviceOpen banking as a service
Open banking as a service
 
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
Rendi unica l’offerta della tua startup sul mercato con i servizi Machine Lea...
 
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...OpsWorks Configuration Management: automatizza la gestione e i deployment del...
OpsWorks Configuration Management: automatizza la gestione e i deployment del...
 
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows WorkloadsMicrosoft Active Directory su AWS per supportare i tuoi Windows Workloads
Microsoft Active Directory su AWS per supportare i tuoi Windows Workloads
 
Database Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatareDatabase Oracle e VMware Cloud on AWS i miti da sfatare
Database Oracle e VMware Cloud on AWS i miti da sfatare
 
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJSCrea la tua prima serverless ledger-based app con QLDB e NodeJS
Crea la tua prima serverless ledger-based app con QLDB e NodeJS
 
API moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e webAPI moderne real-time per applicazioni mobili e web
API moderne real-time per applicazioni mobili e web
 
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatareDatabase Oracle e VMware Cloud™ on AWS: i miti da sfatare
Database Oracle e VMware Cloud™ on AWS: i miti da sfatare
 
Tools for building your MVP on AWS
Tools for building your MVP on AWSTools for building your MVP on AWS
Tools for building your MVP on AWS
 
How to Build a Winning Pitch Deck
How to Build a Winning Pitch DeckHow to Build a Winning Pitch Deck
How to Build a Winning Pitch Deck
 
Building a web application without servers
Building a web application without serversBuilding a web application without servers
Building a web application without servers
 
Fundraising Essentials
Fundraising EssentialsFundraising Essentials
Fundraising Essentials
 
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
AWS_HK_StartupDay_Building Interactive websites while automating for efficien...
 
Come costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWSCome costruire un'architettura Serverless nel Cloud AWS
Come costruire un'architettura Serverless nel Cloud AWS
 
AWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei serverAWS Serverless per startup: come innovare senza preoccuparsi dei server
AWS Serverless per startup: come innovare senza preoccuparsi dei server
 

Amazon Textract Introduces AI for Document Processing

  • 1.
  • 2. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. Introducing Amazon Textract A I M 3 6 3 Ranju Das General Manager Amazon Textract Wendy Tse Sr. Product Manager AWS John Newton Chief Technology Officer Alfresco
  • 3. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Documents are important Primary tool of record keeping, communicating, collaborating, and transacting Finance Insurance Real estate Accounting Tax management Medical Legal Business management Education And many more…
  • 4. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark 16.3M US mortgage applications ($2.1T) in 2016 *
  • 5. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark About 240M W2 tax forms will be processed for FY2018 in the US *
  • 6. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Need for processing documents
  • 7. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark How documents are processed today Optical Character Recognition (OCR) Manual processing Rules and template-based extraction
  • 8. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Manual processing Expensive Error-prone Time-consuming
  • 9. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Manualprocessing Output 1. Exempt is true 2. 28 is true 3. CPP/QPP is true 4. RPC/RRQ is true
  • 10. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Optical Character Recognition (OCR) Error-prone Flat bag of wordsSimple documents only
  • 11. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Optical Character Recognition (OCR) Output Extract data quickly & No code templates to accurately maintain
  • 12. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Optical Character Recognition (OCR) Output Start Date End Date Employer Name Position Held Reason for leaving 1/15/2009 6/30/2013 Any Company Head Baker Family Relocated
  • 13. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Output Full Name Date of Birth Gender John X Doe 01 01 1971 Male First Middle Last MM DD YYYY Female
  • 14. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Rules and template-based extraction Limited by accuracy of OCR Significant development and management overhead Templates are brittle
  • 15. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Challenges for processing documents Rules and template-based extraction The well-known W2 US tax form has 100s of variants each year
  • 16. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark It looks easy, but … …not a single corresponding pixel value in common
  • 17. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
  • 18. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract features Text extraction Table extraction Form extraction
  • 19. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract—Text Extraction Blocks: PAGE, PARAGRAPH, LINE, WORD is washed by waves, and cooled
  • 20. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Name Description Blocks List of blocks identified from the document ID Unique ID of the unit Relationships CHILD Block type PAGE, PARAGRAPH, LINE, WORD Pages Contains number of pages in the document Amazon Textract—Text Extraction API DetectDocumentText Name Description Document Blob or Amazon S3 object
  • 21. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract—Table Extraction Blocks: PAGE, TABLE, CELL For each ’block’ you get: • Text • Confidence score • Block relationships (e.g. cells within a table)
  • 22. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Name Description Document Blob or Amazon S3 object FeatureTypes TABLES Name Description Blocks List of blocks identified from the document ID Unique ID of the unit Relationships CHILD Block type PAGE, TABLE, CELL Pages Contains number of pages in the document Amazon Textract—Table Extraction API AnalyzeDocumentwith “table” as FeatureTypes parameter
  • 23. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract—Form Extraction Blocks: PAGE, KEY_VALUE_SET For each ’block’ of your document: • Form field name (key) and field value (value) association • Confidence score • Page number • Block relationships
  • 24. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Name Description Document Blob or Amazon S3 object FeatureTypes FORMS Name Description Blocks List of blocks identified from the document ID Unique ID of the unit Relationships KEY, VALUE, CHILD Block type PAGE, KEY_VALUE_SET Pages Contains number of pages in the document Amazon Textract—Forms Extraction API AnalyzeDocumentwith “forms” as FeatureTypes parameter
  • 25. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Supports single-page documents such as images (e.g., mobile capture) For multi-page documents, up to 3,000 pages Amazon Textract Sync and async
  • 26. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract— TextExtractionsimplified Output Extract data quickly & accurately No code or templates to maintain
  • 27. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Output { Start Date: 1/15/2009 End Date: 6/30/2013 Employer Name: Any Company Position Held: Head Baker Reason for leaving: Family relocated } Amazon Textract— TableExtractionsimplified
  • 28. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Output Full Name: First: John Middle: X Last: Doe Date of Birth: MM: 01 DD: 01 YYYY: 1971 Gender: Male: True Female: False Amazon Textract— FormExtractionsimplified
  • 29. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
  • 30. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Text Extraction: OCR reimagined Orientation
  • 31. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Text Extraction: OCR reimagined Structure variability
  • 32. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Text Extraction: OCR reimagined Document variability
  • 33. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Segmentation and rectification Photometric
  • 34. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Segmentation and rectification Geometric
  • 35. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Table and cell detection Understand document structure and context to find tables
  • 36. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Table and cell detection Understand cells even without explicit boundaries
  • 37. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Table and cell detection Variable-sized rows and columns
  • 38. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Field name (key) and value Extraction Full Name: First: John Middle: X Last: Doe Date of Birth: MM: 01 DD: 01 YYYY: 1971 Gender: Male: True Female: False Detect phrases or groups of words Output
  • 39. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Inferring key/value association Detect structures of the same form without templates
  • 40. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Inferring key/value association Key and value association
  • 41. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Beyond OCR: Inferring key/value association Infer empty values Full Name: First: John Middle: null Last: Doe Date of Birth: MM: 01 DD: 01 YYYY: 1971 Gender: Male: True Female: False Output
  • 42. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
  • 43. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Reference architecture—Index and search documents Input Uploaded document images such as tax forms, credit applications, or medical notes Amazon S3 Uploaded documents are stored in data lake AWS Lambda A Lambda function is triggered to initiate document analysis using the Hieroglyph API Amazon Textract Automatically extract text, including key-value pairs and tables Amazon Elasticsearch Service Extracted data and confidence scores are indexed to enable document search Output Perform contextual search on millions of documents or integrate data into your document management system
  • 44. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Reference architecture—Form capture Input Customer uses mobile app to capture a photo of a W2 form Amazon Textract The Amazon Textract API is integrated into the end- user application to automatically extract text from the W2 form and auto-populate the form fields Customer Application Customers experience real-time capture of their tax information by taking a photo instead of manual data entry Database User submitted data is loaded into a database for use in tax preparation
  • 45. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Reference architecture—Extract for NLP Quickly turnextracted text/datainto actionableinsights Input Uploaded document images of medical notes, explanation of benefits, and patient forms Amazon S3 Uploaded documents are stored in S3 NLP Use natural language processing to extract insights from medical documents Amazon Elasticsearch Service Easily search through extracted data and text insights Output Discover medical insights to improve patient care Amazon Textract Automatically extract words and lines of text, and tables
  • 46. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark
  • 47. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract Launch customers
  • 48. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved. John Newton CTO & Co-Founder Alfresco Software, Inc Brad Christus Sr. Director, Global Presales Alfresco Software, Inc
  • 49. More than 80% of business information is locked in unstructured content
  • 50. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Agenda • Who is Alfresco? • The need for intelligent OCR • Alfresco + Amazon Textract • Use case example
  • 51. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Who is Alfresco?
  • 52. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark The need for intelligent OCR • Paper is expensive • Customers want mobile capture • Manual data entry is slow and error-prone • Access to embedded information speeds decisions and handling • Extracting key-value and tabular data necessary to integrate documents to line of business systems • Storing documents with extracted information aids retrieval and proper compliance
  • 53. Amazon Textract allows us to move beyond just Digital Transformation to Automatic Transformation
  • 54. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Alfresco Digital Business Platform & Amazon Textract JSON JPG Amazon Textract
  • 55. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Use cases Enterprise document library Loan applications Claims & case processing Transaction & logistics records Research & analysis Real-time video Medical & personnel records Government records & archives Discovery & litigation Media management
  • 56. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Deep Dive: Financial services Accelerate loan applications On average, it takes a person 6-8 weeks to close a mortgage loan Any incorrect data could mean weeks of delay and hours of wasted employee time • With Alfresco and Amazon Textract, you can • Automate the loan application process • Extract data across multiple sources with high levels of accuracy • Increase security by automating the data entry layer • Decrease loan processing times
  • 57. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract Benefits
  • 58. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Per 100 pages or images processed Up to 1M pages/month 1M+ pages/month Text Detection $0.15 $0.06 Table Extraction (Text Detection included) $1.50 $1.00 Key-Value Detection (Text Detection included) $5.00 $4.00 All $6.50 $5.00 Amazon Textract Pricing
  • 59. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Features Free for first three months Text Detection Up to 1,000 pages or images Table Detection Up to 100 pages or images Key-Value Detection Text, Table, and Key-Value Detection Amazon Textract FreeTier
  • 60. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Textract Regions US West (Oregon) US East (N. Virginia) US East (Ohio) EU (Ireland)
  • 61. © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Amazon Textract Preview https://pages.awscloud.com/textract-preview.html LEARN MORE or SIGN UP
  • 62. Thank you! © 2018, Amazon Web Services, Inc. or its Affiliates. All rights reserved. Amazon Confidential and Trademark Ranju Das ranjudas@amazon.com Wendy Tse tsewendy@amazon.com John Newton john.newton@alfresco.com
  • 63. © 2018, Amazon Web Services, Inc. or its affiliates. All rights reserved.