SlideShare une entreprise Scribd logo
1  sur  23
Extracting your
data shouldn’t be
like pulling teeth
Turning Content into Data With Intelligent Data Extraction
Advanced Capture from DocuFi, Inc.
©2014 DocuFi
Time moves on…
…many
businesses have
moved from just
scanning for storage
purposes only.
Users want the
brainwork taken out
of working with their
scans and files.
Capture software
should see the
images, extract the
content, and
integrate it into the
workflow.
Brain mri.jpg,
National Institutes of Health
Recognize Extract Integrate
Key Elements of Intelligent Data
Capture
Recognition
technologies such
as OCR and barcode
recognition can be
used to pull data
from structured or
unstructured scans
or existing files
painlessly.
OCR has the greatest impact on the growth
of intelligent data extraction and the
potential continues to grow as the
technologies continue to improve.
Barcode recognition offers
the most trustworthy
recognition technology for
data capture and is widely
deployed.
See What Can Barcodes Do for Me?
OMR (Optical Mark Recognition)
• capturing human-marked data from
document forms such as surveys and
• continues to improve in accuracy and
demand
ICR (Intelligent Character Recognition)
• handwriting recognition
• not as accurate as OCR
• plays a limited role in some capture systems
• continues to improve in accuracy and
demand
Other Recognition Technologies
After the data has been captured (from barcode,
OCR, etc.), pattern matching technology identifies
the key data.
Regular expressions (regex) provide a
fast and powerful method to search,
extract and replace specific data found
within scanned documents.
Regular expressions are essentially a
special text string for describing a
search pattern.You could think of
regular expressions as extremely
powerful wildcards.
See Using Regular Expressions in Document
Management Data Capture and Indexing
See Using Regular Expressions in Document
Management Data Capture and Indexing
Regex’s Lookahead , Lookbehind and Line Item Extraction features
go beyond basic zonal OCR and let you identify and extract data from
unstructured documents. These let you search for an identifiable
keyword or string, like “PO Number” and then a word pattern to
identify the desired text to extract.
There’s a Mountain
of It!
Here is a partial invoice where you might need to capture the "Catalogue Number“
with line Item extraction technology.
Real World Example
So once the key data
has been identified or
“extracted”, how can it
be used?
A large single file can be split into multiple files based on information
extracted from barcodes and content.
Split Files
Name Files and Folders
Name files, folders and subfolders with extracted information from the file
or system information.
Route Files
Route the files to another directory (and even create the folder and
subfolder names) using content.
Create indexes from extracted information for the “searchable” fields.
Index
Create PDF Bookmarks
Create PDF bookmarks based on extracted information.
Validation
Data can be validated against business rules to reduce errors .
Integrate
Integration means
sharing the
information with:
• A simple search and retrieval
system
• A Document Management (DM)
system
• An Enterprise Content
Management (ECM) system
• A back-end application such as an
Enterprise Resource Planning
(ERP) system
Molaire sur implant, jbessade — Travail, www.fr.wikipedia.org
Henry Schein,
Dentri Dentrix
Enterprise
Dentrix Ascend,
Easy Dx, ental
Viive,
DentalVision,
axiUm
… ImageRamp can share the extracted data with
anyone who can accept a standard XML or CSV
file
Laserfiche
Filenet
MyMedicalRecords
Eaglesoft
Allscripts
Dentrix
CSV or XML
Anyone
Documentum
Epic
So smile, this is where
the content becomes
data.
There’s a Mountain
of It!
If a stack of invoices were scanned at one time, at each unique occurrence of
the Invoice Number, the file could be split and named with the extracted
invoice number. Furthermore, the Invoice Number could be shared with an
AP system.
The Catalogue Numbers could be extracted and shared with an ERP for
inventory purposes.
Remember our Real World Example?
So what needs brushing up?
What does the future
hold for intelligent
data capture? digicla, "Be good for your teeth and the will be good for you“.
Continued Improvement in Recognition Technologies Including:
Increased Mobility Integration For Smart Phones, Tablets, etc.
Increased Cloud Computing Options
Improved Validation Against Complex Business Rules
Increased Technical Support to Manage the Complexity
• OCR expansion to include services like translation
• Better accuracy of ICR (handwriting recognition)
• Faster, more accurate
Increased Information Governance Issues and Complexity
Want to Learn More about Document Imaging
and Capture?
For more on:
• Extracting meta data,
• Data extraction from unstructured
data
• Intelligent data capture
• Data extraction
• Using regex to extract data
• Document scanning
• Extracting data
• Extract meta data,
• Scanner software,
• Barcode recognition,
• OCR software,
• Capture tutorial
• Pdf scanning,
• Scanning software
• Indexing
• Document indexing
• Automated capture
• Meta data
• Scan to index
• Batch Processing
• Bulk scanning
• Docufi
• Imageramp
• Data capture
• Migration to document management
DocuFi
30 years’ experience in the Document Imaging and
Capture market
Capture Products www.docufi.com
Copyright ©2014
makers of ImageRamp,
Intelligent Capture Solution
Just take a bite and get started
with us.

Contenu connexe

Tendances

DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning SolutionsGordon Bishop
 
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...Steven Meister
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalizationAmeenababs
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsConnexica
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesFellowBuddy.com
 
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...csandit
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentationmillerca2
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BIAchmad Solichin
 

Tendances (20)

Batch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp BatchBatch Document Processing with ImageRamp Batch
Batch Document Processing with ImageRamp Batch
 
What can barcodes do for me? A look at barcodes in Document Management/EMR da...
What can barcodes do for me? A look at barcodes in Document Management/EMR da...What can barcodes do for me? A look at barcodes in Document Management/EMR da...
What can barcodes do for me? A look at barcodes in Document Management/EMR da...
 
Folder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch ScanningFolder Watching For Automated Document Capture, Batch Scanning
Folder Watching For Automated Document Capture, Batch Scanning
 
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
Fujitsu ScanSnap Scanner, an overview of document data capture with barcodes,...
 
An Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your RequirementsAn Introduction to Document Scanning, Understanding Your Requirements
An Introduction to Document Scanning, Understanding Your Requirements
 
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
Mobile Cloud Capture: Customize your Data Capture on Mobile Devices with Proc...
 
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned ImagesImprove OCR Accuracy, Clean Up and Enhance Scanned Images
Improve OCR Accuracy, Clean Up and Enhance Scanned Images
 
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File FormatsPDF vs. TIFF, An Evaluation of Document Scanning File Formats
PDF vs. TIFF, An Evaluation of Document Scanning File Formats
 
DocuSolve Scanning Solutions
DocuSolve Scanning SolutionsDocuSolve Scanning Solutions
DocuSolve Scanning Solutions
 
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
BigDataRevealed SecureSequesterEncrypt - iot easy as 1-2-3 - catalog-metadata...
 
Big data
Big dataBig data
Big data
 
information retrieval Techniques and normalization
information retrieval Techniques and normalizationinformation retrieval Techniques and normalization
information retrieval Techniques and normalization
 
Steering Away from Bolted-On Analytics
Steering Away from Bolted-On AnalyticsSteering Away from Bolted-On Analytics
Steering Away from Bolted-On Analytics
 
Data Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture NotesData Mining & Data Warehousing Lecture Notes
Data Mining & Data Warehousing Lecture Notes
 
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...
INTELLIGENT AND PERVASIVE ARCHIVING FRAMEWORK TO ENHANCE THE USABILITY OF THE...
 
Data Mining
Data MiningData Mining
Data Mining
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
A Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining PresentationA Practical Approach To Data Mining Presentation
A Practical Approach To Data Mining Presentation
 
01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI01. Introduction to Data Mining and BI
01. Introduction to Data Mining and BI
 
Data Warehouse
Data Warehouse Data Warehouse
Data Warehouse
 

Similaire à Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Capture Software

UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxRohitRadhakrishnan8
 
iData Sciences Product Overview
iData Sciences Product OverviewiData Sciences Product Overview
iData Sciences Product Overviewjvsrinivas1
 
Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfDhanashreeBadhe
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise deteo
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?ARC Document Solutions
 
Using AI to classify your SharePoint Data
Using AI to classify your SharePoint DataUsing AI to classify your SharePoint Data
Using AI to classify your SharePoint DataAlbert-Jan Schot
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization wordDhana K
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final VersionJanani Eshwaran
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final VersionJanani Eshwaran
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR RecognitionBharat Kalia
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Denodo
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchAltair
 
Business Analytics Paradigm Change
Business Analytics Paradigm ChangeBusiness Analytics Paradigm Change
Business Analytics Paradigm ChangeDmitry Anoshin
 
Scanning To Cloud Presentation
Scanning To Cloud PresentationScanning To Cloud Presentation
Scanning To Cloud Presentationguest3a3ab
 
Infopulse AI, Data Science & RPA Managed Services
Infopulse AI, Data Science & RPA Managed ServicesInfopulse AI, Data Science & RPA Managed Services
Infopulse AI, Data Science & RPA Managed ServicesInfopulse
 

Similaire à Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Capture Software (20)

UiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptxUiPath Document Understanding_Day 2.pptx
UiPath Document Understanding_Day 2.pptx
 
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
ChronoScan Document Scanning and Capture for Unparralleled Data Extraction an...
 
iData Sciences Product Overview
iData Sciences Product OverviewiData Sciences Product Overview
iData Sciences Product Overview
 
Modern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdfModern Document Processing | Nanonets Blog.pdf
Modern Document Processing | Nanonets Blog.pdf
 
Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise Deteo. Data science, Big Data expertise
Deteo. Data science, Big Data expertise
 
IT webinar 2016
IT webinar 2016IT webinar 2016
IT webinar 2016
 
What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?What is Optical Character Recognition (OCR) Technology?
What is Optical Character Recognition (OCR) Technology?
 
Using AI to classify your SharePoint Data
Using AI to classify your SharePoint DataUsing AI to classify your SharePoint Data
Using AI to classify your SharePoint Data
 
Optical character recognization word
Optical character recognization wordOptical character recognization word
Optical character recognization word
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final Version
 
MS Azure with IoT - Final Version
MS Azure with IoT - Final VersionMS Azure with IoT - Final Version
MS Azure with IoT - Final Version
 
DU_SERIES_Session1.pdf
DU_SERIES_Session1.pdfDU_SERIES_Session1.pdf
DU_SERIES_Session1.pdf
 
Project report of OCR Recognition
Project report of OCR RecognitionProject report of OCR Recognition
Project report of OCR Recognition
 
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
Quicker Insights and Sustainable Business Agility Powered By Data Virtualizat...
 
No Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair MonarchNo Code Data Transformation for Insurance with Altair Monarch
No Code Data Transformation for Insurance with Altair Monarch
 
Business Analytics Paradigm Change
Business Analytics Paradigm ChangeBusiness Analytics Paradigm Change
Business Analytics Paradigm Change
 
Document Parsing
Document ParsingDocument Parsing
Document Parsing
 
Scanning To Cloud Presentation
Scanning To Cloud PresentationScanning To Cloud Presentation
Scanning To Cloud Presentation
 
DU PPT (1).pptx
DU PPT (1).pptxDU PPT (1).pptx
DU PPT (1).pptx
 
Infopulse AI, Data Science & RPA Managed Services
Infopulse AI, Data Science & RPA Managed ServicesInfopulse AI, Data Science & RPA Managed Services
Infopulse AI, Data Science & RPA Managed Services
 

Dernier

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxLoriGlavin3
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsNathaniel Shimoni
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfpanagenda
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observabilityitnewsafrica
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationKnoldus Inc.
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024TopCSSGallery
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxfnnc6jmgwh
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Alkin Tezuysal
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentPim van der Noll
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...Wes McKinney
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfLoriGlavin3
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxLoriGlavin3
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI AgeCprime
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesKari Kakkonen
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxLoriGlavin3
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Mark Goldstein
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfNeo4j
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024BookNet Canada
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersNicole Novielli
 

Dernier (20)

Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptxMerck Moving Beyond Passwords: FIDO Paris Seminar.pptx
Merck Moving Beyond Passwords: FIDO Paris Seminar.pptx
 
Time Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directionsTime Series Foundation Models - current state and future directions
Time Series Foundation Models - current state and future directions
 
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdfSo einfach geht modernes Roaming fuer Notes und Nomad.pdf
So einfach geht modernes Roaming fuer Notes und Nomad.pdf
 
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security ObservabilityGlenn Lazarus- Why Your Observability Strategy Needs Security Observability
Glenn Lazarus- Why Your Observability Strategy Needs Security Observability
 
Data governance with Unity Catalog Presentation
Data governance with Unity Catalog PresentationData governance with Unity Catalog Presentation
Data governance with Unity Catalog Presentation
 
Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024Top 10 Hubspot Development Companies in 2024
Top 10 Hubspot Development Companies in 2024
 
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptxGenerative AI - Gitex v1Generative AI - Gitex v1.pptx
Generative AI - Gitex v1Generative AI - Gitex v1.pptx
 
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
Unleashing Real-time Insights with ClickHouse_ Navigating the Landscape in 20...
 
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native developmentEmixa Mendix Meetup 11 April 2024 about Mendix Native development
Emixa Mendix Meetup 11 April 2024 about Mendix Native development
 
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
The Future Roadmap for the Composable Data Stack - Wes McKinney - Data Counci...
 
Moving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdfMoving Beyond Passwords: FIDO Paris Seminar.pdf
Moving Beyond Passwords: FIDO Paris Seminar.pdf
 
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptxUse of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
Use of FIDO in the Payments and Identity Landscape: FIDO Paris Seminar.pptx
 
A Framework for Development in the AI Age
A Framework for Development in the AI AgeA Framework for Development in the AI Age
A Framework for Development in the AI Age
 
Testing tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examplesTesting tools and AI - ideas what to try with some tool examples
Testing tools and AI - ideas what to try with some tool examples
 
The State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptxThe State of Passkeys with FIDO Alliance.pptx
The State of Passkeys with FIDO Alliance.pptx
 
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
Arizona Broadband Policy Past, Present, and Future Presentation 3/25/24
 
Connecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdfConnecting the Dots for Information Discovery.pdf
Connecting the Dots for Information Discovery.pdf
 
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
New from BookNet Canada for 2024: Loan Stars - Tech Forum 2024
 
A Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software DevelopersA Journey Into the Emotions of Software Developers
A Journey Into the Emotions of Software Developers
 

Intelligent Data Extraction, Turning Content into Data, A Look at Advanced Capture Software

  • 1. Extracting your data shouldn’t be like pulling teeth Turning Content into Data With Intelligent Data Extraction Advanced Capture from DocuFi, Inc. ©2014 DocuFi
  • 2. Time moves on… …many businesses have moved from just scanning for storage purposes only.
  • 3. Users want the brainwork taken out of working with their scans and files.
  • 4. Capture software should see the images, extract the content, and integrate it into the workflow. Brain mri.jpg, National Institutes of Health
  • 5. Recognize Extract Integrate Key Elements of Intelligent Data Capture
  • 6. Recognition technologies such as OCR and barcode recognition can be used to pull data from structured or unstructured scans or existing files painlessly.
  • 7. OCR has the greatest impact on the growth of intelligent data extraction and the potential continues to grow as the technologies continue to improve.
  • 8. Barcode recognition offers the most trustworthy recognition technology for data capture and is widely deployed. See What Can Barcodes Do for Me?
  • 9. OMR (Optical Mark Recognition) • capturing human-marked data from document forms such as surveys and • continues to improve in accuracy and demand ICR (Intelligent Character Recognition) • handwriting recognition • not as accurate as OCR • plays a limited role in some capture systems • continues to improve in accuracy and demand Other Recognition Technologies
  • 10. After the data has been captured (from barcode, OCR, etc.), pattern matching technology identifies the key data. Regular expressions (regex) provide a fast and powerful method to search, extract and replace specific data found within scanned documents. Regular expressions are essentially a special text string for describing a search pattern.You could think of regular expressions as extremely powerful wildcards. See Using Regular Expressions in Document Management Data Capture and Indexing
  • 11. See Using Regular Expressions in Document Management Data Capture and Indexing Regex’s Lookahead , Lookbehind and Line Item Extraction features go beyond basic zonal OCR and let you identify and extract data from unstructured documents. These let you search for an identifiable keyword or string, like “PO Number” and then a word pattern to identify the desired text to extract.
  • 12. There’s a Mountain of It! Here is a partial invoice where you might need to capture the "Catalogue Number“ with line Item extraction technology. Real World Example
  • 13. So once the key data has been identified or “extracted”, how can it be used?
  • 14. A large single file can be split into multiple files based on information extracted from barcodes and content. Split Files Name Files and Folders Name files, folders and subfolders with extracted information from the file or system information. Route Files Route the files to another directory (and even create the folder and subfolder names) using content. Create indexes from extracted information for the “searchable” fields. Index Create PDF Bookmarks Create PDF bookmarks based on extracted information. Validation Data can be validated against business rules to reduce errors .
  • 16. Integration means sharing the information with: • A simple search and retrieval system • A Document Management (DM) system • An Enterprise Content Management (ECM) system • A back-end application such as an Enterprise Resource Planning (ERP) system Molaire sur implant, jbessade — Travail, www.fr.wikipedia.org
  • 17. Henry Schein, Dentri Dentrix Enterprise Dentrix Ascend, Easy Dx, ental Viive, DentalVision, axiUm … ImageRamp can share the extracted data with anyone who can accept a standard XML or CSV file Laserfiche Filenet MyMedicalRecords Eaglesoft Allscripts Dentrix CSV or XML Anyone Documentum Epic
  • 18. So smile, this is where the content becomes data.
  • 19. There’s a Mountain of It! If a stack of invoices were scanned at one time, at each unique occurrence of the Invoice Number, the file could be split and named with the extracted invoice number. Furthermore, the Invoice Number could be shared with an AP system. The Catalogue Numbers could be extracted and shared with an ERP for inventory purposes. Remember our Real World Example?
  • 20. So what needs brushing up? What does the future hold for intelligent data capture? digicla, "Be good for your teeth and the will be good for you“.
  • 21. Continued Improvement in Recognition Technologies Including: Increased Mobility Integration For Smart Phones, Tablets, etc. Increased Cloud Computing Options Improved Validation Against Complex Business Rules Increased Technical Support to Manage the Complexity • OCR expansion to include services like translation • Better accuracy of ICR (handwriting recognition) • Faster, more accurate Increased Information Governance Issues and Complexity
  • 22. Want to Learn More about Document Imaging and Capture?
  • 23. For more on: • Extracting meta data, • Data extraction from unstructured data • Intelligent data capture • Data extraction • Using regex to extract data • Document scanning • Extracting data • Extract meta data, • Scanner software, • Barcode recognition, • OCR software, • Capture tutorial • Pdf scanning, • Scanning software • Indexing • Document indexing • Automated capture • Meta data • Scan to index • Batch Processing • Bulk scanning • Docufi • Imageramp • Data capture • Migration to document management DocuFi 30 years’ experience in the Document Imaging and Capture market Capture Products www.docufi.com Copyright ©2014 makers of ImageRamp, Intelligent Capture Solution Just take a bite and get started with us.