SlideShare une entreprise Scribd logo
1  sur  22
.consulting .solutions .partnership
Text Analysis with SAP HANA
Text Analysis with SAP HANA
2Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich
Motivation1 3
Text Analysis with SAP HANA2 7
Enhancement Options - Dictionaries and Rules3 21
Text Analysis with SAP HANA
3Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich
Motivation1 3
Text Analysis with SAP HANA2 7
Enhancement Options - Dictionaries and Rules3 21
Text Analysis with SAP HANA
Why do we need Text Analysis?
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 4
• According to Merril Lynch 80-90% of all potentially usable business information may originate in
unstructured form
(Structure, Models and Meaning: Is "unstructured" data merely unmodeled?, Intelligent Enterprise, March 1, 2005.)
• The data might origin from:
 Social Networks
 “Letters” from Customer
 ...
• What is the problem with unstructured data?
• It is unstructured!
 Not organized
 No pre-defined data model
 No metadata or mix of data and metadata
 We have a lot of information that is relevant for the business but we cannot access it 
Text Analysis with SAP HANA
How can we solve that issue?
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 5
• Text Analysis: Extracting high quality information from texts
• Typical process of a text analysis:
 Parsing of the text
 Adding features like linguistic information
 Entity recognition: Is it an organization or a person or a place including domain facts like
requests?
 Sentiment analysis: What attitudinal information is “hidden” in the text?
 Insertion of information to database in structured manner
Text Analysis with SAP HANA
6Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich
Motivation1 3
Text Analysis with SAP HANA2 7
Enhancement Options - Dictionaries and Rules3 21
Text Analysis with SAP HANA
What has this to do with SAP HANA?
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 7
© SAP SE
Text Analysis with SAP HANA
Fulltext Index - Basics
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 8
• Starting point: database table containing the text (types like TEXT, NVARCHAR, BLOB …)
• Create a Fulltext index incl. options (see system view SYS.FULLTEXT_INDEXES)
Text Analysis with SAP HANA
Entity Extraction
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 9
• In order to get valuable information out of the data SAP delivers several configurations
• These configurations focus on entity and fact extraction under specific aspects
• Types of Extraction:
 EXTRACTION_CORE
 EXTRACTION_CORE_ENTERPRISE
 EXTRACTION_CORE_PUBLIC_SECTOR
 EXTRACTION_CORE_VOICEOFCUSTOMER
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 10
Text Analysis with SAP HANA
11Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich
Motivation1 3
Text Analysis with SAP HANA2 7
Enhancement Options - Dictionaries and Rules3 21
Text Analysis with SAP HANA
Custom Dictionary
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 12
• In several use cases you need to enhance the dictionary due to your business domain
• Structure of a dictionary
© SAP SE
Text Analysis with HANA – Workflow of Enhancement
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 13
1. Find an extraction configuration that is most fitting for you
2. Copy the configuration into the target folder
3. Create a new custom dictionary
4. Reference the dictionary in your configuration copy
5. Recreate the fulltext index using your custom configuration
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 14
Text Analysis with HANA – What’s next?
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 15
• Assume that we are in an “industry”-specific context or mining for “slang”-like facts and entities
• Good example for this are sports!
• We use the example of CrossFit® … as there are some funny facts to extract
• Question: How can we extract complex entities from a text?
• Examples:
 Did somebody attend a CrossFit training?
 Does somebody want to join a CrossFit box?
Text Analysis with HANA – Text Analysis Extraction Rules
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 16
• Extraction rules (CGUL rules): pattern-based language for pattern matching using character or
token-based regular expressions combined with linguistic attributes to define custom entity types.
• Goal of the rule sets:
 Extract complex facts based on relations between entities and predicates.
 Identify entities in domain-specific language and capture facts expressed in new, popular
“slang”
Text Analysis with HANA – Text Analysis Extraction Rules
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 17
Extraction Rule
Regular ExpressionsTokens
Luck Dictionaries
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 20
Text Analysis with HANA – “Lessons Learned”
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 21
• Text Analysis on SAP HANA is extremely powerful
• Besides the delivered content you have a lot of options to adopt the text analysis to extract the
entities and facts that you need
• This also means you have a lot of options that you can set the wrong way 
• Since SP09 rules get compiled upon activation (no separate compilation necessary)
• The documentation is mostly ok but has room for improvement in case of extraction rules
• Creating custom dictionaries and text rules is cumbersome, finding an error (e. g. a typo) is hell
 No support in IDE 
 You can usually activate all objects, create the index … but the index remains empty 
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 22
Q&A
.consulting .solutions .partnership
Dr. Christian Lechner
Principal IT Consultant
+49 (0) 171 7617190
christian.lechner@msg-systems.com
http://scn.sap.com/people/christian.lechner
@lechnerc77
Text Analysis with HANA – Ressources
Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 24
• SAP HANA Search Developer Guide (Fulltext Index Options)
help.sap.com -> Search Developer Guide
• SAP HANA Text Analysis Developer Guide:
help.sap.com -> TA Developer Guide
• SAP HANA Text Analysis Language Reference Guide:
help.sap.com -> TA Language Refrence Guide
• SAP HANA Text Analysis Extraction Customization Guide:
help.sap.com -> TA Extraction Customization Guide
• YouTube Playlist of SAP HANA Academy:
Text Analysis and Search

Contenu connexe

Tendances

Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANASAP Technology
 
SAP HANA SPS09 - Full-text Search
SAP HANA SPS09 - Full-text SearchSAP HANA SPS09 - Full-text Search
SAP HANA SPS09 - Full-text SearchSAP Technology
 
SAP HANA Training - For Technical/BASIS administrators.
SAP HANA Training - For Technical/BASIS administrators. SAP HANA Training - For Technical/BASIS administrators.
SAP HANA Training - For Technical/BASIS administrators. Gaganpreet Singh
 
SAP Abap on Hana Training Course Content
SAP Abap on Hana Training Course ContentSAP Abap on Hana Training Course Content
SAP Abap on Hana Training Course ContentZaranTech LLC
 
Sapabapcoursecontent 130302033356-phpapp02
Sapabapcoursecontent 130302033356-phpapp02Sapabapcoursecontent 130302033356-phpapp02
Sapabapcoursecontent 130302033356-phpapp02Hemanth Kumar
 
Dmm203 – new approaches for data modelingwith sap hana
Dmm203 – new approaches for data modelingwith sap hanaDmm203 – new approaches for data modelingwith sap hana
Dmm203 – new approaches for data modelingwith sap hanaLuc Vanrobays
 
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA Luc Vanrobays
 
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Ocean9, Inc.
 
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...Christian Lechner
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP Technology
 
SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707Henrique Pinto
 
SAP MM Versus SAP S/4 HANA
SAP MM Versus SAP S/4 HANASAP MM Versus SAP S/4 HANA
SAP MM Versus SAP S/4 HANAAnjali Rao
 
Vdocuments.mx sap retail-55fed4ead31a0
Vdocuments.mx sap retail-55fed4ead31a0Vdocuments.mx sap retail-55fed4ead31a0
Vdocuments.mx sap retail-55fed4ead31a0melisarenovales
 
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and Predictive
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and PredictiveDmm117 – SAP HANA Processing Services Text Spatial Graph Series and Predictive
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and PredictiveLuc Vanrobays
 
DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015Luc Vanrobays
 
SQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSAP Technology
 

Tendances (20)

SAP ABAP Material
SAP ABAP MaterialSAP ABAP Material
SAP ABAP Material
 
The HANA Cloud Platform
The HANA Cloud PlatformThe HANA Cloud Platform
The HANA Cloud Platform
 
Building Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANABuilding Custom Advanced Analytics Applications with SAP HANA
Building Custom Advanced Analytics Applications with SAP HANA
 
SAP HANA SPS09 - Full-text Search
SAP HANA SPS09 - Full-text SearchSAP HANA SPS09 - Full-text Search
SAP HANA SPS09 - Full-text Search
 
SAP HANA Training - For Technical/BASIS administrators.
SAP HANA Training - For Technical/BASIS administrators. SAP HANA Training - For Technical/BASIS administrators.
SAP HANA Training - For Technical/BASIS administrators.
 
Prashantini Krishnan Chandrakumar
Prashantini Krishnan ChandrakumarPrashantini Krishnan Chandrakumar
Prashantini Krishnan Chandrakumar
 
SAP Abap on Hana Training Course Content
SAP Abap on Hana Training Course ContentSAP Abap on Hana Training Course Content
SAP Abap on Hana Training Course Content
 
Sapabapcoursecontent 130302033356-phpapp02
Sapabapcoursecontent 130302033356-phpapp02Sapabapcoursecontent 130302033356-phpapp02
Sapabapcoursecontent 130302033356-phpapp02
 
Dmm203 – new approaches for data modelingwith sap hana
Dmm203 – new approaches for data modelingwith sap hanaDmm203 – new approaches for data modelingwith sap hana
Dmm203 – new approaches for data modelingwith sap hana
 
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
Dmm302 - Sap Hana Data Warehousing: Models for Sap Bw and SQL DW on SAP HANA
 
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
Hadoop, Spark and Big Data Summit presentation with SAP HANA Vora and a path ...
 
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...
ABAP Development in time of S/4 - Do's and Don'ts and Golden Rules for Simpli...
 
SAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop IntegrationSAP HANA SPS10- Hadoop Integration
SAP HANA SPS10- Hadoop Integration
 
SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707SAP HANA Vora SITMTY 20160707
SAP HANA Vora SITMTY 20160707
 
SAP MM Versus SAP S/4 HANA
SAP MM Versus SAP S/4 HANASAP MM Versus SAP S/4 HANA
SAP MM Versus SAP S/4 HANA
 
Vdocuments.mx sap retail-55fed4ead31a0
Vdocuments.mx sap retail-55fed4ead31a0Vdocuments.mx sap retail-55fed4ead31a0
Vdocuments.mx sap retail-55fed4ead31a0
 
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and Predictive
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and PredictiveDmm117 – SAP HANA Processing Services Text Spatial Graph Series and Predictive
Dmm117 – SAP HANA Processing Services Text Spatial Graph Series and Predictive
 
DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015DMM161 HANA_MODELING_2015
DMM161 HANA_MODELING_2015
 
SQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of ThingsSQL Anywhere and the Internet of Things
SQL Anywhere and the Internet of Things
 
SAP ECC to S/4HANA Move
SAP ECC to S/4HANA MoveSAP ECC to S/4HANA Move
SAP ECC to S/4HANA Move
 

En vedette

SAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP Technology
 
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform SAP Inside Track Munich 2016 - SAP HANA Cloud Platform
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform Christian Lechner
 
What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11SAP Technology
 
SAP HANA Cloud Platform - The big picture
SAP HANA Cloud Platform - The big picture SAP HANA Cloud Platform - The big picture
SAP HANA Cloud Platform - The big picture Matthias Steiner
 
SAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP Technology
 
SAP Platform & S/4 HANA - Support for Innovation
SAP Platform & S/4 HANA - Support for InnovationSAP Platform & S/4 HANA - Support for Innovation
SAP Platform & S/4 HANA - Support for InnovationBernhard Luecke
 
What's new in SAP HANA SPS 11 SQL/SQLScript
What's new in SAP HANA SPS 11 SQL/SQLScriptWhat's new in SAP HANA SPS 11 SQL/SQLScript
What's new in SAP HANA SPS 11 SQL/SQLScriptSAP Technology
 
What's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingWhat's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingSAP Technology
 

En vedette (8)

SAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text MiningSAP HANA SPS10- Text Analysis & Text Mining
SAP HANA SPS10- Text Analysis & Text Mining
 
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform SAP Inside Track Munich 2016 - SAP HANA Cloud Platform
SAP Inside Track Munich 2016 - SAP HANA Cloud Platform
 
What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11What's new for Text in SAP HANA SPS 11
What's new for Text in SAP HANA SPS 11
 
SAP HANA Cloud Platform - The big picture
SAP HANA Cloud Platform - The big picture SAP HANA Cloud Platform - The big picture
SAP HANA Cloud Platform - The big picture
 
SAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data AnalysisSAP HANA in Healthcare: Real-Time Big Data Analysis
SAP HANA in Healthcare: Real-Time Big Data Analysis
 
SAP Platform & S/4 HANA - Support for Innovation
SAP Platform & S/4 HANA - Support for InnovationSAP Platform & S/4 HANA - Support for Innovation
SAP Platform & S/4 HANA - Support for Innovation
 
What's new in SAP HANA SPS 11 SQL/SQLScript
What's new in SAP HANA SPS 11 SQL/SQLScriptWhat's new in SAP HANA SPS 11 SQL/SQLScript
What's new in SAP HANA SPS 11 SQL/SQLScript
 
What's New in SAP HANA View Modeling
What's New in SAP HANA View ModelingWhat's New in SAP HANA View Modeling
What's New in SAP HANA View Modeling
 

Similaire à Text Analysis with SAP HANA

Certified Python Business Analyst
Certified Python Business AnalystCertified Python Business Analyst
Certified Python Business AnalystAnkitSingh2134
 
ProjectsSummary.pptx
ProjectsSummary.pptxProjectsSummary.pptx
ProjectsSummary.pptxJamesKirk79
 
Building an effective sharepoint team
Building an effective sharepoint teamBuilding an effective sharepoint team
Building an effective sharepoint teamBaris Bruce Tuncertan
 
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of Concept
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of ConceptCase Study: Lessons from Newell Rubbermaid's SAP HANA Proof of Concept
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of ConceptSAPinsider Events
 
xAPI: The Landscape
xAPI: The LandscapexAPI: The Landscape
xAPI: The LandscapeMegan Bowe
 
Resume_Bhavana_Gaur_SAPBW
Resume_Bhavana_Gaur_SAPBWResume_Bhavana_Gaur_SAPBW
Resume_Bhavana_Gaur_SAPBWbhavana gaur
 
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...Jothi Periasamy
 
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016IXIASOFT
 
Introduction to SAP and UiPath Automation
Introduction to SAP and UiPath AutomationIntroduction to SAP and UiPath Automation
Introduction to SAP and UiPath AutomationDianaGray10
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Simplilearn
 
SAP HANA Cookbook for MySQL Developers
SAP HANA Cookbook for MySQL DevelopersSAP HANA Cookbook for MySQL Developers
SAP HANA Cookbook for MySQL Developerssaphanacookbook
 

Similaire à Text Analysis with SAP HANA (20)

Certified Python Business Analyst
Certified Python Business AnalystCertified Python Business Analyst
Certified Python Business Analyst
 
Project report
Project reportProject report
Project report
 
Sunil_HANA
Sunil_HANASunil_HANA
Sunil_HANA
 
ProjectsSummary.pptx
ProjectsSummary.pptxProjectsSummary.pptx
ProjectsSummary.pptx
 
Building an effective sharepoint team
Building an effective sharepoint teamBuilding an effective sharepoint team
Building an effective sharepoint team
 
Semantic SharePoint
Semantic SharePointSemantic SharePoint
Semantic SharePoint
 
Solved Big Data and Data Science Projects pdf.pdf
Solved Big Data and Data Science Projects pdf.pdfSolved Big Data and Data Science Projects pdf.pdf
Solved Big Data and Data Science Projects pdf.pdf
 
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of Concept
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of ConceptCase Study: Lessons from Newell Rubbermaid's SAP HANA Proof of Concept
Case Study: Lessons from Newell Rubbermaid's SAP HANA Proof of Concept
 
xAPI: The Landscape
xAPI: The LandscapexAPI: The Landscape
xAPI: The Landscape
 
AI for Analysts June 2016
AI for Analysts June 2016AI for Analysts June 2016
AI for Analysts June 2016
 
SAP
SAPSAP
SAP
 
Resume_Bhavana_Gaur_SAPBW
Resume_Bhavana_Gaur_SAPBWResume_Bhavana_Gaur_SAPBW
Resume_Bhavana_Gaur_SAPBW
 
sangeeta
sangeetasangeeta
sangeeta
 
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...In-Memory Analytics - SAP Big Data - Analytics Tools Selection  - SAP HANA & ...
In-Memory Analytics - SAP Big Data - Analytics Tools Selection - SAP HANA & ...
 
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016
Metrics for continual improvements - Nolwenn Kerzreho LavaconDublin2016
 
sangeeta
sangeetasangeeta
sangeeta
 
Introduction to SAP and UiPath Automation
Introduction to SAP and UiPath AutomationIntroduction to SAP and UiPath Automation
Introduction to SAP and UiPath Automation
 
Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...Data Science With Python | Python For Data Science | Python Data Science Cour...
Data Science With Python | Python For Data Science | Python Data Science Cour...
 
Sap hana
Sap hanaSap hana
Sap hana
 
SAP HANA Cookbook for MySQL Developers
SAP HANA Cookbook for MySQL DevelopersSAP HANA Cookbook for MySQL Developers
SAP HANA Cookbook for MySQL Developers
 

Plus de Christian Lechner

Serverless and SAP … Oh Behave
Serverless and SAP … Oh BehaveServerless and SAP … Oh Behave
Serverless and SAP … Oh BehaveChristian Lechner
 
FaaS by Microsoft: Azure Functions and Azure Durable Functions
FaaS by Microsoft: Azure Functions and Azure Durable FunctionsFaaS by Microsoft: Azure Functions and Azure Durable Functions
FaaS by Microsoft: Azure Functions and Azure Durable FunctionsChristian Lechner
 
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...Christian Lechner
 
Serverless side by-side extensions with Azure Durable Functions
Serverless side by-side extensions with Azure Durable FunctionsServerless side by-side extensions with Azure Durable Functions
Serverless side by-side extensions with Azure Durable FunctionsChristian Lechner
 
SAP Embrace - A Look behind the curtains (by minnosphere)
SAP Embrace - A Look behind the curtains (by minnosphere)SAP Embrace - A Look behind the curtains (by minnosphere)
SAP Embrace - A Look behind the curtains (by minnosphere)Christian Lechner
 
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure Christian Lechner
 
Side-by-Side Extensibility with Microsoft Azure
Side-by-Side Extensibility with Microsoft AzureSide-by-Side Extensibility with Microsoft Azure
Side-by-Side Extensibility with Microsoft AzureChristian Lechner
 
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...Christian Lechner
 
NET53494 Extensions in the Age of S/4HANA
NET53494  Extensions in the Age of S/4HANANET53494  Extensions in the Age of S/4HANA
NET53494 Extensions in the Age of S/4HANAChristian Lechner
 

Plus de Christian Lechner (10)

Serverless and SAP … Oh Behave
Serverless and SAP … Oh BehaveServerless and SAP … Oh Behave
Serverless and SAP … Oh Behave
 
FaaS by Microsoft: Azure Functions and Azure Durable Functions
FaaS by Microsoft: Azure Functions and Azure Durable FunctionsFaaS by Microsoft: Azure Functions and Azure Durable Functions
FaaS by Microsoft: Azure Functions and Azure Durable Functions
 
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...
[SOT322] Serverless Side-by-Side Extensions with Azure Durable Functions - Wh...
 
Serverless side by-side extensions with Azure Durable Functions
Serverless side by-side extensions with Azure Durable FunctionsServerless side by-side extensions with Azure Durable Functions
Serverless side by-side extensions with Azure Durable Functions
 
SAP Embrace - A Look behind the curtains (by minnosphere)
SAP Embrace - A Look behind the curtains (by minnosphere)SAP Embrace - A Look behind the curtains (by minnosphere)
SAP Embrace - A Look behind the curtains (by minnosphere)
 
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure
SAP Inside Track Hamburg 2019 - Side-by-Side Extensibility with Microsoft Azure
 
Side-by-Side Extensibility with Microsoft Azure
Side-by-Side Extensibility with Microsoft AzureSide-by-Side Extensibility with Microsoft Azure
Side-by-Side Extensibility with Microsoft Azure
 
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...
SAP Inside Track 2018 - "Quidquid agis, prudenter agas ..." - Learnings from ...
 
NET53494 Extensions in the Age of S/4HANA
NET53494  Extensions in the Age of S/4HANANET53494  Extensions in the Age of S/4HANA
NET53494 Extensions in the Age of S/4HANA
 
sitFRA_ BRFplus_TheAPIWay
sitFRA_ BRFplus_TheAPIWaysitFRA_ BRFplus_TheAPIWay
sitFRA_ BRFplus_TheAPIWay
 

Dernier

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?Igalia
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...apidays
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...apidays
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 

Dernier (20)

Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
Apidays Singapore 2024 - Scalable LLM APIs for AI and Generative AI Applicati...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...Apidays New York 2024 - The value of a flexible API Management solution for O...
Apidays New York 2024 - The value of a flexible API Management solution for O...
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 

Text Analysis with SAP HANA

  • 2. Text Analysis with SAP HANA 2Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich Motivation1 3 Text Analysis with SAP HANA2 7 Enhancement Options - Dictionaries and Rules3 21
  • 3. Text Analysis with SAP HANA 3Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich Motivation1 3 Text Analysis with SAP HANA2 7 Enhancement Options - Dictionaries and Rules3 21
  • 4. Text Analysis with SAP HANA Why do we need Text Analysis? Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 4 • According to Merril Lynch 80-90% of all potentially usable business information may originate in unstructured form (Structure, Models and Meaning: Is "unstructured" data merely unmodeled?, Intelligent Enterprise, March 1, 2005.) • The data might origin from:  Social Networks  “Letters” from Customer  ... • What is the problem with unstructured data? • It is unstructured!  Not organized  No pre-defined data model  No metadata or mix of data and metadata  We have a lot of information that is relevant for the business but we cannot access it 
  • 5. Text Analysis with SAP HANA How can we solve that issue? Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 5 • Text Analysis: Extracting high quality information from texts • Typical process of a text analysis:  Parsing of the text  Adding features like linguistic information  Entity recognition: Is it an organization or a person or a place including domain facts like requests?  Sentiment analysis: What attitudinal information is “hidden” in the text?  Insertion of information to database in structured manner
  • 6. Text Analysis with SAP HANA 6Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich Motivation1 3 Text Analysis with SAP HANA2 7 Enhancement Options - Dictionaries and Rules3 21
  • 7. Text Analysis with SAP HANA What has this to do with SAP HANA? Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 7 © SAP SE
  • 8. Text Analysis with SAP HANA Fulltext Index - Basics Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 8 • Starting point: database table containing the text (types like TEXT, NVARCHAR, BLOB …) • Create a Fulltext index incl. options (see system view SYS.FULLTEXT_INDEXES)
  • 9. Text Analysis with SAP HANA Entity Extraction Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 9 • In order to get valuable information out of the data SAP delivers several configurations • These configurations focus on entity and fact extraction under specific aspects • Types of Extraction:  EXTRACTION_CORE  EXTRACTION_CORE_ENTERPRISE  EXTRACTION_CORE_PUBLIC_SECTOR  EXTRACTION_CORE_VOICEOFCUSTOMER
  • 10. Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 10
  • 11. Text Analysis with SAP HANA 11Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich Motivation1 3 Text Analysis with SAP HANA2 7 Enhancement Options - Dictionaries and Rules3 21
  • 12. Text Analysis with SAP HANA Custom Dictionary Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 12 • In several use cases you need to enhance the dictionary due to your business domain • Structure of a dictionary © SAP SE
  • 13. Text Analysis with HANA – Workflow of Enhancement Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 13 1. Find an extraction configuration that is most fitting for you 2. Copy the configuration into the target folder 3. Create a new custom dictionary 4. Reference the dictionary in your configuration copy 5. Recreate the fulltext index using your custom configuration
  • 14. Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 14
  • 15. Text Analysis with HANA – What’s next? Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 15 • Assume that we are in an “industry”-specific context or mining for “slang”-like facts and entities • Good example for this are sports! • We use the example of CrossFit® … as there are some funny facts to extract • Question: How can we extract complex entities from a text? • Examples:  Did somebody attend a CrossFit training?  Does somebody want to join a CrossFit box?
  • 16. Text Analysis with HANA – Text Analysis Extraction Rules Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 16 • Extraction rules (CGUL rules): pattern-based language for pattern matching using character or token-based regular expressions combined with linguistic attributes to define custom entity types. • Goal of the rule sets:  Extract complex facts based on relations between entities and predicates.  Identify entities in domain-specific language and capture facts expressed in new, popular “slang”
  • 17. Text Analysis with HANA – Text Analysis Extraction Rules Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 17 Extraction Rule Regular ExpressionsTokens Luck Dictionaries
  • 18. Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 20
  • 19. Text Analysis with HANA – “Lessons Learned” Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 21 • Text Analysis on SAP HANA is extremely powerful • Besides the delivered content you have a lot of options to adopt the text analysis to extract the entities and facts that you need • This also means you have a lot of options that you can set the wrong way  • Since SP09 rules get compiled upon activation (no separate compilation necessary) • The documentation is mostly ok but has room for improvement in case of extraction rules • Creating custom dictionaries and text rules is cumbersome, finding an error (e. g. a typo) is hell  No support in IDE   You can usually activate all objects, create the index … but the index remains empty 
  • 20. Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 22 Q&A
  • 21. .consulting .solutions .partnership Dr. Christian Lechner Principal IT Consultant +49 (0) 171 7617190 christian.lechner@msg-systems.com http://scn.sap.com/people/christian.lechner @lechnerc77
  • 22. Text Analysis with HANA – Ressources Oktober 2015 | Text Analysis with SAP HANA - SAP Inside Track Munich 24 • SAP HANA Search Developer Guide (Fulltext Index Options) help.sap.com -> Search Developer Guide • SAP HANA Text Analysis Developer Guide: help.sap.com -> TA Developer Guide • SAP HANA Text Analysis Language Reference Guide: help.sap.com -> TA Language Refrence Guide • SAP HANA Text Analysis Extraction Customization Guide: help.sap.com -> TA Extraction Customization Guide • YouTube Playlist of SAP HANA Academy: Text Analysis and Search

Notes de l'éditeur

  1. Text analysis in SAP HANA is a suite of natural-language processing capabilities based on linguistic, statistical and machine-learning algorithms that model and structure the information content of textual sources in multiple languages. This technology forms the foundation for advanced text processing for a range of applications including search, business intelligence or exploratory data analysis.
  2. LANGUAGE COLUMN <column_name> - Defines the column where the language of a document is specified. LANGUAGE DETECTION ( <string_literal_list> ) - The set of languages to be considered during language detection. MIME TYPE COLUMN <column_name> - Defines the column where the mime-type of a document is specified. FUZZY SEARCH INDEX <on_off> - Specifies whether a fuzzy search index should be used. PHRASE INDEX RATIO <index_ratio> <index_ratio> ::= <exact_numeric_literal> - Specifies the percentage of the phrase index. Value must be between 0.0 and 1.0 Stores information about the occurrence of words and the proximity of words to one another. If a phrase index is present, phrase searches are sped up (e.g. SELECT * FROM T WHERE CONTAINS(COLUMN1, '"cats and dogs"')) . The float value is between 0.0 and 1.0. 1.0 means that the internal phrase index can use 100% of the memory size of the fulltext index. CONFIGURATION <string_literal> - The path to a custom configuration file for text analysis. SEARCH ONLY <on_off> - Defines if the original document should be stored or only the search results. When set to ON the original document content is not stored. FAST PREPROCESS <on_off> - If set to ON, fast preprocessing is used, i.e. linguistic searches are not possible. TEXT ANALYSIS <on_off> - Enables text analysis capabilities on the indexed column. Text analysis can extract entities such as persons, products, or places from documents, which are stored in a new table. MIME TYPE <string_literal> - The default mime type used for preprocessing. The value must be a valid mime type. TOKEN SEPARATORS <string_literal> - A set of characters used for token separation. Only ASCII characters are considered. <change_tracking_elem> ::= SYNC[HRONOUS] | ASYNC[HRONOUS] [FLUSH [QUEUE] <flush_queue_elem>] - The type of index to be created. SYNC[HRONOUS] - Creates a synchronous fulltext index. ASYNC[HRONOUS] - Creates an asynchronous fulltext index. FLUSH [QUEUE] <flush_queue_elem> <flush_queue_elem> ::= EVERY <integer_literal> MINUTES | AFTER <integer_literal> DOCUMENTS | EVERY <integer_literal> MINUTES OR AFTER <integer_literal> DOCUMENTS - Specifies when to update the fulltext index if an asynchronous index is used. When DOCUMENTS is specified, the fulltext index will be updated after the specified number of changes to the table, including updates and deletes. TEXT MINING <on_off> - Enables text mining capabilities on the indexed column. Text mining provides functionality that can compare documents by examining the terms used within them. TEXT MINING CONFIGURATION <string_literal> - The path to a custom configuration file for text mining. If not specified, DEFAULT.textminingconfig is use
  3. Entity Extraction is the identification of named entities (persons, organizations etc.), which eliminates the 'noise' in textual data by highlighting salient information. This process transforms unstructured text into structured information.  Fact Extraction is a higher-level semantic processing that links entities as "facts" in domain-specific applications. For example, "Voice of the Customer" classifies sentiments with their corresponding topics.
  4. CGUL - Custom Group User Language