SlideShare une entreprise Scribd logo
1  sur  17
Knowledge Mining with
Cognitive Search (Whats New?)
Sammy Deprez
• Husband, Father of 2
• Partner at
• Core Member at
• Passionate about #Data and #AI
• Belgian
organizer
• “AITalk” Host
Hi There!
• What is Cognitive Search
• Quick Setup
• New features
What is Cognitive Search
What is Cognitive Search?
What is Cognitive Search?
• Free form text search
• Relevance
• Geo-search
• Filter and facets
• UX
What is Cognitive Search?
Data Source
Skillset
Index
Indexer
General new stuff
• Knowledge Store
• Soft Delete
• Updating
• Skillls
• Index
• Indexer
• Data Source
• Incremental Enrichment
• WebApp Designer
New Skills
• PII (Personal Identity Identifier)
• TextTranslation
• Custom Entity
• TextTranslation
Questions?
Now or
@sammydeprez
Code Snippets (Enable Delete, DataSource)
{
"@odata.context": "https://acs-demo.search.windows.net/$metadata#datasources/$entity",
"@odata.etag": ""0x8D7D4E9B760831D"",
"name": "demodata",
"description": null,
"type": "azureblob",
"subtype": null,
"credentials": {
"connectionString": "DefaultEndpointsProtocol=https;AccountName****************"
},
"container": {
"name": "data",
"query": null
},
"dataChangeDetectionPolicy": null,
"dataDeletionDetectionPolicy" : {
"@odata.type" :"#Microsoft.Azure.Search.NativeBlobSoftDeleteDeletionDetectionPolicy"
}
}
Code Snippets (PII Skill)
{
"@odata.type": "#Microsoft.Skills.Text.PIIDetectionSkill",
"defaultLanguageCode": "en",
"minimumPrecision": 0.5,
"maskingMode": "replace",
"maskingCharacter": "*",
"inputs": [
{
"name": "text",
"source": "/document/merged_content/translated_text"
}
],
"outputs": [
{
"name": "piiEntities"
},
{
"name": "maskedText"
}
]
},
Code Snippets (Custom Entity Skill)
{
"@odata.type": "Microsoft.Skills.Text.CustomEntityLookupSkill",
"context": "/document",
"entitiesDefinitionUri": “****.json",
"inputs": [
{
"name": "text",
"source": "/document/merged_content/translated_text"
}
],
"outputs": [
{
"name": "entities",
"targetName": "matchedEntities"
}
]
}
Code Snippets (Custom Entity Json)
[
{
"name" : "Sammy Deprez",
"description" : "Managing Partner at Arinti" ,
"aliases" : [
{ "text" : "depresa", "caseSensitive" : false },
{ "text" : "SammyD", "caseSensitive" : true }
]
}
]
Code Snippets (Incremental Enrichment ,
Indexer)
"cache": {
"enableReprocessing": true,
"storageConnectionString":
"DefaultEndpointsProtocol=https;AccountName=saacsdemo;AccountKey=
nXPGESbY6VfVgaLnQFyn53TCTERKHC/Ch2rJJrUktZc7a/fjLpe9aIeMLeJH4
g9RTiLkICPVyWkBSA9tE7i/3A==;EndpointSuffix=core.windows.net"
}
Code Snippets (PII + Custom Entity Mapping
Indexer)
{
"sourceFieldName": "/document/piiEntities",
"targetFieldName": "piiEntities",
"mappingFunction": null
},
{
"sourceFieldName": "/document/maskedText",
"targetFieldName": "maskedText",
"mappingFunction": null
},
{
"sourceFieldName": "/document/matchedEntities",
"targetFieldName": "customEntities",
"mappingFunction": null
}
{
"name": "piiEntities",
"type": "Collection(Edm.ComplexType)",
"fields": [
{ "name": "text", "type": "Edm.String" },
{ "name": "type", "type": "Edm.String" },
{ "name": "subtype", "type": "Edm.String" },
{ "name": "offset", "type": "Edm.Int32" },
{ "name": "length", "type": "Edm.Int32" },
{ "name": "score", "type": "Edm.Double" }
]
},
{
"name": "maskedText",
"type": "Edm.String"
},
{
"name": "customEntities",
"type": "Collection(Edm.ComplexType)",
"fields": [
{ "name": "name", "type": "Edm.String" },
{ "name": "description", "type": "Edm.String" },
{ "name": "id", "type": "Edm.String" },
{
"name": "matches",
"type": "Collection(Edm.ComplexType)",
"fields": [
{ "name": "text", "type":
"Edm.String" },
{ "name": "offset", "type":
"Edm.Int32" },
{ "name": "length", "type":
"Edm.Int32" },
{ "name": "matchDistance", "type":
"Edm.Double" }
]
}
]
}
Code Snippets (PII + Custom Entity Fields ,
Index)

Contenu connexe

En vedette

Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
Kurio // The Social Media Age(ncy)
 

En vedette (20)

PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
Unlocking the Power of ChatGPT and AI in Testing - A Real-World Look, present...
 
12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work12 Ways to Increase Your Influence at Work
12 Ways to Increase Your Influence at Work
 
ChatGPT webinar slides
ChatGPT webinar slidesChatGPT webinar slides
ChatGPT webinar slides
 
More than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike RoutesMore than Just Lines on a Map: Best Practices for U.S Bike Routes
More than Just Lines on a Map: Best Practices for U.S Bike Routes
 
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
Ride the Storm: Navigating Through Unstable Periods / Katerina Rudko (Belka G...
 
Barbie - Brand Strategy Presentation
Barbie - Brand Strategy PresentationBarbie - Brand Strategy Presentation
Barbie - Brand Strategy Presentation
 

Knowledge Mining with Cognitive Search - Whats New

  • 1. Knowledge Mining with Cognitive Search (Whats New?) Sammy Deprez
  • 2. • Husband, Father of 2 • Partner at • Core Member at • Passionate about #Data and #AI • Belgian organizer • “AITalk” Host Hi There!
  • 3. • What is Cognitive Search • Quick Setup • New features
  • 6. What is Cognitive Search? • Free form text search • Relevance • Geo-search • Filter and facets • UX
  • 7. What is Cognitive Search? Data Source Skillset Index Indexer
  • 8. General new stuff • Knowledge Store • Soft Delete • Updating • Skillls • Index • Indexer • Data Source • Incremental Enrichment • WebApp Designer
  • 9. New Skills • PII (Personal Identity Identifier) • TextTranslation • Custom Entity • TextTranslation
  • 11. Code Snippets (Enable Delete, DataSource) { "@odata.context": "https://acs-demo.search.windows.net/$metadata#datasources/$entity", "@odata.etag": ""0x8D7D4E9B760831D"", "name": "demodata", "description": null, "type": "azureblob", "subtype": null, "credentials": { "connectionString": "DefaultEndpointsProtocol=https;AccountName****************" }, "container": { "name": "data", "query": null }, "dataChangeDetectionPolicy": null, "dataDeletionDetectionPolicy" : { "@odata.type" :"#Microsoft.Azure.Search.NativeBlobSoftDeleteDeletionDetectionPolicy" } }
  • 12. Code Snippets (PII Skill) { "@odata.type": "#Microsoft.Skills.Text.PIIDetectionSkill", "defaultLanguageCode": "en", "minimumPrecision": 0.5, "maskingMode": "replace", "maskingCharacter": "*", "inputs": [ { "name": "text", "source": "/document/merged_content/translated_text" } ], "outputs": [ { "name": "piiEntities" }, { "name": "maskedText" } ] },
  • 13. Code Snippets (Custom Entity Skill) { "@odata.type": "Microsoft.Skills.Text.CustomEntityLookupSkill", "context": "/document", "entitiesDefinitionUri": “****.json", "inputs": [ { "name": "text", "source": "/document/merged_content/translated_text" } ], "outputs": [ { "name": "entities", "targetName": "matchedEntities" } ] }
  • 14. Code Snippets (Custom Entity Json) [ { "name" : "Sammy Deprez", "description" : "Managing Partner at Arinti" , "aliases" : [ { "text" : "depresa", "caseSensitive" : false }, { "text" : "SammyD", "caseSensitive" : true } ] } ]
  • 15. Code Snippets (Incremental Enrichment , Indexer) "cache": { "enableReprocessing": true, "storageConnectionString": "DefaultEndpointsProtocol=https;AccountName=saacsdemo;AccountKey= nXPGESbY6VfVgaLnQFyn53TCTERKHC/Ch2rJJrUktZc7a/fjLpe9aIeMLeJH4 g9RTiLkICPVyWkBSA9tE7i/3A==;EndpointSuffix=core.windows.net" }
  • 16. Code Snippets (PII + Custom Entity Mapping Indexer) { "sourceFieldName": "/document/piiEntities", "targetFieldName": "piiEntities", "mappingFunction": null }, { "sourceFieldName": "/document/maskedText", "targetFieldName": "maskedText", "mappingFunction": null }, { "sourceFieldName": "/document/matchedEntities", "targetFieldName": "customEntities", "mappingFunction": null }
  • 17. { "name": "piiEntities", "type": "Collection(Edm.ComplexType)", "fields": [ { "name": "text", "type": "Edm.String" }, { "name": "type", "type": "Edm.String" }, { "name": "subtype", "type": "Edm.String" }, { "name": "offset", "type": "Edm.Int32" }, { "name": "length", "type": "Edm.Int32" }, { "name": "score", "type": "Edm.Double" } ] }, { "name": "maskedText", "type": "Edm.String" }, { "name": "customEntities", "type": "Collection(Edm.ComplexType)", "fields": [ { "name": "name", "type": "Edm.String" }, { "name": "description", "type": "Edm.String" }, { "name": "id", "type": "Edm.String" }, { "name": "matches", "type": "Collection(Edm.ComplexType)", "fields": [ { "name": "text", "type": "Edm.String" }, { "name": "offset", "type": "Edm.Int32" }, { "name": "length", "type": "Edm.Int32" }, { "name": "matchDistance", "type": "Edm.Double" } ] } ] } Code Snippets (PII + Custom Entity Fields , Index)

Notes de l'éditeur

  1. SAAS (Search As A Service) solution For adding rich search experience over your own content within your own application
  2. Full Text Search Simple Query Syntax Lucene query syntax Scoring Autocomplete Search sugestions Synonyms Highlighting
  3. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  4. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  5. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  6. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  7. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  8. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  9. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  10. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  11. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen 2 (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API
  12. Sources: Azure SQL Database Cosmos DB SQL API Azure Blob Storage Azure Table Storage --- (private) Azure Data Lake Gen (private) Cosmos DB Gremlin API (private) Cosmos DB Cassandra API Skillsets Conditional Custom Entity Lookup Document Extraction Entity Recognition Image Analysis Key Phrase Extraction Language Detection OCR PII Detection Sentiment Shaper Text Merger Text Split Text Translation Custom Web API