What makes a search engine "intelligent"? In this talk I discuss MarkLogic's full text search features and demonstrate how to enhance search functionality using MarkLogic's new Search API to deliver better, faster results automatically. You will learn how to use Search API to include indexed facets alongside results and perform query expansion to add robust automatic semantic search for known entities and expand thesaurus terms to reduce false negatives.
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village.
Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions.
Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
Slides for my presentation at SoCal Code Camp, June 29, 2014
(http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6337660f-37de-4d6e-a5bc-46ba54478e5e)
Building Search & Recommendation EnginesTrey Grainger
In this talk, you'll learn how to build your own search and recommendation engine based on the open source Apache Lucene/Solr project. We'll dive into some of the data science behind how search engines work, covering multi-lingual text analysis, natural language processing, relevancy ranking algorithms, knowledge graphs, reflected intelligence, collaborative filtering, and other machine learning techniques used to drive relevant results for free-text queries. We'll also demonstrate how to build a recommendation engine leveraging the same platform and techniques that power search for most of the world's top companies. You'll walk away from this presentation with the toolbox you need to go and implement your very own search-based product using your own data.
The document provides an overview and agenda for an Apache Solr crash course. It discusses topics such as information retrieval, inverted indexes, metrics for evaluating IR systems, Apache Lucene, the Lucene and Solr APIs, indexing, searching, querying, filtering, faceting, highlighting, spellchecking, geospatial search, and Solr architectures including single core, multi-core, replication, and sharding. It also provides tips on performance tuning, using plugins, and developing a Solr-based search engine.
This document provides an overview of new features and changes in Apache Solr for TYPO3 in 2018. Key highlights include improved user experience in EXT:solr 8.0.0 with updated frontend markup, new suggest and facet features. Other extensions like EXT:solrfluidgrouping and EXT:solrmlt were updated to enhance related content and grouping capabilities. Future plans include additional extensions, TYPO3 9 compatibility, and migrating to the Solarium API for improved Solr integration.
The document provides specifications for the RFS search language, including:
- Queries have 3 components: word query, metadata query, and timestamp
- Word queries use operators (AND, OR, NOT) to combine words and contexts like mailfrom
- Metadata queries filter results based on fields like IP, port, location using ranges, lists, and negation
- Examples demonstrate complex queries combining word and metadata filters
This document provides an overview and summary of new features in EXT:solr 8.0.0, including improvements to the user experience like new suggest functionality, filterable options facets, and Bootstrap CSS integration. It also discusses under the hood changes like moving to a doctrine-based architecture and preparing for TYPO3 9 support. Additional new features in related extensions are mentioned, such as grouping support for FLUID. The document concludes by thanking sponsors and partners and inviting others to support further Solr for TYPO3 development.
This document provides an overview of search functionality in Kibana, including the Discover UI, search types (free text, field level, filters), the Kibana Query Language (KQL) and Lucene Query Language, advanced search types (wildcard, proximity, boosting, ranges, regex), and examples of queries. It also demonstrates how to perform a basic search in Kibana by choosing an index, setting a time range, using free text search, refining with fields and filters, and inspecting surrounding documents.
Scaling Recommendations, Semantic Search, & Data Analytics with solrTrey Grainger
This presentation is from the inaugural Atlanta Solr Meetup held on 2014/10/21 at Atlanta Tech Village.
Description: CareerBuilder uses Solr to power their recommendation engine, semantic search, and data analytics products. They maintain an infrastructure of hundreds of Solr servers, holding over a billion documents and serving over a million queries an hour across thousands of unique search indexes. Come learn how CareerBuilder has integrated Solr into their technology platform (with assistance from Hadoop, Cassandra, and RabbitMQ) and walk through api and code examples to see how you can use Solr to implement your own real-time recommendation engine, semantic search, and data analytics solutions.
Speaker: Trey Grainger is the Director of Engineering for Search & Analytics at CareerBuilder.com and is the co-author of Solr in Action (2014, Manning Publications), the comprehensive example-driven guide to Apache Solr. His search experience includes handling multi-lingual content across dozens of markets/languages, machine learning, semantic search, big data analytics, customized Lucene/Solr scoring models, data mining and recommendation systems. Trey is also the Founder of Celiaccess.com, a gluten-free search engine, and is a frequent speaker at Lucene and Solr-related conferences.
Search Engine Building with Lucene and Solr (So Code Camp San Diego 2014)Kai Chan
Slides for my presentation at SoCal Code Camp, June 29, 2014
(http://www.socalcodecamp.com/socalcodecamp/session.aspx?sid=6337660f-37de-4d6e-a5bc-46ba54478e5e)
Building Search & Recommendation EnginesTrey Grainger
In this talk, you'll learn how to build your own search and recommendation engine based on the open source Apache Lucene/Solr project. We'll dive into some of the data science behind how search engines work, covering multi-lingual text analysis, natural language processing, relevancy ranking algorithms, knowledge graphs, reflected intelligence, collaborative filtering, and other machine learning techniques used to drive relevant results for free-text queries. We'll also demonstrate how to build a recommendation engine leveraging the same platform and techniques that power search for most of the world's top companies. You'll walk away from this presentation with the toolbox you need to go and implement your very own search-based product using your own data.
The document provides an overview and agenda for an Apache Solr crash course. It discusses topics such as information retrieval, inverted indexes, metrics for evaluating IR systems, Apache Lucene, the Lucene and Solr APIs, indexing, searching, querying, filtering, faceting, highlighting, spellchecking, geospatial search, and Solr architectures including single core, multi-core, replication, and sharding. It also provides tips on performance tuning, using plugins, and developing a Solr-based search engine.
This document provides an overview of new features and changes in Apache Solr for TYPO3 in 2018. Key highlights include improved user experience in EXT:solr 8.0.0 with updated frontend markup, new suggest and facet features. Other extensions like EXT:solrfluidgrouping and EXT:solrmlt were updated to enhance related content and grouping capabilities. Future plans include additional extensions, TYPO3 9 compatibility, and migrating to the Solarium API for improved Solr integration.
The document provides specifications for the RFS search language, including:
- Queries have 3 components: word query, metadata query, and timestamp
- Word queries use operators (AND, OR, NOT) to combine words and contexts like mailfrom
- Metadata queries filter results based on fields like IP, port, location using ranges, lists, and negation
- Examples demonstrate complex queries combining word and metadata filters
This document provides an overview and summary of new features in EXT:solr 8.0.0, including improvements to the user experience like new suggest functionality, filterable options facets, and Bootstrap CSS integration. It also discusses under the hood changes like moving to a doctrine-based architecture and preparing for TYPO3 9 support. Additional new features in related extensions are mentioned, such as grouping support for FLUID. The document concludes by thanking sponsors and partners and inviting others to support further Solr for TYPO3 development.
This document provides an overview of search functionality in Kibana, including the Discover UI, search types (free text, field level, filters), the Kibana Query Language (KQL) and Lucene Query Language, advanced search types (wildcard, proximity, boosting, ranges, regex), and examples of queries. It also demonstrates how to perform a basic search in Kibana by choosing an index, setting a time range, using free text search, refining with fields and filters, and inspecting surrounding documents.
This document summarizes research that developed basic and advanced search modules for an XML-based website using JavaScript. The basic search allowed searching across all fields of an XML table using one text box, while the advanced search added options to search specific fields and make the search case-sensitive. The research found that XML technologies like XSLT, XPath, and DOM could be combined with JavaScript to effectively add client-side search capabilities to a native XML website.
Siteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra SoniJitendra Soni
Presented two parts -
Part 1 -
Solr architectural view
Basic algorithm for search – How it works
Synonyms
Stop word
Protected word
Spell checker
Highlighter
Analyzers – How it works
Tokenizers – How it works
Filters - How it works
How to add custom filters and extending Solr core.
Part 2 -
Sitecore SXA Search - Overview
Understand OOTB options
Facet
Scope
Tokens
Boosting.
Hands-on session
Creating a search result page
view more option
sorting
Facet
Boosting and rule engine
Extending scope, facet and new custom tokens resolver.
Troubleshooting tips for search issues
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformTrey Grainger
Trey Grainger discusses CareerBuilder's large-scale search platform built on Apache Solr. The platform handles over 150 search servers and indexes over 100 million documents in multiple languages and fields. Grainger describes CareerBuilder's approaches to multi-lingual analysis, custom scoring, and implementing a "Solr cloud" to make search capabilities easily accessible. He also discusses how the search platform is used for knowledge discovery and data analytics applications beyond just search.
This chapter discusses using Google search techniques to find security vulnerabilities on websites. It covers both basic operators like +, -, and " to refine searches, as well as advanced operators like allintext:, filetype:, and intitle: to search specific fields. The goal is to quickly find sensitive information like login credentials, passwords, or internal documents through structured searches. A variety of example searches are provided to demonstrate how different operators can reveal website details and potentially sensitive data.
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
This document describes a research project that developed a client-side search module for a native XML website using XML technologies like XSLT, XPath, and DOM as well as JavaScript. The researcher created both a basic search and an advanced search utility. The basic search allowed searching across all text fields in a table using one text box, while the advanced search provided more options to search specific columns and refine searches. The project showed that an XML website can be effectively searched using a combination of XML technologies and JavaScript. The researcher plans to expand the search capabilities with more advanced regular expressions and server-side searching in the future.
This document discusses building distributed search applications using Apache Solr. It provides an agenda that covers topics such as Solr architecture, schema configuration, indexing data, querying, SolrCloud, and performance factors. It also references a demo app that will be used for hands-on examples during the presentation.
- The document provides an overview of Apache Solr, an open source enterprise search platform. It discusses how to install and configure Solr, load sample data, and perform various search queries. It also offers tips for advanced search functionality, indexing, and scaling Solr for large datasets.
Multi faceted responsive search, autocomplete, feeds engine & logginglucenerevolution
Presented by Remi Mikalsen, Search Engineer, The Norwegian Centre for ICT in Education
Learn how utdanning.no leverages open source technologies to deliver a blazing fast multi-faceted responsive search experience and a flexible and efficient feeds engine on top of Solr 3.6. Among the key open source projects that will be covered are Solr, Ajax-Solr, SolrPHPClient, Bootstrap, jQuery and Drupal. Notable highlights are ajaxified pivot facets, multiple parents hierarchical facets, ajax autocomplete with edge-n-gram and grouping, integrating our search widgets on any external website, custom Solr logging and using Solr to deliver Atom feeds. utdanning.no is a governmental website that collects, normalizes and publishes study information for related to secondary school and higher education in Norway. With 1.2 million visitors each year and 12.000 indexed documents we focus on precise information and a high degree of usability for students, potential students and counselors.
So you've got the search and parsing basics down? Ready to learn more advanced operators? Join us and learn about:
LogReduce, LogCompare, Outlier, Predict, Join, Transaction and many more.
QuickStart your Sumo Logic service with this exclusive webinar. At these monthly live events you will learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic
QuickStart your Sumo Logic service with this exclusive webinar. At these monthly live events you will learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights
The presentation describes what is Apache Solr, how it could be used. There is apache solr overview, performance tuning tips and advanced features description
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
This document provides an introduction to Apache Lucene and Solr. It begins with an overview of information retrieval and some basic concepts like term frequency-inverse document frequency. It then describes Lucene as a fast, scalable search library and discusses its inverted index and indexing pipeline. Solr is introduced as an enterprise search platform built on Lucene that provides features like faceting, scalability and real-time indexing. The document concludes with examples of how Lucene and Solr are used in applications and websites for search, analytics, auto-suggestion and more.
Overview of Solr 6.2 examples, including features they have and challenges they present. A contrasting demonstration of a minimal viable example. A step-by-step deconstruction of "films" example to show what part of shipped examples are not actually needed.
SMS API Integration in Saudi Arabia| Best SMS API ServiceYara Milbes
Discover the benefits and implementation of SMS API integration in the UAE and Middle East. This comprehensive guide covers the importance of SMS messaging APIs, the advantages of bulk SMS APIs, and real-world case studies. Learn how CEQUENS, a leader in communication solutions, can help your business enhance customer engagement and streamline operations with innovative CPaaS, reliable SMS APIs, and omnichannel solutions, including WhatsApp Business. Perfect for businesses seeking to optimize their communication strategies in the digital age.
Contenu connexe
Similaire à Search Intelligence & MarkLogic Search API
This document summarizes research that developed basic and advanced search modules for an XML-based website using JavaScript. The basic search allowed searching across all fields of an XML table using one text box, while the advanced search added options to search specific fields and make the search case-sensitive. The research found that XML technologies like XSLT, XPath, and DOM could be combined with JavaScript to effectively add client-side search capabilities to a native XML website.
Siteocre Sxa and Solr - Sitecore User Group UAE Dubai- Jitendra SoniJitendra Soni
Presented two parts -
Part 1 -
Solr architectural view
Basic algorithm for search – How it works
Synonyms
Stop word
Protected word
Spell checker
Highlighter
Analyzers – How it works
Tokenizers – How it works
Filters - How it works
How to add custom filters and extending Solr core.
Part 2 -
Sitecore SXA Search - Overview
Understand OOTB options
Facet
Scope
Tokens
Boosting.
Hands-on session
Creating a search result page
view more option
sorting
Facet
Boosting and rule engine
Extending scope, facet and new custom tokens resolver.
Troubleshooting tips for search issues
Extending Solr: Building a Cloud-like Knowledge Discovery PlatformTrey Grainger
Trey Grainger discusses CareerBuilder's large-scale search platform built on Apache Solr. The platform handles over 150 search servers and indexes over 100 million documents in multiple languages and fields. Grainger describes CareerBuilder's approaches to multi-lingual analysis, custom scoring, and implementing a "Solr cloud" to make search capabilities easily accessible. He also discusses how the search platform is used for knowledge discovery and data analytics applications beyond just search.
This chapter discusses using Google search techniques to find security vulnerabilities on websites. It covers both basic operators like +, -, and " to refine searches, as well as advanced operators like allintext:, filetype:, and intitle: to search specific fields. The goal is to quickly find sensitive information like login credentials, passwords, or internal documents through structured searches. A variety of example searches are provided to demonstrate how different operators can reveal website details and potentially sensitive data.
What is the best full text search engine for Python?Andrii Soldatenko
Nowadays we can see lot’s of benchmarks and performance tests of different web frameworks and Python tools. Regarding to search engines, it’s difficult to find useful information especially benchmarks or comparing between different search engines. It’s difficult to manage what search engine you should select for instance, ElasticSearch, Postgres Full Text Search or may be Sphinx or Whoosh. You face a difficult choice, that’s why I am pleased to share with you my acquired experience and benchmarks and focus on how to compare full text search engines for Python.
This document describes a research project that developed a client-side search module for a native XML website using XML technologies like XSLT, XPath, and DOM as well as JavaScript. The researcher created both a basic search and an advanced search utility. The basic search allowed searching across all text fields in a table using one text box, while the advanced search provided more options to search specific columns and refine searches. The project showed that an XML website can be effectively searched using a combination of XML technologies and JavaScript. The researcher plans to expand the search capabilities with more advanced regular expressions and server-side searching in the future.
This document discusses building distributed search applications using Apache Solr. It provides an agenda that covers topics such as Solr architecture, schema configuration, indexing data, querying, SolrCloud, and performance factors. It also references a demo app that will be used for hands-on examples during the presentation.
- The document provides an overview of Apache Solr, an open source enterprise search platform. It discusses how to install and configure Solr, load sample data, and perform various search queries. It also offers tips for advanced search functionality, indexing, and scaling Solr for large datasets.
Multi faceted responsive search, autocomplete, feeds engine & logginglucenerevolution
Presented by Remi Mikalsen, Search Engineer, The Norwegian Centre for ICT in Education
Learn how utdanning.no leverages open source technologies to deliver a blazing fast multi-faceted responsive search experience and a flexible and efficient feeds engine on top of Solr 3.6. Among the key open source projects that will be covered are Solr, Ajax-Solr, SolrPHPClient, Bootstrap, jQuery and Drupal. Notable highlights are ajaxified pivot facets, multiple parents hierarchical facets, ajax autocomplete with edge-n-gram and grouping, integrating our search widgets on any external website, custom Solr logging and using Solr to deliver Atom feeds. utdanning.no is a governmental website that collects, normalizes and publishes study information for related to secondary school and higher education in Norway. With 1.2 million visitors each year and 12.000 indexed documents we focus on precise information and a high degree of usability for students, potential students and counselors.
So you've got the search and parsing basics down? Ready to learn more advanced operators? Join us and learn about:
LogReduce, LogCompare, Outlier, Predict, Join, Transaction and many more.
QuickStart your Sumo Logic service with this exclusive webinar. At these monthly live events you will learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights
Introduction to the basics of Information Retrieval (IR) with an emphasis on Apache Solr/Lucene. A lecture I gave during the JOSA Data Science Bootcamp.
Sumo Logic QuickStart Webinar - Jan 2016Sumo Logic
QuickStart your Sumo Logic service with this exclusive webinar. At these monthly live events you will learn how to capitalize on critical capabilities that can amplify your log analytics and monitoring experience while providing you with meaningful business and IT insights
The presentation describes what is Apache Solr, how it could be used. There is apache solr overview, performance tuning tips and advanced features description
Have you ever wondered how search works while visiting an e-commerce site, internal website, or searching through other types of online resources? Look no further than this informative session on the ways that taxonomies help end-users navigate the internet! Hear from taxonomists and other information professionals who have first-hand experience creating and working with taxonomies that aid in navigation, search, and discovery across a range of disciplines.
This document provides an introduction to Apache Lucene and Solr. It begins with an overview of information retrieval and some basic concepts like term frequency-inverse document frequency. It then describes Lucene as a fast, scalable search library and discusses its inverted index and indexing pipeline. Solr is introduced as an enterprise search platform built on Lucene that provides features like faceting, scalability and real-time indexing. The document concludes with examples of how Lucene and Solr are used in applications and websites for search, analytics, auto-suggestion and more.
Overview of Solr 6.2 examples, including features they have and challenges they present. A contrasting demonstration of a minimal viable example. A step-by-step deconstruction of "films" example to show what part of shipped examples are not actually needed.
Similaire à Search Intelligence & MarkLogic Search API (20)
SMS API Integration in Saudi Arabia| Best SMS API ServiceYara Milbes
Discover the benefits and implementation of SMS API integration in the UAE and Middle East. This comprehensive guide covers the importance of SMS messaging APIs, the advantages of bulk SMS APIs, and real-world case studies. Learn how CEQUENS, a leader in communication solutions, can help your business enhance customer engagement and streamline operations with innovative CPaaS, reliable SMS APIs, and omnichannel solutions, including WhatsApp Business. Perfect for businesses seeking to optimize their communication strategies in the digital age.
Artificia Intellicence and XPath Extension FunctionsOctavian Nadolu
The purpose of this presentation is to provide an overview of how you can use AI from XSLT, XQuery, Schematron, or XML Refactoring operations, the potential benefits of using AI, and some of the challenges we face.
Need for Speed: Removing speed bumps from your Symfony projects ⚡️Łukasz Chruściel
No one wants their application to drag like a car stuck in the slow lane! Yet it’s all too common to encounter bumpy, pothole-filled solutions that slow the speed of any application. Symfony apps are not an exception.
In this talk, I will take you for a spin around the performance racetrack. We’ll explore common pitfalls - those hidden potholes on your application that can cause unexpected slowdowns. Learn how to spot these performance bumps early, and more importantly, how to navigate around them to keep your application running at top speed.
We will focus in particular on tuning your engine at the application level, making the right adjustments to ensure that your system responds like a well-oiled, high-performance race car.
Top Benefits of Using Salesforce Healthcare CRM for Patient Management.pdfVALiNTRY360
Salesforce Healthcare CRM, implemented by VALiNTRY360, revolutionizes patient management by enhancing patient engagement, streamlining administrative processes, and improving care coordination. Its advanced analytics, robust security, and seamless integration with telehealth services ensure that healthcare providers can deliver personalized, efficient, and secure patient care. By automating routine tasks and providing actionable insights, Salesforce Healthcare CRM enables healthcare providers to focus on delivering high-quality care, leading to better patient outcomes and higher satisfaction. VALiNTRY360's expertise ensures a tailored solution that meets the unique needs of any healthcare practice, from small clinics to large hospital systems.
For more info visit us https://valintry360.com/solutions/health-life-sciences
E-Invoicing Implementation: A Step-by-Step Guide for Saudi Arabian CompaniesQuickdice ERP
Explore the seamless transition to e-invoicing with this comprehensive guide tailored for Saudi Arabian businesses. Navigate the process effortlessly with step-by-step instructions designed to streamline implementation and enhance efficiency.
Measures in SQL (SIGMOD 2024, Santiago, Chile)Julian Hyde
SQL has attained widespread adoption, but Business Intelligence tools still use their own higher level languages based upon a multidimensional paradigm. Composable calculations are what is missing from SQL, and we propose a new kind of column, called a measure, that attaches a calculation to a table. Like regular tables, tables with measures are composable and closed when used in queries.
SQL-with-measures has the power, conciseness and reusability of multidimensional languages but retains SQL semantics. Measure invocations can be expanded in place to simple, clear SQL.
To define the evaluation semantics for measures, we introduce context-sensitive expressions (a way to evaluate multidimensional expressions that is consistent with existing SQL semantics), a concept called evaluation context, and several operations for setting and modifying the evaluation context.
A talk at SIGMOD, June 9–15, 2024, Santiago, Chile
Authors: Julian Hyde (Google) and John Fremlin (Google)
https://doi.org/10.1145/3626246.3653374
Microservice Teams - How the cloud changes the way we workSven Peters
A lot of technical challenges and complexity come with building a cloud-native and distributed architecture. The way we develop backend software has fundamentally changed in the last ten years. Managing a microservices architecture demands a lot of us to ensure observability and operational resiliency. But did you also change the way you run your development teams?
Sven will talk about Atlassian’s journey from a monolith to a multi-tenanted architecture and how it affected the way the engineering teams work. You will learn how we shifted to service ownership, moved to more autonomous teams (and its challenges), and established platform and enablement teams.
Everything You Need to Know About X-Sign: The eSign Functionality of XfilesPr...XfilesPro
Wondering how X-Sign gained popularity in a quick time span? This eSign functionality of XfilesPro DocuPrime has many advancements to offer for Salesforce users. Explore them now!
Unveiling the Advantages of Agile Software Development.pdfbrainerhub1
Learn about Agile Software Development's advantages. Simplify your workflow to spur quicker innovation. Jump right in! We have also discussed the advantages.
What to do when you have a perfect model for your software but you are constrained by an imperfect business model?
This talk explores the challenges of bringing modelling rigour to the business and strategy levels, and talking to your non-technical counterparts in the process.
WWDC 2024 Keynote Review: For CocoaCoders AustinPatrick Weigel
Overview of WWDC 2024 Keynote Address.
Covers: Apple Intelligence, iOS18, macOS Sequoia, iPadOS, watchOS, visionOS, and Apple TV+.
Understandable dialogue on Apple TV+
On-device app controlling AI.
Access to ChatGPT with a guest appearance by Chief Data Thief Sam Altman!
App Locking! iPhone Mirroring! And a Calculator!!
UI5con 2024 - Keynote: Latest News about UI5 and it’s EcosystemPeter Muessig
Learn about the latest innovations in and around OpenUI5/SAPUI5: UI5 Tooling, UI5 linter, UI5 Web Components, Web Components Integration, UI5 2.x, UI5 GenAI.
Recording:
https://www.youtube.com/live/MSdGLG2zLy8?si=INxBHTqkwHhxV5Ta&t=0
Malibou Pitch Deck For Its €3M Seed Roundsjcobrien
French start-up Malibou raised a €3 million Seed Round to develop its payroll and human resources
management platform for VSEs and SMEs. The financing round was led by investors Breega, Y Combinator, and FCVC.
13. Enrich Your Query!
• Infer
– Use knowledge about the user
– Look for meaning in search terms
• Enrich
– Translate into more complex query
– Gain speed, accuracy
14. Enrich Your Query!
• Strategies
– Custom term handling
• Works well for single term transformations
• See: http://developer.marklogic.com/try/ninja/page13
– Roll your own parser
• A lot of work (see Michael Blakeley’s xqysp)
– Work between parse and search steps
15. Search API Overview
• The Search API is an XQuery library module designed to
simplify creating search applications:
o Parser
o Constraints
o Faceting
o Snippets
• High performance, scalability
• Extensible
16. Search API Extensibility
• Search API provides several points to hook in
• Hooks are defined in Search API options XML node
o Custom constraints
o Custom grammar
o Custom snippets
o Custom term handling
o Search operators
17. Search API Basics
• Search API module:
• Main entry point: search:search()
import module namespace search = "http://marklogic.com/appservices/search"
at "/MarkLogic/appservices/search/search.xqy";
• parses $qtext with given $options
• executes search
• returns <search:response>
o set of <search:result>s
o facets
o snippets
o metrics and other info
20. Search API Extensibility
• Term handler:
• Parser:
let $custom-parser-output :=
my:parse($qtext)
search:resolve(
$custom-parser-output,
$options
)
21. Search API Basics
• Search API parser:
• Execute search:
• 1st half of search:search()
• returns annotated cts:query XML
• 2nd half of search:search()
• accepts cts:query XML as input
23. Our Use Case
• O’Connor’s Online
– Search portal built on MarkLogic
– Legal rules and commentaries content
– Problem
• Users will enter citation numbers, abbreviations, etc. expecting
complete results
• Text editorial content follows different conventions
– Solution
• Detect special cases pre-search and enrich query
24. Example: detect year
• Content:
– MarkLogic database of news/op-ed articles
• Organized into year directories:
/content/1990
/content/1991
/content/1992
...
/content/2012
• Year is in directory structure, not article text
– But users will still include year in search terms
25. How to transform query?
• Recursive typeswitch
(function mapping on):
do-stuff-here($q)
27. Example: detect year
let $terms := "1996 United States Olympics"
return local:detect-year(search:parse($terms))
28. Example: detect year
• Strategy depends on your content model
• Other possibilities
– date detection
– date ranges
– locations
– etc.
29. search:parse() Strategy
• Weakness
– Limited to single word token
• Similar to custom term handling
• What about multiple tokens?
– Analyze querystring text directly using regex
• Dangerous
– Transform cts:query XML into intermediate form
• Preserve Boolean logic & grouping
• Preserve phrases
• Preserve constraints
30. Building Intermediate Query
• The hack
– Basically, undoing some of the parser's work
– Text "run" concept
• Similar to WordprocessingML
31. Building Intermediate Query
• Intermediate query strategy
1. Flatten query
2. Join sibling words in <run>
3. Transform <run>s
4. Convert <run>s back to word queries
32. Example: multi-word thesaurus
• Content:
– Same MarkLogic database of news/op-ed articles from
detect-year() example
• Query:
– Same as before: "1996 United States Olypmics"
– Start with the search:parse()output
33. Example: multi-word thesaurus
• Intermediate query strategy
1. Flatten query
2. Join sibling words in <run>
3. Transform <run>s
4. Convert <run>s back to word queries
38. Example: multi-word thesaurus
• Intermediate query strategy
1. Flatten query
2. Join sibling words in <run>
3. Transform <run>s
4. Convert <run>s back to word queries
39. Example: multi-word thesaurus
2. Join sibling words in <run>:
• Typeswitch on cts:word-query:
1. Ignore phrases
2. Delete if query is
not the first.
3. Take first
word-query in
sequence and
join with its
following siblings
into a <run>
40. 2. Join sibling words in <run>:
• Input:
– search:parse("1996 United States Olympics")/local:unnest-
ands(.)/local:create-runs(.)
• Output:
Example: multi-word thesaurus
41. 2. Join sibling words in <run>:
• Input:
– search:parse("1996 (sprint OR marathon) United States
Olympics")/local:unnest-ands(.)/local:create-runs(.)
• Output:
Example: multi-word thesaurus
42. Example: multi-word thesaurus
• Intermediate query strategy
1. Flatten query
2. Join sibling words in <run>
3. Transform <run>s
4. Convert <run>s back to word queries
43. Example: multi-word thesaurus
3. Transform <run>s:
1. Store terms in thesaurus
2. Build cts:or-query of thesaurus terms
3. Using cts:or-query of terms, cts:highlight() <run>s,
and replace with thesaurus synonyms
50. Example: multi-word thesaurus
• Intermediate query strategy
1. Flatten query
2. Join sibling words in <run>
3. Transform <run>s
4. Convert <run>s back to word queries
51. 4. Convert <run>s back to word queries
– Typeswitch:
Example: multi-word thesaurus
52. 4. Convert <run>s back to word queries
Input:
Example: multi-word thesaurus
let $q-thsr :=
cts:or-query(
doc("thesaurus.xml")
//thsr:entry/thsr:term/cts:word-query(string(.)))
)
let $runs := search:parse("1996 United States Olympics")
/local:unnest-ands(.)/local:create-runs(.)
let $expanded := local:thsr-expand($runs, $q-thsr)
return local:resolve-runs($expanded)
53. 4. Convert <run>s back to word queries
Output:
Example: multi-word thesaurus
55. Enrich Your Query!
• Takeaway
1. No added GUI
2. Didn't ask the user for additional input
3. Able to build more robust query before
executing search
57. • Many potential applications:
– Automatic spell correction:
Search API Hacking
58. • Many potential applications:
– Detect entities
• Transform text into element-based query
• Less false positives and exclusions
• Leverage indexes:
Search API Hacking
"New York Times"
59. Search API Hacking
• Other ideas
– Regex unparsed query string
• apply constraints, operators, etc as configured in Search API based on key
words/patterns
– Custom term handler
• single-term transformations
– Combine with data enrichment on ingestion
• MarkLogic Entity Framework
• Linguistic processing
60. Hazards
• Chaos
– Daisy chained transformations can have unintended
consequences
– Performance
• Pre-search transformations need to be fast
• make sure to leverage indexes as much as possible
• Larger queries do take longer