SlideShare une entreprise Scribd logo
1  sur  3
Télécharger pour lire hors ligne
Correlation Technology Solutions
Compared to
“Massive Semantic Infrastructure” Solutions
When enterprise or government face difficult, critical problems – intractable problems
that simply must be dealt with – the solutions that are constructed in response to these
problems are often “non-optimal”. Typically, the solutions are expensive, and stunningly
complex. Also typically, these solutions do not perform very well – despite their expense
and complexity. These attributes – expense, complexity, poor outcomes – are especially
prevalent in those cases where computer software is the basis for such solutions. These
“non-optimal” software solutions can be found in many of the 1200 vertical market
sectors for enterprise (identified by NAICS[2007]), and in every sphere of government
operations. Research from Make Sence Florida, Inc. has shown that non-optimal
software solutions are likely to use one or more of three approaches:
“Massive Semantic Infrastructure Solutions” – Systems that require large natural
language databases, ontologies, taxonomies, and concept repositories, and utilize tagging,
threading, entity recognition, and other similar corpus analysis techniques in preparation
for answering user queries.
“Subjective Statistical Model Solutions” – Systems that rely upon statistical models
influenced by subjective human judgments in establishing base or conditional
probabilities of events or outcomes – particularly those which purport to capture *all*
possible events in a complex real-world domain. Such systems typically utilize Bayesian
statistical techniques and include Neural Networks.
“Brute Force Computing Solutions” – Systems which achieve results from the power of
modern day computers to perform a relatively simple process at high speed against large
volumes of data. Keyword searches are a typical example.
The purpose of this document is limited to the examination of how “Massive Semantic
Infrastructure Solutions” differ from Correlation Technology Solutions. A large well
known enterprise software company which is referred to below as “Company A” and that
company’s primary product, which we call “M-Technology” is the example used in this
discussion.
Company A is in fact our "poster-child" for what we call "non-optimal, massive semantic
infrastructure solutions". We like to begin with the practical issues, because the practical
aspects of a Company A solution illustrate perfectly why Company A's "M-Technology"
compares so poorly to Correlation Technology.

1
We often like to recount this true story. At a NYC Search Engine Expo at which we
presented in 2008, a senior staff member of a “Major US Government Financial
Institution” stopped by our booth and, after listening to our explanation of Correlation
Technology, started to complain about Company A - which his organization had
purchased. He said, "for Company A to find a 21-word email I sent (in the past), I had to
remember and enter into the search interface 20 of the words."
Here's why this happens. Before the Company A system can answer a single question, an
enormous set of massive Natural Language databases must be installed and verified.
Then, equally massive dictionaries, thesauri, "concept" repositories, ontologies, lexicons,
and other semantic infrastructure components must be installed and linked. Then, the
corpus (all of the documents) is subjected to indexing, threading, entity recognition, and
other "associative" and "tagging" processes. These require days or weeks of dedicated
server time and huge amounts of memory and data storage. Finally, the system is ready
to do some work. But despite all of this effort, complexity and expense, the Company A
system appears "stupid".
The Company A system appears "stupid" because Company A software is based entirely
on an externally imposed "formal" construct of human language. The "meaning" part of
"M-technology" in fact is constrained to those standard meanings and uses of words
consistent with established academic models. Words are fixed in their allowed use as
only specific parts of speech. The word proximities examined in texts are disregarded if
they do not meet pre-set statistical thresholds of confidence. Syntactically modeled
sentence decomposition is rigidly adhered to, and indexing schemes for "organic"
keyword search are not much improved from their original implementations in the 1990's.
All of these formalisms are observed despite the fact that human expression is riotously,
deliriously, chaotic and adaptive on a moment by moment basis. Writers of even the
shortest communication incorporate cultural memes that no dictionary, no ontology, no
concept map, no semantic infrastructure component could keep current or sort out.
Humans create and utilize idiomatic, vernacular, and colloquial terms and uses for terms
with astounding rapidity and ease, and with astounding confidence in the belief that such
terms and every nuance of meaning carried by such terms will be perfectly understood
and appreciated by the recipients of their expression (and they usually are). Trouble is,
Company A (and its peers) can not make sense of anything not hardwired into the
software's semantic components.
While it is certainly true that a corpus of only very formal documents – such as
government reports, academic papers, and so on - will with the proper lexicons be well
served by a Company A type approach, and while it is also true that Company A has
obliged to provide facilities to users to "make their own lexicons" and to "define their
own concepts" (so, with massive and amazingly time consuming and costly
customization Company A’s product will work better), the fact remains that wherever
human expression and comprehension is informal (such as the majority of email in an
enterprise, human speech captured from transcripts, almost all the other categories of text

2
produced by amateur and professional writers for any purpose), Company A subjects its
users to the possibility for the type of frustrations described above.
If the original text doesn't contain text which conforms to or is confined to the formal
parameters of the academic models used, Company A can often have a lot of trouble in
locating that text. In the last resort, a super-majority of "word matches" was required by
Company A to find the employee's email, because all the "M-technology" was worthless.
The same result could have been achieved with a universally available - and free - Unix
text search utility.
Correlation Technology, in contrast, "permits" a far more "relaxed" and "natural" model
of human language. Our one way, exhaustive transform of data into Knowledge
Fragments (which we call "Acquisition") captures all the significant relations between
words - as they are actually expressed in the text. Unlike "M-technology", Correlation
Technology does not coerce the text into conformity with a set of formalisms or analyze
the text using such formalisms. We "allow" every nuance to be captured without concern
that some artificial rule is observed.
The Correlation process discovers knowledge from the corpus by constructing chains of
iteratively associated Knowledge Fragments, and then analyzing the "Answer Space"
(like the “result set” for RDBMS/SQL) of Correlations. Associations between words can
be as formal or informal as desired or required for the application. We provide in the
Correlation Technology Platform the ability to "dial in" more than 20 differing levels of
"fuzzy association" that actually capture - without imposing any rules which prevent the
discovery of knowledge - all the types of formalisms "understood" by "M-technology".
Further, any additional "reference" preferred for associating words can be "plugged in".
By means of Correlation, knowledge is "emergent", meaning that the analysis of the
Answer Space (a process we call "Refinement") will reveal the desired solutions - if they
exist in the corpus. When the task is Enterprise Search, our Acquisition, Correlation and
Refinement functions will reveal those emails, memos, or documents that the user wants.
Correlation Technology solutions are possible for every product offered by Company A.
In each of these solutions, we believe the Correlation Technology approach will prove far
more effective, far more flexible, and far more straightforward in implementation. While
Correlation Technology solutions can be large scale, every Company A implementation
dwarfs Correlation Technology implementations for an equivalent corpus. While the
complexity of the Correlation Technology solution is obvious, that complexity does not
flow from the hopeless attempt to capture in stone the torrent of human expression and
comprehension, and in fact, Correlation Technology is intrinsically "simple".

For Business Inquiries:
Contact: Carl Wimmer
carl@makesence.us
Mobile: (702) 767-7001

For Technical Inquiries:
Contact: Mark Bobick
m.bobick@correlationconcepts.com
Mobile: (702) 882-5664

3

Contenu connexe

Dernier

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)Gabriella Davis
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityPrincipled Technologies
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationSafe Software
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxOnBoard
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...shyamraj55
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Paola De la Torre
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhisoniya singh
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 

Dernier (20)

A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)A Domino Admins Adventures (Engage 2024)
A Domino Admins Adventures (Engage 2024)
 
Boost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivityBoost PC performance: How more available memory can improve productivity
Boost PC performance: How more available memory can improve productivity
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time AutomationFrom Event to Action: Accelerate Your Decision Making with Real-Time Automation
From Event to Action: Accelerate Your Decision Making with Real-Time Automation
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Maximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptxMaximizing Board Effectiveness 2024 Webinar.pptx
Maximizing Board Effectiveness 2024 Webinar.pptx
 
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
Automating Business Process via MuleSoft Composer | Bangalore MuleSoft Meetup...
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101Salesforce Community Group Quito, Salesforce 101
Salesforce Community Group Quito, Salesforce 101
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | DelhiFULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
FULL ENJOY 🔝 8264348440 🔝 Call Girls in Diplomatic Enclave | Delhi
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 

En vedette

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by HubspotMarius Sescu
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTExpeed Software
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsPixeldarts
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthThinkNow
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfmarketingartwork
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024Neil Kimberley
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)contently
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024Albert Qian
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsKurio // The Social Media Age(ncy)
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Search Engine Journal
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summarySpeakerHub
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next Tessa Mero
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentLily Ray
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best PracticesVit Horky
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project managementMindGenius
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...RachelPearson36
 

En vedette (20)

2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot2024 State of Marketing Report – by Hubspot
2024 State of Marketing Report – by Hubspot
 
Everything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPTEverything You Need To Know About ChatGPT
Everything You Need To Know About ChatGPT
 
Product Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage EngineeringsProduct Design Trends in 2024 | Teenage Engineerings
Product Design Trends in 2024 | Teenage Engineerings
 
How Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental HealthHow Race, Age and Gender Shape Attitudes Towards Mental Health
How Race, Age and Gender Shape Attitudes Towards Mental Health
 
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdfAI Trends in Creative Operations 2024 by Artwork Flow.pdf
AI Trends in Creative Operations 2024 by Artwork Flow.pdf
 
Skeleton Culture Code
Skeleton Culture CodeSkeleton Culture Code
Skeleton Culture Code
 
PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024PEPSICO Presentation to CAGNY Conference Feb 2024
PEPSICO Presentation to CAGNY Conference Feb 2024
 
Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)Content Methodology: A Best Practices Report (Webinar)
Content Methodology: A Best Practices Report (Webinar)
 
How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024How to Prepare For a Successful Job Search for 2024
How to Prepare For a Successful Job Search for 2024
 
Social Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie InsightsSocial Media Marketing Trends 2024 // The Global Indie Insights
Social Media Marketing Trends 2024 // The Global Indie Insights
 
Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024Trends In Paid Search: Navigating The Digital Landscape In 2024
Trends In Paid Search: Navigating The Digital Landscape In 2024
 
5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary5 Public speaking tips from TED - Visualized summary
5 Public speaking tips from TED - Visualized summary
 
ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd ChatGPT and the Future of Work - Clark Boyd
ChatGPT and the Future of Work - Clark Boyd
 
Getting into the tech field. what next
Getting into the tech field. what next Getting into the tech field. what next
Getting into the tech field. what next
 
Google's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search IntentGoogle's Just Not That Into You: Understanding Core Updates & Search Intent
Google's Just Not That Into You: Understanding Core Updates & Search Intent
 
How to have difficult conversations
How to have difficult conversations How to have difficult conversations
How to have difficult conversations
 
Introduction to Data Science
Introduction to Data ScienceIntroduction to Data Science
Introduction to Data Science
 
Time Management & Productivity - Best Practices
Time Management & Productivity -  Best PracticesTime Management & Productivity -  Best Practices
Time Management & Productivity - Best Practices
 
The six step guide to practical project management
The six step guide to practical project managementThe six step guide to practical project management
The six step guide to practical project management
 
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
Beginners Guide to TikTok for Search - Rachel Pearson - We are Tilt __ Bright...
 

Essay: Comparison of Semantic Solutions to Correlation Technology Solutions

  • 1. Correlation Technology Solutions Compared to “Massive Semantic Infrastructure” Solutions When enterprise or government face difficult, critical problems – intractable problems that simply must be dealt with – the solutions that are constructed in response to these problems are often “non-optimal”. Typically, the solutions are expensive, and stunningly complex. Also typically, these solutions do not perform very well – despite their expense and complexity. These attributes – expense, complexity, poor outcomes – are especially prevalent in those cases where computer software is the basis for such solutions. These “non-optimal” software solutions can be found in many of the 1200 vertical market sectors for enterprise (identified by NAICS[2007]), and in every sphere of government operations. Research from Make Sence Florida, Inc. has shown that non-optimal software solutions are likely to use one or more of three approaches: “Massive Semantic Infrastructure Solutions” – Systems that require large natural language databases, ontologies, taxonomies, and concept repositories, and utilize tagging, threading, entity recognition, and other similar corpus analysis techniques in preparation for answering user queries. “Subjective Statistical Model Solutions” – Systems that rely upon statistical models influenced by subjective human judgments in establishing base or conditional probabilities of events or outcomes – particularly those which purport to capture *all* possible events in a complex real-world domain. Such systems typically utilize Bayesian statistical techniques and include Neural Networks. “Brute Force Computing Solutions” – Systems which achieve results from the power of modern day computers to perform a relatively simple process at high speed against large volumes of data. Keyword searches are a typical example. The purpose of this document is limited to the examination of how “Massive Semantic Infrastructure Solutions” differ from Correlation Technology Solutions. A large well known enterprise software company which is referred to below as “Company A” and that company’s primary product, which we call “M-Technology” is the example used in this discussion. Company A is in fact our "poster-child" for what we call "non-optimal, massive semantic infrastructure solutions". We like to begin with the practical issues, because the practical aspects of a Company A solution illustrate perfectly why Company A's "M-Technology" compares so poorly to Correlation Technology. 1
  • 2. We often like to recount this true story. At a NYC Search Engine Expo at which we presented in 2008, a senior staff member of a “Major US Government Financial Institution” stopped by our booth and, after listening to our explanation of Correlation Technology, started to complain about Company A - which his organization had purchased. He said, "for Company A to find a 21-word email I sent (in the past), I had to remember and enter into the search interface 20 of the words." Here's why this happens. Before the Company A system can answer a single question, an enormous set of massive Natural Language databases must be installed and verified. Then, equally massive dictionaries, thesauri, "concept" repositories, ontologies, lexicons, and other semantic infrastructure components must be installed and linked. Then, the corpus (all of the documents) is subjected to indexing, threading, entity recognition, and other "associative" and "tagging" processes. These require days or weeks of dedicated server time and huge amounts of memory and data storage. Finally, the system is ready to do some work. But despite all of this effort, complexity and expense, the Company A system appears "stupid". The Company A system appears "stupid" because Company A software is based entirely on an externally imposed "formal" construct of human language. The "meaning" part of "M-technology" in fact is constrained to those standard meanings and uses of words consistent with established academic models. Words are fixed in their allowed use as only specific parts of speech. The word proximities examined in texts are disregarded if they do not meet pre-set statistical thresholds of confidence. Syntactically modeled sentence decomposition is rigidly adhered to, and indexing schemes for "organic" keyword search are not much improved from their original implementations in the 1990's. All of these formalisms are observed despite the fact that human expression is riotously, deliriously, chaotic and adaptive on a moment by moment basis. Writers of even the shortest communication incorporate cultural memes that no dictionary, no ontology, no concept map, no semantic infrastructure component could keep current or sort out. Humans create and utilize idiomatic, vernacular, and colloquial terms and uses for terms with astounding rapidity and ease, and with astounding confidence in the belief that such terms and every nuance of meaning carried by such terms will be perfectly understood and appreciated by the recipients of their expression (and they usually are). Trouble is, Company A (and its peers) can not make sense of anything not hardwired into the software's semantic components. While it is certainly true that a corpus of only very formal documents – such as government reports, academic papers, and so on - will with the proper lexicons be well served by a Company A type approach, and while it is also true that Company A has obliged to provide facilities to users to "make their own lexicons" and to "define their own concepts" (so, with massive and amazingly time consuming and costly customization Company A’s product will work better), the fact remains that wherever human expression and comprehension is informal (such as the majority of email in an enterprise, human speech captured from transcripts, almost all the other categories of text 2
  • 3. produced by amateur and professional writers for any purpose), Company A subjects its users to the possibility for the type of frustrations described above. If the original text doesn't contain text which conforms to or is confined to the formal parameters of the academic models used, Company A can often have a lot of trouble in locating that text. In the last resort, a super-majority of "word matches" was required by Company A to find the employee's email, because all the "M-technology" was worthless. The same result could have been achieved with a universally available - and free - Unix text search utility. Correlation Technology, in contrast, "permits" a far more "relaxed" and "natural" model of human language. Our one way, exhaustive transform of data into Knowledge Fragments (which we call "Acquisition") captures all the significant relations between words - as they are actually expressed in the text. Unlike "M-technology", Correlation Technology does not coerce the text into conformity with a set of formalisms or analyze the text using such formalisms. We "allow" every nuance to be captured without concern that some artificial rule is observed. The Correlation process discovers knowledge from the corpus by constructing chains of iteratively associated Knowledge Fragments, and then analyzing the "Answer Space" (like the “result set” for RDBMS/SQL) of Correlations. Associations between words can be as formal or informal as desired or required for the application. We provide in the Correlation Technology Platform the ability to "dial in" more than 20 differing levels of "fuzzy association" that actually capture - without imposing any rules which prevent the discovery of knowledge - all the types of formalisms "understood" by "M-technology". Further, any additional "reference" preferred for associating words can be "plugged in". By means of Correlation, knowledge is "emergent", meaning that the analysis of the Answer Space (a process we call "Refinement") will reveal the desired solutions - if they exist in the corpus. When the task is Enterprise Search, our Acquisition, Correlation and Refinement functions will reveal those emails, memos, or documents that the user wants. Correlation Technology solutions are possible for every product offered by Company A. In each of these solutions, we believe the Correlation Technology approach will prove far more effective, far more flexible, and far more straightforward in implementation. While Correlation Technology solutions can be large scale, every Company A implementation dwarfs Correlation Technology implementations for an equivalent corpus. While the complexity of the Correlation Technology solution is obvious, that complexity does not flow from the hopeless attempt to capture in stone the torrent of human expression and comprehension, and in fact, Correlation Technology is intrinsically "simple". For Business Inquiries: Contact: Carl Wimmer carl@makesence.us Mobile: (702) 767-7001 For Technical Inquiries: Contact: Mark Bobick m.bobick@correlationconcepts.com Mobile: (702) 882-5664 3