SlideShare une entreprise Scribd logo
1  sur  45
Leveraging the Semantic Web, Schema.org, Semantic Search and more
San Diego Semantic Web Meetup
By: Barbara Starr
Twitter: @BarbaraStarr
Email: bstarr@algebraixData.com
• Pursued a doctorate in Artificial Intelligence from South
Africa in the 80's.
• Recruited to build intelligent/predictive trading systems
on Wall Street
• Migrated to government-based contracts, several of
which turned into real world products like
– SIRI (PAL from DARPA)
– WATSON (Acquaint - IBM Watson Labs was a team
member)
• From the vantage of a semantic technologist, I keenly
watched the evolution of the Semantic Web.
• “Shocked into the real world” when working as a
consultant @ Overstock
• Today – SVP Product management AlgebraixData
Meta Information
ME
By: Barbara Starr
Twitter: @BarbaraStarr
Email: bstarr@algebraixData.com
Linkedin: http://www.linkedin.com/in/barbarastarr
My favorite author:
Isaac Asimov
Favorite book:
I Robot
Favorite character:
MULTIVAC
Additional Metainformation
For the purpose of this talk:
same-as
MY ROBOT or Artificially Intelligent Entity or Search Engine
OWL
I explain things
from a Search
Engine Point of
View! 
SEARCH ENGINE POINT OF VIEW
How can I exploit
metadata or
“semantic
search”??
SEARCH ENGINE POINT OF VIEW
RICH SNIPPETS 2009
tiles
Searchmonkey 2008
I can directly extract
information to
enhance SERP displays
SEARCH ENGINE POINT OF VIEW
I can search directly on
consumed metadata!
SEARCH ENGINE POINT OF VIEW
I can provide direct
answers to queries by
searching on
consumed, verified and
validated information
SEARCH ENGINE POINT OF VIEW
I can even aggregate
answers or deduce
them (like a timeline of
events)
SEARCH ENGINE POINT OF VIEW
I can even use it in
conjunction with
machine learning
techniques- to eg.
Train other
components
I can detect
relevancy
signals: i.e what
content to show
to what
audience
I can use it to
Assist in
interpreting a
user query
Penn Treebank tagset
?
SEARCH ENGINE POINT OF VIEW
Really interesting in terms
of exposing long tail
content too. It makes
things findable for me
when pages are published
with structured markup!
I meant the
beer brewer
in Arizona
SEARCH ENGINE POINT OF VIEW
I’m a Search Engine Robot
I could really use
this stuff. And it
is like the tower
of babel out
there!
Microdata
Microformats
RDFa
Multiple conflicting
vocabularies that I will
have to align internally
and multiple syntax
formats as well.
Prior to Schema.org (.e. June 2011)
Goodrelations for e-commerce
?
SEARCH ENGINE POINT OF VIEW
Time to get Serious!
What has been the history?
Percentage of URLs with embedded metadata in various formats
Five-fold increase between
March, 2009 and October,
2010
Another five-fold increase
between October 2010 and
January, 2012
RDFa exploded in 2012 – Source Peter Mika - Yahoo
Current state of metadata on the Web
• 31% of webpages, 5% of domains contain some metadata
– Analysis of the Bing Crawl (US crawl, January, 2012)
– RDFa is most common format
• By URL: 25% RDFa, 7% microdata, 9% microformat
• By eTLD (PLD): 4% RDFa, 0.3% microdata, 5.4% microformat
– Adoption is stronger among large publishers
• Especially for RDFa and microdata
• See also
– P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012
– H.Mühleisen, C.Bizer.Web Data Commons - Extracting Structured Data
from Two Large Web Corpora, LDOW 2012
Prolific growth of the LOD Cloud
Timeline of RDFa and Semantic Web Adoption
As of Semtech 2011
Inevitable passage of
Semantic Web adoption –
culminating in schema.org
SEARCH ENGINE POINT OF VIEW
Align and consume
many vocabularies
that may not be of
interest to search
engines?
Rather mandate vocabulary And Syntax - microdata
A Search Engine
alliance has the power
to MANDATE
vocabulary and syntax!
Initial alliance: Google, Yahoo, Bing. Then Yandex and subsequently Pinterest
Sample portion
SEARCH ENGINE POINT OF VIEW
On the other hand
– Not wise to
ignore standards
bodies like W3C
No mandate on Syntax
SEARCH ENGINE POINT OF VIEW
Did I tell you I
don’t like spam?
SEARCH ENGINE POINT OF VIEW
Make sure you are
not cloaking by
feeding one set of
information to me
and another to
human users!
Ensure your data
feeds match
information with
the structured
markup or
“metadata” on
your web pages.
Your Logo
SEARCH ENGINE POINT OF VIEW
Serving
RELEVANT
ANSWERS are
IMPERATIVE!
& central to my
very being!
SEARCH ENGINE POINT OF VIEW
ELSE I AM
SEARCH ENGINE POINT OF VIEW
SEARCH ENGINE POINT OF VIEW
Adding context in
search verticals really
helps me serve up
relevant information
(Seriously increases my
recall), as does
geospatial information.
Consumed information -
Structured Data Dashboard
Google’s “SearchVerticals”
Notice any correlations?
I would advise you to!
OH! and be sure to
check out Moores law
SEARCH ENGINE POINT OF VIEW
I also have a pretty
good understanding of
big data and web
intelligence so I can
leverage them!
SIRI
“Amazing fact: same
amount of computing to
answer one Google Search
query as all the computing
done -- in flight and on the
ground -- for the entire
Apollo program!
SEARCH ENGINE POINT OF VIEW
I can leverage
metadata for
better image
search
SIRI
I can combine it with
computer vision
techniques.
I can enhance
user’s shopping
experience.
SEARCH ENGINE POINT OF VIEW
? Know rather than
Recognize?
INTRODUCING THE KNOWLEDGE GRAPH
Symbolic
reasoning vs
stochastic
reasoning (Latter is
more like NLP or
page rank)
SEARCH ENGINE POINT OF VIEW
♫
Folks finding answers
on my page never
even have to click
through to yours!
And speaking of
the knowledge
graph or
knowledge
carousel!
I can even now
start to derive
associations or
relationships
between entities.
SEARCH ENGINE POINT OF VIEW
Check out this great highlighter.
The information is available
only to me and not to any other
search or social engines!
Can you believe I have been
accused of hijacking semantic
markup?
I find it so helpful that I
would really like to be
able to keep all that
validated verified
information to myself!
SEARCH ENGINE POINT OF VIEW
And extended my data
highlighter to include the
following types of entities
(check your webmaster
tools for this)
I have since created
the structured markup
helper! And added
support for JSON-LD as
well as microdata)
SEARCH ENGINE POINT OF VIEW
They are also leveraging it
in their newly released
graph search!
Not only that, they are even
building an entity graph not
dissimilar from my
knowledge graph!
My social counterparts
have been leveraging
structured markup
(rdfa) for their
opengraph protocol for
quite some time.
The Open Graph Protocol enables you to
integrate your Web pages into the social graph Example of crowdsourced
entity graph info source - places
SEARCH ENGINE POINT OF VIEW
My social counterparts
ought to have a field day in
terms of both targeted
advertising and in creating
engaging user experiences
by leveraging their more
recent innovations.
SEARCH ENGINE POINT OF VIEW
Knowledge Graphs are
now ubiquitous, and
the term has become
common vernacular!
LINKED IN SNAPSHOTS
ADDED PUBMED
Knowledge Graph
Knowledge Graph
SEARCH ENGINE POINT OF VIEW
I am starting to use
hashtags in search so
I can merge topics
and entities in graphs,
like some of my social
counterparts!
LINKED IN SNAPSHOTS
ADDED PUBMED
Knowledge Graph
Knowledge Graph
SEARCH ENGINE POINT OF VIEW
I am even now measuring
my trending “entities” in my
top charts, rather than
“strings”.
SEARCH ENGINE POINT OF VIEW
LIST IS GROWING FAST!
LATEST DRAFT ON ACTION
TYPES – July 2013
Via publicvocabs@w3
SEARCH ENGINE POINT OF VIEW
Check the list to see
what is coming out
next! Schema.org is
dynamic and is
growing!
Mark up information not
yet consumed by search
engines to get the
advantage of extra lift
when it is adopted.
SEARCH ENGINE POINT OF VIEW
Thank you for your
time! 
And just a bye-the-bye,
this technology is still in
it’s nascent stages. Can
you imagine what I will
be able to do soon?
Barbara Starr
Email: bstarr@AlgebraixData
Twitter: @BarbaraStarr
Resources to help you!
Make sure to use
them wisely!
Remember, if you want
to make the search
engines happy, put
yourself in their shoes!
PageRank is now only 1
of over 200 signals that
Google uses!
Resources at this point in time
Caveat: Some training may be required for some of the tools
Programming Languages:
JavaSCript: Microdatajs
Live microdata
Php: Microdataphp
Ruby: RDF Microdata
RDF Lib plugin
PerlRuby: RDF Microdata Gem
Mida
Java: Sindice any23 library
Publishing
Form Based tools:
Schema Creator
Microdata generator
Standalone tools
Web.instadata
Editors:
Topbraid Composer
Protege
Platforms:
Drupal
Joomla
Wordpress (about 7 of them)
Virtuoso
Topbraid Composer
Validators, Testers and More Check.rdfa.info Sindice Inspector
Rich Snippets Testing Tool Bing Validator
Structured data Linter Online Parser?viewer and RSS generator
Validator.nu Google Structured Data Tester
Goodrelations Resources ……
Goodrelations: Resources, generators, validators, more, ….
More Resources
From the mouth of
Other Semantic Web Resources
OpenCalais – Can extract information about people, places and things
AlchemyAPI – named entity extraction, topic recognition, keyword tagging, more ….
Cogito – Expert System
Franz Inc. – Gruff
Pool Party
JSON-LD playground
YAHOO! Glimmer
Many More….
Barbara Starr
Twitter: @BarbaraStarr
Email: bstarr@algebraixdata.com
Linkedin: http://www.linkedin.com/in/barbarastarrFor more info contact:
Caveat: Some training may be required for some of the tools
Topbraid Composer
By Barbara Starr
Twitter: @BarbaraStarr
Linkedin :http://www.linkedin.com/in/barbarastarr
E-mail : bstarr@algebraixdata.com
Bye for now

Contenu connexe

Tendances

How Google Search Engine Works
How Google Search Engine Works How Google Search Engine Works
How Google Search Engine Works ARK Solution
 
T L W Smart Searching
T L W Smart SearchingT L W Smart Searching
T L W Smart SearchingPam Krambeck
 
Search engines powerpoint
Search engines powerpointSearch engines powerpoint
Search engines powerpointvbaker2210
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Bradley Allen
 
Smoke Signals and Social Signals: A look at the patents and papers
Smoke Signals and Social Signals: A look at the patents and papersSmoke Signals and Social Signals: A look at the patents and papers
Smoke Signals and Social Signals: A look at the patents and papersBill Slawski
 
Introduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalIntroduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalA. LE
 
Finding information on the Web - methodology
Finding information on the Web - methodologyFinding information on the Web - methodology
Finding information on the Web - methodologyPhilippe Scheimann
 
EvaluatingWebResources
EvaluatingWebResourcesEvaluatingWebResources
EvaluatingWebResourcesmcneeteach
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Juan Sequeda
 
pranav,sahil and shriman presents search engine
pranav,sahil and shriman presents search enginepranav,sahil and shriman presents search engine
pranav,sahil and shriman presents search engineCool Bhatt
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Juan Sequeda
 

Tendances (20)

How Google Search Engine Works
How Google Search Engine Works How Google Search Engine Works
How Google Search Engine Works
 
Web Search Engine
Web Search EngineWeb Search Engine
Web Search Engine
 
T L W Smart Searching
T L W Smart SearchingT L W Smart Searching
T L W Smart Searching
 
Search engines powerpoint
Search engines powerpointSearch engines powerpoint
Search engines powerpoint
 
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
Faceted Navigation of User-Generated Metadata (Calit2 Rescue Seminar Series 2...
 
Smoke Signals and Social Signals: A look at the patents and papers
Smoke Signals and Social Signals: A look at the patents and papersSmoke Signals and Social Signals: A look at the patents and papers
Smoke Signals and Social Signals: A look at the patents and papers
 
Introduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information RetrievalIntroduction into Search Engines and Information Retrieval
Introduction into Search Engines and Information Retrieval
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Finding information on the Web - methodology
Finding information on the Web - methodologyFinding information on the Web - methodology
Finding information on the Web - methodology
 
Google Search Presentation
Google Search PresentationGoogle Search Presentation
Google Search Presentation
 
EvaluatingWebResources
EvaluatingWebResourcesEvaluatingWebResources
EvaluatingWebResources
 
Information Update Feb 2008
Information Update Feb  2008Information Update Feb  2008
Information Update Feb 2008
 
Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011Consuming Linked Data 4/5 Semtech2011
Consuming Linked Data 4/5 Semtech2011
 
Search Engine
Search EngineSearch Engine
Search Engine
 
pranav,sahil and shriman presents search engine
pranav,sahil and shriman presents search enginepranav,sahil and shriman presents search engine
pranav,sahil and shriman presents search engine
 
Semantic search
Semantic searchSemantic search
Semantic search
 
Search engine
Search engineSearch engine
Search engine
 
Search Engines
Search EnginesSearch Engines
Search Engines
 
Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010Consuming Linked Data SemTech2010
Consuming Linked Data SemTech2010
 
Search Engine Demystified
Search Engine DemystifiedSearch Engine Demystified
Search Engine Demystified
 

En vedette

Techcrunch application
Techcrunch applicationTechcrunch application
Techcrunch applicationBen Johnson
 
Menighedsudvikling 7 - hvordan
Menighedsudvikling   7 - hvordan Menighedsudvikling   7 - hvordan
Menighedsudvikling 7 - hvordan Mogens Mogensen
 
On Kue Business Architecture101
On Kue Business Architecture101On Kue Business Architecture101
On Kue Business Architecture101Carolyn N. Evans
 
Roedtness, presentation of the company
Roedtness, presentation of the companyRoedtness, presentation of the company
Roedtness, presentation of the companyprebenpoulsen
 
Introduktion Til LæRingsnetvæRk
Introduktion Til LæRingsnetvæRkIntroduktion Til LæRingsnetvæRk
Introduktion Til LæRingsnetvæRkMogens Mogensen
 
和平特色成果投影片
和平特色成果投影片和平特色成果投影片
和平特色成果投影片楊 騏
 
динаміка поширення інтернету
динаміка поширення інтернетудинаміка поширення інтернету
динаміка поширення інтернетуkorzhenko
 
Nye ekklesiologiske modeller
Nye ekklesiologiske modellerNye ekklesiologiske modeller
Nye ekklesiologiske modellerMogens Mogensen
 
Blue Apple 規格書簡報
Blue Apple 規格書簡報Blue Apple 規格書簡報
Blue Apple 規格書簡報楊 騏
 
Hiver D Autrefois
Hiver D AutrefoisHiver D Autrefois
Hiver D Autrefoisiuliacosma
 
Networking Presntation
Networking PresntationNetworking Presntation
Networking Presntationlenstrickler
 
Questions And Answers On Mediation Question 1 What Is Mediation
Questions And Answers On Mediation Question 1 What Is MediationQuestions And Answers On Mediation Question 1 What Is Mediation
Questions And Answers On Mediation Question 1 What Is Mediationlegal5
 

En vedette (20)

LegalShield
LegalShieldLegalShield
LegalShield
 
Techcrunch application
Techcrunch applicationTechcrunch application
Techcrunch application
 
1.4.7power
1.4.7power1.4.7power
1.4.7power
 
VIPnet & social media
VIPnet & social mediaVIPnet & social media
VIPnet & social media
 
Connection
ConnectionConnection
Connection
 
Angels
AngelsAngels
Angels
 
Menighedsudvikling 7 - hvordan
Menighedsudvikling   7 - hvordan Menighedsudvikling   7 - hvordan
Menighedsudvikling 7 - hvordan
 
On Kue Business Architecture101
On Kue Business Architecture101On Kue Business Architecture101
On Kue Business Architecture101
 
Roedtness, presentation of the company
Roedtness, presentation of the companyRoedtness, presentation of the company
Roedtness, presentation of the company
 
Pregovori
PregovoriPregovori
Pregovori
 
Introduktion Til LæRingsnetvæRk
Introduktion Til LæRingsnetvæRkIntroduktion Til LæRingsnetvæRk
Introduktion Til LæRingsnetvæRk
 
2.4.3derived
2.4.3derived2.4.3derived
2.4.3derived
 
2.4.5multiples
2.4.5multiples2.4.5multiples
2.4.5multiples
 
和平特色成果投影片
和平特色成果投影片和平特色成果投影片
和平特色成果投影片
 
динаміка поширення інтернету
динаміка поширення інтернетудинаміка поширення інтернету
динаміка поширення інтернету
 
Nye ekklesiologiske modeller
Nye ekklesiologiske modellerNye ekklesiologiske modeller
Nye ekklesiologiske modeller
 
Blue Apple 規格書簡報
Blue Apple 規格書簡報Blue Apple 規格書簡報
Blue Apple 規格書簡報
 
Hiver D Autrefois
Hiver D AutrefoisHiver D Autrefois
Hiver D Autrefois
 
Networking Presntation
Networking PresntationNetworking Presntation
Networking Presntation
 
Questions And Answers On Mediation Question 1 What Is Mediation
Questions And Answers On Mediation Question 1 What Is MediationQuestions And Answers On Mediation Question 1 What Is Mediation
Questions And Answers On Mediation Question 1 What Is Mediation
 

Similaire à Leveraging the semantic web meetup, Semantic Search, Schema.org and more

Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008Blogtalk 2008
 
Microformats I: What & Why
Microformats I: What & WhyMicroformats I: What & Why
Microformats I: What & WhyRachael L Moore
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011sssw2011
 
Making things findable
Making things findableMaking things findable
Making things findablePeter Mika
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebMatthew Brown
 
Explaining The Semantic Web
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic WebAditya Tuli
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured DataBotify
 
SEO Will Never Die! Part 2
SEO Will Never Die! Part 2SEO Will Never Die! Part 2
SEO Will Never Die! Part 2We Are Marketing
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talksyawal
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & AnalysisScott Sanders
 
Structured Data and Semantic SEO
Structured Data and Semantic SEOStructured Data and Semantic SEO
Structured Data and Semantic SEOMatthew Brown
 
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v12017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1Don Miller
 
Challenges in web crawling
Challenges in web crawlingChallenges in web crawling
Challenges in web crawlingBurhan Ahmed
 
Searching the internet information and assessment
Searching the internet information and assessmentSearching the internet information and assessment
Searching the internet information and assessmentnollyris
 
Smxeastbarbarastarr2012
Smxeastbarbarastarr2012Smxeastbarbarastarr2012
Smxeastbarbarastarr2012Barbara Starr
 
Content Strategy Applied Deck (Oct 17)
Content Strategy Applied Deck (Oct 17)Content Strategy Applied Deck (Oct 17)
Content Strategy Applied Deck (Oct 17)Matt Hobbs
 

Similaire à Leveraging the semantic web meetup, Semantic Search, Schema.org and more (20)

Spivack Blogtalk 2008
Spivack Blogtalk 2008Spivack Blogtalk 2008
Spivack Blogtalk 2008
 
Microformats I: What & Why
Microformats I: What & WhyMicroformats I: What & Why
Microformats I: What & Why
 
Not Your Mom's SEO
Not Your Mom's SEONot Your Mom's SEO
Not Your Mom's SEO
 
Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011Peter Mika's Presentation at SSSW 2011
Peter Mika's Presentation at SSSW 2011
 
Search V Next Final
Search V Next FinalSearch V Next Final
Search V Next Final
 
Semantic Web, e-commerce
Semantic Web, e-commerceSemantic Web, e-commerce
Semantic Web, e-commerce
 
Making things findable
Making things findableMaking things findable
Making things findable
 
SMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic WebSMX Advanced 2012 - Catching up with the Semantic Web
SMX Advanced 2012 - Catching up with the Semantic Web
 
Search engines
Search enginesSearch engines
Search engines
 
Explaining The Semantic Web
Explaining The Semantic WebExplaining The Semantic Web
Explaining The Semantic Web
 
Webinar Structured Data
Webinar Structured DataWebinar Structured Data
Webinar Structured Data
 
SEO Will Never Die! Part 2
SEO Will Never Die! Part 2SEO Will Never Die! Part 2
SEO Will Never Die! Part 2
 
Nova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web TalkNova Spivack - Semantic Web Talk
Nova Spivack - Semantic Web Talk
 
Social Media Data Collection & Analysis
Social Media Data Collection & AnalysisSocial Media Data Collection & Analysis
Social Media Data Collection & Analysis
 
Structured Data and Semantic SEO
Structured Data and Semantic SEOStructured Data and Semantic SEO
Structured Data and Semantic SEO
 
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v12017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
2017 01-11 intelligent search and intranet - chihuahuas vs muffins v1
 
Challenges in web crawling
Challenges in web crawlingChallenges in web crawling
Challenges in web crawling
 
Searching the internet information and assessment
Searching the internet information and assessmentSearching the internet information and assessment
Searching the internet information and assessment
 
Smxeastbarbarastarr2012
Smxeastbarbarastarr2012Smxeastbarbarastarr2012
Smxeastbarbarastarr2012
 
Content Strategy Applied Deck (Oct 17)
Content Strategy Applied Deck (Oct 17)Content Strategy Applied Deck (Oct 17)
Content Strategy Applied Deck (Oct 17)
 

Dernier

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Commit University
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 3652toLead Limited
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek SchlawackFwdays
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):comworks
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024BookNet Canada
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Mark Simos
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.Curtis Poe
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Mattias Andersson
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brandgvaughan
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...Fwdays
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsMiki Katsuragi
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity PlanDatabarracks
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr BaganFwdays
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Enterprise Knowledge
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfAlex Barbosa Coqueiro
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .Alan Dix
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024Stephanie Beckett
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionDilum Bandara
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenHervé Boutemy
 

Dernier (20)

Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!Nell’iperspazio con Rocket: il Framework Web di Rust!
Nell’iperspazio con Rocket: il Framework Web di Rust!
 
Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365Ensuring Technical Readiness For Copilot in Microsoft 365
Ensuring Technical Readiness For Copilot in Microsoft 365
 
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
"Subclassing and Composition – A Pythonic Tour of Trade-Offs", Hynek Schlawack
 
CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):CloudStudio User manual (basic edition):
CloudStudio User manual (basic edition):
 
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
Transcript: New from BookNet Canada for 2024: BNC CataList - Tech Forum 2024
 
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
Tampa BSides - Chef's Tour of Microsoft Security Adoption Framework (SAF)
 
How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.How AI, OpenAI, and ChatGPT impact business and software.
How AI, OpenAI, and ChatGPT impact business and software.
 
Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?Are Multi-Cloud and Serverless Good or Bad?
Are Multi-Cloud and Serverless Good or Bad?
 
WordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your BrandWordPress Websites for Engineers: Elevate Your Brand
WordPress Websites for Engineers: Elevate Your Brand
 
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks..."LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
"LLMs for Python Engineers: Advanced Data Analysis and Semantic Kernel",Oleks...
 
Vertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering TipsVertex AI Gemini Prompt Engineering Tips
Vertex AI Gemini Prompt Engineering Tips
 
How to write a Business Continuity Plan
How to write a Business Continuity PlanHow to write a Business Continuity Plan
How to write a Business Continuity Plan
 
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptxE-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
E-Vehicle_Hacking_by_Parul Sharma_null_owasp.pptx
 
"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan"ML in Production",Oleksandr Bagan
"ML in Production",Oleksandr Bagan
 
Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024Designing IA for AI - Information Architecture Conference 2024
Designing IA for AI - Information Architecture Conference 2024
 
Unraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdfUnraveling Multimodality with Large Language Models.pdf
Unraveling Multimodality with Large Language Models.pdf
 
From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .From Family Reminiscence to Scholarly Archive .
From Family Reminiscence to Scholarly Archive .
 
What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024What's New in Teams Calling, Meetings and Devices March 2024
What's New in Teams Calling, Meetings and Devices March 2024
 
Advanced Computer Architecture – An Introduction
Advanced Computer Architecture – An IntroductionAdvanced Computer Architecture – An Introduction
Advanced Computer Architecture – An Introduction
 
DevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache MavenDevoxxFR 2024 Reproducible Builds with Apache Maven
DevoxxFR 2024 Reproducible Builds with Apache Maven
 

Leveraging the semantic web meetup, Semantic Search, Schema.org and more

  • 1. Leveraging the Semantic Web, Schema.org, Semantic Search and more San Diego Semantic Web Meetup By: Barbara Starr Twitter: @BarbaraStarr Email: bstarr@algebraixData.com
  • 2. • Pursued a doctorate in Artificial Intelligence from South Africa in the 80's. • Recruited to build intelligent/predictive trading systems on Wall Street • Migrated to government-based contracts, several of which turned into real world products like – SIRI (PAL from DARPA) – WATSON (Acquaint - IBM Watson Labs was a team member) • From the vantage of a semantic technologist, I keenly watched the evolution of the Semantic Web. • “Shocked into the real world” when working as a consultant @ Overstock • Today – SVP Product management AlgebraixData Meta Information ME By: Barbara Starr Twitter: @BarbaraStarr Email: bstarr@algebraixData.com Linkedin: http://www.linkedin.com/in/barbarastarr My favorite author: Isaac Asimov Favorite book: I Robot Favorite character: MULTIVAC
  • 3. Additional Metainformation For the purpose of this talk: same-as MY ROBOT or Artificially Intelligent Entity or Search Engine OWL I explain things from a Search Engine Point of View! 
  • 4. SEARCH ENGINE POINT OF VIEW How can I exploit metadata or “semantic search”??
  • 5. SEARCH ENGINE POINT OF VIEW RICH SNIPPETS 2009 tiles Searchmonkey 2008 I can directly extract information to enhance SERP displays
  • 6. SEARCH ENGINE POINT OF VIEW I can search directly on consumed metadata!
  • 7. SEARCH ENGINE POINT OF VIEW I can provide direct answers to queries by searching on consumed, verified and validated information
  • 8. SEARCH ENGINE POINT OF VIEW I can even aggregate answers or deduce them (like a timeline of events)
  • 9. SEARCH ENGINE POINT OF VIEW I can even use it in conjunction with machine learning techniques- to eg. Train other components I can detect relevancy signals: i.e what content to show to what audience I can use it to Assist in interpreting a user query Penn Treebank tagset ?
  • 10. SEARCH ENGINE POINT OF VIEW Really interesting in terms of exposing long tail content too. It makes things findable for me when pages are published with structured markup! I meant the beer brewer in Arizona
  • 11. SEARCH ENGINE POINT OF VIEW I’m a Search Engine Robot I could really use this stuff. And it is like the tower of babel out there! Microdata Microformats RDFa Multiple conflicting vocabularies that I will have to align internally and multiple syntax formats as well. Prior to Schema.org (.e. June 2011) Goodrelations for e-commerce ?
  • 12. SEARCH ENGINE POINT OF VIEW Time to get Serious!
  • 13. What has been the history? Percentage of URLs with embedded metadata in various formats Five-fold increase between March, 2009 and October, 2010 Another five-fold increase between October 2010 and January, 2012 RDFa exploded in 2012 – Source Peter Mika - Yahoo
  • 14. Current state of metadata on the Web • 31% of webpages, 5% of domains contain some metadata – Analysis of the Bing Crawl (US crawl, January, 2012) – RDFa is most common format • By URL: 25% RDFa, 7% microdata, 9% microformat • By eTLD (PLD): 4% RDFa, 0.3% microdata, 5.4% microformat – Adoption is stronger among large publishers • Especially for RDFa and microdata • See also – P. Mika, T. Potter. Metadata Statistics for a Large Web Corpus, LDOW 2012 – H.Mühleisen, C.Bizer.Web Data Commons - Extracting Structured Data from Two Large Web Corpora, LDOW 2012
  • 15. Prolific growth of the LOD Cloud
  • 16. Timeline of RDFa and Semantic Web Adoption As of Semtech 2011 Inevitable passage of Semantic Web adoption – culminating in schema.org
  • 17. SEARCH ENGINE POINT OF VIEW Align and consume many vocabularies that may not be of interest to search engines? Rather mandate vocabulary And Syntax - microdata A Search Engine alliance has the power to MANDATE vocabulary and syntax! Initial alliance: Google, Yahoo, Bing. Then Yandex and subsequently Pinterest
  • 19. SEARCH ENGINE POINT OF VIEW On the other hand – Not wise to ignore standards bodies like W3C No mandate on Syntax
  • 20. SEARCH ENGINE POINT OF VIEW Did I tell you I don’t like spam?
  • 21. SEARCH ENGINE POINT OF VIEW Make sure you are not cloaking by feeding one set of information to me and another to human users! Ensure your data feeds match information with the structured markup or “metadata” on your web pages.
  • 22. Your Logo SEARCH ENGINE POINT OF VIEW Serving RELEVANT ANSWERS are IMPERATIVE! & central to my very being!
  • 23. SEARCH ENGINE POINT OF VIEW ELSE I AM
  • 25. SEARCH ENGINE POINT OF VIEW Adding context in search verticals really helps me serve up relevant information (Seriously increases my recall), as does geospatial information. Consumed information - Structured Data Dashboard Google’s “SearchVerticals” Notice any correlations? I would advise you to!
  • 26. OH! and be sure to check out Moores law SEARCH ENGINE POINT OF VIEW I also have a pretty good understanding of big data and web intelligence so I can leverage them! SIRI “Amazing fact: same amount of computing to answer one Google Search query as all the computing done -- in flight and on the ground -- for the entire Apollo program!
  • 27. SEARCH ENGINE POINT OF VIEW I can leverage metadata for better image search SIRI I can combine it with computer vision techniques. I can enhance user’s shopping experience.
  • 28.
  • 29. SEARCH ENGINE POINT OF VIEW ? Know rather than Recognize? INTRODUCING THE KNOWLEDGE GRAPH Symbolic reasoning vs stochastic reasoning (Latter is more like NLP or page rank)
  • 30. SEARCH ENGINE POINT OF VIEW ♫ Folks finding answers on my page never even have to click through to yours! And speaking of the knowledge graph or knowledge carousel! I can even now start to derive associations or relationships between entities.
  • 31. SEARCH ENGINE POINT OF VIEW Check out this great highlighter. The information is available only to me and not to any other search or social engines! Can you believe I have been accused of hijacking semantic markup? I find it so helpful that I would really like to be able to keep all that validated verified information to myself!
  • 32. SEARCH ENGINE POINT OF VIEW And extended my data highlighter to include the following types of entities (check your webmaster tools for this) I have since created the structured markup helper! And added support for JSON-LD as well as microdata)
  • 33. SEARCH ENGINE POINT OF VIEW They are also leveraging it in their newly released graph search! Not only that, they are even building an entity graph not dissimilar from my knowledge graph! My social counterparts have been leveraging structured markup (rdfa) for their opengraph protocol for quite some time. The Open Graph Protocol enables you to integrate your Web pages into the social graph Example of crowdsourced entity graph info source - places
  • 34. SEARCH ENGINE POINT OF VIEW My social counterparts ought to have a field day in terms of both targeted advertising and in creating engaging user experiences by leveraging their more recent innovations.
  • 35. SEARCH ENGINE POINT OF VIEW Knowledge Graphs are now ubiquitous, and the term has become common vernacular! LINKED IN SNAPSHOTS ADDED PUBMED Knowledge Graph Knowledge Graph
  • 36. SEARCH ENGINE POINT OF VIEW I am starting to use hashtags in search so I can merge topics and entities in graphs, like some of my social counterparts! LINKED IN SNAPSHOTS ADDED PUBMED Knowledge Graph Knowledge Graph
  • 37. SEARCH ENGINE POINT OF VIEW I am even now measuring my trending “entities” in my top charts, rather than “strings”.
  • 38. SEARCH ENGINE POINT OF VIEW LIST IS GROWING FAST! LATEST DRAFT ON ACTION TYPES – July 2013 Via publicvocabs@w3
  • 39. SEARCH ENGINE POINT OF VIEW Check the list to see what is coming out next! Schema.org is dynamic and is growing! Mark up information not yet consumed by search engines to get the advantage of extra lift when it is adopted.
  • 40. SEARCH ENGINE POINT OF VIEW Thank you for your time!  And just a bye-the-bye, this technology is still in it’s nascent stages. Can you imagine what I will be able to do soon? Barbara Starr Email: bstarr@AlgebraixData Twitter: @BarbaraStarr Resources to help you! Make sure to use them wisely! Remember, if you want to make the search engines happy, put yourself in their shoes! PageRank is now only 1 of over 200 signals that Google uses!
  • 41. Resources at this point in time Caveat: Some training may be required for some of the tools Programming Languages: JavaSCript: Microdatajs Live microdata Php: Microdataphp Ruby: RDF Microdata RDF Lib plugin PerlRuby: RDF Microdata Gem Mida Java: Sindice any23 library Publishing Form Based tools: Schema Creator Microdata generator Standalone tools Web.instadata Editors: Topbraid Composer Protege Platforms: Drupal Joomla Wordpress (about 7 of them) Virtuoso Topbraid Composer Validators, Testers and More Check.rdfa.info Sindice Inspector Rich Snippets Testing Tool Bing Validator Structured data Linter Online Parser?viewer and RSS generator Validator.nu Google Structured Data Tester
  • 42. Goodrelations Resources …… Goodrelations: Resources, generators, validators, more, ….
  • 44. Other Semantic Web Resources OpenCalais – Can extract information about people, places and things AlchemyAPI – named entity extraction, topic recognition, keyword tagging, more …. Cogito – Expert System Franz Inc. – Gruff Pool Party JSON-LD playground YAHOO! Glimmer Many More…. Barbara Starr Twitter: @BarbaraStarr Email: bstarr@algebraixdata.com Linkedin: http://www.linkedin.com/in/barbarastarrFor more info contact: Caveat: Some training may be required for some of the tools Topbraid Composer
  • 45. By Barbara Starr Twitter: @BarbaraStarr Linkedin :http://www.linkedin.com/in/barbarastarr E-mail : bstarr@algebraixdata.com Bye for now