7. Automatic Annotation
Spotting Disambiguation
Heading into the Los
Angeles Lakers' Tuesday
night tilt with the Golden
State Warriors, Kobe
Bryant needed 25 points
to pass Michael Jordan for
the second-most points
scored by a player for a
single NBA franchise. He
got 30. We now have a
definitive answer; finally,
the debate can end.
12. 1212
Motivating Vision
Next-Generation Search will be
Information Extraction + Ontology + Inference
Object 1
…
Albert Einstein was a German-born theoretical physicist …
h German Scientists Taught at US Universities?
…
Object 3
New Jersey is a state in the Northeastern region
…
Object 2
…
Einstein was a guest lecturer at the Institute for Advanced Study in New Jersey
…
14. AQL script for Techpedia Project Title
create view ProjectName as
extract
regex /Project Title/
with flags 'CASE_INSENSITIVE'
on D.text as title
from DetaggedDoc D;
create view ProjectTitle as
select RightContextTok(P.title,10) as ProjectTitle
from ProjectName P;
output view ProjectTitle;
15. AQL script for course prerequisite
create view PrerequisiteWord as
extract regex /(Prerequisitesns*)(.*)/
on D.text
return group 1 as prereqWord and
group 2 as prereq
from DetaggedDoc D;
create view prereq as
extract
P.prereq as match,
regex /n/ on P.prereq as boundary
from PrerequisiteWord P;
create view CoursePrerequisite as
extract split using P.boundary
retain left split point on P.match
as CoursePrerequisite
from prereq P
limit 1;
output view CoursePrerequisite;
16. AQL script for Professor Name
create view ProfessorName as
(select CombineSpans(F.fname, L.lname) as Pname
from FirstName F, LastName L
where FollowsTok(F.fname, L.lname,0,0)
consolidate on F.fname using 'ContainedWithin'
order by GetText(F.fname)
limit 1)
union all
(select CombineSpans(F.Nameinitial, L.lname) as Pname
from FNameInitial F, LastName L
where FollowsTok(F.Nameinitial, L.lname,0,0)
consolidate on F.Nameinitial using 'ContainedWithin'
order by GetText(F.Nameinitial)
limit 1);
16
18. Output view for Project Title
Output View ProjectTitle:
[Document.text[17-73]: ' : DESIGN & ANALYSIS OF PRESSURE
VES...'(1 fields)]
ProjectTitle: : DESIGN & ANALYSIS OF PRESSURE VESSEL
College : U
[Document.text[276-349]: ' : Design and Analysis of
Electrical...'(1 fields)]
ProjectTitle: : Design and Analysis of Electrical Overhead
Traveling Crane
College
[Document.text[566-623]: ' : Blungers: For Slip and Glaze
Prep...'(1 fields)]
ProjectTitle: : Blungers: For Slip and Glaze Preparation
College :
[Document.text[833-880]: ' : DESIGN OF STEAM CONDENSERn
Col...'(1 fields)]
ProjectTitle: : DESIGN OF STEAM CONDENSER
19. Output view for College Name
Output View CollegeName:
[Document.text[68-113]: ' : U.V.PATEL COLLEGE OF
ENGINEERING...'(1 fields)]
CollegeName: : U.V.PATEL COLLEGE OF ENGINEERING
Guide
[Document.text[349-394]: ' : U.V.PATEL COLLEGE OF
ENGINEERING...'(1 fields)]
CollegeName: : U.V.PATEL COLLEGE OF ENGINEERING
Guide
[Document.text[621-666]: ' : U.V.PATEL COLLEGE OF
ENGINEERING...'(1 fields)]
CollegeName: : U.V.PATEL COLLEGE OF ENGINEERING
Guide
[Document.text[873-918]: ' : U.V.PATEL COLLEGE OF
ENGINEERING...'(1 fields)]
CollegeName: : U.V.PATEL COLLEGE OF ENGINEERING
20. Output view for Guide Name
Output View GuideName:
[Document.text[116-150]: ' MR. Bhavesh Pateln Team
Members'(1 fields)]
Guide: MR. Bhavesh Patel
Team Members
[Document.text[397-418]: ' Mr. Bhavesh P. Patel'(1 fields)]
Guide: Mr. Bhavesh P. Patel
[Document.text[669-680]: ' PROF. V.B.'(1 fields)]
Guide: PROF. V.B.
[Document.text[921-941]: ' A.R. ISRANIn Team'(1 fields)]
Guide: A.R. ISRANI
21. Output view for Team Members
Output View TeamMembers:
[Document.text[153-219]: ' Jimit Vyas & Mahavir Solankin
Abs...'(1 fields)]
members: Jimit Vyas & Mahavir Solanki
Abstract : The significance read
[Document.text[437-471]: ' Vishal A. Patel,Bhavik H. Khamar,'(1
fields)]
members: Vishal A. Patel,Bhavik H. Khamar,
[Document.text[704-758]: ' PATEL JAYRAM ,PATEL KETUL ,PATEL
TUS...'(1 fields)]
members: PATEL JAYRAM ,PATEL KETUL ,PATEL TUSHAR
Abstract :
[Document.text[952-1007]: ' HARSHAD PATEL,NITIN NAHAR,SANDEEP
PA...'(1 fields)]
members: HARSHAD PATEL,NITIN NAHAR,SANDEEP PARMAR
23. Output view for Course Details
CourseId: CS 717
CourseName: Statistical Relational Learning
CoursePrerequisite: N/A
CourseHomepage: Not Available
CourseContent:
* What is Relational Learning? Need for RL.* Discussion on the
three elements of relational models: (a) logic for representing
types, relations and complexdependencies between them, (b)
uncertainty, and (c) learning and inferencing* Basics of 0-
order and First order logic - includes types of clauses, syntax
and semantics, ....
CourseReferences: 1.011 Inductive Logic Programming:
Techniques and Applications, N. Lavrac and S. Dzeroski. Ellis
ProfessorName: G.Ramakrishnan
25. What is annotation?
Short description of page/document
Explicit v/s Implicit
Metadata and not content
Type of Annotation
Concise description
Abbreviation
Opinion
Web URL annotation like wiki links
earch” by Dmitriev, Pavel A. and Eiron, Nadav and Fontoura, Marcus and Shekita, Eugene In Proceedings of the 15th internatio
26. Flow of annotations in search
“Using Annotations in Enterprise Search” by Dmitriev, Pavel A. and Eiron, Nadav and Fontoura, Marcus and
Shekita, Eugene In Proceedings of the 15th international conference on World Wide Web 2006
27. Definition of Ontology
‘A formal, explicit specification of a shared conceptualization’
Gruber (1993)
must be machine
understandable not private to some individual,
but accepted by a group
an abstract model of some phenomenon in the world formed by identifying th
types of concepts and
constraints must be clearly
defined
27
27
28. Processes to create a Domain
Ontology
Ontology acquisition
Automatic extraction of ontological knowledge from base vocabulary
and domain specific text sources
Merging into one ontology
Refinement and Extension
Evaluation and Assessment
28
28
Notes de l'éditeur
Ontologies today are available in many different forms: as artifacts of a tedious knowledge-engineering process, as information that was extracted automatically from informal electronic sources, or as simple “light-weight” ontologies