2. Contents
• Franz and customers
• Two Use Cases
– Amdocs: a real time semantic platform for telecom that
knows everything about everyone in real time
– Real time news and social network analysis using the
Linked Open Data Cloud
Linked Open Data Cloud
• Scalability?
• Integration with other NoSQL databases – Solr, MongoDB
g , g
3. Franz Inc Who We Are
Franz Inc – Who We Are
• Private, founded 1984
• We are an AI and
Semantic Technology company
• Out of Berkeley
Out of Berkeley
8. How is it different from an RDB
and why is it more flexible?
d h i i fl ibl ?
• No Schema.
– Say whatever you want to say but
– ontologies may constrain what you put in triple store
• No Link Tables
– because you can do one‐to‐many relationships directly
• No Indexing Choices
– Can add new data attributes (predicates) on‐the‐fly that
will be real time available for querying, because
will be real‐time available for querying because
everything is automatically indexed.
• Takes anything you give it: it is trivial to consume
– Rows and columns from RDB, XML, RDF(S), OWL, Text and
Extracted Entities, JSON
12. Use Case Amdocs
Use Case Amdocs
Build a semantic platform
that knows everything
about everyone
b
in real time.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22. Telco Call Center Volume
Quadruples
Quadruples
Since 2007
• On average, each call
– Lasts 10 minutes
– Go thru 68 screens
• One call costs 3 months’ profit from that customer
One call costs 3 months profit from that customer
• It’s getting worse every day!
23. Typical Interaction Begins in the
Dark
Bill
Past
Payments
Plan The unknown – why
calling? How to help?
g p
Calculator
(avg peak Device
usage)
No real‐time context
Past
Statements Interactions
(Memos)
g g
‐ insight & guidance
High AHT, poor FCR, low customer and agent satisfaction
24.
25. AIDA Maps Events to
Concepts
C t
Events from many source systems are transformed into a set of related business concepts
Many events
Triple Store with business concepts
Interactions
Orders
Bills
Payments
Collections
Charge dispute
g p
Customer
Pay instructions
Subjective "good payer"
Individual
Patterns
a e s "always pays 2 days late"
a ays pays days a e
Device Activated
Trends “improving payer"
Device heartbeat
Geospatial “within 5 miles of the tower"
Subscriptions
Time Chronology of events “within 5 minutes of an outage"
Device h
D i changes
Probability “probably will call about the bill"
Absence of occurrence “missed payment"
Relationship between " friend of a friend"
26. Events Decision Engine Actions
SBA Application Server
Container
Container
Amdocs Amdocs
Event Collector Integration
Event Framework
Ingestion Inference
Inference
Engine
(Business
Events
Rules)
Bayesian
y
Scheduled
Belief
Events
Network
RM CRM OMS CRM
“Sesame”
Operational Systems
NW Web 2.0
Event Data Sources AllegroGraph
Triple Store DB
27. AIDA Event Collection
AIDA Event Collection
Inference &
Amdocs Event Collector
Amdocs Event Collector Decision
Event Sources Collection Parsing Mapping Publishing Ingestion
• Events are collected from many heterogeneous,
configured event sources
– Phone calls, texting, video upload, roaming, etc.
Phone calls texting video upload roaming etc
– iTune download, web site interaction, media upload
– Emails, support calls
– Bill payment or non‐payment
Bill payment or non payment
– Phones stop working or disconnect
• All fused and mapped into a single event
knowledge base
28. AIDA Semantic Inference
AIDA Semantic Inference
• Define rules to operate to create higher level concepts
– Event (mapping) rules ‐ Map event data into the domain ontology
– Automatic rules – Compute new properties defined by the ontology
– On‐demand rules ‐ perform inference for the services
• Rules triggered upon event ingestion, service request or schedule
• Semantic rule inference generates new triples from existing ones
Charges Amount
Bills
Payment
Payments
P t Due Date Pattern
P
Make Good
“Timeliness”
Customer
Bad
Devices Model Early
Improving
Late
Worsening
Status
OnTime
29. Semantic Inference – Using Business
Rules to generate high level concepts
R l hi h l l
• AIDA provides “Late Payment” defined in Workbench
Workbench for business
rule construction
• Utilizes a sophisticated
magnetic block GUI for
business analysts
b i l
• Rules triggered to infer
and generate new
business concepts
business concepts
Each business rule defines an attribute. This rule defines
rule PaymentDetails.timeliness
an attribute of the PaymentDetails class called timeliness
{
if date within EarlyPeriod days after customerBill.billDate
then timeliness = Early ;
else if date not within LatePeriod days after customerBill.billDate
then timeliness = Late ; Java code
else timeliness = OnTime ; All classes and their attributes are
} defined in the application ontology
30. Decisioning – Probabilistic
Assessment
• AIDA incorporates also Bayesian Belief Networks (BBN)
• These are graphical models for reasoning under uncertainty
• Important part of decision making – the likelihood of something happenning
estimated by how often it occurred in the past (primarily used in medical research
until recently)
til tl )
• Evidence consists of observations on certain nodes leading to conclusions
Evidence Conclusions
Bill
Expect Payment
Arrangement
Setup
Payment
Pattern
Expect
Payment
Payment
31. Presenting insight to the CSR
ese t g s g t to t e CS
Process opens
Prediction on reason for the
Prediction on reason for the
relevant screen for
call – ranked by probability
reference and action
Presentation of recent
interactions and events
d
Prioritized Recommended
treatment and script
34. So why a triple store
So why a triple store
• Flexibility, flexibility and flexibility
y, y y
– Change the schema on a daily basis
– Customers create new policies which in turn will create
new schemas on the fly
• Needed to work with meaning
– Rdf describes data
Rdf describes data
• Needed to be declarative for everything
– Most RTBI is a combination of data in the DB and java
Most RTBI is a combination of data in the DB and java
variables in the application.
37. How would you do this with
your standard search engine
d d h i
• Give me a newspaper text with a republican and a democrat that serve on
two subcommittees that have the same parent committee.
[ | p ] p
• Which [democrat|republican] is most vocal in the oil spill disaster
• Given this text, find all the other texts that have the same people and the
same main topics but not democrats in the text.
same main topics but not democrats in the text
• Which newspaper favors [democrats|republicans]
• Which [democrate|republican|senator|representative] get most of the
attention in the last week.
• Give me the distribution of the most important topics yesterday
38. The process
The process
• We spider daily > 300 on‐line newspapers and thousands of
p y p p
blogs
• And search specifically for all the member of the senate and
house of representatives and the executive branch
• Apply entity extractor to the text and extract main concepts
– About 150 triples per text…
p p
• Hook up these concepts with a detailed database of each
politician and with information from the linked open data
cloud
39.
40.
41.
42. From News Article to
From News Article to
• People (has‐people)
p ( p p )
– And their roles
• Places (has‐places)
– And the county, state, country they are in
• Organizations (has‐organizations)
– Government departments, company names, etc.
• Main Categories (has‐domains)
– Politics sports ministries energy finance economics
Politics, sports, ministries, energy, finance, economics,
ecology, oil, mining industry, etc..
• Main Concepts (has‐main‐groups)
– Other important nouns and phrases in a text
69. Query performance notes:
Wins
i
• Indices are small enough to fit in memory of convential
g y
machines
• Simultaneous access to indices (see next slide)
• Pipe line architecture
Pipe line architecture
– Stream based processing (all nodes can be active in
p
parallel. Most nodes can begin before the end of data is
g
reached.)