SlideShare une entreprise Scribd logo
1  sur  50
Semantic Technology 
in Publishing & Finance 
Triplestores and inference, applications in Finance, the GraphDB engine, 
text-mining, projects and solutions for financial media and publishers 
Keystone Industrial Panel 
ISWC 2014, Riva del Garda, 18 Oct 2014 
Semantic Technology in Publishing & Finance Oct 2014 #1
Outline 
• Introduction to Ontotext 
• Clients, cases 
• Text Mining, Media and Publishing Solution 
• SemTech applications in Finance 
• Wrap-up 
Semantic Technology in Publishing & Finance Oct 2014 #2
Ontotext 
• Information management company providing text analysis, 
data management and state-of-the-art semantic technology 
• 75 employees, head quartered in Sofia, Bulgaria 
• Sales presence in London, Washington DC, and Boston 
• Clients include BBC, AstraZeneca, US DoD, OUP, Wiley, Getty… 
• Over 400 person-years in R&D to create a one-stop shop for: 
– Content enrichment 
– Data management 
– Graph database engine 
• Open and standard 
compliant technology: 
– RDF(S), OWL, GATE, Sesame 
Semantic Technology in Publishing & Finance Oct 2014 #3
Interlinking Text and Data 
Semantic Technology in Publishing & Finance Oct 2014 #4
Semantic Annotation 
Bronchial Diseases 
pmid:17714090 
Clinical and experimental pharmacology … 
Semantic Technology in Publishing & Finance #5 
umls:C0035204 
COPD 
Respiration Disorders 
umls:C0006261 
Chronic Obstructive 
Airway Diseases 
Asthma umls:C000496 
Ian A Yang 
Oct 2014
Semantic Annotation 
Semantic Annotation goes far beyond 
tagging. It allows search using enrichment, 
linking and rules to return explicit and 
implicit results – complete intelligence. 
Semantic Technology in Publishing & Finance 
Content Enrichment 
• Text Mining & 
Classification 
• Curation 
• Quality Monitoring 
Data Management 
• Ontologies and 
Semantic Annotation 
• Web mining 
• Identity Resolution 
Graph Database 
• Standards Based 
• 24-7 Resiliency 
• Hybrid Semantic 
Queries & Search 
Oct 2014 #6
What is RDF Good for? 
• Metadata-based content management 
– Metadata represents a re-usable result of content analytics 
– It can be repurposed allowing for a wide range of applications 
– Most of the search engines do analytics, but the results are not 
explicit; so, they cannot be validated, refined and used by other 
applications 
• Linking text and structured data 
– Allows structured, uniform and efficient access to diverse domain 
models, taxonomies, dictionaries, reference databases 
• Reference data management 
– E.g. product catalogs and taxonomies that are too structured to be 
managed with NoSQL, but too diverse and interconnected for SQL 
• Using open linked data (LOD) 
– A growing amount and diverse public data can be used in enterprise 
Knowledge Management applications 
Semantic Technology in Publishing & Finance Oct 2014 #7
LOD: Growing Exponentially 
Linked Data Datasets 
27 43 89 162 
295 
822 
2,289 
2007 2008 2009 2010 2011 2012 2013 
• July 2013 stats: 2 289 datasets (http://stats.lod2.eu/) 
• Growing exponentially (see the dotted trend line) 
Semantic Technology in Publishing & Finance Oct 2014 #8
How Does Inference Help? 
• Intelligent mapping of queries to data 
– This matters a lot when an application should query a dataset 
combined from 10+ sources, which evolve independently 
– There is no way application developer can stay on top of all schemata 
and all datasets, all the time 
• Finding patterns and inferring new relationships 
– Think of someone constantly looking for patterns that elicit new 
relations, which can match patterns that elicit other relations … 
– Or someone who goes deeper and deeper into finding new ways to 
rewrite a query, over and over again, until all alternatives are 
exhausted 
• Get deeper results and more complete results 
• Cheaper data integration, easier querying 
Semantic Technology in Publishing & Finance Oct 2014 #9
Ontotext Technology Portfolio 
Semantic Technology in Publishing & Finance Oct 2014 #10
Outline 
• Introduction to Ontotext 
• Clients, cases 
• Text Mining, Media and Publishing Solution 
• SemTech applications in Finance 
• Wrap-up 
Semantic Technology in Publishing & Finance Oct 2014 #11
BBC: The Perfect Application 
Since year 2000 Semantic technology was striving for: 
• Pertinent applications, a really good use case 
• Real high-profile projects to prove its maturity 
The “Dynamic Semantic Publishing” architecture 
implemented by the BBC for its FIFA World cup 2010 
web-site filled this gap! 
It demonstrates: 
• How RDF database serves well, where RDBMS fail to 
• How text-mining and triplestores complement one another 
• How inference adds value at a decent scale 
• 24/7 live operation that cannot work without a functional triplestore 
Semantic Technology in Publishing & Finance Oct 2014 #12
Ontotext and BBC 
Profile 
• Mass media broadcaster founded in 1922 
• 23,000 employees and over 5 billion 
pounds in annual revenue. 
Goals 
• Create a dynamic semantic publishing 
platform that assembled web pages on-the-fly 
using a variety of data sources 
• Deliver highly relevant data to web site 
visitors with sub-second response 
Challenges 
• BBC journalists author and publish content 
which is then statistically rendered. The 
costs and time to do this were high. 
• Diverse content was difficult to navigate, 
content re-use was not flexible 
• User experience needed to be improved 
Semantic Technology in Publishing & Finance 
with relevant content 
"The goal is to be able to more easily and 
accurately aggregate content, find it and 
share it across many sources. From these 
simple relationships and building blocks you 
can dynamically build up incredibly rich sites 
and navigation on any platform." 
John O’Donovan 
Chief Technical Architect 
Oct 2014 #13
BBC: The Perfect Application (ctd) 
• The BBC’s FIFA World cup 2010 project was widely 
recognized as the best showcase for SemTech 
– It used OWLIM as a triplestore (chosen after a thorough evaluation) 
– It triggered a wave of adoption of the technology 
• The next milestone: London 2012 Olympic Games 
– The two most important websites used the DSP architecture: the 
official one, operated by Press Association, and the one of the BBC 
– Ontotext text-mining technology was used for content enrichment 
• Four years later this application pattern is still the 
best use case 
– And there are still no other triplestores that can survive such load, 
judging by the LDBC Semantic Publishing Benchmark, public 
information and feedback from the industry 
Semantic Technology in Publishing & Finance Oct 2014 #14
Ontotext and AstraZeneca 
Profile 
• Global, Bio-pharma company 
• $28 billion in sales in 2012 
• $4 billion in R&D across three continents 
Goals 
• Efficient design of new clinical studies 
• Quick access to all of the data 
• Improved evidence based decision-making 
• Strengthen the knowledge feedback loop 
• Enable predictive science 
Challenges 
• Over 7,000 studies and 23,000 documents 
Semantic Technology in Publishing & Finance 
are difficult to obtain 
• Searches returning 1,000 – 10,000 results 
• Document repositories not designed for 
reuse 
• Tedious process to arrive at evidence 
based decisions 
Oct 2014 #15
Context-based Disambiguation 
Semantic Technology in Publishing & Finance Oct 2014 #16
Ontotext and LMI 
Profile 
• Established in 1961 to enable federal 
Semantic Technology in Publishing & Finance 
agencies 
• Specializes in logistics, financial, 
infrastructure & information management 
Goals 
• Unlock large collections of complex 
documents 
• Improve analyst productivity 
• Create an application they can sell to US 
Federal agencies 
Challenges 
• Analysts taking hours to find, download 
and search documents, using inaccurate 
keyword searches 
• Needed a knowledge base to search 
quickly and guide the analysts – highly 
relevant searches 
• Extracts knowledge from collection of 
documents 
• Uses GraphDB to intuitively search and filter 
• Knowledge base used to suggest searches 
• Hyper speed performance 
• Huge savings in analyst time 
• Accurate results 
Oct 2014 #17
Some of our clients 
#18 
The most 
popular 
financial 
newspaper 
Semantic Technology in Publishing & Finance Oct 2014
Outline 
• Introduction of Ontotext 
• Clients, cases 
• Text Mining, Media and Publishing Solution 
• SemTech applications in Finance 
• Wrap-up 
Semantic Technology in Publishing & Finance Oct 2014 #19
Publishing and Media Solution 
Semantic Technology in Publishing & Finance Oct 2014 #20
Solution Features 
• Dedicated solutions for media and publishing 
• Based on the Ontotext Semantic Platform 
• Mature implementation and continuous adaptation 
methodology 
• Introducing advanced features to the authoring, 
editorial and publishing phases of content and data 
workflows 
Semantic Technology in Publishing & Finance Oct 2014 #21
#22 
Methodology 
Semantic Technology in Publishing & Finance Oct 2014
Architecture Overview 
Semantic Technology in Publishing & Finance Oct 2014 #23
Authoring 
Related assets – as you type 
Related entities and concepts 
Entity profiles and facts on the fly 
Create higher value content at the same cost 
Semantic Technology in Publishing & Oct 2014 #24 
Finance
Contextual Authoring 
Semantic Technology in Publishing & Oct 2014 
Finance 
#25
Curation 
Continuous adaptation through editorial feedback 
Query driven publishing templates 
Dynamic re-purposing and reuse 
New publishing products with the same content 
Semantic Technology in Publishing & Oct 2014 #26 
Finance
Example of Client Integrated Curation 
Semantic Technology in Publishing & Oct 2014 #27 
Finance
Example Curation Tool: PressAssociation 
Semantic Technology in Publishing & Finance Oct 2014 #28
Monitoring and Curation Curation Tool 
Semantic Technology in Publishing & Finance Oct 2014 #29
Continuous Adaptation 
Semantic Technology in Publishing & Finance Oct 2014 #30
Publishing 
Dynamic construction of products (e.g. topic 
pages) 
Personalized content streams 
Semantics driven trend and user analytics 
Behavior driven personal asset streams 
Semantic Technology in Publishing & Finance Oct 2014 #31
User Behavior Tracking 
perform 
comments 
votes 
posts 
preview 
read 
Article 
contains leads to 
read 
leads to 
preview 
Search 
Action 
Result 
Date 
FTS Q. Tag 
Cat 
Tag set 
results 
cat 
taxonomy 
Search Log 
------------- 
------------- 
------------- 
------------- 
------------- 
Semantic Technology in Publishing & Finance Oct 2014 #32
Personalized Recommendations 
User Profile 
Behavioural 
and 
Contextual 
Simil ar ity Reads 
Semantic Technology in Publishing & Finance Oct 2014 #33
#34 
Methodology 
Semantic Technology in Publishing & Finance Oct 2014
Methodology 
Semantic Technology in Publishing & Finance Oct 2014 #35
Methodology 
Semantic Technology in Publishing & Finance Oct 2014 #36
Methodology 
Semantic Technology in Publishing & Finance Oct 2014 #37
Complete Domain Ontology 
Semantic Technology in Publishing & Finance Oct 2014 #38
Example KB for 50 daily publications 
Semantic Technology in Publishing & Finance Oct 2014 #39
Methodology 
Semantic Technology in Publishing & Finance Oct 2014 #40
Design of Machine Learning Pipeline 
Semantic Technology in Publishing & Finance Oct 2014 #41
Outline 
• Introduction of Ontotext 
• Clients, cases 
• Text Mining, Media and Publishing Solution 
• SemTech applications in Finance 
• Wrap-up 
Semantic Technology in Publishing & Finance Oct 2014 #42
Discovering Suspicious Relationships 
Semantic Technology in Publishing & Finance Oct 2014 #43
• Have a database of locations, with part-of info 
• Have a database with companies, with dependencies 
• Define semantics for the relevant relationships: 
– sub-region and control are transitive relationships 
– Located-in is transitive over sub-region 
• Define the semantics of suspicious relationships 
CONSTRUCT { ?orgA my:suspiciousLink ?orgB } WHERE { 
?orgA ptop:locatedIn ?x ; fibo:controls ?y . 
?y fibo:controls ?orgB ; ptop:locatedIn ?z . 
?orgB ptop:locatedIn ?x . 
?z a ptop:OffshoreZone . 
} 
What It Takes to Make It Work? 
Semantic Technology in Publishing & Finance Oct 2014 #44
Use Cases 
• Investigating networks of linked entities 
– As prerequisite for risk assessment and compliance research 
• Risk assessment 
– Tracing information about suspicious entities 
– Identifying risk-indicators across multiple sources 
– Identifying risks related to linked entities 
– Determining exposure against a group of linked entities 
• Compliance-related research 
– Fraud detection, insider trading, etc. 
• Searching in large policies and regulations 
Semantic Technology in Publishing & Finance 
– See Open Policy 
Oct 2014 #45
How: Semantic BI/Data-warehouses 
• Imagine integrated database, which allows querying 
across silted databases 
– E.g. bond market data vs. risk assessment vs. equity markets vs. M&A 
– A lot of duplicate data across various databases in different 
departments of banks, and data is simply not linked or organized in a 
unified data model 
• Benefits compared to the mainstream technology: 
– Lower cost of development and maintenance; 
– Direct benefit from industry standards, using inference 
– Real-time updates, unlike traditional data-warehousing, where 
updates should often be scheduled overnight 
– Support for a wide variety of analytical queries, which are far more 
flexible than traditional approaches 
Semantic Technology in Publishing & Finance 
Oct 2014 #46
Ontotext and top 3 Business Media 
Profile 
• Top 3 business media 
• Focused both on B2C publishing and B2B 
Semantic Technology in Publishing & Finance 
services 
Goals 
• Create a horizontal platform for both data 
and content based on semantics and serve 
all functionality through it 
Challenges 
• Critical part of the entire workflow 
• Multiple development projects in parallel 
with up to 2 months time between 
inception and go live 
• GraphDB used not only for data, but for 
content storage as well 
• Horizontal platform with focus on 
organizations, people, GPEs and relations 
between them 
• Automatic extraction of all these concepts 
and relationships 
• Separate stream of work for a user behavior 
based recommendation of relevant content 
and data across the entire media 
Oct 2014 #47
Reference Projects: BCA/Euromoney 
• BCA/Euromoney Macroeconomics Reports 
– Implementation of the Euromoney Semantic Platform 
• Automatically generate metadata about: 
– Markets, geo-political entities, economies, currencies, indicators, indices; 
– Themes of the report; 
– Economic and market conditions; 
– Views of the economist with horizon, focus of the view, and prediction; 
– Suggested trades of tradable objects (bonds, commodities, equities). 
• Semantic indices powering various services: 
– Live Charts – serving macro economics charts with the possibility to add 
additional data series/indices; 
– Macroeconomists dashboard of views with their objects, sentiment, 
horizon, and agreement/disagreement. 
Semantic Technology in Publishing & Finance 
Oct 2014 #48
Ontotext and Euromoney 
Profile 
• Euromoney Institutional Investor PLC, the 
international online information and events 
group 
Goals 
• Create a horizontal platform to serve 100 
Semantic Technology in Publishing & Finance 
different publications 
• create a new publishing and information 
platform which would include the latest 
authoring, storing, and display technologies 
including, semantic annotation, search and a 
triple store repository 
Challenges 
• Different domains covered 
• Sophisticated content analytics incl. 
Relation, template and scenario extraction 
• Analytics of reports and news of various 
domains 
• Extraction of sophisticated macro economic 
views on markets and market conditions; 
trades, condition and trade horizons, assets, 
asset allocations, etc. 
• Multi-faceted search 
• Completely new content and data 
infrastructure 
Oct 2014 #49
Wrap up 
• Ontotext has a full stack of semantic technologies 
• Triplestores combine beauties from NoSQL and SQL 
• Inference fosters discovery in diverse dynamic data 
• GraphDB is in a league on its own: 
– Standard compliant – comprehensive support for OWL and SPARQL 
– Efficient inference through the entire life-cycle of the data 
– H igh-availability cluster architecture – proven and mature 
– FTS and NoSQL Connectors for seamless integration 
• End-to-end solution for Media and Publishing 
– Authoring, curation and publishing through adaptive text-mining 
• All the above proven with industry leaders 
Semantic Technology in Publishing & Finance Oct 2014 #50

Contenu connexe

Tendances

Semantic E-Commerce - Use Cases in Enterprise Web Applications
Semantic E-Commerce - Use Cases in Enterprise Web ApplicationsSemantic E-Commerce - Use Cases in Enterprise Web Applications
Semantic E-Commerce - Use Cases in Enterprise Web ApplicationsLinked Enterprise Date Services
 
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...semanticsconference
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Cambridge Semantics
 
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...semanticsconference
 
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprisesemanticsconference
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big DataMarin Dimitrov
 
Linked data the next 5 years - From Hype to Action
Linked data the next 5 years - From Hype to ActionLinked data the next 5 years - From Hype to Action
Linked data the next 5 years - From Hype to ActionAndreas Blumauer
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analyticsEnterprise Knowledge
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyOntotext
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata managementOpen Data Support
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Dr. Haxel Consult
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataSemantic Web Company
 
Data Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementData Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementRENDER project
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanPeter Berger
 
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...semanticsconference
 
MongoDB Case Study in Healthcare
MongoDB Case Study in HealthcareMongoDB Case Study in Healthcare
MongoDB Case Study in HealthcareMongoDB
 
Delivering a Linked Data warehouse and realising the power of graphs
Delivering a Linked Data warehouse and realising the power of graphsDelivering a Linked Data warehouse and realising the power of graphs
Delivering a Linked Data warehouse and realising the power of graphsBen Gardner
 
II-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data ExplorationII-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data ExplorationDr. Haxel Consult
 

Tendances (20)

Semantic E-Commerce - Use Cases in Enterprise Web Applications
Semantic E-Commerce - Use Cases in Enterprise Web ApplicationsSemantic E-Commerce - Use Cases in Enterprise Web Applications
Semantic E-Commerce - Use Cases in Enterprise Web Applications
 
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...David Kuilman | Creating a Semantic Enterprise Content model to support conti...
David Kuilman | Creating a Semantic Enterprise Content model to support conti...
 
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
Transforming Data Management and Time to Insight with Anzo Smart Data Lake®
 
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
Kerstin Diwisch | Towards a holistic visualization management for knowledge g...
 
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for EnterpriseChalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
Chalitha Perera | Cross Media Concept and Entity Driven Search for Enterprise
 
Semantic Technologies for Big Data
Semantic Technologies for Big DataSemantic Technologies for Big Data
Semantic Technologies for Big Data
 
Sebastian Hellmann
Sebastian HellmannSebastian Hellmann
Sebastian Hellmann
 
Linked data the next 5 years - From Hype to Action
Linked data the next 5 years - From Hype to ActionLinked data the next 5 years - From Hype to Action
Linked data the next 5 years - From Hype to Action
 
Five creative search solutions using text analytics
Five creative search solutions using text analyticsFive creative search solutions using text analytics
Five creative search solutions using text analytics
 
Gaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive TechnologyGaining Advantage in e-Learning with Semantic Adaptive Technology
Gaining Advantage in e-Learning with Semantic Adaptive Technology
 
Introduction to metadata management
Introduction to metadata managementIntroduction to metadata management
Introduction to metadata management
 
Linked Open Data in the World of Patents
Linked Open Data in the World of Patents Linked Open Data in the World of Patents
Linked Open Data in the World of Patents
 
PROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked DataPROPEL . Austrian's Roadmap for Enterprise Linked Data
PROPEL . Austrian's Roadmap for Enterprise Linked Data
 
Data Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data ManagementData Collection and Integration, Linked Data Management
Data Collection and Integration, Linked Data Management
 
Metadata in Business Intelligence
Metadata in Business IntelligenceMetadata in Business Intelligence
Metadata in Business Intelligence
 
Semantics in Financial Services -David Newman
Semantics in Financial Services -David NewmanSemantics in Financial Services -David Newman
Semantics in Financial Services -David Newman
 
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...
Joe Pairman | Multiplying the Power of Taxonomy with Granular, Structured Con...
 
MongoDB Case Study in Healthcare
MongoDB Case Study in HealthcareMongoDB Case Study in Healthcare
MongoDB Case Study in Healthcare
 
Delivering a Linked Data warehouse and realising the power of graphs
Delivering a Linked Data warehouse and realising the power of graphsDelivering a Linked Data warehouse and realising the power of graphs
Delivering a Linked Data warehouse and realising the power of graphs
 
II-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data ExplorationII-SDV 2012 Towards Unified Access Systems for Data Exploration
II-SDV 2012 Towards Unified Access Systems for Data Exploration
 

Similaire à Semantic Technology in Publishing & Finance

Webinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in PublishingWebinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in PublishingOntotext
 
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth SciencesValues & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth Sciencesterradue
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudMarin Dimitrov
 
Summary of Day 1
Summary of Day 1Summary of Day 1
Summary of Day 1Europeana
 
Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j
 
Community Management
Community ManagementCommunity Management
Community ManagementLee Schlenker
 
Martin Voigt | Streaming-based Text Mining using Deep Learning and Semantics
Martin Voigt | Streaming-based Text Mining using Deep Learning and SemanticsMartin Voigt | Streaming-based Text Mining using Deep Learning and Semantics
Martin Voigt | Streaming-based Text Mining using Deep Learning and Semanticssemanticsconference
 
Streaming-based Text Mining using Deep Learning and Semantics
Streaming-based Text Mining using Deep Learning and SemanticsStreaming-based Text Mining using Deep Learning and Semantics
Streaming-based Text Mining using Deep Learning and SemanticsLinked Enterprise Date Services
 
Ontos NLP Stack, Sep. 2016
Ontos NLP Stack, Sep. 2016Ontos NLP Stack, Sep. 2016
Ontos NLP Stack, Sep. 2016Martin Voigt
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalMartin Kaltenböck
 
RDF Presentation
RDF PresentationRDF Presentation
RDF Presentationjonathan959
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceMarin Dimitrov
 
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic TechnologyThomas Kelly, PMP
 
22 May 2014 CDE competition: Information processing and sensemaking presentation
22 May 2014 CDE competition: Information processing and sensemaking presentation22 May 2014 CDE competition: Information processing and sensemaking presentation
22 May 2014 CDE competition: Information processing and sensemaking presentationDefence and Security Accelerator
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureNASIG
 
Atlas Ad-Serving Server - Advertising at scale at Facebook
Atlas Ad-Serving Server - Advertising at scale at FacebookAtlas Ad-Serving Server - Advertising at scale at Facebook
Atlas Ad-Serving Server - Advertising at scale at FacebookTrieu Nguyen
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of SouthamptonRepository Fringe
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without iaFredric Landqvist
 
BDVA i Spaces - What they are, how to become one, value and collaborations
BDVA i Spaces - What they are, how to become one, value and collaborationsBDVA i Spaces - What they are, how to become one, value and collaborations
BDVA i Spaces - What they are, how to become one, value and collaborationsBig Data Value Association
 

Similaire à Semantic Technology in Publishing & Finance (20)

Webinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in PublishingWebinar: Metadata Enrichment in Publishing
Webinar: Metadata Enrichment in Publishing
 
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth SciencesValues & Vision - Cloud Sandboxes for BIG Earth Sciences
Values & Vision - Cloud Sandboxes for BIG Earth Sciences
 
On-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the CloudOn-Demand RDF Graph Databases in the Cloud
On-Demand RDF Graph Databases in the Cloud
 
Summary of Day 1
Summary of Day 1Summary of Day 1
Summary of Day 1
 
Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017Neo4j PartnerDay Amsterdam 2017
Neo4j PartnerDay Amsterdam 2017
 
Community Management
Community ManagementCommunity Management
Community Management
 
Martin Voigt | Streaming-based Text Mining using Deep Learning and Semantics
Martin Voigt | Streaming-based Text Mining using Deep Learning and SemanticsMartin Voigt | Streaming-based Text Mining using Deep Learning and Semantics
Martin Voigt | Streaming-based Text Mining using Deep Learning and Semantics
 
Streaming-based Text Mining using Deep Learning and Semantics
Streaming-based Text Mining using Deep Learning and SemanticsStreaming-based Text Mining using Deep Learning and Semantics
Streaming-based Text Mining using Deep Learning and Semantics
 
Ontos NLP Stack, Sep. 2016
Ontos NLP Stack, Sep. 2016Ontos NLP Stack, Sep. 2016
Ontos NLP Stack, Sep. 2016
 
Easy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance ProfessionalEasy SPARQLing for the Building Performance Professional
Easy SPARQLing for the Building Performance Professional
 
RDF Presentation
RDF PresentationRDF Presentation
RDF Presentation
 
Text Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-ServiceText Analytics & Linked Data Management As-a-Service
Text Analytics & Linked Data Management As-a-Service
 
Enterprise Semantic Technology
Enterprise Semantic TechnologyEnterprise Semantic Technology
Enterprise Semantic Technology
 
22 May 2014 CDE competition: Information processing and sensemaking presentation
22 May 2014 CDE competition: Information processing and sensemaking presentation22 May 2014 CDE competition: Information processing and sensemaking presentation
22 May 2014 CDE competition: Information processing and sensemaking presentation
 
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data FutureThe Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
The Canadian Linked Data Initiative: Charting a Path to a Linked Data Future
 
Atlas Ad-Serving Server - Advertising at scale at Facebook
Atlas Ad-Serving Server - Advertising at scale at FacebookAtlas Ad-Serving Server - Advertising at scale at Facebook
Atlas Ad-Serving Server - Advertising at scale at Facebook
 
EPrints Update, Les Carr, University of Southampton
EPrints  Update, Les Carr, University of SouthamptonEPrints  Update, Les Carr, University of Southampton
EPrints Update, Les Carr, University of Southampton
 
Smart cities no ai without ia
Smart cities   no ai without iaSmart cities   no ai without ia
Smart cities no ai without ia
 
BDVA i Spaces - What they are, how to become one, value and collaborations
BDVA i Spaces - What they are, how to become one, value and collaborationsBDVA i Spaces - What they are, how to become one, value and collaborations
BDVA i Spaces - What they are, how to become one, value and collaborations
 
Bdva - iSpaces
Bdva - iSpacesBdva - iSpaces
Bdva - iSpaces
 

Plus de Vladimir Alexiev, PhD, PMP

Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Vladimir Alexiev, PhD, PMP
 
Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Vladimir Alexiev, PhD, PMP
 
Ontotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsOntotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsVladimir Alexiev, PhD, PMP
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Vladimir Alexiev, PhD, PMP
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Vladimir Alexiev, PhD, PMP
 
Europeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeEuropeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeVladimir Alexiev, PhD, PMP
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationVladimir Alexiev, PhD, PMP
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenVladimir Alexiev, PhD, PMP
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsVladimir Alexiev, PhD, PMP
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsVladimir Alexiev, PhD, PMP
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...Vladimir Alexiev, PhD, PMP
 

Plus de Vladimir Alexiev, PhD, PMP (20)

Semantics and Machine Learning
Semantics and Machine LearningSemantics and Machine Learning
Semantics and Machine Learning
 
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
Museum LOD (Ontotext, 1 May 2019, Doha, Qatar)
 
Linked Open Data and Ontotext Projects
Linked Open Data and Ontotext ProjectsLinked Open Data and Ontotext Projects
Linked Open Data and Ontotext Projects
 
Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects Museum Linked Open Data: Ontologies, Datasets, Projects
Museum Linked Open Data: Ontologies, Datasets, Projects
 
Ontotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities ProjectsOntotext Cultural Heritage and Digital Humanities Projects
Ontotext Cultural Heritage and Digital Humanities Projects
 
euBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic DataeuBusinessGraph Company and Economic Data
euBusinessGraph Company and Economic Data
 
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
Semantic Archive Integration for Holocaust Research: the EHRI Research Infras...
 
GLAMs working with Wikidata
GLAMs working with WikidataGLAMs working with Wikidata
GLAMs working with Wikidata
 
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
Wikidata, a target for Europeana’s semantic strategy (Glam-Wiki 2015)
 
Europeana Food and Drink Classification Scheme
Europeana Food and Drink Classification SchemeEuropeana Food and Drink Classification Scheme
Europeana Food and Drink Classification Scheme
 
Adding a DBpedia Mapping
Adding a DBpedia MappingAdding a DBpedia Mapping
Adding a DBpedia Mapping
 
DBpedia Ontology and Mapping Problems
DBpedia Ontology and Mapping ProblemsDBpedia Ontology and Mapping Problems
DBpedia Ontology and Mapping Problems
 
20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture20140521 sem-tech-biz-guest-lecture
20140521 sem-tech-biz-guest-lecture
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic RepresentationGetty Vocabulary Program LOD: Ontologies and Semantic Representation
Getty Vocabulary Program LOD: Ontologies and Semantic Representation
 
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, SwedenSem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
Sem tech in CH, Linked Data Meetup, 2014-08-21, Malmo, Sweden
 
Semantic technologies for cultural heritage
Semantic technologies for cultural heritageSemantic technologies for cultural heritage
Semantic technologies for cultural heritage
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom Views
 
Europeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom ViewsEuropeana Creative. EDM Endpoint. Custom Views
Europeana Creative. EDM Endpoint. Custom Views
 
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ... Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
Large-scale Reasoning with a Complex Cultural Heritage Ontology (CIDOC CRM) ...
 

Semantic Technology in Publishing & Finance

  • 1. Semantic Technology in Publishing & Finance Triplestores and inference, applications in Finance, the GraphDB engine, text-mining, projects and solutions for financial media and publishers Keystone Industrial Panel ISWC 2014, Riva del Garda, 18 Oct 2014 Semantic Technology in Publishing & Finance Oct 2014 #1
  • 2. Outline • Introduction to Ontotext • Clients, cases • Text Mining, Media and Publishing Solution • SemTech applications in Finance • Wrap-up Semantic Technology in Publishing & Finance Oct 2014 #2
  • 3. Ontotext • Information management company providing text analysis, data management and state-of-the-art semantic technology • 75 employees, head quartered in Sofia, Bulgaria • Sales presence in London, Washington DC, and Boston • Clients include BBC, AstraZeneca, US DoD, OUP, Wiley, Getty… • Over 400 person-years in R&D to create a one-stop shop for: – Content enrichment – Data management – Graph database engine • Open and standard compliant technology: – RDF(S), OWL, GATE, Sesame Semantic Technology in Publishing & Finance Oct 2014 #3
  • 4. Interlinking Text and Data Semantic Technology in Publishing & Finance Oct 2014 #4
  • 5. Semantic Annotation Bronchial Diseases pmid:17714090 Clinical and experimental pharmacology … Semantic Technology in Publishing & Finance #5 umls:C0035204 COPD Respiration Disorders umls:C0006261 Chronic Obstructive Airway Diseases Asthma umls:C000496 Ian A Yang Oct 2014
  • 6. Semantic Annotation Semantic Annotation goes far beyond tagging. It allows search using enrichment, linking and rules to return explicit and implicit results – complete intelligence. Semantic Technology in Publishing & Finance Content Enrichment • Text Mining & Classification • Curation • Quality Monitoring Data Management • Ontologies and Semantic Annotation • Web mining • Identity Resolution Graph Database • Standards Based • 24-7 Resiliency • Hybrid Semantic Queries & Search Oct 2014 #6
  • 7. What is RDF Good for? • Metadata-based content management – Metadata represents a re-usable result of content analytics – It can be repurposed allowing for a wide range of applications – Most of the search engines do analytics, but the results are not explicit; so, they cannot be validated, refined and used by other applications • Linking text and structured data – Allows structured, uniform and efficient access to diverse domain models, taxonomies, dictionaries, reference databases • Reference data management – E.g. product catalogs and taxonomies that are too structured to be managed with NoSQL, but too diverse and interconnected for SQL • Using open linked data (LOD) – A growing amount and diverse public data can be used in enterprise Knowledge Management applications Semantic Technology in Publishing & Finance Oct 2014 #7
  • 8. LOD: Growing Exponentially Linked Data Datasets 27 43 89 162 295 822 2,289 2007 2008 2009 2010 2011 2012 2013 • July 2013 stats: 2 289 datasets (http://stats.lod2.eu/) • Growing exponentially (see the dotted trend line) Semantic Technology in Publishing & Finance Oct 2014 #8
  • 9. How Does Inference Help? • Intelligent mapping of queries to data – This matters a lot when an application should query a dataset combined from 10+ sources, which evolve independently – There is no way application developer can stay on top of all schemata and all datasets, all the time • Finding patterns and inferring new relationships – Think of someone constantly looking for patterns that elicit new relations, which can match patterns that elicit other relations … – Or someone who goes deeper and deeper into finding new ways to rewrite a query, over and over again, until all alternatives are exhausted • Get deeper results and more complete results • Cheaper data integration, easier querying Semantic Technology in Publishing & Finance Oct 2014 #9
  • 10. Ontotext Technology Portfolio Semantic Technology in Publishing & Finance Oct 2014 #10
  • 11. Outline • Introduction to Ontotext • Clients, cases • Text Mining, Media and Publishing Solution • SemTech applications in Finance • Wrap-up Semantic Technology in Publishing & Finance Oct 2014 #11
  • 12. BBC: The Perfect Application Since year 2000 Semantic technology was striving for: • Pertinent applications, a really good use case • Real high-profile projects to prove its maturity The “Dynamic Semantic Publishing” architecture implemented by the BBC for its FIFA World cup 2010 web-site filled this gap! It demonstrates: • How RDF database serves well, where RDBMS fail to • How text-mining and triplestores complement one another • How inference adds value at a decent scale • 24/7 live operation that cannot work without a functional triplestore Semantic Technology in Publishing & Finance Oct 2014 #12
  • 13. Ontotext and BBC Profile • Mass media broadcaster founded in 1922 • 23,000 employees and over 5 billion pounds in annual revenue. Goals • Create a dynamic semantic publishing platform that assembled web pages on-the-fly using a variety of data sources • Deliver highly relevant data to web site visitors with sub-second response Challenges • BBC journalists author and publish content which is then statistically rendered. The costs and time to do this were high. • Diverse content was difficult to navigate, content re-use was not flexible • User experience needed to be improved Semantic Technology in Publishing & Finance with relevant content "The goal is to be able to more easily and accurately aggregate content, find it and share it across many sources. From these simple relationships and building blocks you can dynamically build up incredibly rich sites and navigation on any platform." John O’Donovan Chief Technical Architect Oct 2014 #13
  • 14. BBC: The Perfect Application (ctd) • The BBC’s FIFA World cup 2010 project was widely recognized as the best showcase for SemTech – It used OWLIM as a triplestore (chosen after a thorough evaluation) – It triggered a wave of adoption of the technology • The next milestone: London 2012 Olympic Games – The two most important websites used the DSP architecture: the official one, operated by Press Association, and the one of the BBC – Ontotext text-mining technology was used for content enrichment • Four years later this application pattern is still the best use case – And there are still no other triplestores that can survive such load, judging by the LDBC Semantic Publishing Benchmark, public information and feedback from the industry Semantic Technology in Publishing & Finance Oct 2014 #14
  • 15. Ontotext and AstraZeneca Profile • Global, Bio-pharma company • $28 billion in sales in 2012 • $4 billion in R&D across three continents Goals • Efficient design of new clinical studies • Quick access to all of the data • Improved evidence based decision-making • Strengthen the knowledge feedback loop • Enable predictive science Challenges • Over 7,000 studies and 23,000 documents Semantic Technology in Publishing & Finance are difficult to obtain • Searches returning 1,000 – 10,000 results • Document repositories not designed for reuse • Tedious process to arrive at evidence based decisions Oct 2014 #15
  • 16. Context-based Disambiguation Semantic Technology in Publishing & Finance Oct 2014 #16
  • 17. Ontotext and LMI Profile • Established in 1961 to enable federal Semantic Technology in Publishing & Finance agencies • Specializes in logistics, financial, infrastructure & information management Goals • Unlock large collections of complex documents • Improve analyst productivity • Create an application they can sell to US Federal agencies Challenges • Analysts taking hours to find, download and search documents, using inaccurate keyword searches • Needed a knowledge base to search quickly and guide the analysts – highly relevant searches • Extracts knowledge from collection of documents • Uses GraphDB to intuitively search and filter • Knowledge base used to suggest searches • Hyper speed performance • Huge savings in analyst time • Accurate results Oct 2014 #17
  • 18. Some of our clients #18 The most popular financial newspaper Semantic Technology in Publishing & Finance Oct 2014
  • 19. Outline • Introduction of Ontotext • Clients, cases • Text Mining, Media and Publishing Solution • SemTech applications in Finance • Wrap-up Semantic Technology in Publishing & Finance Oct 2014 #19
  • 20. Publishing and Media Solution Semantic Technology in Publishing & Finance Oct 2014 #20
  • 21. Solution Features • Dedicated solutions for media and publishing • Based on the Ontotext Semantic Platform • Mature implementation and continuous adaptation methodology • Introducing advanced features to the authoring, editorial and publishing phases of content and data workflows Semantic Technology in Publishing & Finance Oct 2014 #21
  • 22. #22 Methodology Semantic Technology in Publishing & Finance Oct 2014
  • 23. Architecture Overview Semantic Technology in Publishing & Finance Oct 2014 #23
  • 24. Authoring Related assets – as you type Related entities and concepts Entity profiles and facts on the fly Create higher value content at the same cost Semantic Technology in Publishing & Oct 2014 #24 Finance
  • 25. Contextual Authoring Semantic Technology in Publishing & Oct 2014 Finance #25
  • 26. Curation Continuous adaptation through editorial feedback Query driven publishing templates Dynamic re-purposing and reuse New publishing products with the same content Semantic Technology in Publishing & Oct 2014 #26 Finance
  • 27. Example of Client Integrated Curation Semantic Technology in Publishing & Oct 2014 #27 Finance
  • 28. Example Curation Tool: PressAssociation Semantic Technology in Publishing & Finance Oct 2014 #28
  • 29. Monitoring and Curation Curation Tool Semantic Technology in Publishing & Finance Oct 2014 #29
  • 30. Continuous Adaptation Semantic Technology in Publishing & Finance Oct 2014 #30
  • 31. Publishing Dynamic construction of products (e.g. topic pages) Personalized content streams Semantics driven trend and user analytics Behavior driven personal asset streams Semantic Technology in Publishing & Finance Oct 2014 #31
  • 32. User Behavior Tracking perform comments votes posts preview read Article contains leads to read leads to preview Search Action Result Date FTS Q. Tag Cat Tag set results cat taxonomy Search Log ------------- ------------- ------------- ------------- ------------- Semantic Technology in Publishing & Finance Oct 2014 #32
  • 33. Personalized Recommendations User Profile Behavioural and Contextual Simil ar ity Reads Semantic Technology in Publishing & Finance Oct 2014 #33
  • 34. #34 Methodology Semantic Technology in Publishing & Finance Oct 2014
  • 35. Methodology Semantic Technology in Publishing & Finance Oct 2014 #35
  • 36. Methodology Semantic Technology in Publishing & Finance Oct 2014 #36
  • 37. Methodology Semantic Technology in Publishing & Finance Oct 2014 #37
  • 38. Complete Domain Ontology Semantic Technology in Publishing & Finance Oct 2014 #38
  • 39. Example KB for 50 daily publications Semantic Technology in Publishing & Finance Oct 2014 #39
  • 40. Methodology Semantic Technology in Publishing & Finance Oct 2014 #40
  • 41. Design of Machine Learning Pipeline Semantic Technology in Publishing & Finance Oct 2014 #41
  • 42. Outline • Introduction of Ontotext • Clients, cases • Text Mining, Media and Publishing Solution • SemTech applications in Finance • Wrap-up Semantic Technology in Publishing & Finance Oct 2014 #42
  • 43. Discovering Suspicious Relationships Semantic Technology in Publishing & Finance Oct 2014 #43
  • 44. • Have a database of locations, with part-of info • Have a database with companies, with dependencies • Define semantics for the relevant relationships: – sub-region and control are transitive relationships – Located-in is transitive over sub-region • Define the semantics of suspicious relationships CONSTRUCT { ?orgA my:suspiciousLink ?orgB } WHERE { ?orgA ptop:locatedIn ?x ; fibo:controls ?y . ?y fibo:controls ?orgB ; ptop:locatedIn ?z . ?orgB ptop:locatedIn ?x . ?z a ptop:OffshoreZone . } What It Takes to Make It Work? Semantic Technology in Publishing & Finance Oct 2014 #44
  • 45. Use Cases • Investigating networks of linked entities – As prerequisite for risk assessment and compliance research • Risk assessment – Tracing information about suspicious entities – Identifying risk-indicators across multiple sources – Identifying risks related to linked entities – Determining exposure against a group of linked entities • Compliance-related research – Fraud detection, insider trading, etc. • Searching in large policies and regulations Semantic Technology in Publishing & Finance – See Open Policy Oct 2014 #45
  • 46. How: Semantic BI/Data-warehouses • Imagine integrated database, which allows querying across silted databases – E.g. bond market data vs. risk assessment vs. equity markets vs. M&A – A lot of duplicate data across various databases in different departments of banks, and data is simply not linked or organized in a unified data model • Benefits compared to the mainstream technology: – Lower cost of development and maintenance; – Direct benefit from industry standards, using inference – Real-time updates, unlike traditional data-warehousing, where updates should often be scheduled overnight – Support for a wide variety of analytical queries, which are far more flexible than traditional approaches Semantic Technology in Publishing & Finance Oct 2014 #46
  • 47. Ontotext and top 3 Business Media Profile • Top 3 business media • Focused both on B2C publishing and B2B Semantic Technology in Publishing & Finance services Goals • Create a horizontal platform for both data and content based on semantics and serve all functionality through it Challenges • Critical part of the entire workflow • Multiple development projects in parallel with up to 2 months time between inception and go live • GraphDB used not only for data, but for content storage as well • Horizontal platform with focus on organizations, people, GPEs and relations between them • Automatic extraction of all these concepts and relationships • Separate stream of work for a user behavior based recommendation of relevant content and data across the entire media Oct 2014 #47
  • 48. Reference Projects: BCA/Euromoney • BCA/Euromoney Macroeconomics Reports – Implementation of the Euromoney Semantic Platform • Automatically generate metadata about: – Markets, geo-political entities, economies, currencies, indicators, indices; – Themes of the report; – Economic and market conditions; – Views of the economist with horizon, focus of the view, and prediction; – Suggested trades of tradable objects (bonds, commodities, equities). • Semantic indices powering various services: – Live Charts – serving macro economics charts with the possibility to add additional data series/indices; – Macroeconomists dashboard of views with their objects, sentiment, horizon, and agreement/disagreement. Semantic Technology in Publishing & Finance Oct 2014 #48
  • 49. Ontotext and Euromoney Profile • Euromoney Institutional Investor PLC, the international online information and events group Goals • Create a horizontal platform to serve 100 Semantic Technology in Publishing & Finance different publications • create a new publishing and information platform which would include the latest authoring, storing, and display technologies including, semantic annotation, search and a triple store repository Challenges • Different domains covered • Sophisticated content analytics incl. Relation, template and scenario extraction • Analytics of reports and news of various domains • Extraction of sophisticated macro economic views on markets and market conditions; trades, condition and trade horizons, assets, asset allocations, etc. • Multi-faceted search • Completely new content and data infrastructure Oct 2014 #49
  • 50. Wrap up • Ontotext has a full stack of semantic technologies • Triplestores combine beauties from NoSQL and SQL • Inference fosters discovery in diverse dynamic data • GraphDB is in a league on its own: – Standard compliant – comprehensive support for OWL and SPARQL – Efficient inference through the entire life-cycle of the data – H igh-availability cluster architecture – proven and mature – FTS and NoSQL Connectors for seamless integration • End-to-end solution for Media and Publishing – Authoring, curation and publishing through adaptive text-mining • All the above proven with industry leaders Semantic Technology in Publishing & Finance Oct 2014 #50

Notes de l'éditeur

  1. AstraZeneca is a world leader in bio-pharmaceuticals with over 28 billion in revenue and 4 billion invested in R&D. Innovation and creating great medicines are core values. To continue on their path to success, they needed to design new clinical studies efficiently through instant access to all of their data. By doing this, they could improve evidence based decision making and create a knowledge feedback loop through accessing historical clinical studies and related documents. But with over 7,000 studies and 23000 documents, the volume of unstructured data was overwhelming them. They would search for what they thought were relevant results and instead get anywhere between 1,000 and 10,000 results that needed review. The fact is their document repository was not designed for reuse. They had not extracted the meaning from their documents they needed. They had not created reusable meta data allowing for the search results to be highly targeted. The tedious process of document review was slowing down the innovation engine. They needed to reduce the onerous manual effort, gain complete visibility, decease the time to locate knowledge and arrive at instant analytical results. [click to build the slide] Ontotext was able to extract meaning from the unstructured documents, optimize their knowledge repository for flexible semantic searches – searches that leveraged newly created metadata. The data was stored in a way where it was optimized for navigation and retrieval. To do this, we used vocabularies for drugs and biomarkers while also creating a master databases linking all the information together in OWL. We indexed all of the disambiguated data allowing users to find targeted results from context-based search. In the end this created less patient & regulatory risk, used fewer resources to locate and analyze clinical data and enhanced the innovation process allowing AZ to create new drugs faster. All of their goals were met and they are using this system in production today.
  2. To add steps 1, 2 and 3 and so on. Make it more interactive. Add explanation to the slide
  3. LMI was founded in 1961 to provide solutions to enable federal government agencies. Specializes in logistics, acquisition and financial management, infrastructure management, information management and policy and program support.
  4. Publishing solution covering 3 main phases of publishers workflows
  5. Typical pipeline structure including gazetteers, rules, ML for tagging and disambiguation; RDF output
  6. Main layers of the architecture, including data, analytics and UIs
  7. Contextual authoring is the key message
  8. A wire frame for a typical contextual recommendations of data and content
  9. Curation as a part of the editorial process. Emphasis is on quality control and adaptation of the text analytics, but as well monitoring and instance data curation.
  10. Client annotation/curation tool
  11. Another example for annotation and disambiguation tool based on our APIs
  12. Ontotext curation and monitoring module of annotation jobs
  13. Continuous adaptation of the text analytics modules / concept extraction services. Initial creation of gold standards and then adapting the ML models according to new examples and editorial feedback.
  14. Main focus is on dynamic publishing of content based on metadata templates and on personalization driven by behavior and context modeling.
  15. An example user behavior model
  16. Behavior and context driven recommendations for a client.
  17. Focus on methodologies for implementing a solution. Business and Functional requirements SME analysis Definition of annotation types and main domain model concepts Design of ETL jobs and annotation guidelines
  18. Zoom in
  19. Domain model design – Information architecture, a combination of the work of an architect and the knowledge of a subject matter expert.
  20. Example domain ontology EM
  21. Example KB Newz
  22. Initial Concept Extraction Pipeline
  23. Iterative implementation of ML-driven extraction