Manuel Noya talks about the science-industry relationship driven by competitive intelligence and how to surf emerging technologies
Workshop title:TDM unlocking a goldmine of information
Training overview:
Text and Data Mining (TDM) is a natural ‘next step’ in open science. It can lead to new and unexpected discoveries and increase the impact of publications and repositories. This workshop showcases examples of successful TDM and infrastructural solutions for researchers. We will also discuss what is needed to make most of infrastructures and how publishers and repositories can open up their content.
DAY 2 - PARALLEL SESSION 4 & 5
3. Who I am
Researcher turned entrepreneur:
BSc Chemical Engineering (USC, Spain) and MSc Materials Science (UPM, Spain).
Know-how: technology scouting, competitive intelligence, NPD (new product
development), early-stage startups and innovation strategy.
Researcher for NEOKER (Spain) and SRI International (CA): an awarded spin-off from
USC (Spain), now scaling up in China. R&D projects in materials science for over 3
years at SRI International (Menlo Park, CA).
Cofounded Linknovate.com in May 2012 in Palo Alto (CA) and went through Stanford
University accelerator program (StartX) in 2013.
The company provides clients such as BMW AG and REPSOL with competitive
intelligence software tools.
CEO since 2014, now a 7 people company growing internationally.
4. Our credentials
4 4
We founded Linknovate en 2012 in Stanford University, CA.
We have been awarded 3 EU projects (grants) by the EC and have gone through 2 of the most prestigious startup
accelerators in the world.
Some of our satisfied clients:
13. 13
2. Fresh User-Generated data: “We focus on bringing modern agricultural techniques to the farming of
insects. Currently our focus is primarily on the production of insects as food and as an ingredient.”
“We have been focused on producing insects for food for over three years. With regard to TRL, we have products on
the market, so arguably TRL8-9.”
3.Willingness: “Joint tech development / R&D collaboration; Partnerships - e.g. grant applications like NSF, DOE or
EU projects; Consider an investment (equity or other). We re open to discussing opportunities.”
4. Companies/ Groups recommended: “(…) the only North American company that we have seen any
credible results from is Tiny Farms, though there are indications that at least one of the established CPG companies
has begin making significant investments in R&D. Looking further afield, there are a number of companies in Europe
(particularly France and the Netherlands) that seem sophisticated.”
Email:
gm@aspirefg.com
1. Technology Maturity (2 aspects)
Insects as Protein Source: TRL 8 or higher. Already in the market
Industrial Processing of Insects: TRL 7. Less than 1,5y to be in the market
User Case: “Insects as an Alternative Source of Proteins”
User Generated Data (detailed, fresh, assesing willingness). Expert profile
14. Otros
1-23
Camera-based
TRL
Investigación aplicada
Prueba de concepto validada
Wearables
1-2 3 4
4 5-65-6 7 7
User Case: “biometric sensors”
User Generated Data (classified in categories). Companies’ visualization
Universidades
Relevancia
2 3 4 5
14
Centros de
Investigación
Compañías
Prot. en entorno de lab.
Prot. en entorno de altura.
Mayor
powered by
18. Ranking of Active Entities Worldwide (II)
18
Working in systems for reduction of irrelevant information during searches.
Works in the text-mining based bioassay neighboring analysis as a standalone or as a complementary
tool for the PubChem bioassay neighboring process to enable efficient integration of assay results
and generate hypotheses for the discovery of bioactivities of the tested reagents.
19. Ranking of Active Entities Worldwide (III)
19
Study investigating the usefulness of natural language processing (NLP) as an adjunct to dictionary-
based concept normalization. Methods used: two biomedical concept normalization systems,
MetaMap and Peregrine, with and without the use of a rule-based NLP module.
Interested in automated extraction of useful biomedical information from unstructured text.
Especially in the importance of named entity recognition and relationship extraction as fundamental
approaches that are relevant to systems biology.
20. Insights from Linknovate.com
20
Academic sources dominate over
‘industrial signals’.
We may be still waiting for the ‘great’
stuff to land?
Algorithmia over
‘Document Classification’,
‘Ontologies’, ‘Feature
Analysis’, ‘Rule Extraction’,
etc
21. Insights from Linknovate.com (II)
21
High academic activity,
primarily with universities as
the most involved
organisations.
Highlight: small-medium size
companies are the most active
in 2017…
22. Global Data Comparison
Comparing search engines of complementary results
Results show similar trends in all 3 engines and data sources.
22
Sources
• Linknovate (publications, conf proceedings, grants, patent apps, news, web monitoring)
• PubMed (publications, conf proceedings)
• Google Patents (patents)
24. Stanford University – U.S.A.
https://www.stanford.edu/
24
Text Mining for Adverse Drug Events: the Promise,
Challenges, and State of the Art
A Framework for the Automatic Extraction of Rules
from Online Text
Learning the Structure of Biomedical Relationships
from Unstructured Text
References in this topic
This article provides an overview of recent advances in
pharmacovigilance driven by the application of text mining.
This paper presents a general-purpose framework for acquiring more
complex relationships from text and then encoding this knowledge
as rules.
Here we describe a novel algorithm, Ensemble Biclustering for
Classification (EBC), that learns the structure of biomedical
relationships automatically from text, overcoming differences in
word choice and sentence structure.
26. Elon University – U.S.A.
https://www.elon.edu/home/
26
CRI: RUI: CI-EN: Infrastructure to Enable Mining and
Analysis of Open Source Software Engineering
Artifacts
Awarded with a NSF Grant in 2014
This NSF CRI supported Research at Undergraduate Institutions (RUI)
project will integrate, expand and enhance several distinct data
sources currently used by three research communities: those who
study Free, Libre, and Open Source Software (FLOSS), the larger
empirical software engineering research community, and researchers
engaged in data mining and text mining.
27. Old Dominion University – U.S.A.
https://www.odu.edu/
27
Collaboration with James Madison University in a publication
related to the conpetitive analysis that three companies make
around the content that their customers generate in social
networks
This paper describes an in-depth case study which applies text mining
to analyze unstructured text content on Facebook and Twitter sites of
the three largest pizza chains: Pizza Hut, Domino's Pizza and Papa
John's Pizza. The results reveal the value of social media competitive
analysis and the power of text mining as an effective technique to
extract business value from the vast amount of available social media
data.
Social media competitive analysis and text mining: A case
study in the pizza industry
28. University of Illinois at Urbana - Champaign – U.S.A.
http://illinois.edu/
28
In this paper we introduce a text cube architecture designed to
organize social media data in multiple dimensions and
hierarchies for efficient information query and visualization from
multiple perspectives.
SocialCube: A Text Cube Framework for Analyzing Social
Media Data
Collaboration with Cornell University and the company
Intelligent Automation, Inc. in a publication related to the
study of social and cultural behaviors through the contents
generated by users of social networks
30. IBM – U.S.A.
https://www.ibm.com/us-en/
30
Towards comprehensive longitudinal healthcare data
capture
Collaboration with Wright State University in a publication
about the use of text mining in unstructured clinical texts.
In this work therefore, we explore a pattern-based approach for
extracting Smoker Semantic Types (SST) from unstructured clinical
notes.
33. KBSI (Knowledge Based Systems, Inc.) – U.S.A.
https://www.kbsi.com/
33
SOME PARTNERS
34. Amenity Analytics – U.S.A.
http://www.amenityanalytics.com/
34
$7.6M of raised funds in August 2017
INVESTORS
Amenity Analytics provides next-generation
Text-Mining AI Platform. A leading edge text
analytics platform that allows customers to
identify actionable signals from unstructured
data.
35. AYLIEN – Ireland
http://aylien.com/
35
$1.14M of raised funds
$580k in March 2016
Aylien is an artificial intelligence
startup that focuses on creating
technologies that help machines
understand humans better. The firm
provides text analysis and news
API's that allow users to make sense
of human-generated content at
scale. They also provide a range of
content analysis solutions to
developers, data scientists,
marketers and academics.
INVESTORS
36. Lexalytics – U.S.A.
https://www.lexalytics.com/
36
Lexalytics transforms global
conversations into meaningful and
actionable insights. Their leading
text analysis platforms process
billions of pieces of unstructured
data, translating thoughts and
feelings into profitable decisions for
their customers. Lexalytics helps
companies implement vital
feedback and monitoring programs
that create an ongoing dialogue
with their customers.
37. Bitext – U.S.A.
https://www.bitext.com/
37
$900k of raised funds in January 2015
Bitext develops multilingual analytics
technology in 30 languages. The company
takes an approach to text analysis, using
linguistic knowledge as a scientific base.
38. 38
Text & Data Mining for
Competitive Intelligence
manuel@linknovate.com | skype: manu_noia
Gracias!