Data growth is rapidly surpassing Moore's Law, as data sets are growing increasingly large, hence deriving insights from these large data sets is becoming more and more complex. Lily, a software product made by Outerthought, allows you to store, index and search vast quantities of data. In the next few years, successful business models will be based on monetization of data. Steven Noels will highlight the raison d'être of Lily, discussing challenges that every data-intensive organisation encounters.
How to Effectively Monitor SD-WAN and SASE Environments with ThousandEyes
Sirris innovate2011 - Lily, Smart Data at scale made easy, Steven Noels, Outertought
1. Smart data,
Lily at scale
madE easy
from content storage
to scaling smart data
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org
2. the pain
data
need for
distributed
processing
moore
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 2
3. the pain
» growth of data sets
» smart businesses need
to apply analytics to Smart data,
activities
at scale
» doing business online
means real-time
madE easy
» talent shortage
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 3
4. LILY
The Real-time Platform built for the Age of Data.
We manage, track and measure your data and users,
and do the mat(c)hmaking in-between:
» provide you with business intelligence and analytics
» harvest user profiles and learn their interests
» dynamically engage your users using quality recommendations
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 4
5. where would you use lily?
» large collections of data » large groups of users
» content repositories » e-commerce / retail
» library catalogs » news / media
» (media) asset management
» product catalogs
» ‘live’ archives
» ... if you want to use big
data, but you need easy.
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 5
7. beyond content management: data + analytics
recommendations
call to action
personalised
revenue
product / service
audience data
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 7
8. LILY 2.0: smart data
SMARTER DATA data processing
s
relation
recommendations
semantic augmentation
Analytics
usage
metrics domain
knowledge
patterns
rules
keywords
lists
...
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 8
9. roadmap
» now: highly-scalable data repository: store, index and search
» next: with real-time usage stats gathering and analytics
» later: and built-in context- and user-sensitive
recommendations
» built on top of Google BigTable / HBase / Solr
» identical, robust technology in use at Facebook, Twitter,
StumbleUpon, Yahoo!
» scales widely over distributed (cloud) infrastructure
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 9
11. status june 2011
» Lily 1.0.1 released - developing since Q4/09
» some customers - DIY retail / media / news
» e-commerce platform project
» Lily as the data (integration) tier
» first contrib: FrogPond (annotated Java <> Lily mapper)
https://bitbucket.org/calmera/frogpond
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 11
12. Next up: usage stats
» sits in CRUD-path
» tracks users ops against
records
interactions
» from both perspectives
record user
» arbitrary K/V properties: time,
location, ...
rec
» automatically builds user
om
me
nd
ati
o
profiles (as records)
ns
indexes
e
tim
» tied to records ops
» indexed access
» time dimension: trending
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 12
13. from usage stats to recommendations ‘light’
record user
» grouping of users based on
» shared properties
» shared record access
» grouping of records based on
» shared properties
{ connections
» shared user operations recommendations
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 13
14. full-on recommendations
» look at real-time-capable Mahout algorithms
» pre-index or -calculate as much as possible
» save as secondary indexes
» present recommendations as part of record API
» allow user to contribute ‘domain knowledge’ to
record processing pipeline
» pattern detection, keywords, ontologies, ...
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org 14
18. Thank you !
for your attention
for your questions
» stevenn@outerthought.org
» @stevenn
IIC » TECHNOLOGIEPARK 3 » B-9052 ZWIJNAARDE (GENT) » www.outerthought.org