These slides were from my Goto Amsterdam presentation. During this presentation I went into detail about how we're building a high performance relevance platform at Hippo with Couchbase and Elasticsearch. The talk will also cover why we chose CouchBase for storage and how Elasticsearch can be used for search and analytics. I shared how we integrated and leverage both products full-circle from within our Hippo CMS product.
5. follow the Hippo trail
OneHippo @ Goto
OneHippo @ Goto
“The capability of a search
engine or function to
retrieve data appropriate
to a user's needs.”
http://www.thefreedictionary.com/relevance
7. follow the Hippo trail
OneHippo @ Goto
OneHippo @ Goto
How we deliver
relevant content
@Hippo
8. follow the Hippo trail
OneHippo @ Goto
Registration
Visitor - entity making HTTP requests
Collector - records data about a visitor or his behavior
Example: location collector (GeoIPCollector)
Targeting Data - all data about a specific visitor
Example: IP address is located in Amsterdam
9. follow the Hippo trail
OneHippo @ Goto
Matching
Characteristic - a type of fact about visitors
Example: "comes from a city", "experiences a type of
weather"
Target Group - the specification of a Characteristic
Example: "comes from a European city", "comes from
Amsterdam"
Persona - one or more target groups that describe a
certain type of visitor
Example: "Jim, the European urban consumer",
"Alice, the Pet owner"
10. follow the Hippo trail
OneHippo @ Goto
What do we store?
Request log
Targeting data
Statistics
Averages, e.g. how many visitors became which persona
23. follow the Hippo trail
OneHippo @ Goto
OneHippo @ Goto
NoSQL to the rescue
24. follow the Hippo trail
OneHippo @ Goto
Suitable types
• Key-value store
• Document database
25. follow the Hippo trail
OneHippo @ Goto
Assessment Criteria
Maturity Data model
Consistency model
PerformanceReplication
Caching model Query model
Monitoring
Scalability
Reliability
Support
40. follow the Hippo trail
OneHippo @ Goto
Flexible data model
• Native JSON support
• Incremental Map Reduce
• Gives power to the developer
41. follow the Hippo trail
OneHippo @ Goto
OneHippo @ Goto
How we run
Couchbase @Hippo
42. follow the Hippo trail
OneHippo @ Goto
Load Balancer
Database cluster
Hippo Delivery Tier
Couchbase cluster
•Request log data
•Targeting data
•Statistics data
43. follow the Hippo trail
OneHippo @ Goto
Query capabilities
• Querying via views
• Secondary indexes via views
• Views based on Map - Reduce
• Lacks some advanced query capabilities
44. follow the Hippo trail
OneHippo @ Goto
Elasticsearch
• Apache Lucene
• Designed to be distributed
• Schema free
• Apache 2 licensed
• RESTful API
45. follow the Hippo trail
OneHippo @ Goto
Added value of ES
• Full text search
• Faceted search
• Geo spatial search
• All in (near) real-time
46. follow the Hippo trail
OneHippo @ Goto
Couchbase Server Cluster Elasticsearch Server Cluster
Hippo Delivery Tier
Java API
Write
Read
XDCR Couchbase ES
Transport plugin
Replicating to ES