4. You can query up a storm
● SELECT firstname,lastname FROM users WHERE username='tcodd';
firstname | lastname
-----------+----------
Ted | Codd
● SELECT * FROM videos WHERE videoid = 'b3a76c6b-7c7f-4af6-964f-
803a9283c401' and videoname>'N';
videoid | videoname | description
| tags | upload_date | username
b3a76c6b-7c7f-4af6-964f-803a9283c401 | Now my dog plays piano! | My
dog learned to play the piano because of the cat. | dogs,piano,lol | 2012-
08-30 16:50:00+0000 | ctodd
6. ● Can I slice a slice (or sub query)?
● Can I do advanced where clauses ?
● Can I union two slices server side?
● Can I join data from two tables without two
request/response round trips?
● What about procedures?
● Can I write functions or aggregation functions?
7. Let's look at the API's we have
http://www.slideshare.net/aaronmorton/apachecon-nafeb2013
8. But none of those API's do what I
want, and it seems simple
enough to do...
10. Why not just do it client side?
● Move processing close to data
– Idea borrowed from Hadoop
● Doing work close to the source can result in:
– Less network IO
– Less memory spend encoding/decoding 'throw
away' data
– New storage and access paradigms
11. Vertx + cassandra
● What is vertx ?
– Distributed Event Bus which spans the server and
even penetrates into client side for effortless 'real-
time' web applications
● What are the cool features?
– Asynchronous
– Hot re-loadable modules
– Modules can be written in groovy, ruby, java, java-
script
http://vertx.io
13. HTTP Transport
● HTTP is easy to use on firewall'ed networks
● Easy to secure
● Easy to compress
● The defacto way to do everything anyway
● IntraVert attempts to limit round-trips
– Not provide a terse binary format
14. JSON Payload
● Simple nested types like list, map, String
● Request is composed of N operations
● Each operation has a configurable timeout
● Again, IntraVert attempts to limit round-trips
– Not provide a terse message format
15. Why not use lighting fast transport
and serialization library X?
● X's language/code gen issues
● You probably can not TCP dump X
● Net-admins don't like 90 jars for health checks
● IntraVert attempts to limit round-trips:
– Prepared statements
– Server side filtering
– Other cool stuff
19. Application requirement
● User request wishes to know which beers are
“Breakfast Stout” (s)
● Common “solutions”:
– Write a copy of the data sorted by type
– Request all the data and parse on client side
20. Using an IntraVert filter
● Send a function to the server
● Function is applied to subsequent get or slice
operations
● Only results of the filter are returned to the
client
21. Defining a filter JavaScript
● Syntax to create a filter
{
"type": "CREATEFILTER",
"op": {
"name": "stouts",
"spec": "javascript",
"value": "function(row) { if (row['value'] == 'Breakfast Stout')
return row; else return null; }"
}
},
22. Defining a filter Groovy/Java
● We can define a groovy closure or Java filter
{
"type": "CREATEFILTER",
"op": {
"name": "stouts",
"spec": "groovy",
"{ row -> if (row["value"] == "Breakfast Stout") return row else
return null }"
}
},
29. Application Requirements
● User wishes to intersect the column names of
two slices/queries
● Common “solutions”
– Pull all results to client and apply the intersection
there
30. Server Side MultiProcessor
● Send a class that implements MultiProcessor
interface to server
● public List<Map> multiProcess
(Map<Integer,Object> input, Map params);
● Do one or more get/slice operations as input
● Invoke MultiProcessor on input
34. Imagine you want to insert this data
● User wishes to enter this event for multiple column
families
– 09/10/201111:12:13
– App=Yahoo
– Platform=iOS
– OS=4.3.4
– Device=iPad2,1
– Resolution=768x1024
– Events–videoPlayPercent=38–Taste=great
http://www.slideshare.net/charmalloc/jsteincassandranyc2011
38. IntraVert status
● Still pre 1.0
● Good docs
– https://github.com/zznate/intravert-ug/wiki/_pages
● Functional equivalent to thrift (mostly features
ported)
● CQL support
● Virgil (coming soon)
● Hbase like scanners (coming soon)