HOW TO
MANAGE LARGE
AMOUNTS OF
DATA WITH
ITERATEE
BY @WAXZCE – QUENTIN ADAM
SCALA DAYS 2014 BERLIN
MY DAY TO DAY WORK :
CLEVER CLOUD, MAKE YOUR
APP RUN ALL THE TIME
And learn a lot of things about your code, apps, and good/bad design…
KEEP YOUR APPS ONLINE. MADE WITH
NODE.JS, SCALA, JAV...
AND LEARN A LOT OF THINGS ABOUT
YOUR CODE, APPS, AND GOOD/BAD
DESIGN…
WHY STREAMS ARE SO
TRENDY THESE DAYS?
2 MAIN
REASONS
DATA ARE COMING
BIGGER AND BIGGER
WE NEED TO ACT WHILE
DATA TRANSFERRING IS
OCCURRING
&
WE ARE NOW USING
LOTS OF PERMANENT
CONNECTION
EXAMPLE
WHAT IS INSIDE AN
HTTP REQUEST ?
Verb
• The action
Resource
• The object of the action
Headers
• The context of the action...
IN MANY CASE THE REQUEST IS
MANIPULATE ALL FROM MEMORY
UPLOADING A
20 GB FILE ON
A POST
REQUEST
AND STORE IT
EXAMPLE
IT’S A BAD IDEA TO PUT THE BODY
PART IN MEMORY
CREATE A TEMP FILE TO STORE THE
DATA
Client
upload
file
Create a
temp file
Send a
file in a
backend
Then
answer to
the client
The file is
transferring two
times
NOT SO GOOD
Client
upload file
Directly
stream it to
the backend
Then
answer to
the client
LET’S ACT ON STREAM!
CLASSIC JAVA STREAM
MANAGEMENT
CLASSIC JAVA STREAM
MANAGEMENT
• Buffers
• Buffer management
• Buffer exception
• Buffer overflow
CLASSIC JAVA STREAM
MANAGEMENT
• Low performances if not buffered
• Not modular
• Thread blocking
• Code is ugly
• No back...
I CAN’T SLEEP IN THE NIGHT
WE HAVE TO FIND A BETTER WAY
THAT’S
WHY WE
NEED
ITERATEE
DATA
STRUCTURES
TO EXPRESS
DATA STREAM
MANAGEMENT
Like a recipe
Consume the data
ITERATEE : HOW TO MANAGE A STREAM
Produce the data
ENUMERATOR : DATA STREAM
Set of tools to do cool things with Iteratee and Enumerator
ENUMERATEE
SIMPLE EXAMPLE TO START
AN ENUMERATOR
AN ITERATEE
INPUTS
• EOF
• Empty
• El
ITERATEE STATES
• Done
• Error
• Cont
AN ITERATEE
AN ITERATEE TO
MANAGE THE STREAM
AN ITERATEE
ITERATEE
COMPOSITION
PURE FUNCTIONAL
TYPE SAFE
COMPOSITION
SO WE WILL
JUST MANAGE
THE BODY
STREAM
JUST DO NOT REWRITE HTTP PARSER
CREATE TEMP FILE
Built in on play with
CREATE TEMP FILE
Built in on play with
BODY
PARSERS
REQUEST HEADERS -> ITERATEE[ARRAY[BYTE],
EITHER[SIMPLERESULT, ?]]
GET FILE AND
CALCULATE HASH FROM
CHUNK
DEMO
I WANT TO
USE FORMS
SO
PARTHANDLER
ALLOW YOU TO
COMPOSE A
BODYPARSER
ITERATEE <3 COMPOSITION
NOW WE CAN ACCESS
TO STREAMS 
DATA STORAGE
I HATE FILE
SYSTEM
I HATE FILE
SYSTEM
DO NOT USE THE FILE
SYSTEM AS A DATASTORE
File system are POSIX compliant
• POSIX is ACID
• POSIX is powerful but is a bot...
• Key = hash of the chunk
STORE FILE IN CHUNK INSIDE K/V
STORE
STORE AS META DATA
THE LIST OF CHUNKS
PROCESS
File is post
by browser
Stream is
chunk
nomalized
Each chunk
is async
store in kv
List of chunk
hash is store
with...
NEW DEMO
• Play 2 + iteratee
• Datastore is riak 
ARE YOU CONVINCE
TO USE STREAM ?
DATA STREAMING
PROGRAMMING IS NOW
TRENDING
THX TO @CLEMENTD
CLEMENT DELAFARGUE
I’m @waxzce on twitter
I’m the CEO of
A PaaS provider, give it a try
;-)
THX FOR LISTENING
& QUESTIONS TIME
Coupon for Cle...
How to manage large amounts of data with Iteratee - ScalaDays Berlin 2014
How to manage large amounts of data with Iteratee - ScalaDays Berlin 2014
How to manage large amounts of data with Iteratee - ScalaDays Berlin 2014
Prochain SlideShare
Chargement dans... 5
×

How to manage large amounts of data with Iteratee - ScalaDays Berlin 2014

1,987

Published on

How to manage a large file with HTTP? Is it possible to manage data stream when you are in a HTTP POST request? How to be relax when managing data stream (ie : not use while(true) hack or something I/O block)?

This talk is about Play! iteratee and how to use it on real project : what is an iteratee? why it's useful? how to use it? how to manage it? is it clean? is your code readable or not?

http://scaladays.org/#schedule/How-to-manage-large-amounts-of-data-with-Iteratee

Published in: Technologies
0 commentaires
10 mentions J'aime
Statistiques
Remarques
  • Soyez le premier à commenter

Aucun téléchargement
Vues
Total des vues
1,987
Sur Slideshare
0
À partir des ajouts
0
Nombre d'ajouts
10
Actions
Partages
0
Téléchargements
44
Commentaires
0
J'aime
10
Ajouts 0
No embeds

No notes for slide

How to manage large amounts of data with Iteratee - ScalaDays Berlin 2014

  1. 1. HOW TO MANAGE LARGE AMOUNTS OF DATA WITH ITERATEE BY @WAXZCE – QUENTIN ADAM SCALA DAYS 2014 BERLIN
  2. 2. MY DAY TO DAY WORK : CLEVER CLOUD, MAKE YOUR APP RUN ALL THE TIME
  3. 3. And learn a lot of things about your code, apps, and good/bad design… KEEP YOUR APPS ONLINE. MADE WITH NODE.JS, SCALA, JAVA, RUBY, PHP, PYTHON, GO…
  4. 4. AND LEARN A LOT OF THINGS ABOUT YOUR CODE, APPS, AND GOOD/BAD DESIGN…
  5. 5. WHY STREAMS ARE SO TRENDY THESE DAYS?
  6. 6. 2 MAIN REASONS
  7. 7. DATA ARE COMING BIGGER AND BIGGER
  8. 8. WE NEED TO ACT WHILE DATA TRANSFERRING IS OCCURRING
  9. 9. &
  10. 10. WE ARE NOW USING LOTS OF PERMANENT CONNECTION
  11. 11. EXAMPLE
  12. 12. WHAT IS INSIDE AN HTTP REQUEST ? Verb • The action Resource • The object of the action Headers • The context of the action Body • Optional • The datas
  13. 13. IN MANY CASE THE REQUEST IS MANIPULATE ALL FROM MEMORY
  14. 14. UPLOADING A 20 GB FILE ON A POST REQUEST AND STORE IT EXAMPLE
  15. 15. IT’S A BAD IDEA TO PUT THE BODY PART IN MEMORY
  16. 16. CREATE A TEMP FILE TO STORE THE DATA
  17. 17. Client upload file Create a temp file Send a file in a backend Then answer to the client The file is transferring two times
  18. 18. NOT SO GOOD
  19. 19. Client upload file Directly stream it to the backend Then answer to the client
  20. 20. LET’S ACT ON STREAM!
  21. 21. CLASSIC JAVA STREAM MANAGEMENT
  22. 22. CLASSIC JAVA STREAM MANAGEMENT • Buffers • Buffer management • Buffer exception • Buffer overflow
  23. 23. CLASSIC JAVA STREAM MANAGEMENT • Low performances if not buffered • Not modular • Thread blocking • Code is ugly • No back pressure • Error handling is bad • i/o management and business code are mixed
  24. 24. I CAN’T SLEEP IN THE NIGHT
  25. 25. WE HAVE TO FIND A BETTER WAY
  26. 26. THAT’S WHY WE NEED ITERATEE
  27. 27. DATA STRUCTURES TO EXPRESS DATA STREAM MANAGEMENT
  28. 28. Like a recipe Consume the data ITERATEE : HOW TO MANAGE A STREAM
  29. 29. Produce the data ENUMERATOR : DATA STREAM
  30. 30. Set of tools to do cool things with Iteratee and Enumerator ENUMERATEE
  31. 31. SIMPLE EXAMPLE TO START
  32. 32. AN ENUMERATOR
  33. 33. AN ITERATEE
  34. 34. INPUTS • EOF • Empty • El
  35. 35. ITERATEE STATES • Done • Error • Cont
  36. 36. AN ITERATEE
  37. 37. AN ITERATEE TO MANAGE THE STREAM
  38. 38. AN ITERATEE
  39. 39. ITERATEE COMPOSITION
  40. 40. PURE FUNCTIONAL
  41. 41. TYPE SAFE
  42. 42. COMPOSITION
  43. 43. SO WE WILL JUST MANAGE THE BODY STREAM JUST DO NOT REWRITE HTTP PARSER
  44. 44. CREATE TEMP FILE Built in on play with
  45. 45. CREATE TEMP FILE Built in on play with
  46. 46. BODY PARSERS REQUEST HEADERS -> ITERATEE[ARRAY[BYTE], EITHER[SIMPLERESULT, ?]]
  47. 47. GET FILE AND CALCULATE HASH FROM CHUNK
  48. 48. DEMO
  49. 49. I WANT TO USE FORMS
  50. 50. SO PARTHANDLER ALLOW YOU TO COMPOSE A BODYPARSER ITERATEE <3 COMPOSITION
  51. 51. NOW WE CAN ACCESS TO STREAMS 
  52. 52. DATA STORAGE
  53. 53. I HATE FILE SYSTEM
  54. 54. I HATE FILE SYSTEM
  55. 55. DO NOT USE THE FILE SYSTEM AS A DATASTORE File system are POSIX compliant • POSIX is ACID • POSIX is powerful but is a bottleneck • File System is the nightmare of ops • File System creates coupling (host provider/OS/language) • SPOF-free multi tenant File System is a unicorn
  56. 56. • Key = hash of the chunk STORE FILE IN CHUNK INSIDE K/V STORE
  57. 57. STORE AS META DATA THE LIST OF CHUNKS
  58. 58. PROCESS File is post by browser Stream is chunk nomalized Each chunk is async store in kv List of chunk hash is store with file id No overhead for data store 
  59. 59. NEW DEMO • Play 2 + iteratee • Datastore is riak 
  60. 60. ARE YOU CONVINCE TO USE STREAM ?
  61. 61. DATA STREAMING PROGRAMMING IS NOW TRENDING
  62. 62. THX TO @CLEMENTD CLEMENT DELAFARGUE
  63. 63. I’m @waxzce on twitter I’m the CEO of A PaaS provider, give it a try ;-) THX FOR LISTENING & QUESTIONS TIME Coupon for Clever Cloud trial : berlin_scala_days
  1. A particular slide catching your eye?

    Clipping is a handy way to collect important slides you want to go back to later.

×