7. Flow
A flow is a sequence of manipulations on
pipes of tuple streams
• Flow compiles to one or more MapReduce
jobs
• Inputs and outputs called “Taps”.
• Each Tap produces or receives a pipe of
tuples with the same format
• Multiple inputs, multiple outputs
13. More functionality
• Inner and outer joins natively supported
• Seamlessly branch and merge pipes of
tuples
• Integrate diverse data sources
14. Why not Pig?
• Pig is a custom language for writing
MapReduce workflows
• Because it’s a custom language, intermixing
“plain logic” in between flows is painful
• Not nearly as flexible as Cascading for
custom needs
15. Learn more
• Tutorial: http://blog.rapleaf.com/dev/?p=33
• Website: http://www.cascading.org