2. In a single project, I was using:
‣ Matlab for linear algebra
‣ R for stats & visualization
‣ C for the fast stuff
‣ Ruby to tie it all together
My Data Science Stack circa 2009
3. What is the “Two Language Problem”?
a.k.a. “Ousterholt’s dichotomy”
“systems languages” “scripting languages”
static dynamic
compiled interpreted
user types standard types
fast slow
hard easy
4. What is the “Two Language Problem”?
Because of this dichotomy, a two-tier compromise is standard:
‣ for convenience, use a scripting language (Python, R, Matlab)
‣ but do all the hard stuff in a systems language (C, C++, Fortran)
Pragmatic for many applications, but has drawbacks
‣ aren’t the hard parts exactly where you need an easier language?
‣ forces vectorization everywhere, even when awkward or wasteful
‣ creates a social barrier – a wall between users and developers
9. Rube Goldberg Revised
Here’s my data science stack today:
‣ Matlab Julia for linear algebra
‣ R Julia for stats & visualization
‣ C Julia for the fast stuff
‣ Ruby Julia to tie it all together
10. Why Try Julia?
‣ Because it’s fun!
some people enjoy trying out and learning new languages
‣ You’re in a world of pain with other tools
R / Python / Matlab isn’t fast enough for the work you do
Rcpp / Cython / C++ are unappealing, not productive enough
Julia is in the sweet spot for speed & high productivity
‣ For the fancy features
multiple dispatch, coroutine-based I/O, macros & metaprogramming,
efficient custom types, advanced linear algebra, …