Personal Information
Entreprise/Lieu de travail
San Francisco Bay Area, QC United States
Profession
Data scientist at Stitch Fix
Secteur d’activité
Retail
À propos
Data paranoid, failed entrepreneur, ex stock trader, father, Canadian in US, Shanghainese.
Programming since 13 (QBasic in DOS on a 386 PC with a 5' floppy disk). Once studied Physics then went to Canada to learn more on business. Built a company then got hit by financial crisis. Got married and moved to US. Moved to Silicon Valley with wife as she got a job there.
Love freedom and enjoy all the randomness in life.
Highest Kaggle rank: 1076th / 300k https://www.kaggle.com/piggybox
http://stackoverflow.com/users/2102764/piggybox
https://github.com/piggybox
Mots-clés
database
time-series
functional programming
inventory
spark redshift data-engineering spark-summit
spark
redshift
data quality
data cleansing
machine learning
etl
data munging
data wrangling
Tout plus
- Présentations
- Documents
- Infographies
Kubernetes on AWS at Zalando: Failures & Learnings - DevOps NRW
Henning Jacobs
•
il y a 6 ans
Top 5 Mistakes to Avoid When Writing Apache Spark Applications
Cloudera, Inc.
•
il y a 8 ans
(BDT303) Running Spark and Presto on the Netflix Big Data Platform
Amazon Web Services
•
il y a 8 ans
Spark shuffle introduction
colorant
•
il y a 9 ans
Streaming SQL
Julian Hyde
•
il y a 8 ans
Choosing an HDFS data storage format- Avro vs. Parquet and more - StampedeCon 2015
StampedeCon
•
il y a 8 ans
Effective testing for spark programs Strata NY 2015
Holden Karau
•
il y a 8 ans