Personal Information
Entreprise/Lieu de travail
San Francisco Bay Area United States
Profession
Data analytics developer
Secteur d’activité
Technology / Software / Internet
Site Web
http://xinhstechblog.blogspot.com
À propos
Developer and team lead, with 9+ years experience in analytics, big data, and data science. At Samsung SDS, I worked on data science projects. As a developer, I used Spark and Scala for data munging, exploration, machine learning, and data pipelines. As a scrum master, I facilitated the Agile process in a team of developers and data scientists.
From 2010-2012, at LLNL, a research and development lab, I worked on a text processing data pipeline for a document search application, as well as analytics with Hadoop, Pig, HBase, and Solr.
In 2005-2009, I worked on Web search at Yahoo!, implementing distributed applications to analyze Web data, consisting of many billions of web pages, in the ...
Mots-clés
dataframe
big data
spark
ops
data pipeline
dcos
production
scala
data science
data munging
analytics
spark sql
Tout plus
Présentations
(3)J’aime
(5)The Future of Real-Time in Spark
Reynold Xin
•
il y a 8 ans
Dato Keynote
Turi, Inc.
•
il y a 8 ans
Introducing DataFrames in Spark for Large Scale Data Science
Databricks
•
il y a 9 ans
Scalding: Twitter's Scala DSL for Hadoop/Cascading
johnynek
•
il y a 11 ans
Data Workflows for Machine Learning - Seattle DAML
Paco Nathan
•
il y a 10 ans
Personal Information
Entreprise/Lieu de travail
San Francisco Bay Area United States
Profession
Data analytics developer
Secteur d’activité
Technology / Software / Internet
Site Web
http://xinhstechblog.blogspot.com
À propos
Developer and team lead, with 9+ years experience in analytics, big data, and data science. At Samsung SDS, I worked on data science projects. As a developer, I used Spark and Scala for data munging, exploration, machine learning, and data pipelines. As a scrum master, I facilitated the Agile process in a team of developers and data scientists.
From 2010-2012, at LLNL, a research and development lab, I worked on a text processing data pipeline for a document search application, as well as analytics with Hadoop, Pig, HBase, and Solr.
In 2005-2009, I worked on Web search at Yahoo!, implementing distributed applications to analyze Web data, consisting of many billions of web pages, in the ...
Mots-clés
dataframe
big data
spark
ops
data pipeline
dcos
production
scala
data science
data munging
analytics
spark sql
Tout plus