Personal Information
Entreprise/Lieu de travail
San Francisco Bay Area United States
Profession
Senior Research Engineer at Netflix
Secteur d’activité
Education
Site Web
www.dbtsai.com
À propos
Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers.
I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem.
Specialties:
• Machine Learning and Data Mining.
• Distributed/Parallel Computing and Big Data Processing.
• Expert in Apache Hadoop
Mots-clés
machine learning
spark
mapreduce
hadoop
mllib
alpine data labs
big data
logistic regression
netflix
data mining
apache spark
multinomial
l-bfgs
recommendation
pipeline
kernel methods
linear models
polynomial mapping
feature engineering
linear regression
ml
spark summit
elastic-net
batch layer
serving layer
speed layer
spark streaming
pig
lambda architecture
real time
storm
stream
large scale
iot
internet of things
svd
k-means
unsupervised learning
Tout plus
Présentations
(9)J’aime
(4)Distributed Time Travel for Feature Generation at Netflix
sfbiganalytics
•
il y a 8 ans
Introducing Windowing Functions (pgCon 2009)
PostgreSQL Experts, Inc.
•
il y a 11 ans
Multinomial Logistic Regression with Apache Spark
DB Tsai
•
il y a 10 ans
Personal Information
Entreprise/Lieu de travail
San Francisco Bay Area United States
Profession
Senior Research Engineer at Netflix
Secteur d’activité
Education
Site Web
www.dbtsai.com
À propos
Big Data Machine Learning Engineer with strong computer science, theoretical physics and mathematical background. I've deep understanding of implementing data mining algorithms in a scalable ways, not just using them as consumers.
I'm a big fan of Scala, and have been using it to develop scalable and distributed data mining algorithms with Apache Spark. I've involved with open source Apache Spark development as a contributor. Apache Spark is a fast and general engine for large-scale data processing, and it fits into the Hadoop open-source ecosystem.
Specialties:
• Machine Learning and Data Mining.
• Distributed/Parallel Computing and Big Data Processing.
• Expert in Apache Hadoop
Mots-clés
machine learning
spark
mapreduce
hadoop
mllib
alpine data labs
big data
logistic regression
netflix
data mining
apache spark
multinomial
l-bfgs
recommendation
pipeline
kernel methods
linear models
polynomial mapping
feature engineering
linear regression
ml
spark summit
elastic-net
batch layer
serving layer
speed layer
spark streaming
pig
lambda architecture
real time
storm
stream
large scale
iot
internet of things
svd
k-means
unsupervised learning
Tout plus