Personal Information
Entreprise/Lieu de travail
United States United States
Profession
Data Scientist
Secteur d’activité
Technology / Software / Internet
À propos
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Mots-clés
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Tout plus
Présentations
(3)J’aime
(4)AlphaPy
Robert Scott
•
il y a 7 ans
kaggle_meet_up
Marios Michailidis
•
il y a 7 ans
Kaggle Winning Solution Xgboost algorithm -- Let us learn from its author
Vivian S. Zhang
•
il y a 8 ans
General Tips for participating Kaggle Competitions
Mark Peng
•
il y a 8 ans
Personal Information
Entreprise/Lieu de travail
United States United States
Profession
Data Scientist
Secteur d’activité
Technology / Software / Internet
À propos
I lead the development and deployment of scaleable models, with expertise in both real-time and big data architecture.
= Apache: Spark, Hadoop, Pig, Hive, and Oozie.
= Python: scikit-learn, pandas, NumPy, and Luigi.
= R: PivotalR, madlib, Time Series Analysis with X12-ARIMA.
= Modeling: MLLib, H2O, yhat, Sense
= Machine Learning: Random Forests, Clustering, Association Rules, and Logistic Regression.
= Software Development: Streaming, Distributed Systems, REST APIs.
= Visualization: Matplotlib, ggplot2, Seaborn, and D3.
= Database: Hive, Postgres, SQL
I build data science pipelines and frameworks (see my presentations below).
Mots-clés
model
classification
machine learning
kaggle
predictive analytics
analytics
data science
software
scikit-learn
logistic regression
xgboost
tensorflow
pipeline
pandas
python
gradient boosting
random forest
framework
stock market
regression
market analysis
change point
nfl
fantasy
sports
Tout plus