4. The Maslow Pyramid of Data Science
IT Infrastructure
Software Engineering
Quantitative Analytics
Domain
Data
Alex Gilgur. Data Science & Predictive SPC
5. Data Science = Nuclear Energy
Blow up in our face
Alex Gilgur. Data Science & Predictive SPC
6. Data Science = Nuclear Energy
Blow up in our face
…or…
Alex Gilgur. Data Science & Predictive SPC
7. Data Science = Nuclear Energy
Blow up in our face
…or…
Give us Power
Alex Gilgur. Data Science & Predictive SPC
8. What’s the Team We’re Rooting For?
DATA
INFORMATION
8Alex Gilgur. Data Science & Predictive SPC
9. What’s the Team We’re Rooting For?
DATA
INFORMATION
9
INFORMATION
INFORMATION
Alex Gilgur. Data Science & Predictive SPC
10. What’s the Team We’re Rooting For?
INFORMATION
10
INFORMATION
INFORMATION
Alex Gilgur. Data Science & Predictive SPC
22. Arithmetic means of
random samples taken
from any distribution
asymptotically
converges to a normal
distribution as the
number of such samples
tends to infinity.
CENTRAL LIMIT THEOREM
38. A Few Words About Forecasting
Methods:
● EWMA
● ARIMA
● Regression
EWMA models are very specific and computationally fast, but they have to be told
trend (linear or exponential) and seasonality (additive or multiplicative).
ARIMA model will implicitly account for trends, seasonality, and stationarity of the
data. Autocorrelation of ARIMA residuals provide all the periodicities that have been
missed.
For stationary data, use ARIMA
For non-stationary data, use EWMA
EWMA and ARIMA overlap
When to use Regression:
● data are monotonic.
● seasonality is NOT statistically significant.
● EWMA and ARIMA fail.
When to use Quantile Regression:
● Upper and Lower bounds behave differently.
● Outliers are possible.
For each data set, we can run a model competition, computing forecast model
quality based on a weighted sum of model goodness of fit, model suitability for
forecasting, data stationarity and data variability, and selecting the model that works
best for each data set.
EWMA
ARIMA
Quantile Regression