1. L-3 Data Tactics Data Science:
Why DT Data Science?
!
asymptotically approaching perfect
2. Data Science Team
The Team:
(Geoffrey B., Nathan D., Rich H., Keegan H., David P., Ted P., Shrayes R., Robert R., Jonathan T., Adam VE., Max W.)
!
Graduates from top universities…
…many of whom are EMC Data Science Certified.
!
Advanced degrees include:
mathematics, computer science, astrophysics, electrical
engineering, mechanical engineering, statistics, social sciences.
!
Base competencies (horizontals): clustering, association rules,
regression, naive bayesian classifier, decision trees, time-series,
text analysis.
!
Going beyond the base (verticals)...
3. Horizontals and Verticals
Clustering || Regression || Decision Trees || Text Analysis
Association Rules || Naive Bayesian Classifier || Time Series Analysis
econom
etrics
spatialeconom
etrics
graph
theory
algorithm
s
astrophysicaltim
e-series
analysis
path
planning
algorithm
s
bayesian
statistics
constrained
optim
izations
num
ericalintegration
techniques
PCA
bagging/boosting
hierarchicalm
odels
IRT
space-tim
e
latentclass
analysis
structuralequation
m
odeling
m
ixture
m
odels
SVM
m
axent
CART
autoregressive
m
odels
ICA
factoranalysis
random
forest
dim
ensionalreduction
topic
m
odels
sentim
entanalysis
frequency
dom
ain
patterns
unsupervised
by
supervised
change-pointm
odels
LUBAP
DLISA
4. Team Design
Centralized Structure:
Decentralized Structure:
Hybrid Structure (L-3 DT DS Team):
+ 1 Standardized processes
+ 1 Strategic goals met
- 1 costumer goals not met
+ 1 costumers goals met
- 1 NO Strategic goals met
- 1 Inconsistent & redundant
+ 1 standardized processes
+ 1 Strategic goals met
+ 1 costumers goals met
8. Why Data Science [Business]???
Why are analytics important?
(Business, Analytics, Practical)
!
!
"We need to stop reinventing the cloud
and start using it!"
(Dave Boyd)
!
Using the cloud = doing data science
!
!
9. Why are analytics important?
(Business, Analytics, Practical)
Analytics:
!
No Free Lunch (NFL) theorems: no algorithm performs better
than any other when their performance is averaged uniformly
over all possible problems of a particular type. Algorithms must
be designed for a particular domain or style of problem, and that
there is no such thing as a general purpose algorithm.
!
Meaning you need tool-makers! Not tool users!
!
!
Why Data Science [Data Science]???
10. If this guy doesn’t scale - none of us do. We need data science.
Data Scales
Web Scales
Academic Publications Scale
IC Scales
N
t
t
Why Data Science [Practical]???
N=Amount of data; t=time
11. Big Data needs Data Science, but Data Science does
not need Big Data. We excel with Big and Small Data.
!
BIG DATA, small data - it doesn’t really matter.
Big P vs. Big N vs. small n vs. small p
N: records
P: columns (variables)
!
...it doesn’t matter cause data size alone is not
enough to find vagaries in data:
Generalization = Data + Knowledge.
Data = rough + smooth
Philosophy:
12. DT Data Science Ethos:
“We are Data Dogmatic!”
!
We are NOT “Data Agnostic”
...this should represent an early warning
system about any corporate culture
claiming to “do” data science.
!
The IT notion of data is dead.
14. “Analytics in Perspective” reflects how people arrive at
decisions.
!
GOOD: Induction, Abduction, Circumscription, Counterfactuals.
!
BAD: Deduction, Speculation, Justification, Groupthink
!
!
!
Data Science Perspective...
15. What can dogs teach us about data science?
Dogs and Data Science:
Just as there are odors that dogs can smell and
we cannot, as well as sounds that dogs can hear
and we cannot, so too there are wavelengths of
light we cannot see and flavors we cannot taste.
Why then, given our brains wired the way they
are, does the remark "Perhaps there are thoughts
we cannot think," surprise you? Evolution, so far,
may possibly have blocked us from being able to
think in some directions; there could be
unthinkable thoughts.
!
The point is; analysts have biases and self-
schemas that may preclude them from asking
certain questions of data and thinking in certain
directions. Data Science is about allowing data to
speak and communicate in novel ways.
16. Data Science for Government (DS4G)
DS4G 4 Everyone! - Train everyone!
!
Created and delivered by practitioners of Data
Science!
!
FREE!
!
July 28th @ 11am - 3:30pm; followed by L-3
Data Tactics Quarterly Data Science
Brown Bag (4pm - 5:30pm).
17.
18. Data Science for Government (DS4G)
Data Science for Government
An L-3 Event
July 28, 2014
!
Introduction by Will Grannis
Vice President and Chief Technology Officer, L-3 National Security Solutions
!
Organized by Richard Heimann
Chief Data Scientist, L-3 National Security Solutions
!
!
Speakers:
Nathan Danneman: Nathan’s background is in political science, with specializations in applied statistics and international conflict. He
finished his PhD in June of 2013, and joined Data Tactics in May of that same year. He recently co-authored Social Media Mining with R, is
active in the local Data Science community and currently supports DARPA. Nathan is also EMC Data Science Certified.
Richard Heimann: Richard’s background is in quantitative geography with specializations in spatial statistics and spatially explicit theory.
He currently leads the Data Science Team at L-3 NSS and is adjunct faculty at UMBC and an instructor at GMU teaching related topics.
He recently co-authored Social Media Mining with R and formerly supported DARPA. Richard is also EMC Data Science Certified.
!
Theodore Procita: Ted is an information technologist with ten years experience embracing open-source technology to build large-scale
parallel processing systems for data manipulation and analysis. He's supported government customers in research at NRL and DARPA
along with members of the IC. Ted is also EMC Data Science Certified.
!
Shrayes Ramesh: Shrayes’s background is in economics and statistics. Shrayes completed his undergraduate degree at University of
Virginia in cognitive science and his PhD at University of Maryland, in 2012. Shrayes joined the Data Tactics team in July 2013 and
currently supports DARPA. He is a former instructor of the EMC Data Science course and is himself EMC Data Science Certified.
!
Max Watson: Max’s background is in physics and applied mathematics. Max completed his undergraduate degree at University of
California, Berkeley and completed his PhD at University of California, Santa Barbara in 2012. Max specializes in large-scale simulations,
signal analysis and statistical physics - he joined the Data Tactics team in January 2014 and has supported DHS. Max is also EMC Data
Science Certified.
19. Thank you...
Questions?
Email us!
Homepage: http://www.data-tactics.com
Blog: http://datatactics.blogspot.com
Twitter: https://twitter.com/DataTactics
Or, me (Rich Heimann) at rheimann@data-tactics-corp.com