2. 2 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Overview
Big Data and Open Source Software
Government Open Source
Data Tactics Contributions to Open Source
Questions
3. 3 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Overview
4. 4 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Our Staff
200+ Employees
• 90% TS/SCI cleared, many with polygraphs
• 25% have Advanced Degrees and Doctorates
• High percentage of Military and Intelligence
• Community Veterans
• Over 10% of Staff are ―Data Scientists‖
• Three World Class Semantic Researchers
Certification Highlights
• Project Management: CMMI, Project+, and PMP
• Software Development
• VMware
• Cyber Security
• Cloudera Certified Engineers
• Over 40% of Technical Staff
• Hadoop
• Puppet
• MapR
• Greenplum
5. 5 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics: What We Do
• Data Architecture
• Innovation and Design
• Assessment and Benchmarking
• Collaboration and Uniformity
• DataEngineering
• Discovery, Ingestion, and Cleansing
• Scientific Analysis
• Large Scale Computation and Platforms
• DataManagement
• Security and Assurance
• Infrastructure and Administration
• Visualization and Dissemination
6. 6 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Our Methodologies
• Bridging the Academic and Operational Gap
• Translating our operational experience from tactical ground
operations and academia into actionable requirements
• Tactical Engineering
• Team diversity provides a greater understanding of customer
requirements which translates to focused and efficient solutions
• Experience comes from the DoD and Intelligence Community not
just analytical or technical experience
• Staff is trained across multiple technical disciplines
• Right to Left Approach
• Solving our customers’ problems starting at the ―right‖ with ―what do
they want from their data‖
• No ―cookie cutter‖ approach
7. 7 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Core Customers
"...cultivating, strengthening, and advancing your data…"
Today's decisions makers are tasked to gather, correlate and analyze information from
ever increasing data sources in shorter amounts of time. Data Tactics is focused on
solving the problems of data management facing the DoD, intelligence community, law
enforcement and the private sector. From tactical to strategic efforts, our team has lead
the creation, integration and implementation of innovative and proven solutions in the
world of data alignment, modeling and analytics
8. 8 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Key Open Source Big Data Efforts
DNI
INFORMATION TECHNOLOGY EFFICIENCY (ITE): FIRST IC/DoD Cloud
NSA
TEST, INTEGRATION, DEPLOYMENT AND SUSTAINMENT (TIDS)
T3 - Security Engineering / Information Assurance
ARMY
DCGS-A STANDARD CLOUD (DSC): FIRST DoD Production Cloud
DCGS-A EDGE NODE (DEN) / TACTICAL EDGE NODE (D-TEN)
INSCOM ENTERPRISE PLATFORM (IEP): FIRST DoD R&D Cloud
AIR FORCE
AIR FORCE TENCAP: FIRST DoD Implementation of NSA Ghost Machine Architecture
DARPA
NEXUS 7: FIRST DARPA Cloud
MORE EYES: FIRST Deployed DARPA Cloud
XDATA: Integration and CASE Project
9. 9 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Globally Deployed Open Source Clouds
5 Clouds on UNCLASS
17 Clouds on SIPRNET 7 Clouds on TS/SCI 4 Clouds on Coalition
Networks
• 4 at DT in Tyson’s • 1 at AF TENCAP, CO • 4 at DT in VA (DARPA)
• 1 at GISA, Ft. Bragg • 1 at NRL, DC • 1 at DT in VA (IRAD)
• 2 in Hawaii • 1 at DT in VA
• 2 in Germany • 1 at INSCOM • Coalition Networks:
• 7 at APG in MD • 3 for DSC • 1 on CX-I in Afghanistan
• 1 in Afghanistan • 1 on CX-T in Afghanistan
• 1 Cloud on BICES
• 1 Cloud in Germany
Cloud and Big Data Domains are where we
live.
Data is the hard problem.
10. 10 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Big Data and Open Source Software
11. 11 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Government World of Big Data
• The government is moving to BigData
through incorporation of IaaS, PaaS,
DaaS, and SaaS
• Data center consolidation
requirements
• The federal government current IT
budget of $76B of which $19B is
infrastructure
• Migration to Government Open
Source Software (GOSS) Cloud
solutions is expected to save $24B
• The White House BigData Initiative
2012
• $200M in R&D funding
• Six federal departments
participating
12. 12 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Definition: OSS
Open Source Software (OSS)
• Software that is licensed to users with the following
freedoms:
• To run the software for any purpose;
• To study and modify the software; and
• To freely redistribute copies of either the original
or modified software without royalty payments or
other restrictions on who can receive them.
"Free software is a matter of liberty, not price. To understand the concept, you
should think of free as in free speech, not as in free beer.“
—Richard Stallman[2]
13. 13 Data Tactics – Open Source Advocate for Lowering IC IT Costs
OSS as a Focus Area for Success
• Maximizing use of Open
Sourcesoftware lower costs
and provides ahigher ROI
• Lowers the cost for entry for
new systems
• Lowers the Operations and
Maintenance (O&M) costs
14. 14 Data Tactics – Open Source Advocate for Lowering IC IT Costs
OSS as a Focus Area for Success
Enterprise availability at neutral cost provides affordability
at scale
Linked Open Data Cloud Linked Classified Data Cloud
15. 15 Data Tactics – Open Source Advocate for Lowering IC IT Costs
OSS as a Focus Area for Success
• Supports Participation by many Agencies and Developers
• Encourages Transparency, Participation, Task Sharing, and
Collaboration between all agencies in the Intelligence Community
• Provides Enterprise Platforms, Standards, and Rules of the Road to
provide the fertile substrate for innovation
16. 16 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Development Process: OSS*
Any Agency can contribute by contributing,
fixing, or extending the codebase
Governance to validate
Developers
fixes and contributions
Trusted
Developer
Trusted
Trusted Repository governed by Repository
Trusted Developers manage the ―Official‖
version of the program. All developers can
contribute, not all code goes into ―trunk‖ Distributor
Governance for stable releases
User
To the user community
*Open Source Software (OSS) in U.S. Government Acquisitions Constant Beta
by David A. Wheeler User as a Developer
March 2008 (Revised Dec. 17, 2010)
17. 17 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Government Open Source
18. 18 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Government Open The Shelf (GOTS)
• Definition: • Challenges:
• Software and/or • Limited Governance
hardware products that • Participation and
are custom developed Collaboration Muted
by technical staff of the • Result:
government agency for
• Duplication of effort
One agency for a
Mission Need • LOE and TTM increase
Minimal Awareness or Mechanisms to Address Problem
19. 19 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Government Open Source Software
(GOSS)
• Definition: • Benefits:
• Computer software • The power of
available in source code distributed peer review
form, for which the and transparency of
source code and certain process
other rights normally • Improved quality,
reserved for the reliability and flexibility
development agency, • Result:
are provided broadly
• Lower Cost
(within government
community) • End to predatory
vendor lock-in
Knowledge and Insight, vs. Best Guess
20. 20 Data Tactics – Open Source Advocate for Lowering IC IT Costs
GOSS – Benefits to Governance
• Ability to submit patches, features and enhancements for
inclusion in the baseline and for your custom requirements
• Costs reduced if patch is accepted: source agency no longer needs
to reapply and retest patch against new releases.
• Capabilities enhanced, costs reduced if make use of patches
submitted by other agencies
• Interoperability across the government
• Influence through the GOSS community at large to request
additional core features
• Risk management through full visibility into ongoing
development
• Voting power through the GOSS Advisory Board to set
priorities for features (aka Project Governance)
The talent and technology throughout the extended community is
leveraged in a cohesive and productive fashion
21. 21 Data Tactics – Open Source Advocate for Lowering IC IT Costs
GOSS – Benefits to Security
Eliminate the Fear, Uncertainty, and Doubt
• Enterprise Class Security is comparable to commercial applications
• Open Source projects quickly fix security issues –100% faster
• More eyes on:
• Specialties of extended community are brought to bear
• External vulnerability analysis and testing
Visibility, Transparency, and Broad Expertise Delivers a More Secure
Product
Aligned with DFARS clauses and DoD Open Source Agreements
22. 22 Data Tactics – Open Source Advocate for Lowering IC IT Costs
What is Needed for GOSS Project
• A Product that has Broad Value
and Relevancy
• Community Space to Work
• Code Repository, Bug
Tracking, Feature
Requests, Roadmaps,
Documentation,
Collaboration
• Governance Processes
• Charters, Roles &
Responsibilities
• A Community with a Passion for
Success
23. 23 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Contributions to Open
Source
24. 24 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Contributions to OSS Community
• GeoTools
• Part of the Open GeoServer project
• Contributed MongoDB database driver
• Katta
• Various Code fixes and contributions
• Xarm Motif C++ library
• eXtended Template Library
• MilDroid
• Many other individual contributions by our software
engineers
25. 25 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data Tactics Use of OSS
• Integration and Analytic Tools • Distributed Computation Frameworks
• Eclipse and Services
• Gephi • Hadoop MR
• Git • Spark
• Subversion • Storm
• Pig • Giraph
• Sqoop • Resource Scheduling, Management,
• Flume Coordination, Queue
• Mahout • Mesos
• R • Yarn
• Data Storage, Access, and • Zookeeper
Organization • Apache Active MQ
• Katta • Operating Systems, other
• Accumulo • RHEL
• Hbase • Centos
• Riak • Debian
• SOLR • Puppet
• CouchDb • Nagios
• MongoDb • Ganglia
• Terrastore • OpenLDAP
• Hive • Apache HTTPD
• Mysql • JBOSS
• HDFS
26. 26 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Champions of OSS and GOS
• DNI • DoD
• Alex S. Voultepsis • FalconView
• NSA • NASA
• SE-Linux • OpenStack
• SE-Android • Open Government
• Ozone Widget Framework Initiative
(OWF) • Whirlwind
• Accumulo
• DOE
• DISA • 28 projects
• Forge.mil
• White House
• Mil-OSS • Drupal
• DoD-CIO • Army Research Lab
• Dan Risacher
• Ping
• MITRE • Navy
• Dr. David Wheeler
• TOR
27. 27 Data Tactics – Open Source Advocate for Lowering IC IT Costs
28. Contact Info
Lee Shabe
Vice President
Data Tactics Corporation
LShabe@Data-Tactics.com 7901 Jones Branch Dr.
Cell: (703) 963-3523
Office: (571) 297-2136 Suite 700
Will Conroy McLean, VA 22102
Fed/Intel Division Manager
WConroy@Data-Tactics.com www.Data-Tactics.com
Cell: (703) 307-4359
Office: (571) 297-2125
Twitter: @DataTactics
Bruce Goldfeder Blog: http://datatactics.blogspot.com
BGoldfeder@Data-Tactics.com
LinkedIn:
Cell: (703) 304-7518 http://www.linkedin.com/company/data-
Office: (571) 297-2157 tactics-corporation
Eric Whyne
EWhyne@Data-Tactics.com
Cell: (570) 205-3283
29. 29 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Backup Slides
30. Follow Successful Open Source Methods,
Policies, and Governance Models
• Apache Project
• JBOSS
• GNU Licensing (Copyleft)
• Linux
• PostgreSQL
• Many, many more
31. Government Paradigm Shift
Commercial Reality
• Realization that purchasing software is fundamentally
different than battleships or airframes
• Enterprise cost savings for GOSS is tremendous
• Enterprise capability gains far exceed cost concerns
• Faster, cheaper, more secure products
• Unlimited application and extension
• Government Oversight Monitors the Entire GOSS
Ecosystem and Government IT Consumers
• Support those that leverage GOSS
• Financial Incentives
• Infrastructure, Standards, and Best Practices
• Follow the ABC’s of procurement, Adopt, Buy, Create
• Adopt, adapt, and extend existing GOSS
• If not available consider next buying commercial COTS capability
• If not available or not cost effective, create a new GOSS project
32. GOSS – OMB Guidance
From: VivekKundra
Daniel Gordon
Victoria Espinel
...agencies should analyze
alternatives that include
proprietary, open-source, and
mixed source technologies.
... considering factors such as
performance, cost, security,
interoperability, ability to share
or re-use, and availability of
quality support.
33. Updating the Model for Government Software
• Current Model of GOTS procurement is inefficient with
respect to:
• Cost
• Technical Risk
• Schedule Risk
• Mission Risk
• Lock in Risk for both product and service providers
• Currently 93 Federal Government Agencies with their own IT
development infrastructures, budgets, with little to no interaction or
collaboration
• Missions may differ, but system commonalities exist and can be
exploited for efficiencies
34. 34 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Linked IC Data Cloud
Linked IC Data Cloud
Linked Open Data Cloud Linked Classified Data Cloud
35. 35 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Linked IC Data Cloud
Linked Open Data Cloud
Linked Classified Data Cloud
Linked IC Data Cloud
36. 36 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Government Cloud Nodes
#1: ―DI/ODNI‖
#6: ―DI/FBI‖ #2: ―DI/DHS‖
#5: ―DI/CIA‖ #3: ―DI/DOS‖
#7: ―Q2J‖ #4: ―DI/NSA‖
Q2J program
DEEPINSIGHT program
CATALYST program -- Phase A
37. 37 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Data and Utility Clouds
LAYER 4
Cloud Analytics
LAYER 3
Cloud Services
LAYER 2
Cloud Software
Hardware
UtilityCLOUD
LAYER 1 GHOSTMACHINE
Cloud Hardware
38. 38 Data Tactics – Open Source Advocate for Lowering IC IT Costs
IC Network Domains
SCION (FBI) JWICS (ODNI) NSA Net (NSA)
ADN (CIA)
= Compute Cluster, a collection of
computing and storage resources
39. 39 Data Tactics – Open Source Advocate for Lowering IC IT Costs
IC Compute Clusters
SCION (FBI) JWICS (ODNI) NSA Net (NSA)
ADN (CIA)
QL program Q2J program NSA’s SECURE HUB
FBI’s SECURE HUB I2P program WOLFDEN program
CIA’s SECURE HUB DEEPINSIGHT program
= Compute Cluster, a collection of
computing and storage resources
40. 40 Data Tactics – Open Source Advocate for Lowering IC IT Costs
Virtual Network Overlay
Notes de l'éditeur
Maximizing use of Free and Open Source Software Lower Costs and Higher ROILowers the cost for entry for new systemsLowers the Operations and Maintenance (O&M) costsEnterprise availability at neutral costProvides affordability at scaleSupports Participation by many Agencies and DevelopersEncourages Transparency, Participation, Task Sharing, and Collaboration between all agencies in the Intelligence Community Provides Enterprise Platforms, Standards, and Rules of the Road to provide the fertile substrate for innovation
Maximizing use of Free and Open Source Software Lower Costs and Higher ROILowers the cost for entry for new systemsLowers the Operations and Maintenance (O&M) costsEnterprise availability at neutral costProvides affordability at scaleSupports Participation by many Agencies and DevelopersEncourages Transparency, Participation, Task Sharing, and Collaboration between all agencies in the Intelligence Community Provides Enterprise Platforms, Standards, and Rules of the Road to provide the fertile substrate for innovation
Maximizing use of Free and Open Source Software Lower Costs and Higher ROILowers the cost for entry for new systemsLowers the Operations and Maintenance (O&M) costsEnterprise availability at neutral costProvides affordability at scaleSupports Participation by many Agencies and DevelopersEncourages Transparency, Participation, Task Sharing, and Collaboration between all agencies in the Intelligence Community Provides Enterprise Platforms, Standards, and Rules of the Road to provide the fertile substrate for innovation
Maximizing use of Free and Open Source Software Lower Costs and Higher ROILowers the cost for entry for new systemsLowers the Operations and Maintenance (O&M) costsEnterprise availability at neutral costProvides affordability at scaleSupports Participation by many Agencies and DevelopersEncourages Transparency, Participation, Task Sharing, and Collaboration between all agencies in the Intelligence Community Provides Enterprise Platforms, Standards, and Rules of the Road to provide the fertile substrate for innovation
Infrequently shared within or beyond agencyAgency provides funding and sets priorities“Department Level” applicationTypically not built with the Enterprise in mindLimited extensibilityFinished product may be shared with other agencies Limited Governance of Product One agency controls root codebase Many agencies desire additional features As the codebase changes interoperability, accreditation, and merging are lessened Specific mission codebase not built for extensibility Participation and Collaboration are Muted Many agencies have the talent and technology on hand to contribute to a better product Programmatic Fundamentals limit collaboration How does one program for one agency contribute resources to another program at another agency? Agencies “extend” the product with local patches Sharing produces codebase forks Duplication of development effort You need it, I need it –we both do it Duplication of effort across the full lifecycle C&A, Operations, Maintenance Time to market increases LOE increases over time Product interoperability is perturbed or eliminated over timeMinimal Awareness or Mechanisms to Address Problem
Government Open Source Software (GOSS)• Computer software available in source code form, for which the source code and certain other rights normally reserved for the development agency, are provided broadly (within government community) • Permits other agencies to study, change, improve, and contribute to the software in a cohesive and synchronized fashion using Program Governance• An evolutionary way of developing, distributing, licensing and consuming software taking advantage of cost and common task sharingThe power of distributed peer review and transparency of processThe power of code visibility and transparencyResults:Better qualityHigh reliabilityMore flexibilityLower costEnd to predatory vendor lock-in
Infrequently shared within or beyond agencyAgency provides funding and sets priorities“Department Level” applicationTypically not built with the Enterprise in mindLimited extensibilityFinished product may be shared with other agencies Limited Governance of Product One agency controls root codebase Many agencies desire additional features As the codebase changes interoperability, accreditation, and merging are lessened Specific mission codebase not built for extensibility Participation and Collaboration are Muted Many agencies have the talent and technology on hand to contribute to a better product Programmatic Fundamentals limit collaboration How does one program for one agency contribute resources to another program at another agency? Agencies “extend” the product with local patches Sharing produces codebase forks Duplication of development effort You need it, I need it –we both do it Duplication of effort across the full lifecycle C&A, Operations, Maintenance Time to market increases LOE increases over time Product interoperability is perturbed or eliminated over timeMinimal Awareness or Mechanisms to Address Problem