SlideShare une entreprise Scribd logo
1  sur  36
Télécharger pour lire hors ligne
Cloud Analytics Playbook
1.0 | Summary




                                                                                                                                       1.0 | Summary
                                                                                  The problems of explosive data growth and how
                                                                                             cloud analytics provide the solution

                                                                                                                           page 4




                                                                                                                                       2.0 | Differentiation
                                                                                                         2.0 | Differentiation
An enormous amount of valuable information is out there,                            Introduces the Architecture and explains how
                                                                              Booz Allen’s unique approach to people, processes,
waiting to be transformed into differentiating services. Booz Allen                             and technology gets the job done

                                                                                                                          page 10
Hamilton uses its Cloud Analytics Reference Architecture to build
                                                                                                                    3.0 | Depth
technology infrastructures that can withstand the weight of massive




                                                                                                                                       3.0 | Depth
                                                                                                      Takes the Architecture apart
                                                                                           layer by layer with detailed visuals that
datasets—and deliver the deep insights organizations need to                                   show you how we frame a solution

                                                                                                                          page 16

drive innovation.




                                                                                                                                       4.0 | Successes
                                                                                                              4.0 | Successes
                                                                               Presents real-world examples from the hundreds of
                                                                      organizations who have successfully worked with Booz Allen
                                                                         to implement an analytics solution using the Architecture

                                                                                                                          page 31




Prefer to read this on your iPad?

Search “Booz Allen” at the iTunes App Store,®
or simply scan the QR code.




                                                                                                                                               3
1.0
Summary




in this section


A majority of executives believe their
companies are unprepared to leverage
their data. We look at why that is and
how to change it.
1.0 | Summary
 Extracting True Insights
 The Growing Data Analysis Gap




                                                                                                                                                                                   2.0 | Differentiation
 We are living in the greatest age of information discovery the world has ever known.

 According to recent industry research, we now generate more data every 2 days than we did from the dawn of early civilization
 through the year 2003 combined. And data rates are still growing—approximately 40% each year.

 Fueled in large part by the more than five billion mobile phones in use around the globe, our world is increasingly measured,
 instrumented, monitored, and automated in ways that generate incredible amounts of rich and complex data. Unfortunately,
 the number of big data analysts and the capabilities of traditional tools aren’t keeping pace with this unprecedented data growth.

 At Booz Allen, we’ve watched this trend for some time now—we call it the “data analysis gap.” It’s clear that data has outstripped




                                                                                                                                                                                   3.0 | Depth
 common analytics tools and staffing levels. In order to move forward, organizations must be able to analyze data on a massive
 scale and quickly use it to provide deeper insights, create new products, and differentiate their services.




                                                                                                                                                                                   4.0 | Successes
                           sustainable                                                  challenging                                      missed
                                                                                                                                      opportunities




                                                                               44                     times
quantity




                                                                                                                                                             Data
                                                    By 2020, the amount of information in our economy will grow 44 times.                                    Analysis
                                                          Very few organizations are prepared for this wave of data.                                         Gap




                              2009                                                                                                                    2020
                                                                                           time
 (Source: IDC)




                                                                                                                                                                                            5
Preparing for
What’s Ahead
The ability to compete and win in the information economy will come from powerful analytics that draw insights and value from                 Done right, analytics hosted in
data, and from high-fidelity visualizations that present those insights in impactful, intuitive ways. Both will become key influencers        the cloud will help:
of corporate decision making and consumer purchasing.
                                                                                                                                              ▶▶ Improve overall performance
Many of the world’s IT systems are not ready for the technology revolution happening as organizations seek to transform how they                 and efficiency
use data. Their infrastructures face three major challenges:                                                                                  ▶▶ Better understand customer and
                                                                                                                                                 employee needs
Volume                                           Variety                                          Velocity                                    ▶▶ Translate data into actionable
Not enough storage capacity and                  Data comes in many different formats,            Inability to process data in real time in        intelligence and faster decision
analytical capabilities to handle                which can be difficult and expensive             order to extract the most value from it          making
massive volumes of data                          to integrate                                                                                 ▶▶ Reduce IT costs
                                                                                                                                              ▶▶ mprove scalability to handle
                                                                                                                                                 I
To help organizations overcome these hurdles and prepare for what’s next, Booz Allen has pioneered strategies for the                              future growth
implementation of the Digital Enterprise—a way of using technology, machine-based analytics, and human-powered analysis to create
competitive and mission advantage.                                                                                                            However, before you invest in a
                                                                                                                                              cloud analytics solution, you should
                                                                                                                                              fully understand the scope of what’s
                                                                                                                                              involved and engage in the proper


A Framework for the Future
                                                                                                                                              planning to ensure that all the right
                                                                                                                                              elements will be in place.




Booz Allen has a framework for intelligently integrating cloud computing technology and advanced analytic capabilities,
called the Cloud Analytics Reference Architecture. The Architecture is designed to solve compute-intensive problems that
were previously out of reach for most organizations, including large-scale image processing, sensor data correlation, social
network analysis, encryption/decryption, data mining, simulations, and pattern recognition.

At the core of the Architecture are systems that accommodate petabytes of data at reasonable cost and allow analytics
to run at previously unattainable scales in reasonable amounts of time. However, human insights and action are still the
fundamental drivers.

The purpose of the Architecture is to allow machines to do 80% of the work—the mundane tasks they are best suited for—
and enable people to do the 20% of the work they do best, tasks that involve analysis and creativity.
1.0 | Summary
The Transformative
Power of Cloud Analytics




                                                                                       2.0 | Differentiation
Booz Allen is the leader in the       Analysis using standard cloud
emerging field of cloud analytics.    computing solutions extends
Our unique approach combines          basic analytic techniques to
cloud and other technologies          large or very large datasets.
with superior analytic tradecraft     This is a logical entry point for
to create breakthroughs in how        cloud solutions because cloud
organizations capture, store,         technology is the most efficient,




                                                                                       3.0 | Depth
correlate, pre-compute, and extract   cost-effective way to run analytics
value from large sets of data.        on large amounts of data.

To understand the power of            Advanced analytics is where
cloud analytics, it helps to see      predictive capabilities are brought
the progression from basic            into the mix. It’s generally used
data analytics performed in           to evaluate the future impact of
most organizations today. As an       strategic decisions. However, it




                                                                                       4.0 | Successes
infrastructure is built out along     represents a step back in terms
the continuum to cloud analytics,     of the size of datasets that can
the size and scale of data it can     be manipulated.
process increases along with the
ability to drive performance and      Cloud analytics transcends
improve decision making.              the limits of the other forms of
                                      analysis. It delivers insights to
Basic data analysis usually           answer previously unanswerable
happens in core business              questions such as:
functions with smaller datasets.
Reports are usually created           ▶▶  ow can we gain competitive
                                         H
on a “one-off” basis, limited            advantage in our market
to distribution within a specific        space?
department to support routine
decision making.                      ▶▶  here can we save money
                                         W
                                         within our organization?

                                      ▶▶  ow should we turn our data
                                         H
                                         into a product?




                                                                                                7
From Data
to Digital Enterprise
Booz Allen clients have
a wide variety of data
analysis challenges and IT
infrastructures. Our flexible,
scalable Cloud Analytics                                                                                 STAGE 3
Reference Architecture has                                                                               Significant improvements
three stages or entry points                                                                             in performance are realized
to accommodate these                                                                                     when you achieve success
differences.                                                          STAGE 2                            in managing the flow of
                                                                      We begin to modernize              information at scale and
In each stage, we enable                                              applications to handle             derive the fullest value from
shifts in technology              STAGE 1                             the demands of advanced            your data.
investments while helping         The focus is on saving              analytics. Faster, reusable,
manage risk and maximize          money and reducing risk.            and more intuitive applications
the  reward. That means
leveraging the assets you
already own and taking logical
                                  You may have already
                                  begun some of these
                                  initiatives; we leverage
                                                                      will enable everyone in your
                                                                      organization to work smarter.      3 CLOUD ANALYTICS
steps to add what’s needed.       what’s working now as we                                                   ▶	 reate deep insight into
                                                                                                               C

                                                                      2 SMART DATA
                                  discover new ways to                                                         relevant mission data
This is the only way to build a
                                  increase efficiency.
structure to instrument data                                                                                   at scale
so you can truly experience                                                                                  ▶	 sk and answer previously
                                                                                                               A
                                                                         ▶	 nhanced Enterprise
                                                                           E
breakthrough analytics.                                                                                        unanswerable questions

                                  1 IT EFFICIENCIES                        Data Architecture
                                                                         ▶	 larify pedigree
                                                                           C
                                                                           (data tagging)
                                   ▶	 Data center consolidation          ▶	 Multidimensional indexing
                                   ▶	 Server and data consolidation      ▶	 Adopt distributed database
                                   ▶	 Increased automation               ▶	 Reusable applications
                                   ▶	 Modernized security
                                   	 posture and metrics
                                   ▶	 Reduced licensing costs




                                                                                                                                  IT Maturity
1.0 | Summary
A Better Approach

As the leaders in cloud analytics, Booz Allen has a proven approach




                                                                                                                                                               2.0 | Differentiation
delivered by some of the industry’s best talent. Here’s why we’re different:



Technical framework                                                                                    A LOOK AHEAD
Our Architecture combines the collective experience of thousands of people who have road tested
technologies from across the cloud solution landscape in hundreds of client organizations, ranging
from the U.S. Federal Government to commercial and international clients.




                                                                                                                                                               3.0 | Depth
                                                                                                       Section 2.0
Best practices                                                                                         Differentiation: Introduces and diagrams
We have an exclusive set of lessons learned and breadth of technical knowledge that saves time and     the Architecture, and explains how it
money while reducing risk.                                                                             reflects Booz Allen’s unique approach.
                                                                                                       You’ll also read about our core design
                                                                                                       principles, extensive service offerings,
Core principles                                                                                        and technology choices.




                                                                                                                                                               4.0 | Successes
These are “rules of the road” we’ve developed to build the most effective solution with the highest    Pages 10–15
return on investment. They encompass everything from how data should be stored to how to improve
relationships with the end users of your data.
                                                                                                       Section 3.0
                                                                                                       Depth: Takes the Architecture apart layer
Critical skill sets                                                                                    by layer with detailed visuals, design
We bring technologists as architecture and solutions specialists, domain experts who know your         concepts, and recommended solutions
industry and your data, and data scientists who explore and examine data from disparate sources        from the cloud vendor landscape. The
and recommend how best to use it. No one else in the industry offers a better combination of talent.   section ends with a look at how security is
                                                                                                       built into all levels. Pages 16–30

Vendor neutrality
Our approach utilizes a broad ecosystem of products and custom systems culled from an exhaustive       Section 4.0
survey of available options. In the crowded, fragmented, and continually evolving landscape of cloud   Successes: Presents real-world
solutions, we recommend only the best fit and value for your organization.                             examples from our extensive file of case
                                                                                                       studies. We present the solutions and
                                                                                                       challenges, describe and diagram the
                                                                                                       implementations, and explain the results.
                                                                                                       Pages 31–35




                                                                                                                                                                        9
2.0
Differentiation




in this section


Booz Allen’s approach to cloud
analytics is unmatched in the
industry. Read about our unique
principles and best practices.
1.0 | Summary
A Layered
Framework
                                                                          Human Insights and Actions




                                                                                                                2.0 | Differentiation
                                                                          Enabled by customizable interfaces
The Booz Allen Cloud Analytics                                            and visualizations of the data
Reference Architecture                       Visualization,
                                                Reporting,
incorporates a wide range
                                          Dashboards, and
of services to move from                   Query Interface
a technology infrastructure
with chaotic, distributed data
burdened by noise to large-
                                                                          Analytics and Services
                                            Services (SOA)                Your tools for analysis, modeling,
scale data processing and
                                                                          testing, and simulations
analytics characterized by
speed, precision, security,




                                                                                                                                  3.0 | Depth
scalability, and cost efficiency.
                                             Analytics and
                                                Discovery
However, Booz Allen’s approach
is about much more than
infrastructure. We start with
your need to make better sense
and better use of your mission
                                         Views and Indexes
data, and build from there.




                                                                                                                                  4.0 | Successes
                                                              Streaming
                                                               Indexes



                                                                          Data Management
                                                                          The single, secure repository
                                                                          for all of your valuable data


                                                Data Lake




                                         Metadata Tagging


                                             Data Sources

                                                                          Infrastructure
A FRAMEWORK FOR SECURITY                    Infrastructure/               The technology platform for storing
                                              Management                  and managing your data
Page 30 details our security processes




                                                                                                                                        11
Booz Allen
Cloud Analytics Service Offerings




cloud strategy and                                    Cloud application migration                           advanced cloud analytics                            infrastructure
economics
                                                      Expertise in the assessment, prioritization,          Delivery of scalable analytics platforms allowing   Design and implementation of IaaS offerings
Delivery of strategy, technology, and economic        architectural mapping, re-engineering, and            the processing of information at extreme scale;     to provide global access to data storage,
analysis for evaluating and planning all of the       optimization of workloads that have high value        and eDiscovery: high-volume, full text indexing,    computing, and networking services on
business, technical, operational, and financial       and are ready for migration to the cloud              and context-based search of information             demand through self-service portals
aspects of a cloud transition




cloud security                                        software and platform                                 vdi deployment and                                  data center migration
                                                                                                            integration                                         and optimization
Unified risk management approach to define cloud      Expertise in the secure implementation of SaaS and
security requirements, controls, and a continuous     PaaS service delivery models, data migration, and     Delivery of flexible and dynamic virtual desktop    Identify critical factors, design, and execute
monitoring framework to address data protection,      integration with existing enterprise infrastructure   infrastructure to simplify management, reduce       the transformation of legacy IT systems to
identity, privacy, regulatory, and compliance risks   and applications                                      licensing costs, and increase desktop security      virtualized and cloud computing environments
                                                                                                            and data protection requirements
1.0 | Summary
Complex Ecosystem

We’ll help you navigate the crowded, fragmented, and continually evolving




                                                                                                          2.0 | Differentiation
vendor ecosystem to design a best-of-breed solution for your organization.




                                                                                                                           3.0 | Depth
                                                                                                                            4.0 | Successes
                                                                             Human Insights and Actions

                                                                             Analytics and Services

                                                                             Data Management
                                                                             Infrastructure
                                                                             Data Integration




                                                                                                                                  13
Booz Allen
Cloud Analytics Core Principles


                  In-situ processing                                     Use commodity hardware                            Schema on read
                  The Architecture demands that “you send the            Hardware should be expected to fail as the        If you have all the source data indexed and query-
                  question to the data,” because most big data           normal condition. The Architecture supports       able, plus the ability to create aggregations, then
                  processes are disk I/O-bound. In-situ processing       both scalability and fault tolerance to achieve   you can manage complex ontologies and demands
                  means that most of the computation is done             optimal application load balancing.               in a very efficient manner.
                  locally to the data, so that analytics run faster.
                  This can enhance existing analytic capabilities
                  and/or allow you to ask entirely new types
                  of questions.



                  Throw away nothing                                     Economies of scale                                Change development process
                  Near-linear scalable hardware and software             What used to be called service-oriented           In order to develop a tight, iterative relationship
                  systems allow much more data to be stored,             architecture (SOA) means that you can             with your end users, you can develop/research
                  which enable reprocessing of historical data           define the value and cost of services in          a new capability in hours (not months), and the
                  with new algorithms and correlations that              your enterprise, and plan your development        process of discovery and integration with the rest
                  bring new insights.                                    actions, either to reduce the cost of low-value   of the enterprise begins much sooner, too.
                                                                         components or increase the scale of high-
                                                                         value components.


                  Data tagging
                  You can now afford to tag all of your data for
                  sensitivity or other controls (such as geographic).
                  This is the fastest, most reliable way to instrument
                  change across your entire Data Lake.
1.0 | Summary
Deeper                                    The Cloud Analytics Reference Architecture enables staff at all levels to quickly gather and act on granular insights
                                          from all of your available data, regardless of its format or location. Below are some of the ways human insights and
Insights                                  actions are enhanced by this new framework, which fosters greater collaboration and teamwork, and, ultimately,
                                          delivers the highest business value from your information and your computing infrastructure.




                                                                                                                                                                                            2.0 | Differentiation
                                                                                                                                                                                                              3.0 | Depth
Human Insights and Actions                           Analytics and Services                       Data Management                             Infrastructure

DecisionMakers,                                      Analysts and Data                            Developers and Data                         System Administrators
Investigators, Interdictors,                         Scientists                                   Scientists                                  and IT Staff




                                                                                                                                                                                                              4.0 | Successes
AND Analysts
                                                     ▶▶  reate and use many views into the
                                                        C                                         ▶▶  o longer constrained by years-old
                                                                                                     N                                        ▶▶  educe IT costs through commoditization
                                                                                                                                                 R
▶▶  eal-time alerting, situational awareness, and
   R                                                    same data                                      schemas                                  and economies of scale
   dissemination specific to their clearance level   ▶▶ Automatically find trends and outliers    ▶▶   C
                                                                                                        atalog and index the data that is    ▶▶ Meet long-term scalability requirements
▶▶ nvestigate and provide feedback
   I                                                 ▶▶  valuate analysis methods to determine
                                                        E                                              relevant today
   on reporting                                         and enhance best-of-breed tradecraft      ▶▶   F
                                                                                                        ree to create new views and
▶▶ Interact and search using tailored tools                                                            reporting metrics
                                                                                                  ▶▶   R
                                                                                                        eference undiscovered trends in
                                                                                                       original data
                                                                                                  ▶▶   A
                                                                                                        pply advanced machine learning and
                                                                                                       statistical methods
                                                                                                  ▶▶   In-situ hypothesis testing




                                                                                                                                                                                                                    15
                                                                                                                                                                                                                    15
3.0
Depth




in this section


We diagram and describe each layer
of the Cloud Analytics Reference
Architecture, including our design
principles and technology choices.
1.0 | Summary
Reference
Architecture




                                                                                                                                          2.0 | Differentiation
Booz Allen’s Cloud Analytics                                            layer 1
Reference Architecture provides
a holistic approach to people,                                          Human Insights and Actions
processes, and technology in four                                       Building on results and outputs from various analytical
tightly integrated layers.                                              methods, multiple data visualizations can be created
                                                                        in your new cloud analytics solution. These are used to
                                                                        compose the interactive, real-time dashboard interfaces
Key Attributes                                                          your decision-makers and analysts need to make sense
                                           Human Insights and Actions   of your data.
By design, the Booz Allen Cloud
Analytics Reference Architecture:




                                                                                                                                  3.0 | Depth
                                                                        layer 2

▶▶ s reliable, allowing distributed
   I                                                                    Analytics and Services
     storage and replication of bytes                                   Both traditional and “Big Data” tools and software
     across networks and hardware                                       can operate on the information stored in your Data
     that is assumed to fail at any time                                Lake, producing advanced specific analysis, modeling,
▶▶   A
      llows for massive, world-scale                                   testing, and simulations you need for decision making.
     storage that separates metadata
     from data                             Analytics and Services




                                                                                                                                          4.0 | Successes
▶▶   S
      upports a write-once,
     sporadic append, read-many
     usage structure                                                    layer 3
▶▶   S
      tores records of various sizes,                                  Data Management
     from a few bytes up to a few                                       Your Data Lake is a secure, distributed repository of a
     terabytes in size                                                  wide variety of data sources. Security, metadata, and
▶▶   A
      llows compute cycles to be                                       indexing of Big Data are enabled by distributed key
     easily moved to the data store,                                    value systems (NoSQL), but the Architecture allows for
     instead of moving the data to                                      traditional relational databases as well.
     a processer farm                      Data Management

                                                                        layer 4

                                                                        Infrastructure
                                                                        This foundational layer allows for quick, streamlined,
                                                                        low-risk deployment of the cloud implementation. The
                                                                        plug-and-play, vendor-neutral framework is unique to
                                                                        Booz Allen.
A FRAMEWORK FOR SECURITY
                                           Infrastructure
Page 30 details our security processes




                                                                                                                                                17
layer 1




Human Insights and Actions
architecture model




                Monitoring   Exploratory




                             Geospatial


                             Line Chart




   Analytics and Services
1.0 | Summary
Human Insights and Actions                                                          (continued)



PRINCIPLES AND TECHNOLOGIES




                                                                                                                                                                                                                  2.0 | Differentiation
In analytics solutions built on the Architecture, the data that’s available and the desired results drive the interfaces—   TECHNOLOGY EXAMPLES




                                                                                                                                                                                                          3.0 | Depth
not the other way around. When user communities and stakeholders aren’t restricted by their tools, they can perform
complex visualizations to identify patterns they previously couldn’t see.                                                   HTML5, JavaScript, OWF, Synapse   Lightweight, custom web-based
                                                                                                                                                              applications and dashboards tailored
That freedom defines the guiding principles behind this first layer of the Architecture:                                                                      to specific user communities or
                                                                                                                                                              stakeholders for data exploration, event
                                                                                                                                                              alerting, and monitoring, as well as
▶▶ Design and build the framework so that the desired data and analytic results define the visualization                                                      continuous quality improvement
▶▶ Reuse results and outputs of analytics across different visualizations
▶▶  ecouple the underlying analytics and data access from the visualizations and interfaces so that it’s possible
   D




                                                                                                                                                                                                                  4.0 | Successes
  to build customized, interactive dashboard interfaces composed of dynamically linked visualizations                       Commercial products (Splunk,      Out-of-the-box, easy-to-build dashboards
                                                                                                                            Pentaho, Datameer Business        for historical trending and real-time
                                                                                                                            Infographics, etc.)               monitoring to analyze user transactions,
                                                                                                                                                              customer behavior, network patterns,
                                                                                                                                                              security threats, and fraudulent activity



                                                                                                                            Adobe Flex and Adobe Flash        Despite the rise of HTML5, Adobe Flex
                                                                                                                                                              and Flash applications still remain
                                                                                                                                                              strong candidates for quickly building
                                                                                                                                                              and deploying rich user interfaces




                                                                                                                                                                                                                        19
layer 2




Analytics and Services
architecture model

   Human Insights and Actions




                Time Series                             Social Network
                                                           Analysis




                   R, SAS, Matlab,   MapReduce, Hive,
                    Mathematica         Pig, Hama
1.0 | Summary
Analytics and Services                                              (continued)



PRINCIPLES AND TECHNOLOGIES




                                                                                                                                                                                                                2.0 | Differentiation
                                                                                                                     TECHNOLOGY EXAMPLES

                                                                                                                     Data Mining                         Data mining is used to discover patterns
                                                                                                                                                         in large datasets and draws from multiple
                                                                                                                                                         fields including artificial intelligence,
                                                                                                                                                         machine learning, statistics, and
                                                                                                                                                         database systems.


Frequently where data is concerned, the whole is greater than the sum of its parts. In the most strategic business




                                                                                                                                                                                                       3.0 | Depth
                                                                                                                     Machine Learning                    Machine learning is used to learn
decisions, the ability to combine multiple types of analyses creates a holistic picture that can lead to much more
                                                                                                                                                         classifiers and prediction models in
valuable insight. With the Cloud Analytics Reference Architecture, you can implement different                                                           the absence of an expert and employs
types of analytical methods.                                                                                                                             many algorithms in the areas of
                                                                                                                                                         decision trees, association learning,
                                                                                                                                                         artificial neural networks, inductive logic
This integrated approach is an anchor for the guiding principles behind our Analytics and Services layer:
                                                                                                                                                         programming, support vector machines,
                                                                                                                                                         clustering, Bayesian networks, genetic
▶▶  llow both traditional and Big Data analysis tools and software to operate on a centralized repository of data
   A                                                                                                                                                     algorithms, reinforcement learning, and
  (the DataLake)                                                                                                                                         representation learning.




                                                                                                                                                                                                                4.0 | Successes
▶▶ Integrate results and outputs of analyses and visualize them on dashboards for decision making
▶▶ Decouple tools from the various types of analyses to make the system more extensible and adaptable
▶▶ nclude a service-oriented architecture layer to reuse results and outputs in many different ways relevant to
   I                                                                                                                 Natural Language Processing (NLP)   NLP is used to process unstructured
                                                                                                                                                         and semi-structured documents for
   different stakeholders and decisionmakers                                                                                                             the purposes of information retrieval,
▶▶ ncorporate Certified Catastrophe Risk Analysis (CCRA) to allow a variety of data analysis tools and software
   I                                                                                                                                                     sentiment analysis, statistical machine
                                                                                                                                                         translation, and classification.
   to be integrated and used; it also enables results and outputs of analyses to be visualized and used across
   multiple interfaces
                                                                                                                     Network Analysis                    Network analysis using graph theory
                                                                                                                                                         and social network analysis are
                                                                                                                                                         used to understand association
                                                                                                                                                         and relationships between entities
                                                                                                                                                         of interests.



                                                                                                                     Statistical Analysis                Traditional statistical methods using
                                                                                                                                                         univariate and multivariate analysis on
                                                                                                                                                         relatively small datasets are employed
                                                                                                                                                         to make inferences, test hypotheses,
                                                                                                                                                         and summarize data.




                                                                                                                                                                                                                     21
layer 2




Analytics and Services                                                (continued)



discovering your data



Before working with Booz Allen, most clients faced a fundamental
challenge with data discovery. They didn’t know what data was actually
available or how to sort through all of it to identify the most important
business problems or trends it could reveal.
                                                                                                                                     Search
Technical framework

Discovery is intimately related to search and analysis. All three feed into insight in a nonlinear fashion.
A search-discovery-analytics process that solves business problems without consuming disproportionate
resources meets these user needs:

▶▶ Real-time, ad hoc access to content
▶▶ Aggressive prioritization based on importance to the user and the business
▶▶  ata-driven decision making, which relies on the ability to try different approaches and ideas in order
   D
                                                                                                                  Analytics                         Discovery
  to discover previously unimagined insights
▶▶ Feedback/learning from the past intelligently applied to today’s data




How booz allen simplifies discovery

Other solutions require analysts to break down data into numerous subsets and samples before it can            How does the Architecture support fast, efficient, and scalable search on entire datasets, not just samples?
be digested. This expensive, time-consuming process is one of the major roadblocks to turning data into
true business intelligence.                                                                                    ▶▶  ulk and soft real-time indexing enable the solution to handle billions of records with subsecond
                                                                                                                  B
                                                                                                                 search and faceting
Even though the Booz Allen Cloud Analytics Reference Architecture supports the most advanced                   ▶▶  arge-scale, cost-effective storage and processing capabilities accommodate “whole data”
                                                                                                                  L
analysis, it can also allow your staff to sift through all of your data on a basic level. Without tedious or      consumption and analysis; in-memory caching of critical data ensures applications meet performance
sophisticated sampling and complex tools, they can discover what’s useful and what’s not useful for a             requirements
specific business problem.                                                                                     ▶▶  LP and machine learning tools can scale to enhance discovery and analysis on very large datasets
                                                                                                                  N
1.0 | Summary
Analytics and Services                                             (continued)



HOW THE DATA SCIENCE LIFECYCLE DISTILLS INSIGHTS




                                                                                                                      2.0 | Differentiation
The data science lifecycle consists of three basic steps:



Step 1
First, data is sampled using a cloud analytics platform. This step may involve a sophisticated analytic
that runs in the cloud, such as one that crawls a social network to find people with certain types of
relationships with an individual or organization. This sampling can be done using either high-level query




                                                                                                             3.0 | Depth
languages that are specially made for scalable cloud analytics or low-level developer interfaces.



Step 2
Next, a data scientist models the data sample in order to understand it better. This is usually done using
a statistical modeling environment on the data scientist’s workstation.




                                                                                                                      4.0 | Successes
Step 3
Finally, once a trend is established using the model, the data scientist works with analysts and domain
experts to explain the trend and yield insights.




This cycle is repeated until the data science team reaches actionable insights and intelligence that
can be presented to senior leadership for decision making purposes. Information may be delivered in
a visualization, dashboard, or written report.




                                                                                                                            23
layer 3




Data Management
architecture model




  Human Insights and Actions


  Analytics and Services




                               Batch   Structured


              Unstructured
1.0 | Summary
Data Management                                           (continued)



PRINCIPLES AND TECHNOLOGIES




                                                                                                                                                                                                                      2.0 | Differentiation
A central feature of the Architecture, the Data Lake delivers on the promise of cloud analytics to offer previously hidden insights and    TECHNOLOGY EXAMPLES
drive better decisions. It’s a secure repository for data of all types and origins. Instead of precategorizing data, which restricts its
usability from the moment it enters your organization, the Architecture combines unstructured, structured, and streaming data types




                                                                                                                                                                                                              3.0 | Depth
                                                                                                                                           Hadoop Distributed File     The primary open-source, distributed
and makes them available for many different forms of analysis.                                                                             System (HDFS)               storage system creates multiple
                                                                                                                                                                       replicas of data blocks and
The following principles demonstrate how the Architecture enables your organization to use this repository of enterprise                                               distributes them on compute nodes
                                                                                                                                                                       throughout a cluster to enable
data to the best advantage:                                                                                                                                            reliable, rapid computations.

▶▶ Provide inherent replication of the data through a distributed file system
▶▶  se distributed key value (NoSQL) data storage to enable security and metadata tagging at the data level                               Accumulo                    NoSQL store based on Google’s
   U
                                                                                                                                                                       BigTable design features cell-level
     as well as indexing for specialized retrieval




                                                                                                                                                                                                                      4.0 | Successes
                                                                                                                                                                       security access labels and a server-
▶▶   R
      elax schema constraints and provide the flexibility to adapt to changing data sources and types with the                                                        side programming mechanism
     schema-on-read approach of distributed key value data storage                                                                                                     that can modify key/value pairs
▶▶                                                                                                                                                                     at various points in the data
     Store the Data Lake on commodity hardware and scale linearly in performance and storage
                                                                                                                                                                       management process.
▶▶   Don’t presummarize or precategorize data
▶▶   Enable rapid ingest of data, aggressive indexing, and dynamic question-focused datasets through scale
                                                                                                                                           Hbase, Cassandra, MongoDB   Open-source NoSQL databases
                                                                                                                                                                       focused on a combination of
                                                                                                                                                                       consistency, availability, and
                                                                                                                                                                       partition tolerance.



                                                                                                                                           Neo4j                       NoSQL scalable graph database
                                                                                                                                                                       storing data in nodes and the
                                                                                                                                                                       relationships of a graph.




                                                                                                                                                                                                                            25
layer 3




Data Management                                                (continued)


DATA LAKE


Booz Allen works with organizations in corporate and government sectors that have an urgent
need to make sense of volumes of data from diverse sources, including those that had been
inaccessible or extremely difficult to utilize, such as streams from social networks. Now analysts
and decisionmakers can form new connections between all of this data to uncover previously
hidden trends and relationships.




                                                                    Enterprise                      Machine-to-                           Transaction         Sensor Data
                                                                       Data                           machine                                logs
                                                                                                   communication




          Intrusion and
          malware detection                                                                                                                                                                  Fraud Detection


                                            Cyber                              Government                                  Defense                       Finance            healthcare
                                      Security Logs                      Regulatory Compliance                    Enhanced Situational                   Enhanced               Email
                                     Quarterly Filings                                                                Awareness                         Forecasting            Reports
                                      System Logs                                                                                                                            Financials
                                                                                                                                                                            Press Articles


               Individual organizations require different types of data. Not all types of data listed above may apply to every organization.
1.0 | Summary
                                                         Enhanced Media Content
                          Booz Allen’s strategy and technology consultants are highly regarded subject matter experts. Through groundbreaking
                                conference keynotes, whiteboard talks, and papers, they help educate and shape the analytics industry.




                                                                                                                                                                                            2.0 | Differentiation
                            We invite you and your team to take advantage of the educational resources listed below to gain strategic insights
                                     about the use of analytics, explore technical topics in depth, and stay on top of the latest trends.




Presentations                                                                             Videos
Yahoo! Hadoop Summit: Biometric Databases and Hadoop                                      Cloud Whiteboard Playlist
Invented and demonstrated methods for dense data correlation (e.g., imagery and           Short instructional videos on a range of topics from introductory talks for executives




                                                                                                                                                                                   3.0 | Depth
biometrics) within a Hadoop distributed computing platform using new machine learning     to tutorials for data analysts. Check back frequently for new material.
parallel methods.
                                                                                            Cloud Analytics for Executive Leadership
Yahoo! Hadoop Summit: Culvert—A Robust Framework for Secondary Indexing of                  Booz Allen Principal Josh Sullivan discusses how analysis of data can be used
Structured and Unstructured Data                                                            as a tool to provide insight to executives.
Demonstration of Booz Allen’s secondary indexing solutions and design patterns,
which support online index updates as well as a variation of the HIVE query language        Informed Decision Making: Sampling Techniques for Cloud Data
over Accumulo and other BigTable-like databases to allow indexing one or more columns       Booz Allen Data Scientist Ed Kohlwey explains how sampling large amounts




                                                                                                                                                                                            4.0 | Successes
in a table.                                                                                 of data can be useful for program managers to make informed decisions.

Slidecast: Hadoop World—Protein Alignment                                                   Developer Perspectives: The FuzzyTable Database
Demonstration of advanced analytics in using protein alignment sequences to identify        Booz Allen Data Scientist Drew Farris explains how to use the FuzzyTable
disease markers using Hadoop, HBase, Accumulo, and novel machine learning concepts.         biometrics database.

Slidecast: Innovative Cyber Defense with Cloud Analytics
Presentation on improving intelligence analysis through a hybrid cloud approach           Workshop
to analytics, with descriptions and diagrams from Booz Allen client solutions.            O’Reilly Strata Conference: Beyond MapReduce—Getting Creative with Parallel Processing
                                                                                          Technical discussion of MapReduce as an excellent environment for some parallel
Slidecast: Integrating Tahoe with Hadoop’s MapReduce                                      computing tasks and the many ways to use a cluster beyond MapReduce.
Invented and demonstrated method to use least-authority encrypted file system as plugin
to HDFS within Hadoop cluster.


                                                                                                              Learn More
Papers
Massive Data Analytics in the Cloud                                                                           Scan the QR code, or go directly to:
Overview of the business impact of cloud computing, and how data clouds are                                   boozallen.com/cloud
shaping new advances in intelligence analysis.




                                                                                                                                                                                                  27
layer 4




Infrastructure
architecture model

  Human Insights and Actions

  Analytics and Services




                               Virtual Desktop
                                  Integration
1.0 | Summary
Infrastructure                           (continued)



PRINCIPLES AND TECHNOLOGIES




                                                                                                                                                                                                                                          2.0 | Differentiation
                                                                                                                                                                                                                                  3.0 | Depth
                                                                                                                                                    TECHNOLOGY EXAMPLES
Infrastructure is the foundation for any cloud implementation. What      The following principles guide the infrastructure layer of
makes the Booz Allen Cloud Analytics Reference Architecture unique       the Architecture:
is its plug-and-play, vendor-neutral framework. This framework not                                                                                  Amazon Web Services, Microsoft      Cloud tool chain for provisioning,
only allows a greater range of choices in selecting resources and        ▶▶  ake it easy to transform physical resources from legacy IT
                                                                            M                                                                       Azure, Puppet, VMware, vSphere      configuration, orchestration, and
                                                                                                                                                                                        monitoring of virtual environment.
building services, it also allows for a faster, more streamlined, more      systems to secure, virtualized data centers and trusted cloud
                                                                                                                                                                                        These tools provide the building blocks
secure, and lower risk deployment.                                          computing environments                                                                                      for IaaS, PaaS, and foundation for
                                                                         ▶▶ mplement core services to provide the mechanisms to realize
                                                                            I                                                                                                           SaaS. Run multiple operating systems




                                                                                                                                                                                                                                          4.0 | Successes
                                                                            on-demand self-service, broad network access, resource pooling,                                             and virtual network platforms on the
                                                                                                                                                                                        same hardware—sharing computing,
                                                                            rapid elasticity, and measured service                                                                      storage, and networking resources.
                                                                         ▶▶  mploy virtualization to increase utilization of existing assets and
                                                                            E
                                                                            resources, and improve operational effectiveness
                                                                         ▶▶  ngineer in-depth security to provide controls and continuous
                                                                            E                                                                       Security through VMware, McAfee,    Protect assets—physical, logical, and
                                                                            monitoring in order to fully address data protection, identity,         Symantec, Cisco, TripWire, EnCase   virtual—while automating governance
                                                                            privacy, regulatory, and compliance risks                                                                   and compliance.




                                                                                                                                                                                                                                                29
                                                                                                                                                                                                                                                29
Reference Architecture
Security Framework
                                      Business
                                      Assets to Be Protected                                     Geography
                                      Threats and Processes that Require Security                Distributed Sites | Remote Workers | Jurisdictions
                                      Organizational Security                                    Time Dependencies
                                      Governance | Supply Chain | Strategic Partnerships         Transaction Throughput | Lifetimes and Deadlines


                                      Business Layer
                                      Conceptual
                                      Business Attributes                                        Technical and Management Security Strategies               Roles and Responsibilities
                                      Security Requirements to Support the Business
                                                                                                 Trust Relationships                                        Time Dependencies
a framework                           Control and Enablement Objectives
                                                                                                 Security Domains, Boundaries, and Associations             When Is Protection Relevant?
for security                          Resulting from Risk Assessment

The Architecture is designed to       Business Layer
                                      Logical
protect your data at rest and         Business Information to Be Secured                         Interrelationships                                         Security Services
in flight, with security controls     Security and Risk                                          Attributes
                                                                                                                                                            Authentication Confidentiality and Integrity
                                                                                                                                                            Protection | Strategic Partnerships
embedded in each layer. This          Security Entities                                          Management Policy

is obviously more than just
                                      Business Layer
                                      Physical
a technology challenge. We
                                      Security-Related Data Structures                           Security Mechanisms                                        Human Interface
understand the need to embed          Tables | Messages | Pointers | Certificates | Signatures   Encryption | Access Control | Digital Signatures           Screen Formats | User Interaction
                                                                                                                                                            Access Control | Systems
new processes and training            Security Rules                                             Security Infrastructure
                                                                                                 Physical Layout of Hardware, Software, and
                                      Conditions | Practices | Procedures                                                                                   Time Dependencies
regimens so your staff handles                                                                   Communication Lines                                        Sequence of Processes and Sessions
sensitive data correctly. We also
advise you on how to secure your      Business Layer
                                      Component
                                      Security IT Products                                       Personnel Management Tools                                 Time Dependencies
facilities and ensure that all off-   Risk Management                                            Identities | Roles | Functions | Access Controls | Lists   Time Schedules | Clocks | Timers and Interrupts
premise facilities have the right     Tools for Monitoring and Reporting
                                      Security Process
                                                                                                 Locator Tools
                                                                                                 Dynamic Inventory of Nodes | Addresses and Locations
controls in place as well.            Tools, Standards, and Protocols


                                      Business Layer
                                      Service Management
                                      Service Delivery Management                                Management of Security Operations                          Management of Environment
                                      Assurance of Operational Continuity                        Admin | Backups | Monitoring | Emergency Response          Buildings | Sites | Platforms and Networks
                                      Operational Risk Management                                Personnel Management                                       Management Schedule
                                      Risk Assessment | Monitoring and Reporting                 Account Provisioning | User Support | Management           Security-Related | Calendar and Timetable
Case Studies




                                                                 1.0 | Summary
                                                                 2.0 | Differentiation
                                                                 3.0 | Depth
               4.0




                                                      4.0 | Successes
               Successes
               in this section


               These case studies show how Booz
               Allen uses superior technology and
               analytics expertise to solve complex
               problems for clients in a wide range
               of corporate and government sectors.




                                                                        31
Case Studies                                                                                                                                                     Example
                                                                                                                                                                 One




Improving Intelligence Analysis
                         Mission                              Solutions
                         To fulfill their mission, this       Booz Allen worked closely with       scalable and flexible to support     Data Management
                         organization requires data           the client to adopt a data cloud     future innovation and evolution      The data sources had multiple
                         correlation, quick access to         implementation by augmenting         without reengineering.               formats, were large in size,
                         analytic results, ad-hoc queries,    the legacy relational databases                                           and distressed with noise.
                         advanced scalable analytics,         with cloud computing and             Interfaces and Visualizations        The solution created deep
                         and real-time alerting.              analytics. The design focused        Dashboards, web applications,        insight through fusion of
                         To provide their analysts            on keeping transactional-            client applications, and rich        different data types at scale.
                         with a continuous pipeline           based queries in the current         clients interfaced and integrated    The solution enabled the
                         of prioritized, actionable           relational databases, while          with advanced analytics              ability to follow the lineage or
                         information, they needed a           doing the “heavy lifting” in         infrastructure and legacy            pedigree of the data, allowing
                         secure, scalable, automated          the cloud and outputting the         relational databases through a       the client to map cost in
                         solution that would more             interesting, processed, or           SOA business logic layer.            relation to the value of the data
                         quickly and precisely sift           desired analytic results into                                             or how well it is being used.
                         through large (and growing)          relational data stores for quick     Analytics and Services
                         volumes of complex data              transactional access.                The solution called for              Infrastructure
                         characterized by a variety of                                             predictive analytics to forecast     The solution used Accumulo
                         formats and noise. In addition,      With many existing systems           potential events from existing       (distributed key value
                         they needed to leverage their        and applications dependent on        data and anomaly detection to        systems/NoSQL database)
                         existing analytics infrastructure    the legacy relational database       extract potentially significant      for content normalization
                         in the new platform.                 for transactional queries of         information and patterns. The        and indexing, MapReduce as
                                                              data, Booz Allen pulled together     solution leveraged the core          the precomputation engine,
                                                              excess servers from the client’s     principle of cloud analytics that    and HDFS for scalable ingest
                                                              infrastructure to build a hybrid     enables automated analysis           and storage.
                                                              cloud solution. Also, as the         techniques, precomputation,
                                                              client’s needs change to adapt       and aggressive indexing.
                                                              to the mission, the solution is




                         Impact
                         Rather than simply focus on gaining IT efficiencies by using cloud technology for infrastructure, Booz Allen focused on applying cloud analytics
                         and in-depth understanding of the organization’s operational and mission needs to extract more value faster from massive datasets. The
                         new cloud solution provided immediate and striking improvements across the increasing volume of structured and unstructured data using
                         aggressive indexing techniques, on-demand analytics, and precomputed results for common analytics.

                         The final solution combined sophistication with scalability, moving the organization from a situation in which analysts stitched together sparse
                         bits of data to a platform for distilling real-time, actionable information from the full aggregation of data.
Case Studies                                                                                                                                                                                                                             Example
                                                                                                                                                                                                                                         Two




                                                                                                                                                                                                                                                               1.0 | Summary
Planning and Responding to Disaster




                                                                                                                                                                                                                                                               2.0 | Differentiation
                                                                                                   Mission                             Solutions
                                                                                                   This organization, which            Booz Allen developed a               tool, Splunk, to mine through       Data Management
               Geotagging                                          Link Analysis
                                                                                                   is responsible for disaster         framework to capture, normalize,     and analyze vast amounts            The solution framework
                                                                                                   planning and response, found        and transform open-source            of data in real time, while         captured live, streaming open-
                                                                                                   that social media could provide     media used to characterize           outputting characterization         source media such as Twitter
                                                                                                   timely situational awareness for    and forecast disaster events,        and forecasting metrics of          and RSS feeds. Data was
                                                                                                   biological (and other disaster)     in real time. The framework          captured events.                    captured in Splunk and stored
                                                                                                   events. They wanted a solution      incorporated computational                                               on AWS.
                                                                                                   to better characterize and          and analytical approaches            Interfaces and Visualizations




                                                                                                                                                                                                                                                               3.0 | Depth
                                                                                                   forecast emerging disaster          to turn the noise from social        The solution included
            Risk Scoring                                        Predictive Modeling                events using social media data      media into valuable information      dashboards that characterized
                                                                                                   as it streams in real time. With    using algorithms such as term        events captured in social
                                                                                                   such a solution in place, the       frequency-inverse document           media. The visual analyses
                                                                                                   organization could increase         frequency (TF-IDF), natural          include event extraction counts,
                                                                                                   overall preparedness by             language processing (NLP),           time series counts, forecasting
                                                                                                   leveraging event characterization   and predictive modeling to           counts, a symptom tag cloud,
                                                                                                   to accurately predict the impact    characterize and forecast the        and geographical isolation.




                                                                                                                                                                                                                                                    4.0 | Successes
                                                                                                   and improve the response.           numbers of sick, dead, and
                                                                                                                                       hospitalized, as well as to          Analytics and Services
                                              Provider                                             In order to reach their goal,       extract symptoms, geography,         TF-IDF and NLP algorithms
                                               Profile                                             the organization needed higher      and demographics for specific        were used to classify and
                                                                                                   levels of confidence in the         illness events.                      extract relevant information
                                          Clean, Validate,                                         social media data on which they                                          from the data. Booz Allen
                                        Normalize, Integrate
                                                                                                   would base their decisions.         The solution framework               developed predictive models
                                                                                                   The specific challenges the         was implemented in the               for forecasting event frequency
             Provider                                                        Online      Cases/    new solution had to overcome        cloud, taking advantage of           and counts. The algorithms
 Financial Registration Geolocation   Licensing     Exclusion      Claims   Activities   Rulings
                                                                                                   included data ingestion and         the flexible computational           were written in Python and
                                                                                                   normalization, social media         power and storage. The               incorporated into Splunk
                                                                                                   vocabulary, social media            new cloud infrastructure             located on Amazon Web
                                                                                                   characterization, information       allowed Booz Allen’s data            Services (AWS).
                                                                                                   extract, and geographical           capturing and visualization
                                                                                                   isolation of events.



                                                                                                   Impact
                                                                                                   The new Booz Allen solution, which builds upon current best practices in cyber terrorism, enables near real-time situational awareness through
                                                                                                   a standalone surveillance system that captures, transforms, and analyzes massive volumes of social media data. By leveraging social media
                                                                                                   data and analytics for more timely and accurate disaster characterization, the organization is able to more effectively plan and respond.




                                                                                                                                                                                                                                                                      33
                                                                                                                                                                                                                                                                      33
Cloud Playbook
Cloud Playbook
Cloud Playbook

Contenu connexe

Tendances

Cloud migration strategies
Cloud migration strategiesCloud migration strategies
Cloud migration strategies
SogetiLabs
 

Tendances (20)

Solution deck capgemini cloud assessment
Solution deck capgemini cloud assessmentSolution deck capgemini cloud assessment
Solution deck capgemini cloud assessment
 
Data Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data FactoryData Quality Patterns in the Cloud with Azure Data Factory
Data Quality Patterns in the Cloud with Azure Data Factory
 
Leveraging Generative AI to Accelerate Graph Innovation for National Security...
Leveraging Generative AI to Accelerate Graph Innovation for National Security...Leveraging Generative AI to Accelerate Graph Innovation for National Security...
Leveraging Generative AI to Accelerate Graph Innovation for National Security...
 
Where to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio MigrationWhere to Begin? Application Portfolio Migration
Where to Begin? Application Portfolio Migration
 
The Path to Data and Analytics Modernization
The Path to Data and Analytics ModernizationThe Path to Data and Analytics Modernization
The Path to Data and Analytics Modernization
 
Data Modeling Fundamentals
Data Modeling FundamentalsData Modeling Fundamentals
Data Modeling Fundamentals
 
Moving to the cloud: cloud strategies and roadmaps
Moving to the cloud: cloud strategies and roadmapsMoving to the cloud: cloud strategies and roadmaps
Moving to the cloud: cloud strategies and roadmaps
 
SQL to Azure Migrations
SQL to Azure MigrationsSQL to Azure Migrations
SQL to Azure Migrations
 
DW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptxDW Migration Webinar-March 2022.pptx
DW Migration Webinar-March 2022.pptx
 
Introduction to Analytics Cloud
Introduction to Analytics CloudIntroduction to Analytics Cloud
Introduction to Analytics Cloud
 
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud MigrationCapgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
Capgemini Cloud Assessment - A Pathway to Enterprise Cloud Migration
 
Cloud computing benefits
Cloud computing benefitsCloud computing benefits
Cloud computing benefits
 
Open Digital Framework from TMFORUM
Open Digital Framework from TMFORUMOpen Digital Framework from TMFORUM
Open Digital Framework from TMFORUM
 
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
The People Model & Cloud Transformation - Transformation Day Public Sector Lo...
 
Presentation cloud management
Presentation   cloud managementPresentation   cloud management
Presentation cloud management
 
Introduction to Microsoft Azure Cloud
Introduction to Microsoft Azure CloudIntroduction to Microsoft Azure Cloud
Introduction to Microsoft Azure Cloud
 
Planning A Cloud Implementation
Planning A Cloud ImplementationPlanning A Cloud Implementation
Planning A Cloud Implementation
 
Data Mesh 101
Data Mesh 101Data Mesh 101
Data Mesh 101
 
Cloud migration strategies
Cloud migration strategiesCloud migration strategies
Cloud migration strategies
 
Data at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and GovernanceData at the Speed of Business with Data Mastering and Governance
Data at the Speed of Business with Data Mastering and Governance
 

En vedette

sudipta-das-09-08-2016-09-20-19
sudipta-das-09-08-2016-09-20-19sudipta-das-09-08-2016-09-20-19
sudipta-das-09-08-2016-09-20-19
sudipta das
 

En vedette (20)

Resilience in the Cyber Era
Resilience in the Cyber EraResilience in the Cyber Era
Resilience in the Cyber Era
 
Rango de celdas y graficos
Rango de celdas y graficosRango de celdas y graficos
Rango de celdas y graficos
 
ESPECIFICACIONES TECNICAS PARTICULARES ESTRUCTURAS METÁLICAS.
ESPECIFICACIONES TECNICAS PARTICULARES ESTRUCTURAS METÁLICAS.ESPECIFICACIONES TECNICAS PARTICULARES ESTRUCTURAS METÁLICAS.
ESPECIFICACIONES TECNICAS PARTICULARES ESTRUCTURAS METÁLICAS.
 
Srbija od sticanja nezavisnosti 1878. do Majskog prevrata 1903.
Srbija od sticanja nezavisnosti 1878. do Majskog prevrata 1903.Srbija od sticanja nezavisnosti 1878. do Majskog prevrata 1903.
Srbija od sticanja nezavisnosti 1878. do Majskog prevrata 1903.
 
Creatividad e innovación en estudiantes universitarios
Creatividad e innovación en estudiantes universitariosCreatividad e innovación en estudiantes universitarios
Creatividad e innovación en estudiantes universitarios
 
Bob fugerer resume
Bob fugerer  resumeBob fugerer  resume
Bob fugerer resume
 
the endocrine system
the endocrine system the endocrine system
the endocrine system
 
N 16.08.2013-11
N 16.08.2013-11N 16.08.2013-11
N 16.08.2013-11
 
Surat edaran sampah pesan2017
Surat edaran sampah pesan2017Surat edaran sampah pesan2017
Surat edaran sampah pesan2017
 
Ejercicios con funciones
Ejercicios con funcionesEjercicios con funciones
Ejercicios con funciones
 
Daily agri report by epic research limited of 01 march 2017
Daily agri report by epic research limited of 01 march  2017Daily agri report by epic research limited of 01 march  2017
Daily agri report by epic research limited of 01 march 2017
 
My Resume - Rhiannon Lotze
My Resume - Rhiannon LotzeMy Resume - Rhiannon Lotze
My Resume - Rhiannon Lotze
 
Mini Budget PP
Mini Budget PPMini Budget PP
Mini Budget PP
 
trabajo-de-internet
trabajo-de-internettrabajo-de-internet
trabajo-de-internet
 
c.v..
c.v..c.v..
c.v..
 
sudipta-das-09-08-2016-09-20-19
sudipta-das-09-08-2016-09-20-19sudipta-das-09-08-2016-09-20-19
sudipta-das-09-08-2016-09-20-19
 
Unidad ii resumen de las exposiciones
Unidad ii resumen de las exposicionesUnidad ii resumen de las exposiciones
Unidad ii resumen de las exposiciones
 
Thomas Tate: Builder and Designer of Handcrafted Homes
Thomas Tate: Builder and Designer of Handcrafted HomesThomas Tate: Builder and Designer of Handcrafted Homes
Thomas Tate: Builder and Designer of Handcrafted Homes
 
Task2(evaluation)
Task2(evaluation)Task2(evaluation)
Task2(evaluation)
 
Discerning Between True & False Guidance
Discerning Between True & False GuidanceDiscerning Between True & False Guidance
Discerning Between True & False Guidance
 

Similaire à Cloud Playbook

Getting more out of your big data
Getting more out of your big dataGetting more out of your big data
Getting more out of your big data
Nathan Bijnens
 
Haydn shaughnessy on banks and ecosystems
Haydn shaughnessy on banks and ecosystemsHaydn shaughnessy on banks and ecosystems
Haydn shaughnessy on banks and ecosystems
Haydn Shaughnessy
 
Knowledge Management
Knowledge ManagementKnowledge Management
Knowledge Management
AKAGroup
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
David Linthicum
 
Социальные медии и облачный компьютинг
Социальные медии и облачный компьютинг Социальные медии и облачный компьютинг
Социальные медии и облачный компьютинг
Dmitry Tseitlin
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibm
Accenture
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
Kun Le
 

Similaire à Cloud Playbook (20)

Cloud and E2.0: Connecting the Dots - OSCON Cloud Summit - 2010
Cloud and E2.0: Connecting the Dots - OSCON Cloud Summit - 2010Cloud and E2.0: Connecting the Dots - OSCON Cloud Summit - 2010
Cloud and E2.0: Connecting the Dots - OSCON Cloud Summit - 2010
 
B13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John RobsonB13 Driving Business Intelligence John Robson
B13 Driving Business Intelligence John Robson
 
B13 Driving Business Intelligence
B13 Driving Business IntelligenceB13 Driving Business Intelligence
B13 Driving Business Intelligence
 
Enterprises2.0
Enterprises2.0Enterprises2.0
Enterprises2.0
 
HEALTH 3.0 (National): A Wisdom Network to crowdcreate patient healthcare
HEALTH 3.0 (National): A Wisdom Network to crowdcreate patient healthcareHEALTH 3.0 (National): A Wisdom Network to crowdcreate patient healthcare
HEALTH 3.0 (National): A Wisdom Network to crowdcreate patient healthcare
 
Sharepoint Web Solutions case study presentation at In-Telligent 2008 Confere...
Sharepoint Web Solutions case study presentation at In-Telligent 2008 Confere...Sharepoint Web Solutions case study presentation at In-Telligent 2008 Confere...
Sharepoint Web Solutions case study presentation at In-Telligent 2008 Confere...
 
Cloud Analytics Playbook
Cloud Analytics PlaybookCloud Analytics Playbook
Cloud Analytics Playbook
 
Getting more out of your big data
Getting more out of your big dataGetting more out of your big data
Getting more out of your big data
 
Haydn shaughnessy on banks and ecosystems
Haydn shaughnessy on banks and ecosystemsHaydn shaughnessy on banks and ecosystems
Haydn shaughnessy on banks and ecosystems
 
Knowledge Management
Knowledge ManagementKnowledge Management
Knowledge Management
 
Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2Getting Cloud Architecture Right the First Time Ver 2
Getting Cloud Architecture Right the First Time Ver 2
 
Web 2.0
Web 2.0Web 2.0
Web 2.0
 
Northridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public WebNorthridge Webinar Share Point 2010 Public Web
Northridge Webinar Share Point 2010 Public Web
 
Социальные медии и облачный компьютинг
Социальные медии и облачный компьютинг Социальные медии и облачный компьютинг
Социальные медии и облачный компьютинг
 
Mobile Applications & Cloud Computing : Leapfrog Strategy for Thai IT Industry
Mobile Applications & Cloud Computing : Leapfrog Strategy for Thai IT IndustryMobile Applications & Cloud Computing : Leapfrog Strategy for Thai IT Industry
Mobile Applications & Cloud Computing : Leapfrog Strategy for Thai IT Industry
 
Analytics big data ibm
Analytics big data ibmAnalytics big data ibm
Analytics big data ibm
 
IBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep diveIBM-Infoworld Big Data deep dive
IBM-Infoworld Big Data deep dive
 
Projections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI teamProjections for BI in 2012 from the neutrinoBI team
Projections for BI in 2012 from the neutrinoBI team
 
Katrina marques presentation
Katrina marques   presentationKatrina marques   presentation
Katrina marques presentation
 
weListen Presentation
weListen PresentationweListen Presentation
weListen Presentation
 

Plus de Booz Allen Hamilton

You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest ChallengesYou Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
Booz Allen Hamilton
 

Plus de Booz Allen Hamilton (20)

You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest ChallengesYou Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
You Can Hack That: How to Use Hackathons to Solve Your Toughest Challenges
 
Examining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working MomsExamining Flexibility in the Workplace for Working Moms
Examining Flexibility in the Workplace for Working Moms
 
The True Cost of Childcare
The True Cost of ChildcareThe True Cost of Childcare
The True Cost of Childcare
 
Booz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of DirectorsBooz Allen's 10 Cyber Priorities for Boards of Directors
Booz Allen's 10 Cyber Priorities for Boards of Directors
 
Inaugural Addresses
Inaugural AddressesInaugural Addresses
Inaugural Addresses
 
Military Spouse Career Roadmap
Military Spouse Career Roadmap Military Spouse Career Roadmap
Military Spouse Career Roadmap
 
Homeland Threats: Today and Tomorrow
Homeland Threats: Today and TomorrowHomeland Threats: Today and Tomorrow
Homeland Threats: Today and Tomorrow
 
Preparing for New Healthcare Payment Models
Preparing for New Healthcare Payment ModelsPreparing for New Healthcare Payment Models
Preparing for New Healthcare Payment Models
 
The Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile CoachingThe Product Owner’s Universe: Agile Coaching
The Product Owner’s Universe: Agile Coaching
 
Immersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is HereImmersive Learning: The Future of Training is Here
Immersive Learning: The Future of Training is Here
 
Nuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving PerformanceNuclear Promise: Reducing Cost While Improving Performance
Nuclear Promise: Reducing Cost While Improving Performance
 
Frenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join ForcesFrenemies – When Unlikely Partners Join Forces
Frenemies – When Unlikely Partners Join Forces
 
Booz Allen Secure Agile Development
Booz Allen Secure Agile DevelopmentBooz Allen Secure Agile Development
Booz Allen Secure Agile Development
 
Booz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat BriefingBooz Allen Industrial Cybersecurity Threat Briefing
Booz Allen Industrial Cybersecurity Threat Briefing
 
Booz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey ReportBooz Allen Hamilton and Market Connections: C4ISR Survey Report
Booz Allen Hamilton and Market Connections: C4ISR Survey Report
 
CITRIX IN AMAZON WEB SERVICES
CITRIX IN AMAZON WEB SERVICESCITRIX IN AMAZON WEB SERVICES
CITRIX IN AMAZON WEB SERVICES
 
Modern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military NetworksModern C4ISR Integrates, Innovates and Secures Military Networks
Modern C4ISR Integrates, Innovates and Secures Military Networks
 
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
Agile and Open C4ISR Systems - Helping the Military Integrate, Innovate and S...
 
Women On The Leading Edge
Women On The Leading Edge Women On The Leading Edge
Women On The Leading Edge
 
Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science Booz Allen Field Guide to Data Science
Booz Allen Field Guide to Data Science
 

Cloud Playbook

  • 2.
  • 3. 1.0 | Summary 1.0 | Summary The problems of explosive data growth and how cloud analytics provide the solution page 4 2.0 | Differentiation 2.0 | Differentiation An enormous amount of valuable information is out there, Introduces the Architecture and explains how Booz Allen’s unique approach to people, processes, waiting to be transformed into differentiating services. Booz Allen and technology gets the job done page 10 Hamilton uses its Cloud Analytics Reference Architecture to build 3.0 | Depth technology infrastructures that can withstand the weight of massive 3.0 | Depth Takes the Architecture apart layer by layer with detailed visuals that datasets—and deliver the deep insights organizations need to show you how we frame a solution page 16 drive innovation. 4.0 | Successes 4.0 | Successes Presents real-world examples from the hundreds of organizations who have successfully worked with Booz Allen to implement an analytics solution using the Architecture page 31 Prefer to read this on your iPad? Search “Booz Allen” at the iTunes App Store,® or simply scan the QR code. 3
  • 4. 1.0 Summary in this section A majority of executives believe their companies are unprepared to leverage their data. We look at why that is and how to change it.
  • 5. 1.0 | Summary Extracting True Insights The Growing Data Analysis Gap 2.0 | Differentiation We are living in the greatest age of information discovery the world has ever known. According to recent industry research, we now generate more data every 2 days than we did from the dawn of early civilization through the year 2003 combined. And data rates are still growing—approximately 40% each year. Fueled in large part by the more than five billion mobile phones in use around the globe, our world is increasingly measured, instrumented, monitored, and automated in ways that generate incredible amounts of rich and complex data. Unfortunately, the number of big data analysts and the capabilities of traditional tools aren’t keeping pace with this unprecedented data growth. At Booz Allen, we’ve watched this trend for some time now—we call it the “data analysis gap.” It’s clear that data has outstripped 3.0 | Depth common analytics tools and staffing levels. In order to move forward, organizations must be able to analyze data on a massive scale and quickly use it to provide deeper insights, create new products, and differentiate their services. 4.0 | Successes sustainable challenging missed opportunities 44 times quantity Data By 2020, the amount of information in our economy will grow 44 times. Analysis Very few organizations are prepared for this wave of data. Gap 2009 2020 time (Source: IDC) 5
  • 6. Preparing for What’s Ahead The ability to compete and win in the information economy will come from powerful analytics that draw insights and value from Done right, analytics hosted in data, and from high-fidelity visualizations that present those insights in impactful, intuitive ways. Both will become key influencers the cloud will help: of corporate decision making and consumer purchasing. ▶▶ Improve overall performance Many of the world’s IT systems are not ready for the technology revolution happening as organizations seek to transform how they and efficiency use data. Their infrastructures face three major challenges: ▶▶ Better understand customer and employee needs Volume Variety Velocity ▶▶ Translate data into actionable Not enough storage capacity and Data comes in many different formats, Inability to process data in real time in intelligence and faster decision analytical capabilities to handle which can be difficult and expensive order to extract the most value from it making massive volumes of data to integrate ▶▶ Reduce IT costs ▶▶ mprove scalability to handle I To help organizations overcome these hurdles and prepare for what’s next, Booz Allen has pioneered strategies for the future growth implementation of the Digital Enterprise—a way of using technology, machine-based analytics, and human-powered analysis to create competitive and mission advantage. However, before you invest in a cloud analytics solution, you should fully understand the scope of what’s involved and engage in the proper A Framework for the Future planning to ensure that all the right elements will be in place. Booz Allen has a framework for intelligently integrating cloud computing technology and advanced analytic capabilities, called the Cloud Analytics Reference Architecture. The Architecture is designed to solve compute-intensive problems that were previously out of reach for most organizations, including large-scale image processing, sensor data correlation, social network analysis, encryption/decryption, data mining, simulations, and pattern recognition. At the core of the Architecture are systems that accommodate petabytes of data at reasonable cost and allow analytics to run at previously unattainable scales in reasonable amounts of time. However, human insights and action are still the fundamental drivers. The purpose of the Architecture is to allow machines to do 80% of the work—the mundane tasks they are best suited for— and enable people to do the 20% of the work they do best, tasks that involve analysis and creativity.
  • 7. 1.0 | Summary The Transformative Power of Cloud Analytics 2.0 | Differentiation Booz Allen is the leader in the Analysis using standard cloud emerging field of cloud analytics. computing solutions extends Our unique approach combines basic analytic techniques to cloud and other technologies large or very large datasets. with superior analytic tradecraft This is a logical entry point for to create breakthroughs in how cloud solutions because cloud organizations capture, store, technology is the most efficient, 3.0 | Depth correlate, pre-compute, and extract cost-effective way to run analytics value from large sets of data. on large amounts of data. To understand the power of Advanced analytics is where cloud analytics, it helps to see predictive capabilities are brought the progression from basic into the mix. It’s generally used data analytics performed in to evaluate the future impact of most organizations today. As an strategic decisions. However, it 4.0 | Successes infrastructure is built out along represents a step back in terms the continuum to cloud analytics, of the size of datasets that can the size and scale of data it can be manipulated. process increases along with the ability to drive performance and Cloud analytics transcends improve decision making. the limits of the other forms of analysis. It delivers insights to Basic data analysis usually answer previously unanswerable happens in core business questions such as: functions with smaller datasets. Reports are usually created ▶▶ ow can we gain competitive H on a “one-off” basis, limited advantage in our market to distribution within a specific space? department to support routine decision making. ▶▶ here can we save money W within our organization? ▶▶ ow should we turn our data H into a product? 7
  • 8. From Data to Digital Enterprise Booz Allen clients have a wide variety of data analysis challenges and IT infrastructures. Our flexible, scalable Cloud Analytics STAGE 3 Reference Architecture has Significant improvements three stages or entry points in performance are realized to accommodate these when you achieve success differences. STAGE 2 in managing the flow of We begin to modernize information at scale and In each stage, we enable applications to handle derive the fullest value from shifts in technology STAGE 1 the demands of advanced your data. investments while helping The focus is on saving analytics. Faster, reusable, manage risk and maximize money and reducing risk. and more intuitive applications the  reward. That means leveraging the assets you already own and taking logical You may have already begun some of these initiatives; we leverage will enable everyone in your organization to work smarter. 3 CLOUD ANALYTICS steps to add what’s needed. what’s working now as we ▶ reate deep insight into C 2 SMART DATA discover new ways to relevant mission data This is the only way to build a increase efficiency. structure to instrument data at scale so you can truly experience ▶ sk and answer previously A ▶ nhanced Enterprise E breakthrough analytics. unanswerable questions 1 IT EFFICIENCIES Data Architecture ▶ larify pedigree C (data tagging) ▶ Data center consolidation ▶ Multidimensional indexing ▶ Server and data consolidation ▶ Adopt distributed database ▶ Increased automation ▶ Reusable applications ▶ Modernized security posture and metrics ▶ Reduced licensing costs IT Maturity
  • 9. 1.0 | Summary A Better Approach As the leaders in cloud analytics, Booz Allen has a proven approach 2.0 | Differentiation delivered by some of the industry’s best talent. Here’s why we’re different: Technical framework A LOOK AHEAD Our Architecture combines the collective experience of thousands of people who have road tested technologies from across the cloud solution landscape in hundreds of client organizations, ranging from the U.S. Federal Government to commercial and international clients. 3.0 | Depth Section 2.0 Best practices Differentiation: Introduces and diagrams We have an exclusive set of lessons learned and breadth of technical knowledge that saves time and the Architecture, and explains how it money while reducing risk. reflects Booz Allen’s unique approach. You’ll also read about our core design principles, extensive service offerings, Core principles and technology choices. 4.0 | Successes These are “rules of the road” we’ve developed to build the most effective solution with the highest Pages 10–15 return on investment. They encompass everything from how data should be stored to how to improve relationships with the end users of your data. Section 3.0 Depth: Takes the Architecture apart layer Critical skill sets by layer with detailed visuals, design We bring technologists as architecture and solutions specialists, domain experts who know your concepts, and recommended solutions industry and your data, and data scientists who explore and examine data from disparate sources from the cloud vendor landscape. The and recommend how best to use it. No one else in the industry offers a better combination of talent. section ends with a look at how security is built into all levels. Pages 16–30 Vendor neutrality Our approach utilizes a broad ecosystem of products and custom systems culled from an exhaustive Section 4.0 survey of available options. In the crowded, fragmented, and continually evolving landscape of cloud Successes: Presents real-world solutions, we recommend only the best fit and value for your organization. examples from our extensive file of case studies. We present the solutions and challenges, describe and diagram the implementations, and explain the results. Pages 31–35 9
  • 10. 2.0 Differentiation in this section Booz Allen’s approach to cloud analytics is unmatched in the industry. Read about our unique principles and best practices.
  • 11. 1.0 | Summary A Layered Framework Human Insights and Actions 2.0 | Differentiation Enabled by customizable interfaces The Booz Allen Cloud Analytics and visualizations of the data Reference Architecture Visualization, Reporting, incorporates a wide range Dashboards, and of services to move from Query Interface a technology infrastructure with chaotic, distributed data burdened by noise to large- Analytics and Services Services (SOA) Your tools for analysis, modeling, scale data processing and testing, and simulations analytics characterized by speed, precision, security, 3.0 | Depth scalability, and cost efficiency. Analytics and Discovery However, Booz Allen’s approach is about much more than infrastructure. We start with your need to make better sense and better use of your mission Views and Indexes data, and build from there. 4.0 | Successes Streaming Indexes Data Management The single, secure repository for all of your valuable data Data Lake Metadata Tagging Data Sources Infrastructure A FRAMEWORK FOR SECURITY Infrastructure/ The technology platform for storing Management and managing your data Page 30 details our security processes 11
  • 12. Booz Allen Cloud Analytics Service Offerings cloud strategy and Cloud application migration advanced cloud analytics infrastructure economics Expertise in the assessment, prioritization, Delivery of scalable analytics platforms allowing Design and implementation of IaaS offerings Delivery of strategy, technology, and economic architectural mapping, re-engineering, and the processing of information at extreme scale; to provide global access to data storage, analysis for evaluating and planning all of the optimization of workloads that have high value and eDiscovery: high-volume, full text indexing, computing, and networking services on business, technical, operational, and financial and are ready for migration to the cloud and context-based search of information demand through self-service portals aspects of a cloud transition cloud security software and platform vdi deployment and data center migration integration and optimization Unified risk management approach to define cloud Expertise in the secure implementation of SaaS and security requirements, controls, and a continuous PaaS service delivery models, data migration, and Delivery of flexible and dynamic virtual desktop Identify critical factors, design, and execute monitoring framework to address data protection, integration with existing enterprise infrastructure infrastructure to simplify management, reduce the transformation of legacy IT systems to identity, privacy, regulatory, and compliance risks and applications licensing costs, and increase desktop security virtualized and cloud computing environments and data protection requirements
  • 13. 1.0 | Summary Complex Ecosystem We’ll help you navigate the crowded, fragmented, and continually evolving 2.0 | Differentiation vendor ecosystem to design a best-of-breed solution for your organization. 3.0 | Depth 4.0 | Successes Human Insights and Actions Analytics and Services Data Management Infrastructure Data Integration 13
  • 14. Booz Allen Cloud Analytics Core Principles In-situ processing Use commodity hardware Schema on read The Architecture demands that “you send the Hardware should be expected to fail as the If you have all the source data indexed and query- question to the data,” because most big data normal condition. The Architecture supports able, plus the ability to create aggregations, then processes are disk I/O-bound. In-situ processing both scalability and fault tolerance to achieve you can manage complex ontologies and demands means that most of the computation is done optimal application load balancing. in a very efficient manner. locally to the data, so that analytics run faster. This can enhance existing analytic capabilities and/or allow you to ask entirely new types of questions. Throw away nothing Economies of scale Change development process Near-linear scalable hardware and software What used to be called service-oriented In order to develop a tight, iterative relationship systems allow much more data to be stored, architecture (SOA) means that you can with your end users, you can develop/research which enable reprocessing of historical data define the value and cost of services in a new capability in hours (not months), and the with new algorithms and correlations that your enterprise, and plan your development process of discovery and integration with the rest bring new insights. actions, either to reduce the cost of low-value of the enterprise begins much sooner, too. components or increase the scale of high- value components. Data tagging You can now afford to tag all of your data for sensitivity or other controls (such as geographic). This is the fastest, most reliable way to instrument change across your entire Data Lake.
  • 15. 1.0 | Summary Deeper The Cloud Analytics Reference Architecture enables staff at all levels to quickly gather and act on granular insights from all of your available data, regardless of its format or location. Below are some of the ways human insights and Insights actions are enhanced by this new framework, which fosters greater collaboration and teamwork, and, ultimately, delivers the highest business value from your information and your computing infrastructure. 2.0 | Differentiation 3.0 | Depth Human Insights and Actions Analytics and Services Data Management Infrastructure DecisionMakers, Analysts and Data Developers and Data System Administrators Investigators, Interdictors, Scientists Scientists and IT Staff 4.0 | Successes AND Analysts ▶▶ reate and use many views into the C ▶▶ o longer constrained by years-old N ▶▶ educe IT costs through commoditization R ▶▶ eal-time alerting, situational awareness, and R same data schemas and economies of scale dissemination specific to their clearance level ▶▶ Automatically find trends and outliers ▶▶ C atalog and index the data that is ▶▶ Meet long-term scalability requirements ▶▶ nvestigate and provide feedback I ▶▶ valuate analysis methods to determine E relevant today on reporting and enhance best-of-breed tradecraft ▶▶ F ree to create new views and ▶▶ Interact and search using tailored tools reporting metrics ▶▶ R eference undiscovered trends in original data ▶▶ A pply advanced machine learning and statistical methods ▶▶ In-situ hypothesis testing 15 15
  • 16. 3.0 Depth in this section We diagram and describe each layer of the Cloud Analytics Reference Architecture, including our design principles and technology choices.
  • 17. 1.0 | Summary Reference Architecture 2.0 | Differentiation Booz Allen’s Cloud Analytics layer 1 Reference Architecture provides a holistic approach to people, Human Insights and Actions processes, and technology in four Building on results and outputs from various analytical tightly integrated layers. methods, multiple data visualizations can be created in your new cloud analytics solution. These are used to compose the interactive, real-time dashboard interfaces Key Attributes your decision-makers and analysts need to make sense Human Insights and Actions of your data. By design, the Booz Allen Cloud Analytics Reference Architecture: 3.0 | Depth layer 2 ▶▶ s reliable, allowing distributed I Analytics and Services storage and replication of bytes Both traditional and “Big Data” tools and software across networks and hardware can operate on the information stored in your Data that is assumed to fail at any time Lake, producing advanced specific analysis, modeling, ▶▶ A llows for massive, world-scale testing, and simulations you need for decision making. storage that separates metadata from data Analytics and Services 4.0 | Successes ▶▶ S upports a write-once, sporadic append, read-many usage structure layer 3 ▶▶ S tores records of various sizes, Data Management from a few bytes up to a few Your Data Lake is a secure, distributed repository of a terabytes in size wide variety of data sources. Security, metadata, and ▶▶ A llows compute cycles to be indexing of Big Data are enabled by distributed key easily moved to the data store, value systems (NoSQL), but the Architecture allows for instead of moving the data to traditional relational databases as well. a processer farm Data Management layer 4 Infrastructure This foundational layer allows for quick, streamlined, low-risk deployment of the cloud implementation. The plug-and-play, vendor-neutral framework is unique to Booz Allen. A FRAMEWORK FOR SECURITY Infrastructure Page 30 details our security processes 17
  • 18. layer 1 Human Insights and Actions architecture model Monitoring Exploratory Geospatial Line Chart Analytics and Services
  • 19. 1.0 | Summary Human Insights and Actions (continued) PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation In analytics solutions built on the Architecture, the data that’s available and the desired results drive the interfaces— TECHNOLOGY EXAMPLES 3.0 | Depth not the other way around. When user communities and stakeholders aren’t restricted by their tools, they can perform complex visualizations to identify patterns they previously couldn’t see. HTML5, JavaScript, OWF, Synapse Lightweight, custom web-based applications and dashboards tailored That freedom defines the guiding principles behind this first layer of the Architecture: to specific user communities or stakeholders for data exploration, event alerting, and monitoring, as well as ▶▶ Design and build the framework so that the desired data and analytic results define the visualization continuous quality improvement ▶▶ Reuse results and outputs of analytics across different visualizations ▶▶ ecouple the underlying analytics and data access from the visualizations and interfaces so that it’s possible D 4.0 | Successes to build customized, interactive dashboard interfaces composed of dynamically linked visualizations Commercial products (Splunk, Out-of-the-box, easy-to-build dashboards Pentaho, Datameer Business for historical trending and real-time Infographics, etc.) monitoring to analyze user transactions, customer behavior, network patterns, security threats, and fraudulent activity Adobe Flex and Adobe Flash Despite the rise of HTML5, Adobe Flex and Flash applications still remain strong candidates for quickly building and deploying rich user interfaces 19
  • 20. layer 2 Analytics and Services architecture model Human Insights and Actions Time Series Social Network Analysis R, SAS, Matlab, MapReduce, Hive, Mathematica Pig, Hama
  • 21. 1.0 | Summary Analytics and Services (continued) PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation TECHNOLOGY EXAMPLES Data Mining Data mining is used to discover patterns in large datasets and draws from multiple fields including artificial intelligence, machine learning, statistics, and database systems. Frequently where data is concerned, the whole is greater than the sum of its parts. In the most strategic business 3.0 | Depth Machine Learning Machine learning is used to learn decisions, the ability to combine multiple types of analyses creates a holistic picture that can lead to much more classifiers and prediction models in valuable insight. With the Cloud Analytics Reference Architecture, you can implement different the absence of an expert and employs types of analytical methods. many algorithms in the areas of decision trees, association learning, artificial neural networks, inductive logic This integrated approach is an anchor for the guiding principles behind our Analytics and Services layer: programming, support vector machines, clustering, Bayesian networks, genetic ▶▶ llow both traditional and Big Data analysis tools and software to operate on a centralized repository of data A algorithms, reinforcement learning, and (the DataLake) representation learning. 4.0 | Successes ▶▶ Integrate results and outputs of analyses and visualize them on dashboards for decision making ▶▶ Decouple tools from the various types of analyses to make the system more extensible and adaptable ▶▶ nclude a service-oriented architecture layer to reuse results and outputs in many different ways relevant to I Natural Language Processing (NLP) NLP is used to process unstructured and semi-structured documents for different stakeholders and decisionmakers the purposes of information retrieval, ▶▶ ncorporate Certified Catastrophe Risk Analysis (CCRA) to allow a variety of data analysis tools and software I sentiment analysis, statistical machine translation, and classification. to be integrated and used; it also enables results and outputs of analyses to be visualized and used across multiple interfaces Network Analysis Network analysis using graph theory and social network analysis are used to understand association and relationships between entities of interests. Statistical Analysis Traditional statistical methods using univariate and multivariate analysis on relatively small datasets are employed to make inferences, test hypotheses, and summarize data. 21
  • 22. layer 2 Analytics and Services (continued) discovering your data Before working with Booz Allen, most clients faced a fundamental challenge with data discovery. They didn’t know what data was actually available or how to sort through all of it to identify the most important business problems or trends it could reveal. Search Technical framework Discovery is intimately related to search and analysis. All three feed into insight in a nonlinear fashion. A search-discovery-analytics process that solves business problems without consuming disproportionate resources meets these user needs: ▶▶ Real-time, ad hoc access to content ▶▶ Aggressive prioritization based on importance to the user and the business ▶▶ ata-driven decision making, which relies on the ability to try different approaches and ideas in order D Analytics Discovery to discover previously unimagined insights ▶▶ Feedback/learning from the past intelligently applied to today’s data How booz allen simplifies discovery Other solutions require analysts to break down data into numerous subsets and samples before it can How does the Architecture support fast, efficient, and scalable search on entire datasets, not just samples? be digested. This expensive, time-consuming process is one of the major roadblocks to turning data into true business intelligence. ▶▶ ulk and soft real-time indexing enable the solution to handle billions of records with subsecond B search and faceting Even though the Booz Allen Cloud Analytics Reference Architecture supports the most advanced ▶▶ arge-scale, cost-effective storage and processing capabilities accommodate “whole data” L analysis, it can also allow your staff to sift through all of your data on a basic level. Without tedious or consumption and analysis; in-memory caching of critical data ensures applications meet performance sophisticated sampling and complex tools, they can discover what’s useful and what’s not useful for a requirements specific business problem. ▶▶ LP and machine learning tools can scale to enhance discovery and analysis on very large datasets N
  • 23. 1.0 | Summary Analytics and Services (continued) HOW THE DATA SCIENCE LIFECYCLE DISTILLS INSIGHTS 2.0 | Differentiation The data science lifecycle consists of three basic steps: Step 1 First, data is sampled using a cloud analytics platform. This step may involve a sophisticated analytic that runs in the cloud, such as one that crawls a social network to find people with certain types of relationships with an individual or organization. This sampling can be done using either high-level query 3.0 | Depth languages that are specially made for scalable cloud analytics or low-level developer interfaces. Step 2 Next, a data scientist models the data sample in order to understand it better. This is usually done using a statistical modeling environment on the data scientist’s workstation. 4.0 | Successes Step 3 Finally, once a trend is established using the model, the data scientist works with analysts and domain experts to explain the trend and yield insights. This cycle is repeated until the data science team reaches actionable insights and intelligence that can be presented to senior leadership for decision making purposes. Information may be delivered in a visualization, dashboard, or written report. 23
  • 24. layer 3 Data Management architecture model Human Insights and Actions Analytics and Services Batch Structured Unstructured
  • 25. 1.0 | Summary Data Management (continued) PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation A central feature of the Architecture, the Data Lake delivers on the promise of cloud analytics to offer previously hidden insights and TECHNOLOGY EXAMPLES drive better decisions. It’s a secure repository for data of all types and origins. Instead of precategorizing data, which restricts its usability from the moment it enters your organization, the Architecture combines unstructured, structured, and streaming data types 3.0 | Depth Hadoop Distributed File The primary open-source, distributed and makes them available for many different forms of analysis. System (HDFS) storage system creates multiple replicas of data blocks and The following principles demonstrate how the Architecture enables your organization to use this repository of enterprise distributes them on compute nodes throughout a cluster to enable data to the best advantage: reliable, rapid computations. ▶▶ Provide inherent replication of the data through a distributed file system ▶▶ se distributed key value (NoSQL) data storage to enable security and metadata tagging at the data level Accumulo NoSQL store based on Google’s U BigTable design features cell-level as well as indexing for specialized retrieval 4.0 | Successes security access labels and a server- ▶▶ R elax schema constraints and provide the flexibility to adapt to changing data sources and types with the side programming mechanism schema-on-read approach of distributed key value data storage that can modify key/value pairs ▶▶ at various points in the data Store the Data Lake on commodity hardware and scale linearly in performance and storage management process. ▶▶ Don’t presummarize or precategorize data ▶▶ Enable rapid ingest of data, aggressive indexing, and dynamic question-focused datasets through scale Hbase, Cassandra, MongoDB Open-source NoSQL databases focused on a combination of consistency, availability, and partition tolerance. Neo4j NoSQL scalable graph database storing data in nodes and the relationships of a graph. 25
  • 26. layer 3 Data Management (continued) DATA LAKE Booz Allen works with organizations in corporate and government sectors that have an urgent need to make sense of volumes of data from diverse sources, including those that had been inaccessible or extremely difficult to utilize, such as streams from social networks. Now analysts and decisionmakers can form new connections between all of this data to uncover previously hidden trends and relationships. Enterprise Machine-to- Transaction Sensor Data Data machine logs communication Intrusion and malware detection Fraud Detection Cyber Government Defense Finance healthcare Security Logs Regulatory Compliance Enhanced Situational Enhanced Email Quarterly Filings Awareness Forecasting Reports System Logs Financials Press Articles Individual organizations require different types of data. Not all types of data listed above may apply to every organization.
  • 27. 1.0 | Summary Enhanced Media Content Booz Allen’s strategy and technology consultants are highly regarded subject matter experts. Through groundbreaking conference keynotes, whiteboard talks, and papers, they help educate and shape the analytics industry. 2.0 | Differentiation We invite you and your team to take advantage of the educational resources listed below to gain strategic insights about the use of analytics, explore technical topics in depth, and stay on top of the latest trends. Presentations Videos Yahoo! Hadoop Summit: Biometric Databases and Hadoop Cloud Whiteboard Playlist Invented and demonstrated methods for dense data correlation (e.g., imagery and Short instructional videos on a range of topics from introductory talks for executives 3.0 | Depth biometrics) within a Hadoop distributed computing platform using new machine learning to tutorials for data analysts. Check back frequently for new material. parallel methods. Cloud Analytics for Executive Leadership Yahoo! Hadoop Summit: Culvert—A Robust Framework for Secondary Indexing of Booz Allen Principal Josh Sullivan discusses how analysis of data can be used Structured and Unstructured Data as a tool to provide insight to executives. Demonstration of Booz Allen’s secondary indexing solutions and design patterns, which support online index updates as well as a variation of the HIVE query language Informed Decision Making: Sampling Techniques for Cloud Data over Accumulo and other BigTable-like databases to allow indexing one or more columns Booz Allen Data Scientist Ed Kohlwey explains how sampling large amounts 4.0 | Successes in a table. of data can be useful for program managers to make informed decisions. Slidecast: Hadoop World—Protein Alignment Developer Perspectives: The FuzzyTable Database Demonstration of advanced analytics in using protein alignment sequences to identify Booz Allen Data Scientist Drew Farris explains how to use the FuzzyTable disease markers using Hadoop, HBase, Accumulo, and novel machine learning concepts. biometrics database. Slidecast: Innovative Cyber Defense with Cloud Analytics Presentation on improving intelligence analysis through a hybrid cloud approach Workshop to analytics, with descriptions and diagrams from Booz Allen client solutions. O’Reilly Strata Conference: Beyond MapReduce—Getting Creative with Parallel Processing Technical discussion of MapReduce as an excellent environment for some parallel Slidecast: Integrating Tahoe with Hadoop’s MapReduce computing tasks and the many ways to use a cluster beyond MapReduce. Invented and demonstrated method to use least-authority encrypted file system as plugin to HDFS within Hadoop cluster. Learn More Papers Massive Data Analytics in the Cloud Scan the QR code, or go directly to: Overview of the business impact of cloud computing, and how data clouds are boozallen.com/cloud shaping new advances in intelligence analysis. 27
  • 28. layer 4 Infrastructure architecture model Human Insights and Actions Analytics and Services Virtual Desktop Integration
  • 29. 1.0 | Summary Infrastructure (continued) PRINCIPLES AND TECHNOLOGIES 2.0 | Differentiation 3.0 | Depth TECHNOLOGY EXAMPLES Infrastructure is the foundation for any cloud implementation. What The following principles guide the infrastructure layer of makes the Booz Allen Cloud Analytics Reference Architecture unique the Architecture: is its plug-and-play, vendor-neutral framework. This framework not Amazon Web Services, Microsoft Cloud tool chain for provisioning, only allows a greater range of choices in selecting resources and ▶▶ ake it easy to transform physical resources from legacy IT M Azure, Puppet, VMware, vSphere configuration, orchestration, and monitoring of virtual environment. building services, it also allows for a faster, more streamlined, more systems to secure, virtualized data centers and trusted cloud These tools provide the building blocks secure, and lower risk deployment. computing environments for IaaS, PaaS, and foundation for ▶▶ mplement core services to provide the mechanisms to realize I SaaS. Run multiple operating systems 4.0 | Successes on-demand self-service, broad network access, resource pooling, and virtual network platforms on the same hardware—sharing computing, rapid elasticity, and measured service storage, and networking resources. ▶▶ mploy virtualization to increase utilization of existing assets and E resources, and improve operational effectiveness ▶▶ ngineer in-depth security to provide controls and continuous E Security through VMware, McAfee, Protect assets—physical, logical, and monitoring in order to fully address data protection, identity, Symantec, Cisco, TripWire, EnCase virtual—while automating governance privacy, regulatory, and compliance risks and compliance. 29 29
  • 30. Reference Architecture Security Framework Business Assets to Be Protected Geography Threats and Processes that Require Security Distributed Sites | Remote Workers | Jurisdictions Organizational Security Time Dependencies Governance | Supply Chain | Strategic Partnerships Transaction Throughput | Lifetimes and Deadlines Business Layer Conceptual Business Attributes Technical and Management Security Strategies Roles and Responsibilities Security Requirements to Support the Business Trust Relationships Time Dependencies a framework Control and Enablement Objectives Security Domains, Boundaries, and Associations When Is Protection Relevant? for security Resulting from Risk Assessment The Architecture is designed to Business Layer Logical protect your data at rest and Business Information to Be Secured Interrelationships Security Services in flight, with security controls Security and Risk Attributes Authentication Confidentiality and Integrity Protection | Strategic Partnerships embedded in each layer. This Security Entities Management Policy is obviously more than just Business Layer Physical a technology challenge. We Security-Related Data Structures Security Mechanisms Human Interface understand the need to embed Tables | Messages | Pointers | Certificates | Signatures Encryption | Access Control | Digital Signatures Screen Formats | User Interaction Access Control | Systems new processes and training Security Rules Security Infrastructure Physical Layout of Hardware, Software, and Conditions | Practices | Procedures Time Dependencies regimens so your staff handles Communication Lines Sequence of Processes and Sessions sensitive data correctly. We also advise you on how to secure your Business Layer Component Security IT Products Personnel Management Tools Time Dependencies facilities and ensure that all off- Risk Management Identities | Roles | Functions | Access Controls | Lists Time Schedules | Clocks | Timers and Interrupts premise facilities have the right Tools for Monitoring and Reporting Security Process Locator Tools Dynamic Inventory of Nodes | Addresses and Locations controls in place as well. Tools, Standards, and Protocols Business Layer Service Management Service Delivery Management Management of Security Operations Management of Environment Assurance of Operational Continuity Admin | Backups | Monitoring | Emergency Response Buildings | Sites | Platforms and Networks Operational Risk Management Personnel Management Management Schedule Risk Assessment | Monitoring and Reporting Account Provisioning | User Support | Management Security-Related | Calendar and Timetable
  • 31. Case Studies 1.0 | Summary 2.0 | Differentiation 3.0 | Depth 4.0 4.0 | Successes Successes in this section These case studies show how Booz Allen uses superior technology and analytics expertise to solve complex problems for clients in a wide range of corporate and government sectors. 31
  • 32. Case Studies Example One Improving Intelligence Analysis Mission Solutions To fulfill their mission, this Booz Allen worked closely with scalable and flexible to support Data Management organization requires data the client to adopt a data cloud future innovation and evolution The data sources had multiple correlation, quick access to implementation by augmenting without reengineering. formats, were large in size, analytic results, ad-hoc queries, the legacy relational databases and distressed with noise. advanced scalable analytics, with cloud computing and Interfaces and Visualizations The solution created deep and real-time alerting. analytics. The design focused Dashboards, web applications, insight through fusion of To provide their analysts on keeping transactional- client applications, and rich different data types at scale. with a continuous pipeline based queries in the current clients interfaced and integrated The solution enabled the of prioritized, actionable relational databases, while with advanced analytics ability to follow the lineage or information, they needed a doing the “heavy lifting” in infrastructure and legacy pedigree of the data, allowing secure, scalable, automated the cloud and outputting the relational databases through a the client to map cost in solution that would more interesting, processed, or SOA business logic layer. relation to the value of the data quickly and precisely sift desired analytic results into or how well it is being used. through large (and growing) relational data stores for quick Analytics and Services volumes of complex data transactional access. The solution called for Infrastructure characterized by a variety of predictive analytics to forecast The solution used Accumulo formats and noise. In addition, With many existing systems potential events from existing (distributed key value they needed to leverage their and applications dependent on data and anomaly detection to systems/NoSQL database) existing analytics infrastructure the legacy relational database extract potentially significant for content normalization in the new platform. for transactional queries of information and patterns. The and indexing, MapReduce as data, Booz Allen pulled together solution leveraged the core the precomputation engine, excess servers from the client’s principle of cloud analytics that and HDFS for scalable ingest infrastructure to build a hybrid enables automated analysis and storage. cloud solution. Also, as the techniques, precomputation, client’s needs change to adapt and aggressive indexing. to the mission, the solution is Impact Rather than simply focus on gaining IT efficiencies by using cloud technology for infrastructure, Booz Allen focused on applying cloud analytics and in-depth understanding of the organization’s operational and mission needs to extract more value faster from massive datasets. The new cloud solution provided immediate and striking improvements across the increasing volume of structured and unstructured data using aggressive indexing techniques, on-demand analytics, and precomputed results for common analytics. The final solution combined sophistication with scalability, moving the organization from a situation in which analysts stitched together sparse bits of data to a platform for distilling real-time, actionable information from the full aggregation of data.
  • 33. Case Studies Example Two 1.0 | Summary Planning and Responding to Disaster 2.0 | Differentiation Mission Solutions This organization, which Booz Allen developed a tool, Splunk, to mine through Data Management Geotagging Link Analysis is responsible for disaster framework to capture, normalize, and analyze vast amounts The solution framework planning and response, found and transform open-source of data in real time, while captured live, streaming open- that social media could provide media used to characterize outputting characterization source media such as Twitter timely situational awareness for and forecast disaster events, and forecasting metrics of and RSS feeds. Data was biological (and other disaster) in real time. The framework captured events. captured in Splunk and stored events. They wanted a solution incorporated computational on AWS. to better characterize and and analytical approaches Interfaces and Visualizations 3.0 | Depth forecast emerging disaster to turn the noise from social The solution included Risk Scoring Predictive Modeling events using social media data media into valuable information dashboards that characterized as it streams in real time. With using algorithms such as term events captured in social such a solution in place, the frequency-inverse document media. The visual analyses organization could increase frequency (TF-IDF), natural include event extraction counts, overall preparedness by language processing (NLP), time series counts, forecasting leveraging event characterization and predictive modeling to counts, a symptom tag cloud, to accurately predict the impact characterize and forecast the and geographical isolation. 4.0 | Successes and improve the response. numbers of sick, dead, and hospitalized, as well as to Analytics and Services Provider In order to reach their goal, extract symptoms, geography, TF-IDF and NLP algorithms Profile the organization needed higher and demographics for specific were used to classify and levels of confidence in the illness events. extract relevant information Clean, Validate, social media data on which they from the data. Booz Allen Normalize, Integrate would base their decisions. The solution framework developed predictive models The specific challenges the was implemented in the for forecasting event frequency Provider Online Cases/ new solution had to overcome cloud, taking advantage of and counts. The algorithms Financial Registration Geolocation Licensing Exclusion Claims Activities Rulings included data ingestion and the flexible computational were written in Python and normalization, social media power and storage. The incorporated into Splunk vocabulary, social media new cloud infrastructure located on Amazon Web characterization, information allowed Booz Allen’s data Services (AWS). extract, and geographical capturing and visualization isolation of events. Impact The new Booz Allen solution, which builds upon current best practices in cyber terrorism, enables near real-time situational awareness through a standalone surveillance system that captures, transforms, and analyzes massive volumes of social media data. By leveraging social media data and analytics for more timely and accurate disaster characterization, the organization is able to more effectively plan and respond. 33 33