Ce diaporama a bien été signalé.
Nous utilisons votre profil LinkedIn et vos données d’activité pour vous proposer des publicités personnalisées et pertinentes. Vous pouvez changer vos préférences de publicités à tout moment.

Building enterprise advance analytics platform

622 vues

Publié le

By Raymond Fu - Practice Architect
This lecture talks about the best practices in building an advanced analytics platform to help companies apply machine learning, deep learning and data science to their structured and unstructured data.

At Southern California Data Science Conference Sept.25.2016 at USC


Publié dans : Technologie
  • I have barely snored at all! My girlfriend was starting to put pressure on me to have an operation to stop my snoring, but to be totally honest I was scared stiff. I've heard some horror stories and there was no way I wanted to take on that risk. Then I found your website and since putting your techniques into practice I have barely snored at all. My girlfriend can't believe how effective this has been. ■■■ http://t.cn/AigiN2V1
    Voulez-vous vraiment ?  Oui  Non
    Votre message apparaîtra ici

Building enterprise advance analytics platform

  1. 1. Building Enterprise Advance Analytics Platform SoCal Data Science Conference 09.25.2016 Raymond Fu Practice Architect Trace3 T3
  2. 2. 22 Raymond Fu Practice Architect, Trace3 16 years of IT experience specializing in big data, business intelligence, and enterprise architecture. 10 year corporate career with Bank of America highlighted by leading many data integrations and warehousing initiatives from mergers and acquisitions. Founded his own technology company Xceed Consulting Group in 2012 enabling data driven solutions. Joined California based consulting company Trace3 in 2016 as a practice architect for the Data Intelligence team. Blog: Everything About Data Twitter: @RaymondxFu
  3. 3. • Typically, organizations got a firm grasp on required People, Process, and Technology to deliver capabilities, articulate end-to-end roadmap, identify platforms and resources. • Big Data disrupts the traditional architecture paradigm. Organizations may have an idea or interest, but they don’t necessarily know what will come out of it. • The answer or outcome for an initial question will trigger the next set of questions. It requires a unique combination of skill sets, the likes of which are new and not in abundance. • The pursuit of the answer is advanced analytics. Big Data Disruption 3
  4. 4. Advanced Analytics Definition • The process, tools, technology, and collaboration to create predictive models that enable/drive strategic and operational decisions. The predictive models (1) generate insights and hypotheses and (2) test/score them through experiments, so organizations KNOW what works better. • Predictive models are created using machine learning, deep learning, advanced data management tools and visualization tools • An integral part of Advanced Analytics includes the operationalization of the predictive models so they can be rapidly scored and decisioned at scale
  5. 5. Advanced Analytics Relevancy 5 Organizations’ goals Advanced Analytics’ goals What’s different today Obstacles to the goals
  6. 6. Advanced Analytics Process 6 • Domain knowledge • Hypothesis development • Model architecture • Algorithm selection and development • Feature engineering • Visualization Collaboration Reproducibility • Data mining • Statistical data shaping • Training • Cross-validation testing • Environment and libraries Production feature generation, modeling, testing Deployment Parallel experiments • Performance assessment • Connectivity • Landing • Ingestion • Knowledge • Preparation Business metric assessment Data management Analytics creation (business modeling) Analytics operationalization (model production and deployment) Organization and business impact • Continuous integration and deployment • Model iteration and redeployment IT/DE, DS LoB, DS DS, IT/DE, LoB LoB, DS, IT/DE • R-T and batch scoring • Decisioning
  7. 7. Enterprise Big Data Strategy • Information management • Data architecture, data governance and meta data management. • Address key issues such as data integration and data quality. • Data platform modernization • Enterprise data warehouse offload. • Data lake platform assessment. • Advanced Analytics • Methodology • Tools recommendation • Operationalization
  8. 8. • Step 1 – Establish Business Context and Scope (incubate ideas) • Step 2 – Establish an Architecture Vision • Step 3 – Assess the Current State • Step 4 – Establish Future State and Economic Model • Step 5 – Develop a Strategic Roadmap • Step 6 – Establish Governance over the Architecture Enterprise Architecture Approach
  9. 9. Establishing an Architecture Vision 9 The architecture development process needs to be more fluid and different from SDLC-like architecture process. It must allow organizations to continuously assess progress, correct course where needed, balance cost, and gain acceptance.
  10. 10. Advanced Analytics Capabilities 10 Category Capability Items Organization and business impact Fast, informed decisions • Time from question to hypothesis to model implementation to informed decision Strategic and operational role • Degree of input into business/policy decisions • Perceived and quantified value of analytics Analytics operationalization Model performance • Execution of experiments in parallel • Model performance for scoring and decisioning Model deployment • Continuous integration and deployment Analytics creation Efficient model creation • Use of data mining and visualization tools • Rapidly spun-up environment customized to individual data scientists that enables execution of large data sets and highly mathematical algorithms • Collaboration among data scientists and between data scientist and lines of business; reuse of data sets and models • Model reproducibility (including versions, algorithms, data sets, parameters, notes, environment) Appropriate model selection • Understanding, and appropriate use, of model architecture and algorithms, feature engineering, hyper parameterization, statistical and mathematical concepts, training and validation, scoring, and decisioning • Use of ML and DL concepts, tools, and libraries • Use of graph systems Data management Data capability • Infrastructure and tools to access and cleanse data Data knowledge and confidence • Understanding of, and confidence in, data (e.g. what is available, their relationships) Data access • Access to internal and external data through infrastructure, logical associations, and tools
  11. 11. Enterprise Information Management Capabilities 11
  12. 12. Advanced Analytics Reference Architect 12
  13. 13. 13 Structured data source Unstructured data source RDBM S Big Data Business Intelligence / Data Visualization Advanced Analytics HDFS NoSQL Cloud Storage ETLETL Teradata Operation CRM ERP Accounting Clickstream Sensor Info Images/Video Event Logs Social Media Tools Real-time Streaming Library (ML and DL) Online ML AWS Azure torch Machine Learning API Google Prediction AWS Azure BigML IBM Watson
  14. 14. Advanced Analytics Services 14 Service Type Services Overall Assessment • Advanced Analytics assessment Architecture • Architecture for data science • Architecture for cloud analytics ETL/ELT • Data source identification and integration • Data virtualization • Data preparation Data analysis and modeling (data science) • Statistical / quantitative analysis • Descriptive analysis • Predictive modeling • Machine learning • Deep learning • Graph systems • Simulation and optimization Service Type Services Visualization and insight presentation and recommendations • Data exploration / mining / advanced visualization to understand the data • Insight presentation and recommendations Tools recommendation • Infrastructure • Software tools • Software environment, programming, libraries Process improvement • Analytics process improvement • Data governance • Model governance • Continuous integration and deployment of models Organizational capabilities • Advanced analytics organization structure and roles • Advanced analytics training • Advanced analytics staff augmentation
  15. 15. Best Practice 15 • Align Analytics with Specific Business Goals • Ease Skills Shortage with Standards and Governance • Optimize Knowledge Transfer with a Center of Excellence • Top Payoff is Aligning Unstructured with Structured Data • Plan Your Discovery Lab for Performance • Align with the Cloud Operating Model
  16. 16. Example 1: Oracle 16
  17. 17. Example 2: Google Cloud Platform – Building Blocks 17
  18. 18. Example 2: Google Cloud Platform – Stepping Stone 18
  19. 19. Thank you! 19