Publicité
Publicité

Contenu connexe

Présentations pour vous(20)

Publicité

Gse uk-cedrinemadera-2018-shared

  1. Big Data to AI Analytics Trends and Directions: Cedrine Madera, PhD Executive Information Architect Member of IBM Academy Of Technology
  2. Unleashing your data and making the shift to a Data-Driven Organization Value Uses of Data Efficiency Modernization Data Decision Monetization Operations Reporting & Data Warehousing Self-Service Analytics New Business Models Data Science Analytics maturity level From information driven to data driven
  3. BIG DATA, MACHINE LEARNING AND COGNITIVE/AI> 010101010101010111100010011001010111 1000101 1000101 1000101 111010111010 00000000000010101010100000000000 111101011
  4. Cognitive BUSINESS VALUE 1990’s DATA WAREHOUSE 2012 BIG DATA 2014 Data Lake Store and analyse growing volumes of data to answer to analytics requirements- Information driven Systems Integrate non structured data – Apache Hadoop experimentation - hybrid information & data driven systems To support digital transformation, data driven model Strong analytics foundations to go to AI> Information systems Velocity/ Variety / Volume of Data 2017 Cognitif Information System 2018 Infuse AI
  5. Semantic • Artificial Intelligence (AI) • Intelligence exhibited by machines or software • Machine Learning (ML) • Type of AI that enables computers to learn without being explicitly programmed • Deep Learning (DL) • Type of ML, based on neural networks loosely modeled after the brain • learns features and representations of data • Training • neural “inspired”, fed by millions of data points • repetition drives weighting and connections Cognitive Systems : A category of technologies that uses natural language processing and machine learning to enable people and machines to interact more naturally to extend and magnify human expertise and cognition. These systems will learn and interact to provide expert assistance to scientists, engineers, lawyers, and other professionals in a fraction of the time it now takes. Machine Learning Deep Learning Break tasks into Artificial Neural Networks Advanced Analytics: NoSQL, Hadoop & Analytics Human Intelligence Exhibited by Machines Cognitive / AI “Trained” using large amounts of data & ability to learn how to perform the task
  6. What the market is saying… https://www.forbes.com/sites/brentdykes/2017/01/11/crawl-with-analytics-before-running-with-artificial-intelligence/#61efd2f8299c Ovum : 2017 Trends to Watch: Analytics Machine learning and automation is the enterprise reality of AI science fiction “A market for algorithms will emerge..” Upgrading data architectures must balance new capabilities with existing investments IDC Crawl With Analytics Before Running With Artificial Intelligence
  7. No Artificial Inteligence without Information Architecture
  8. The descriptive Analytics challenges Functional • Regulation & compliance (GDPR) • Silos • All data types Non functional • Scalability • Reliability • Security • Data governance • Data Gravity Descriptive analytics can be classified into three areas that answer certain kinds of questions: • Standard reporting and dashboards: What happened? How does it compare to our plan? What is happening now? • Ad-hoc reporting: How many? How often? Where? • Analysis/query/drill-down: What exactly is the problem? Why is it happening?
  9. The Predictive Analytics challenges Functional • Information system coverage extension • Skills- open technologies • Machine Learning Non functional • Volume • Security • transparency Predictive analytics can be classified into six categories: •Data mining: What data is correlated with other data? •Pattern recognition and alerts: When should I take action to correct or adjust a process or piece of equipment? •Monte-Carlo simulation: What could happen? •Forecasting: What if these trends continue? •Root cause analysis: Why did something happen? •Predictive modeling: What will happen next if?
  10. The Prescriptive Analytics challenges Functional •Business rules automation Non functional •Real time •Historical data volume Prescriptive analytics, which is part of “advanced analytics,” is based on the concept of optimization, which can be divided into two areas: •Optimization: How can we achieve the best outcome? •Stochastic optimization: How can we achieve the best outcome and address uncertainty in the data to make better decisions?
  11. The Data governance challenges Functional CDO- CPO Ethics & Analytics Regulations Non functional •Data Life cycle •Data Security •Data quality Data governance (DG) refers to the overall management of the availability, usability, integrity, and security of the data employed in an enterprise.
  12. The Data Architecture challenges Functional HTAP* Data Lake IoT Non functional • Volume • Cost • Data Security • Data quality • Real time Data architecture is a set of rules, policies, standards and models that govern and define the type of data collected and how it is used, stored, managed and integrated within an organization and its database systems. It provides a formal approach to creating and managing the flow of data and how it is processed across an organization’s IT systems and applications. *Hybrid Transactional Analytical Processing
  13. How the z Systems can help to solve those challenges? Analytics- Machine Learning-Data governance-Data architecture
  14. The descriptive Analytics challenges Accelerators IBM DB2 Analytics Accelerator DB2 BLU DASHDB SIMD SMT • Data movement – ETL • INZA-predictive modelling • Queries • Open language R-Scala(Spark) • Archives • Federation • DB2 z/OS- IMS-VSAM-Oracle Technology breath : To simply- To alleviate- To secure Data gravity : volume-sensitivity-cost HTAP enablement
  15. The Predictive Analytics challenges Open Framework Machine Learning IBM SPSS Apache Spark IBM Machine Learning on z/OS R Technology breath : To simply- To alleviate- To secure Data gravity : volume-sensitivity-cost HTAP enablement
  16. Machine Learning Basics Identifies patterns in historical data Builds/trains behavioral models from patterns Makes recommendations Machine learning is everywhere, influencing nearly everything we do… Netflix personalized movie recommendations Waze personalized driving experience 7 out of 10 financial customers would take recommendations from a robot advisor
  17. Machine Learning - Process Data Ingestion Data Cleaning and Transformation Model Training Testing and Validation Deployment Model Selection From experimentation to production… the real data science challenge
  18. Machine Learning can be applied to a Variety of Use Cases Across Problem Types and Industries Machine learning can help IT department… batch optimization, predictive maintenance/failure,…. be embeded into any expert System.
  19. The Data governance challenges Move analytics power & security to data Ethics framework into Analytics project HW accelerator Memory extended zIIP eligibility Zero cost – Zero latency for IDAA Apache Spark Pervasive encryption MDM Machine Learning Privacy by design and by default Technology breath : To simply- To alleviate- To secure Data gravity : volume-sensitivity-cost HTAP enablement
  20. The Analytic’s Ethics dilemma with personal data : how GDPR could slow down Analytics project New Analytics or Machine Learning projects will required Ethical policies by design and by default.
  21. The importance of Ethical dimension with Analytics and Machine Learning projects
  22. Recommendations for GDPR readiness with Analytics and Machine learning projects • Check if personal data is processed into big data analytics treatment and should consider to use appropriate techniques to anonymize the personal data in their dataset(s) before analysis... • Become transparent about their processing of personal data by using a combination of innovative approaches in order to provide meaningful privacy notices at appropriate stages throughout a big data project. • Embed a privacy impact assessment framework into their big data processing activities to help identify privacy risks and assess the necessity and proportionality of a given project. • Adopt a privacy by design approach in the development and application of their big data analytics. This should include implementing technical and organizational measures to address matters including data security, data minimization and data segregation... • Develop ethical principles to help reinforce key data protection principles. Organizations should create ethics boards to help scrutinize projects and assess complex issues arising from big data analytics... • Implement innovative techniques to develop auditable machine learning algorithms. Internal and external audits should be undertaken with a view to explaining the rationale behind algorithmic decisions and checking for bias, discrimination and errors...
  23. The Data Architecture challenges Federated data lake Hybrid cloud integration IDAA Apache Spark DashDB Linux on z Technology breath : To simply- To alleviate- To secure Data gravity : volume-sensitivity-cost HTAP enablement
  24. Reasons to limit data movement to build a physical data lake Data gravity – analytic treatment move where the data resides Data sensitivity – To crypt data in case of data breach Real time analytics requirements Data governance high requirements : •Data quality : reduce data copy •Data security : regulations ( such as GDPR) •Data life cycle management : alleviate and optimize data management
  25. The hybrid data lake federated approach To alleviate data movement To use federated data approach To respect data gravity To leverage existing data set To limit data discrepancy Use z Systems as one of physical repository Let z Systems data In place Show to your data scientists How easy it is to access z data
  26. Imperatives to implement Data Lake hybrid scenario Reduce complexity of information supply chain, e.g. • Avoid data movement • Simplify data transformation • Use in-DB transformation • Use temporary tables structures Adhere to innovative and novel Analytics concepts, e.g. • Limit number of data marts and data cubes • Use aggregation on the fly • Allow for agile usage patterns • Leverage HTAP* architecture
  27. Technologies to use for hybrid data lake approach Leverage state-of-the-art technology, e.g. HW accelerators Special-purpose appliances In-memory processing Use federation technique whenever possible, e.g. Federated SQL queries, leaving data in place Federated analytical processing, leaving data in place Open Framework (e.g Apache Spark) *Hybrid Transactional Analytical Processing
  28. Data in IBM DB2 Analytics Accelerator • An extension of a DB2 for z/OS system • ETL process acceleration and alleviation • Accelerating SQL access to z/OS data, including IMS, VSAM ... loaded by IDAA Loader • Managing huge volume of history data (HPSS ) • R queries accelerator • Apache Spark on z/OS queries accelerator Transparent and easy data scientists access • Thru JDBC or API from Spark on distributed including Linux on z With Spark on z/OS as well as Machine Learning on z/OS z Systems as a Data Lake Repository into an hybrid approach- make z Data Simple
  29. Descriptive Predictive Prescriptive Data architecture Data governance Technology breath with IBM Z Ask your Information Architect to leverage them! Wrap up of the presentation Analytics From information driven to data driven , IBM Z can help to achieve the challenge !
  30. Thank you Cedrine Madera, PhD Executive Information Architect Member of IBM Academy Of Technology
Publicité