4. Besoin de l’assureur
Vue 360 , préférence , profilage des assurés
ciblage des besoins des assurés , détection des
fraudes , contrôle des couts de remboursement
Connaissance
Clients / Réseaux
Un allégement des process et des coûts .
La prévention plutôt que la correction ..
Efficacité backoffice
Prédiction et
prévention
Offrir des services de prévention pour avoir plus
de fidélités et fiabilités des prédictions
Prédire les sinistres
5. Besoin des usagers de l’assurance
Avoir une assurance adaptée a leur besoin et
adaptée a leur profil
Reduction des
couts
Un allégement des processus de déclarations ,de
traitements , de remboursements
Efficacité backoffice
Prévention et
Sécurité
Avoir des services de prévention d’informations.
Participer a des programmes de fidélités pour
avoir des réductions
Anticiper des éventuel problèmes ( médecine
préventive )
8. Les types de Data
Audio, video, images. Meaningless
without adding some structure
Non structuré
JSON, XML, sensor data, social media,
device data, web logs. Flexible data
model structure
Semi structuré
Structuré CSV, Columnar Storage (Parquet,
ORC). Strict data model structure
9. Big Data introduit la nouvelle culture de
l’expérimentation
Comprendre les modèles de la
clientèle de découvrir les
opportunités de ventes croisées
Efficacité de la campagne
historique
Générer des rapports
financiers de fin d’année
Suivi financier des recommandations
en temps réel pour augmenter les
revenus
Générer des rapports
financiers de fin d’année
Offres de produits en temps
réel et des promotions basées
sur le comportement
Recueillir des données
historiques sur le rendement
des appareils
Surveillance en temps réel afin
d’identifier la maintenance
proactive
Fonctionnalités sans succès de
compréhension d’expédition
Caractéristiques réussies
corrélant l’action de l’utilisateur
avec l’expérience des produits
de construction
11. Cependant, il y a des défis aux BigData
Obtenir les compétences
et les capacités Déterminer comment
obtenir la valeur
Intégration avec des
investissements
informatiques existants
*Gartner: Survey Analysis – Hadoop Adoption Drivers and Challenges (Stamford, CT.: Gartner, 2015)
12. Trop d’informations , …. l’informations
Avant l’heure c’est pas l’heure , après l’heure c’est plus l’heure
Mais attention le big data n’est pas magique
Gartner predicts that 2017 will
see 60 percent of big data
projects fail.
https://www.networkworld.com/article/3170137/cloud-computing/why-big-data-projects-fail-and-
how-to-make-2017-different.html
16. Comment faire du Big DATA
• Partir avec une expérimentation concrète ( SCOPE )
• Pre action
• Recenser les sources de données disponibles (vous en avez déjà .. )
• Enlever les barrières légales ( GDPR, PCI , données de sante … )
• Référencer les compétences interne ( technologie , métiers )
• Se mettre en mode Agile pour du développement Itératif
• Action
• Utiliser le cloud pour maitriser les couts ( investissement on prémisse sont lourds)
• Séparer la couche de données / traitement ( souvent les 2 sont liées )
• Définir une priorité de besoin fonctionnelle ( vue 360, détection de fraude , bots , utilisation de
services cognitive , processus backoffice )
• Projet par projet mon data Lake se construira…
17. Pour la réalisation
• Brainstorm
• Techniques des 3-3 ( j’aimerais améliorer , je ne changerais pas … )
• Regarder les solutions sur étagères ( cortana gallery , …)
• https://gallery.cortanaintelligence.com/browse
• Déterminer l’ordonnancement des projets a faire
• ROI rapide
• Intégrable facilement
• apport global
18. Exemple d’Implementation
Event Hubs
Machine Learning
Web Service
SPEED Layer
Power BI
real time
Data streams
JSON to Rowset
Score Enrich
Blob Storage
(reference data)
BATCH Layer
SERVING Layer
SQL DB
Power BI
w/ Direct
Query
BLOB
Storage
RAWDATA
Data Lake
Analytics
SQL Data
Warehouse
Data Loading
Analysis
Services
Self Service Data
Analysis
BI Solutions
Data Warehouse & BI
Machine
Learning
Data Science
Data Factory
Real time
analytics
Historical analytics
Real time monitor
Data Lake
Notes de l'éditeur
Hi, I’m ___, and I’m excited to discuss Microsoft’s Cortana Intelligence Suite, and the value it can deliver to your organization.
Dta Dirven companies
Future State/Benefit outcome
What all of this new data available, we are creating an insight economy
Uncovering new insights by collecting and analyzing this data carries the promise of competitive advantage and efficiency savings. Better understand customers by predicting what they might buy based on behavior, on demographics – could be optimizing supply chain to better or faster routes. Reducing risk of fraud by identifying suspicious behavior – its all about that data
Those that don’t harness data are at major disadvantage
understand the past, monitor the present, and predict the future
MIT: data-driven decision environments have 5% higher productivity, 6% higher profit and up to 50% higher market value than other businesses.
Relational databases (RDBMS) work with structured data. Non-relational databases (NoSQL) work with semi-structured data
Relational databases (RDBMS) work with structured data. Non-relational databases (NoSQL) work with semi-structured data
Key Points:
Businesses can use new data streams to gain a competitive advantage.
Microsoft is uniquely equipped to help you manage the growing volume and variety of data: structured, unstructured, and streaming.
Talk Track:
Does it not seem like every day there is a new kind of data that we need to understand?
New data types continue to expand—we need to be prepared to collect that data so that the organization can then go do something with it.
Structured data, the type of data we have been working with for years, continues to accelerate. Think how many transactions are occurring across your business.
Unstructured data, the typical source of all our big data, takes many forms and originates from various places across the web including social.
Streaming data is the data at the heart of the Internet of Things revolution. Just think about how many things in your organization are smart or instrumented and generating data every second.
All of this means that data volumes are growing and bringing new capacity challenges. You are also dealing with an enormous opportunity, taking all of this data and putting it to work. In order to take advantage of all this data, you first need a platform that enables you to collect any data—no matter the size or type. The Microsoft data platform is uniquely complete and can help you collect any data using a flexible approach:
Collecting data on-premises with SQL Server
SQL Server can help you collect and manage structured, unstructured, and streaming data to power all your workloads: OLTP, BI, and Data Warehousing
With new in-memory capabilities that are built into SQL Server 2014, you get the benefit of breakthrough speed with your existing hardware and without having to rewrite your apps.
If you’ve been considering the cloud, SQL Server provides an on-ramp to help you get started. Using the wizards built into SQL Server Management Studio, extending to the cloud by combining SQL and Microsoft Azure is simple.
Capture new data types using the power and flexibility of the Microsoft Azure Cloud
Azure is well equipped to provide the flexibility you need to collect and manage any data in the cloud in a way that meets the needs of your business.
Big data in Azure: HDInsight: an Apache Hadoop-based analytics solution that allows cluster deployment in minutes, scale up or down as needed, and insights through familiar BI tools.
SQL Databases: managed relational SQL Database-as-a-service that offers business-ready capabilities built on SQL Server technology.
Blobs: a cloud storage solution offering the simplest way to store large amounts of unstructured text or binary data, such as video, audio, and images.
Tables: a NoSQL key/value storage solution that provides simple access to data at a lower cost for applications that do not need robust querying capabilities.
Intelligent Systems Service: cloud service that helps enterprises embrace the Internet of Things by securely connecting, managing, and capturing machine-generated data from a variety of sensors and devices to drive improvements in operations and tap into new business opportunities.
Machine Learning: if you’re looking to anticipate business challenges or opportunities, or perhaps expand your data practice into data science, Azure’s new Machine Learning service—cloud-based predictive analytics— can help. ML Studio is a fully-managed cloud service that enables data scientists and developers to efficiently embed predictive analytics into their applications, helping organizations use massive data sets and bring all the benefits of the cloud to machine learning.
Document DB: a fully managed, highly scalable, NoSQL document database service
Azure Stream Analytics: real-time event processing engine that helps uncover insights from devices, sensors, infrastructure, applications, and data
Azure Data Factory: enables information production by orchestrating and managing diverse data
Azure Event Hubs: a scalable service for collecting data from millions of “things” in seconds
Microsoft Analytics Platform System:
In the past, to provide users with reliable, trustworthy information, enterprises gathered relational and transactional data in a single data warehouse.
But this traditional data warehouse is under pressure, hitting limits amidst massive change.
Data volumes are projected to grow tenfold over the next five years. End users want real-time responses and insights.
They want to use non-relational data, which now constitutes 85 percent of data growth. They want access to “cloud-born” data, data that was created from growing cloud IT investments.
Your enterprise can only cope with these shifts with a modern data warehouse—the Microsoft Analytics Platform System is the answer.
The Analytics Platform System brings Microsoft’s massively parallel processing (MPP) data warehouse technology—the SQL Server Parallel Data Warehouse (PDW), together with HDInsight, Microsoft’s 100 percent Apache Hadoop distribution—and delivers it as a turnkey appliance.
Now you can collect relational and non-relational data in one appliance.
You can have seamless integration of the relational data warehouse and Hadoop with PolyBase.
All of these options give you the flexibility to get the most out of your existing data capture investments while providing a path to a more efficient and optimized data environment that is ready to support new data types.
Relational databases (RDBMS) work with structured data. Non-relational databases (NoSQL) work with semi-structured data
Data is now the key strategic business asset. Every device, every customer, every activity – everything that’s happening in the world around us - is producing incredibly rich data that can help us create new experiences, new efficiencies, new business models and even new inventions. Leveraging this data can be the differentiator for your business. For example, IDC estimates companies that are leaders in using data assets to their advantage will capture $1.6 trillion more in business value than those that lag behind.
While data is pervasive, actionable intelligence from data is elusive. Our customers want to transform data to intelligent action and reinvent their business processes. To do this they need to more easily analyze massive amounts of data – so they can move from seeing “what happened” and understanding “why it happened” to predicting “what will happen” and ultimately, knowing “what should I do”. Only then can they create the intelligent enterprise.
Pourquoi le Big Data?
Complexe normal d etre perdu
Un pas en arriere pour comprendre
Pourquoi on est interessé?
Cuurent stae/Negative
Hadoop
Cortana Intelligence delivers an end-to-end platform with an integrated and comprehensive set of tools and services to help you build intelligent applications that let you easily take advantage of Advanced Analytics and intelligence capabilities.
First, Cortana Intelligence provides services to bring data in, so that you can analyze it. It provides information management capabilities like Azure Data Factory so that you can pull data from any source (relational DB like SQL or non-relational ones like your Hadoop cluster) in an automated and scheduled way, while performing the necessary data transforms (like setting certain data columns as dates vs. currency etc). Think ETL (Extract, Transform, Load) in the cloud. Event Hubs does the same for IoT type ingestion of data that streams in from lots of end points.
The data brought in then can be persisted in flexible big data storage services like Data Lake Store and Azure SQL Data Warehouse.
You can then use a wide range of analytics services from Machine Learning to Azure Data Lake Analytics to Azure HDInsight to Azure Stream Analytics to analyze the data stored in the big data storage. This means you can create analytics services and models specific to your business need (say real time demand forecasting).
The resultant analytics services and models created by taking these steps can then be surfaced as interactive dashboards and visualizations via Power BI.
These same analytics services and models created can also be integrated into various different UI (web apps or mobile apps or rich client apps), or with Cortana, so end users can naturally interact with them via speech etc., and so that end users can get proactively be notified by Cortana if the analytics model finds a new anomaly (unusual growth in certain product purchases- in the case of real time demand forecasting example given above) or whatever deserves the attention of the business users. Similar integration can occur with Cognitive Services or Bot Framework based applications.
At a high level though, Cortana Intelligence capabilities are in three main areas: data, analytics and intelligence.
<Transition>: We’re going to dive into each one, starting with data.
Highlights
Event Hub + ASA (vs. KAFKA/SPARK Streaming):
simple structure of the message
Business logic required was immediate to implement
Strong integration with ADL, AML, PBI
Price
With a more complex structure of the message Hadoop or other technologies might be better suited. In addition we are forcing a CEP tool into a Micro-batching scenario
Serving Layer with SQL DB (vs. many things... i.e. NoSql technologies or even search)
Classic BI workload
In memory columnar
Data volumes not an issue even for a scale-up layer (1 day of data + a few historical metrics)
Integration w/ PBI (direct query)
Didn’t fit into PBI + ASA because it lacks analytic power
Data Lake on Azure Storage (vs. HDFS or ADL)
No real game changer in ADL (i.e. ACL, file size / amount), but it would fit as well
ADLA and Polybase support
ADLA for pre-processing of data (model was much more convenient vs Hadoop Hive)
It is a service by itself and integrates super well with many other services for processing
HDFS would force the Hadoop ecosystem
Datawarehouse = SQL DW + AAS
SQL DW has a strong integration with Azure Storage for data loading
AAS was a revelation for them – no Hadoop component provides the same user experience
Data Science
Learn anomalies vs. learn SLA (big difference in consumption)
Placeholder, would require much more work with the internal user
This is the heart of anamaly detection... Huge impact on the organization