SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez nos Conditions d’utilisation et notre Politique de confidentialité.
SlideShare utilise les cookies pour améliorer les fonctionnalités et les performances, et également pour vous montrer des publicités pertinentes. Si vous continuez à naviguer sur ce site, vous acceptez l’utilisation de cookies. Consultez notre Politique de confidentialité et nos Conditions d’utilisation pour en savoir plus.
Legal Compliance and Privacy
• Regulations on what data, where, and from which people
• Protecting sensitive information from prying eye inside and out
• Guaranteeing non-discrimination in machine learning
• Anti-trust and anti-competition regulations
Data Breaches and Analytics Integrity
• Are we breached? By whom? What did they access?
• Are our analytics correct? Are they tampered with?
• Did our data transmit correctly?
• Did input streams ingest correctly?
• Is there malicious intent from any of our data suppliers?
Case In Point
• Financial Industry
• Our entire company was looted by a hacker inducing devastating trades
• Health Care Industry
• Massive lawsuit over mental health records made accessible by rogue analysis
• Credit Reporting Bureaus
• Class action lawsuit for malicious bogus content inserted by a rogue provider
• Intellectual Property Generating Firms
• A competitor just bought a new company with an exact copy of our stack?
Where did it go wrong?
• Spoofing and Identity Theft
• Gap in Capabilities between attackers and defenses
• Security versus scalability myth
Specific Issues to Address
• Due diligence for legal battles on specific breaches or illicit access
• Inability to detect intrusions
• Excessive trust in identities in ‘restricted environments’
• Need to solve these without performance hits
Nice to haves
• Did my data set linking actually work?
• Did this new analytic tool produce ‘quality’ results?
• What questions can I ask?
• Has my sensor array(s) just fried into garbage output?
• Is someone tampering with my input data?
• Option A – extend the TCP/IP stack with a security layer
• Option B – rebuild the entire stack from the ground up
• Option C – There is no C
Example B: Project Moonstone
• A set of design and project planning docs - not code
• Designed to provide security capabilities as a framework
• Replace your Hadoop, Spark, and other data science systems
• Adapters that allow these systems to operate inside Moonstone
• Inside a pre-built SCRUM project framework
• Little overhead required
• Integrated modular distributed anti-virus and intrusion detection
Qu Secure Data Science Language Concept
• Primitives to build other systems from built in graph analysis / SQL
• Derived from Scala and Erlang
• Security everywhere – no trusted places
• Auditing guarantees
Qu Concept Overview
• DataSet contain Table which are collections of Node
• Node contain Links to other Node in the same Table or not
• All are immutable – they can not change once created
• DataSet control access to DataStore that load and create DataSet
• All versions of all data stored, but some are offloaded to bulk storage
• All DataSet, Table, Node, and Link have timestamps for creation
Moonstone on Qu
• Identity Graph, 3 connectedness, and the SecureSocket Interface
• Data cleaning as a security module
• ‘System Temperature’ and automated intrusion reactions
• Automated evaluations and auditing interfaces
• Detecting perimeter threats