Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Big Data Analytics

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Big Data Analytics
Ben Fountain
May 2016
2
What is Big Data?
“The dynamically linked super set of multiple significant
scale discrete data sets.”
-Oscar Wilde
Char...
3
Why care?
• Better intelligence which can be leveraged in business, healthcare etc. to
target efforts;
• Cost of a DNA a...

Les vidéos YouTube ne sont plus prises en charge sur SlideShare

Regarder la vidéo sur YouTube

Prochain SlideShare
Team 2 Big Data Presentation
Team 2 Big Data Presentation
Chargement dans…3
×

Consultez-les par la suite

1 sur 13 Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Les utilisateurs ont également aimé (14)

Publicité

Similaire à Big Data Analytics (20)

Plus par Napier University (20)

Publicité

Plus récents (20)

Big Data Analytics

  1. 1. Big Data Analytics Ben Fountain May 2016
  2. 2. 2 What is Big Data? “The dynamically linked super set of multiple significant scale discrete data sets.” -Oscar Wilde Characteristics include • Large volumes, typically adding terabytes of data daily • Aggregation of many historically discrete data sets • Dynamic links between the data sets Consequently • Any analysis is a point in time position
  3. 3. 3 Why care? • Better intelligence which can be leveraged in business, healthcare etc. to target efforts; • Cost of a DNA analysis has reduced by around 5 orders of magnitude since the process became possible, making personalised medicines a reality in the near future. • If you are investing in Big Data projects, the risk of data loss doesn’t necessarily change. The Volume of loss is potentially colossal with impacts that aren’t understood for an extended period. • Customers hold concerns about companies taking a role of Orwellian Big Brother.
  4. 4. 4 There’s no Best Practice…yet Breaches • Snowden showed that Government organisations with specific focus on security struggle to control Big Data and the associated risks. • Panama Papers showed that legal firms with an inherently high level of confidentiality in their practices struggle. Compliance issues • Harder to define the purpose of data exploration. • Big Data breaches tend to be….bigger. • Regulators will expect technology to be used equally to exploit and control Big Data.
  5. 5. 5 Key Controls for Big Data 1. Track all access that collects, views, and manipulates sensitive data, and ensure that it is encrypted at each point. 2. Encryption keys for sensitive data can't be stored at the same location as the data. 3. All access and processing of data must be logged. These logs must be subject to human and automatically monitoring. 4. Use automated scanning to constantly monitor systems for vulnerabilities and malware. 5. Monitor network egress for anomalies in traffic. 6. Create a number of "false flag“ records. Configure alerts and blocks to identify and prevent data breaches.
  6. 6. 6 How to use Big Data Analytics? Prescriptive Analytics How can we influence the future? Predictive Analytics How can we plan for the future? Diagnostic Analytics Why did this happen? Descriptive Analytics Do we know what happened? AnalyticsMaturity HistoricalAnalyticsProactiveAnalytics
  7. 7. 7 Police use of Predictive Analytics The California city of Fresno is just one of the police departments in the US already using a software program called “Beware” to generate “threat scores” about an individual, address or area. As reported by the Washington Post in January, the software works by processing “billions of data points, including arrest reports, property records, commercial databases, deep web searches and the [person’s] social media postings”. Photo: Nick Otto/For The Washington Post Quote :https://www.theguardian.com/technology/2016/feb/04/us-police-data-analytics-smart-cities-crime-likelihood-fresno-chicago-heat-list
  8. 8. 8 How to do it well Staff appropriately • Specialist Skills are in demand; • Big Data • Data Management • Have a plan to recruit and retain them! Data Quality • Big Data Leaders show maturity in data quality
  9. 9. 9 Final Point Big Data is a pre-requisite of the desire for better analytics, the desire to better understand. Of itself, its just a large data set waiting to breach.
  10. 10. 10 Points of contact Ben Fountain Senior Consultant M: +44 (0) 7545 503 311 E: ben.fountain@nccgroup.trust NCC Group Blogs https://www.nccgroup.trust/uk/about- us/newsroom-and-events/blogs/ TED Talks on Big Data https://www.ted.com/search?q=big+data
  11. 11. 11 Experiment “The dynamically linked super set of multiple significant scale discrete data sets.” -Oscar Wilde Well that’s a lie.
  12. 12. 12 NCC Locations Europe Manchester - Head Office Amsterdam Basingstoke Cambridge Copenhagen Cheltenham Delft Edinburgh Glasgow Leatherhead Leeds London Luxembourg Madrid Malmö Milton Keynes Munich Vilnius Zurich Australia Sydney North America Atlanta Austin Chicago Kitchener New York San Francisco Seattle Sunnyvale

Notes de l'éditeur

  • So, I’m starting by defining how I think of Big Data.


    Experimenting by falsely attributing this definition.
    This definition as of today has zero hits on google.
    Experiment with a search over time to see how, or if Google manages to find and attribute the quote to Oscar or myself.
  • With Big Data you are sifting a larger data set, looking for more specific information than has previously been possible. Sometimes patterns emerge that weren’t previously identified at a macro scale, that’s more often in scientific efforts; business is typically looking to being better able to exploit an existing market than break new ground.

    So what are you looking to analyse? What are the data sets and how have they been compiled? What is their provenance? What about the data quality? Where Big Data projects have provided meaningful benefits a trend shows that these companies have three aspects in place;

    Strong staff who are interested in asking the right questions, not obsessed in ‘big data’ as a buzzword.
    Big Data doesn’t change the Garbage In, Garbage Out principle; Mature data quality processes are a must
    Responsible approach, several aspects
    big data can expose more details that are not palatable to the general public or sometimes to the company; you need to recognise that the analysis may challenge the hypothesis.
    RBAC is critical, exposing these data sets can result in significant harm to your organisation and everyone referred to either directly or indirectly
    Compliance becomes critical in this as soon as you have data sets which correlate to identify individuals instead of groups.
    Whilst personalised healthcare, advertising that predicts what we want just in time for us to purchase it and identifies criminals automatically is the goal, far too often we have found that new technologies tend to be exploited for less laudable goals.
    Big data under the GDPR will associate with big fines….

  • Gunter Ollman of NCC Domain Services proposed that these controls give an overlapping set that work together across network, vulnerability, behaviour and (to a degree) stupidity to jointly reduce the likelihood and impact of a breach.

    Track all access and processing of the data, encrypt sensitive data as soon as possible, ideally at the source.
    Don’t leave the keys in the same place as the data.
    Log everything and monitor it. Leverage the anomaly detection systems to reduce the signal to noise ratio until humans can realistically review the volume of data.
    Use automated scanning to constantly monitor systems for vulnerabilities and malware.
    Monitor network egress for anomalies in traffic.
    Create a number of "false flag“ records. These will automatically alert your security team if they are accessed. Configure alerts and blocks to identify and prevent data breaches.
  • We can split companies use of big data into what happened and what will happen and further segment that to provide a maturity model.

    Descriptive analytics is where most activity remains in the IT sector at the moment with regards to big data. Log collation and some analysis.
    In some instances we have a breach and move to diagnostic analytics as we look to analyse the detail, but this takes effort and because still many organisations do not report breaches the patterns are not always clear enough to derive a confident conclusion. This is a reactive position.
    Predictive analytics some of the more advanced and security focussed organisations are moving to. Threat modelling efforts sit here.
    Prescriptive analytics; crystal ball gazing is now moving into pre-crime, yet this is happening now for several police forces in the US. https://www.theguardian.com/technology/2016/feb/04/us-police-data-analytics-smart-cities-crime-likelihood-fresno-chicago-heat-list
  • On a call, officers respond, Beware checks the address and get names of residents, these are checked against public data sources to threat model them RAG.

    How this is done is a trade secret, but could identify a PTSD sufferer who has tweeted about having bad experiences….. Your tweets could influence whether the officer approaches the door, and if you are flagged red, say because your account has recently been hacked then the outcome may be violent.

    http://www.aclunc.org/docs/201512-social_media_monitoring_softare_pra_response.pdf
  • Traditional IT staff are often the wrong fit for big data, they focus on the T and not the I.
    Specialist skills are required, and only a few organisations work truly at Big Data Exabyte scales, so they are in high demand.

    The analysis can improve by ensuring that the importance of data quality is embedded in all your systems to ensure that the data sets are filtered as they progress through downstream systems before they hit the Big Data aggregation point.

  • Experimenting by falsely attributing this definition.

    This definition as of the today has zero hits on google.

    I’ve configured a Google alert to track this quote and I’m looking forward to seeing who it gets attributed to.

×