3. The Nature of Policing Is Changing
“The increasing availability of information and new technologies offers us huge potential to improve how we
protect the public. It sets new expectations about the services we provide.”
Police IT systems need to adapt to keep up with those changes
Load Data As Is
In reality, there’s two kinds of ETL:
The kind that focuses on transformation, cleansing, and quality because there’s a business process that requires standardization because they have to count things, do math, or disambiguate.
The kind that has to be done merely to allow RDBMS to function because of their dependency on harmonized, normalized, and deorthoganalized data.
We eliminate #2 completely, and enable a more iterative approach to #1.
How does ML deliver this integration?
Ingest data as is. No upfront data modeling required. ETL tool not required.
Structured and unstructured data. This includes scalar, text, geospatial, binary, semantic triples.
Data and meta data. Schemas accepted, but not required.
Data Formats (XML, JSON, binary etc.) with efficient tokenized storage
Methods
Content Pump - high speed data loading, serial writes
REST APIs
Java Client API
Node.js Client API
Java / .NET XCC
Competitive Advantage?
Data modeling and transformation is not a mandatory pre-requisite to loading data.
Quote is from the NPCC’s Policing Vision 2025
http://www.npcc.police.uk/NPCCBusinessAreas/ReformandTransformation/PolicingVision2025.aspx
Need information quickly,
need confidence that it is correct,
and need it to be complete,
To know that they have a complete picture of a suspect/perpetrator
Need to be able to see a full picture of a suspect in a live intelligence situation
To have all of the right information to hand during a live intelligence situation
Major problem is their silos are blocking them
Lots of small steps that taken individually do not seem insurmountable but taken together cause paralysis
And the status quo ends often ends up winning in this situation because if you can’t figure out where to start then you start anywhere
One PCC of a force shared that the highly place ministers are telling the police forces that they need to get on with their transformation
Story:
Imagine you’re a police analyst and you need to provide a full history of a individual to allow officers to proceed with a terrorism investigation
Currently the analysts need to log into several different databases.
At some forces there are more than 10 different systems holding important data
None of these systems is useful for combining and sharing the results of the analysis
So in order to share they then copy and paste the data in a WORD or Excel spreadsheet
This means that important intelligence reports are then further cut off from their source data and are not reproduceable
Google like search appliance
Lists of documents - Keyword Searches
No context
False positives– for example Nigel Brown will bring back tons of false positives
How do you see the person in their context if you can only search for their name and not their relationships?
The Promise - Access all data from a single user interface
Migrate all of the data to a new relational database and use a search engine
Schema design for this number and complexity of systems is time consuming
Losing the unstructured data e.g. the police intelligence
report, the storm call transcipt e.g. the valuable data that
gives meaning and context
Does more than just puts the data in one place
Allows you to discover how that data is related by building connections between the information
It’s the powerful combination of unstructured data and the semantic triples that help surface the relationships between people, events, locations and objects
Able to exploit all of the value in the unstructured data that has heretofore been considered dark matter and therefore difficult to exploit
An example of the IDEAL SOLUTION
Create a model for the delivery of shared services for future application development and for rolling out as good practice to other forces
Increase effectiveness and outcomes of analysts and police officers
Prove that data driven application design can produce results quickly
Single record for a person - not a list of documents
Name misspellings
Network diagram showing the relationships that have been identified by the system
Information is organised by it's context e.g. person rather than by it's format e.g. storm call record
Combine external data sources with your rich linked data set
Dictionaries of words/phrases associated with grooming highlighted in search results
Geospatial search making use of cleansed address data to show crime patterns over time and in geographic location
When there is confidence that a complete 360 view of an individual can be identified you can start to use statistical analysis to identify behavioral patterns that indicate vulnerability
Hand over to JW
Higher contrast colour of purple.
SB:
- Explain up-front why you are using phonetic names – to deal with misspelling etc.
- Same for dates – so I can type in a date in any format.
SB: - Out of UPRN can get lat, long
-
Give a clearer description of a dimension combination
- A combination of fields that uniquely identify the entity.
SB:
- Calculating hashes – so every unique entity can be uniquely identified.
Explanation of transitive
Loosely Integrated Multi System Stack
SB:
- We have demonstrate the approach of a multi-model data platform with semantics.
- Other vendors try to replicate this using multiple technologies – but this approach is different.
Architects paid by the box.
Doing joins is incredibly expensive.
Brittle
Question to Jen – still thinking of a good quesiton
Multi model is a good thing
Multi-model in once place is even better.
Simpler, transactional, all backed up together.
Query across three models – all in once.