Watch this talk here: https://www.confluent.io/online-talks/introducing-events-and-stream-processing-nationwide-building-society
Open Banking regulations compel the UK’s largest banks, and building societies to enable their customers to share personal information with other regulated companies securely. As a result companies such as Nationwide Building Society are re-architecting their processes and infrastructure around customer needs to reduce the risk of losing relevance and the ability to innovate.
In this online talk, you will learn why, when facing Open Banking regulation and rapidly increasing transaction volumes, Nationwide decided to take load off their back-end systems through real-time streaming of data changes into Apache Kafka®. You will hear how Nationwide started their journey with Apache Kafka®, beginning with the initial use case of creating a real-time data cache using Change Data Capture, Confluent Platform and Microservices. Rob Jackson, Head of Application Architecture, will also cover how Confluent enabled Nationwide to build the stream processing backbone that is being used to re-engineer the entire banking experience including online banking, payment processing and mortgage applications.
View now to:
-Explore the technologies used by Nationwide to meet the challenges of Open Banking
-Understand how Nationwide is using KSQL and Kafka Streams Framework to join topics and process data.
-Learn how Confluent Platform can enable enterprises such as Nationwide to embrace the event streaming paradigm
-See a working demo of the Nationwide system and what happens when the underlying infrastructure breaks.
2. 2
Contents
1. Who is Nationwide Building Society?
2. What is the business challenge we’re responding to?
3. What is the Speed Layer?
4. Typical current state architecture
5. Target state architecture
6. How does data flow through the Speed Layer?
7. How we consume data from the Speed Layer
8. How the Speed Layer is deployed
9. Progress
10. Streaming assessment
11. Value achieved
12. Demo
3. 3
Who is Nationwide Building Society?
• Formed in 1884 and renamed
to become the Nationwide
Building Society in 1970
• We’re the largest building
society in the world.
• A major provider of mortgages,
loans, savings and current
accounts in the UK and
launched the first (or 2nd)
Internet Banking Service in
1997
• We recently announced an
investment of an additional
£1.4 billion (total £4.1bn) over
5 years to simplify, digitise and
transform our IT estate.
• Confluent and Kafka form the
heart of an important part o
that investment.
4. 4
• Regulation such as Open Banking
• Business growth
• 24 x 7 availability expectations from
customers and regulators
• Cloud adoption
• Capitalising on our data
• A need for agility and innovation
… and our existing platforms were making this
difficult
5. The Speed Layer will be the preferred source of data for high-volume read-only data requests and event sourcing. It will
deliver secure, near real time customer, account and transaction information from back end systems to front end systems with
speed and resilience. It will use the latest technologies built for cloud as highly available and distributed. It will provide NBS
with the first event based real-time data platform ready for digital.
DEFINITION
FOUR KEY CHARACTERISTICS
SCALABILITY:
The Speed Layer platform will be
built on cloud ready PaaS
architecture to allow for significant
and frictionless scaling that is cost
efficient.
FAST AND AGILE:
The Speed Layer will unlock data
in systems of record enabling
digital and agile development
teams to rapidly deliver new
features and services
RICH DATA SET:
Provide a rich accessible data set
enhanced with data and analytics
from OpenBanking and social
media. It will also future proof for
other interactions such as IoT
RESILIENT:
Reduce the load on core systems
and isolate them from the demands
of the digital platforms: mobile,
internet and OpenBanking in
particular. Built with proven
scalable cloud ready components
for greater capacity and with
resilience
5
What is the Speed Layer?
6. 6
As is logical E2E Architecture
API Gateway
Channel Web Services
Enterprise Web Services
Back-end Services
Mainframes
Fairly normal, is there a problem?
7. 7
API Gateway
Stream Processing
Mainframes + other sources of data
Kafka Topics
Target System Architecture
CDC
Kafka Topics
Microservices Channel Services
Enterprise Services
Protocol adapters
WritesReads
8. System of Record(s)
CDC
Replication
Engine
Source
DB
Kafka Raw Topic – raw data
Stream
processingA
Microservice
Kafka Published Topic – processed data
Materialisation Microservice
NoSQL tables
{REST APIs}
1. Change Data Capture (CDC) is deployed to the System of Record (SoR) and
pushes changes from source database to Kafka Topic
2. Kafka topics contain data in the format of the source system. There will be one
raw topic per table replicated. Data is typically held here for c7 days.
3. Streams processing (Kafka Streams framework) is used to transform data into
processed data made available to consumers through “Published Topics”
4. Kafka Published Topics retain data long term (in line with retention policies
and GDPR) and can be used by many Speed Layer Microservices.
5. Speed Layer Microservices are consumers of Kafka Published Topics and push
the data they need into their persistence store (NoSQL, in-memory, etc.)
6. APIs expose data to consumers
7. Channel applications call Speed Layer Microservices to request data
8. Note, applications can subscribe to events and respond to events without
materialising them in a database, e.g., push notification to device.
124
5
3
6
Consuming Applications
7 8
Data Flow Diagram
9. 9
There are three main approaches for consumption of data from Speed Layer. 1. Immediate real time message consumption in the Event Driven pattern, 2. Usage specific
data sets are materialised and exposed through APIs in the Request Driven pattern. 3. Functionally aligned enterprise level data stores are materialised.
Enterprise/
Functional
microservices
Consumption
Ms & apps
Consumers are microservices that subscribe to
topics and materialise data to their requirements.
A set of functional microservices are created. For
example an “account” microservice from which all
consuming microservices and applications read
account data when needed.
Kafka consumers listen and respond to
messages that are arriving in near real time and
take immediate action on receipt of the message.
In this pattern there is no need to materialise the
data.
SL
subs
Producer Producer Producer Producer
SL
subs
Producer Producer Producer Producer
SL
Producer Producer Producer Producer
FUNCTIONAL SERVICEEVENT DRIVEN REQUEST DRIVEN
Legacy applications and/or services can be re-written to consume data from the Speed Layer to improve performance and reduce compute demand from other systems.
Consumption Patterns Overview
10. 10
Multi-site deployment and resilience
Primary DC for SORs Standby DC for SORs Cloud hosting Deployment
1. CDC writes to a local Kafka Cluster, i.e., in the same
DC as the mainframe
2. Kafka topics are replicated to a separate Kafka cluster
in our 2nd DC
3. Independent database clusters in each datacentre.
4. When required, Kafka topics are replicated using
Confluent Replicator to cloud providers
11. Progress so far…
• Architectural PoC completed:
1. Initial logical proving
2. Functional and non functional proving
3. Load testing/benchmarking in Azure and IBM labs
• Speed Layer project launched to deliver the production capability and first
use cases
1. Split into 3 use cases, with the first one code complete, 2 & 3 progressing well
• Adopting Confluent Kafka across multiple LOBs
1. Speed Layer
2. Event Based designs for originations journeys
3. High volume messaging in Payments
• Working on Streaming Maturity Assessment with Confluent
12. Adopting an Enterprise Event-Streaming Platform is a Journey
Nationwide nearly here - with Speed Layer +
platforms for Mortgages & Payments - but more
potential to share common ways of working and
utilise a common platform for more use cases
VALUE
1
Early
interest
2
Identify a project
/ start to set up
pipeline
3
Mission-critical, but
disparate LOBs
4
Mission-critical,
connected LOBs
5
Central Nervous
System
Projects Platform
Developer
downloads Kafka &
experiments,
Pilot(s).
LOB(s); Small teams
experimenting;
→ 1-3 basic pipeline use
cases - moved into
Production - but
fragmented.
Multiple mission critical use
cases in production with
scale, DR & SLAs.
→ Streaming clearly
delivering business value,
with C-suite visibility but
fragmented across LOBs.
Streaming Platform
managing majority of
mission critical data
processes, globally, with
multi-datacenter replication
across on-prem and hybrid
clouds.
All data in the
organization managed
through a single
Streaming Platform.
Typically → Digital
natives / digital pure
players - probably using
Machine Learning & AI.
13. Expected value (this time next year)
Enables agility and autonomy in digital development teams
The first use case alone will remove c7bn requests / year from the HPNS.
Will help us maintain our service availability despite unprecedented demand
Kafka and streaming being adopted across multiple lines of business
The move to micro services with Confluent Kafka enables Nationwide to onboard
new use cases quickly and easily
Speed Layer, Streaming and Kafka will help Nationwide head-off the threat from
agile challenger banks
The Speed Layer will help Nationwide provide customers with a better customer experience leading to
better customer retention and new revenue streams.
14. 14
Demo of Speed Layer
Why we did the Proof of Concept
Functional walk through
Non-functional view
Editor's Notes
Good morning all and thanks for joining this web cast
I’m Rob Jackson HO Application Architecture for Nationwide and also from Nationwide we have Pete Cracknell who’ll be doing a demo for you today.
Today we’re going to talk to you about an architecture we’ve called the “Speed Layer”. You might be familiar with the concept of a Speed Layer from Lambda Architectures and the world of Data architecture, but this isn’t that, it’s just a name that stuck, so sorry for any confusion we’ve caused with the name…
I talk to you about the reasons why we’re doing this architecture, contrast it with our current state architecture
Describe how we can consume data from the Speed Layer
A bit about how it’s deployed
Where we’re heading next
The best bit is the demo and then we’ll do a Q&A with Tim (Vincent) from Confluent.
I hope that’s ok!
Formed in 1884
Renamed to the name we are now in 1970
Worlds largest building society
In the UK we’re a major provider of mortgages, loans, etc.
Launched our first IB in 1997 and first or second
Recently announced a large investment of 4.1bn
What I’m talking to you about today forms an important part of that investment
I think the main headline for why we’re doing the speed layer is “digital disruption”
Some might not see Open Banking as digital disruption as it’s a regulatory requirement
However, it means we have to expose our data through APIs, and if we don’t offer good digital services through our own apps, customers will use other banks and organisations apps that use our APIs to disintermediate us.
The other reason to mention Open Banking is that it was the catalyst to work on the Speed Layer. OB, had the potential for high and unpredictable read volumes along with stringent requirements for availability.
We knew the other CMA9 were building OB with similar data caches to protect their core SORs from this load and we intended to do the same.
I’m sure you’ll recognise the other headings there: higher volumes, 24x7 expectations, people expecting to use their data in new ways, e.g., how easy is it to search your emails vs. your bank’s transaction history?
The final point is that, despite all the good work we do with them, our core ledgers do not make it easy to use data in new ways, aggregate multiple data sources, push events to customers and scale cost effectively.
Speed Layer one of the answers to Digital Disruption…
First looking at the definition: It’s a source of data that can be queried for a near real-time copy of mainframe data. On top of that it introduces event sourcing and stream processing into the society.
It’s built using modern technologies: Confluent Kafka, MongoDB, Microservices, OpenShift and will initially be deployed on-prem, but is very much an enabler for cloud – which I’ll come back to later.
The 4 key characteristics:
Resilient: the application is designed to tolerate infrastructure failure with build in data redundancy, horizonal scaling and automated recovery.
Fast and agile: once we’ve extracted data from SORs, we can allow consumers to join, aggregate, structure, query and search data in ways the SORs do not easily allow.
Rich Data set: Allows is to enrich SOR data with analytics, 3rd party sources, etc.
And Scalability: scalability is designed in to the technologies we’re using. Kafka and MongoDB are both heavility used in internet scale deployments. My favourite statistic for Kafka is that Ali Baba use it to source events at a peak rate of 425 million TPS. A big number for us is a small proportion of that.
This isn’t really our current state, it’s just a representative sample of one small part of our estate and fairly common.
But it allows me to describe a fairly normal transaction path
A REST/http request for some data comes in from a device on the internet
That hits an API Gateway in our datacentre, that then makes onward http requests until it eventually finds it’s way to the data in the mainframe
If any of those layers is unresponsive the request will time out
And of course, if the mainframe is not available, it doesn’t work at all. Thankfully, that doesn’t happen very often.
If we want to move any of those components out into the cloud, that’s ok, but they still have to call back into our data centre to get to the mainframe.
So, it’s not wrong and it’s how write requests will continue, perhaps with a simplified estate, fewer layers and modern technologies.
However, for read requests, we can do something different.
This shows how Speed layer for reads and event sourcing will sit alongside our enterprise middleware
A write comes into the SOR by existing means: batch, middleware, legacy services, payments gateway, whatever…
It’s picked up by CDC and pushed into Kafka where it’s processed and stored before being materialised, in our case that’s MongoDB.
Using this pattern, read requests are removed from our SORs or even our Data Centres, replicated to where it’s needed and materialised to requirements.
Going into that in a bit more detail…
I’ll just talk you through the data flow…
Step 1 is CDC on the mainframe.
Of course, this enables multiple data sources, for example, batch files using Kafka Connect, applications creating events, but for us right now, it’s Change Data Capture on our Mainframes.
So that’s how it works, next we’ll look at how consumers use it.
These show the ways in which we can consume data from the speed layer
There are cases for all of these, but I’m very much looking forward to seeing the first 2 come to fruition
Event driven – consumers subscribe to topics and act on events.
Requst driven – data is materialised to requirements
Functional – these are our cored shared services we expect to be re-used.
This shows how SL is an enabler for cloud and a good place to describe how resilience is baked into the architecture. Pete will show this for real during the demo.
Slide 10
Slide 11
We’re now getting into next steps
We’re about to embark on a streaming assessment with Confluent’s help.
We at around step 3, we’re using kafka, streaming, in SL, Mortgage and payments, but we’re currently doing things slightly differently on different platforms
We want to look at new use cases, new demand and what capabilities we need to create to support that demand.
This will feed into the various roadmaps, including the SI squad but also the IT Strategy.
Final slide before questions.
Pete was heading up the architectural proving team when we did this
I approached him with an architecture I wanted to prove and I think Pete’s approach was to try to break it. He’ll let you know how he got on in the Q&A
He can also cover the alternatives we looked at for stream processing and maybe some of the stuff we learnt along the way.