A look at ESG concerns and agility needed to address pressures to transform energy organizations with decarbonization. Presented to Future Oil and Gas conference November 2021
Extending open source and hybrid cloud to drive OT transformation - Future Oil & Gas conference
1. 1
Extending open source and
hybrid cloud to drive OT
transformation
John Archer
Senior Principal BDM - AI/Edge
archer@redhat.com
Future Oil & Gas
Nov 16-17, 2021
7. Action Plan? How to prioritize?
7
Condition Models, Preventative Maintenance, Consumption trends,
Feedstocks
Border Taxes, Pass the Carbon Tax down to customers
Change business lines with Renewables, DER, EV, Storage, Biofuels,
Hydrogen, Ammonia
What is my board thinking? End to end Process impacts
Today this is all in silos, difficult to analysis and culturally and/or
security sensitive in many organizations
8. Red Hat OpenShift: Innovation without limitation
8
8
Cloud-native
Internet of things
Digital
transformation
Containers
DevOps
Open organization
Open source communities
Kubernetes
Hybrid cloud
Machine learning
AI
Innovation
Security
Automation
business innovation
Every organization in every geography and in
every industry can innovate and create more
customer value and differentiation with
open source technologies and an open culture.
Big ideas drive...
5G
9. Red Hat OpenShift: Delivering innovation without limitation
But innovating
isn’t always easy
9
Innovation
Innovate at speed.
Flexibility
Flexibility to adapt to market changes.
Growth
Grow new customer experiences
and lines of business.
10. 10
Red Hat industrial edge
Open Source Initiatives around Industrial Edge Computing
Collect and focus the best ideas
A global alliance solving critical
manufacturing challenges
LF Energy is an open source
foundation focused on the power
systems sector
A standards-based, open, secure, and
interoperable process control architecture
Open Source data platform
for the energy industry
The interoperability standard for
secure and reliable information
exchange in industrial automation
Powers the world’s leading
commercial IoT solutions
A framework of open source software
components to build platforms that
support the development of Smart
Solutions faster, easier and cheaper
The cross-vendor open source
connectivity solution for Smart
Factories and Smart Products
We share the vision of a continuous data
exchange for all contributors along the
automotive value chain
11. Data in a large enterprise
Data silos
Slow data access puts project at risk
Legacy tech & poor automation
Error-prone, manual process unacceptable in a
modern event-driven environment
Lack of cross-team collaboration
Demands person-in-the-loop with institutional
knowledge. Analysts assemble personal, stale
datasets
No Data Governance process
Without a centralized understanding of company
assets, few models are capable of deployment
Energy organization data challenges
Months of effort Months of effort Months of effort
Sources
Subject Matter
Experts
Data Scientists
Geologists
Geophysicists Models
?
SEISMIC
WELLBORE
FLUIDS
CORE
ANALYSIS
PIPELINE
Business has
a question
Business gets
an answer
80%
12. Data in a large enterprise
Silos still exist
Data Warehouses & Lakes leave data in original form.
Little change in time to result
Still need SMEs, manual process, no governance, etc.
Consumers handle data
preparation
Data consumers still responsible for transformations
Lacking business-based data
model
Data should be transformed into the form the
business needs and understands. Automation
required forces global understanding as teams self-
service based on their needs
Consolidating data isn’t sufficient
Sources Warehouse / Lake Data Solutions Consumers
APIs
Intelligent
Applications
Events
13. 13
Red Hat industrial edge
The Need
“As shop floor IT person, I want to get rid of all the different bespoke and customized
hardware solutions for PLCs, SCADA, HMI, MES etc. They are expensive, inflexible and
hard to maintain.
I want a single unified software platform based on standard hardware, so I can easily add
new features and functions defined purely in software, even from different vendors.
That would help me to improve the efficiency and agility of my plant”
14. 14
Red Hat industrial edge
Our focus use cases
● Standardized distributed operations
● Modernized application environments
(OT and IT)
● Modernized network infrastructure
● Automation/integration of
monitoring & control processes
● Predictive analytics
● Production optimization
● Supply chain optimization
Digital Enterprise Edge
Extend cloud/data center approaches
to new contexts / distributed locations / OT
Operations Edge
Leverage edge/AI/serverless
to transform OT environments
Industrial
Edge
● Aggregation, access and far edge
● Manages a network for others
○ Telecommunications Service Providers
○ Creates reliable, low latency networks
Provider Edge
Network and compute specifically
to support remote/mobile use cases
Provider
Edge
Enterprise
Edge
● Vehicle edge (onboard & offboard)
○ In-vehicle OS
○ Autonomous driving,
Infotainment up to ASIL-B
○ Quality management
Connected Product Edge
Create new offerings or customer/
partner engagement models
Vehicle
Edge
15. 15
Central data center
Cluster management and application
deployment
Kubernetes node
control
Regional data center
Edge
CONFIDENTIAL designator
Single node
edge servers
Low bandwidth or
disconnected sites.
C W
Site 3
W
Site 2
C
C W
Site 1
Remote worker
nodes
Environments that are
space constrained
3 Node Clusters
Small footprint with
high availability
Legend:
C: Control nodes
W: Worker nodes
Red Hat Edge Topologies
16. 16
S W
Edge clusters
(3+ node HA)
S
W
Remote
worker nodes
S W
Single node
edge servers
Small footprint
device edge
An image-based deployment
option of RHEL, that includes
transactional OS updates,
intelligent OS rollbacks and
intended for, but not limited
to, containerized applications
Red Hat OpenShift
deployment on a single
node (supervisor + worker)
with resources to run a full
Kubernetes cluster as well
as application workloads.
Red Hat OpenShift
supervisors reside in a
central location, with reliably-
connected workers distributed
at edge sites sharing a control
plane.
Red Hat OpenShift
supervisors and workers
reside on the same node.
High Availability (HA) setup
with just 3 servers.
Our edge platforms
Consistent operations at scale
17. 17
Overview of Red Hat OpenShift Data Science
Key features of Red Hat OpenShift Data Science
Combines Red Hat components, open source
software, and ISV certified software available
on Red Hat Marketplace
Increased capabilities/collaboration
Model outputs are hosted on the Red Hat
OpenShift managed service or exported for
integration into an intelligent application
Rapid experimentation use cases
Available on Red Hat OpenShift Dedicated
(AWS) and Red Hat OpenShift Service on AWS
Cloud Service
Provides data scientists and intelligent
application developers the ability to build, train,
and deploy ML models
Core data science workflow
Addressing AI/ML experimentation and integration use cases on a managed platform
18. And the services and partners to guide you to success
18
RED HAT OPEN INNOVATION LABS
RED HAT CONTAINER ADOPTION PROGRAM
CATALYZE INNOVATION
IMMERSE YOUR TEAM
EXPERIMENT
Rapidly build prototypes,
do DevOps, and be agile.
Bring modern application
development back to your
team.
Work side by side with experts
in a residency-style engagement.
FRAMEWORK FOR SUCCESSFUL CONTAINER
ADOPTION AND I.T. TRANSFORMATION
Mentoring, training, and side-by-side collaboration
SYSTEM INTEGRATORS
Or work with our ecosystem of certified systems integrators, including…
Red Hat OpenShift: Delivering innovation without limitation
19. 19
Lots of data is collected,
but finding and preparing
the right data across
multitude of sources with
varying quality is difficult
Readily usable data
lacking
Lack of key skills make it
difficult to find and secure
talent to maintain
operations
Talent
shortage
No rapid availability of
infrastructure and software
tools slows data scientists
and developers
Unavailability of
infrastructure & software
Unable to implement
quickly due to slow,
manual and siloed
operations
Lack of collaboration
across teams
Slow CPU
processing
Data sets continue to
increase in size but CPUs
are not getting faster and
not able to parallelize
processes well
AI/ML Key Execution Challenges
20. 20
Overview of Red Hat OpenShift Data Science
Our approach to AI/ML
Data as the foundation
Represents a workload
requirement for our
platforms across
hybrid cloud.
Applicable to Red Hat’s existing
core business in order to increase
open source development and
production efficiency.
Valuable as specific services
and product capabilities,
providing an intelligent
platform experience.
Lets customers build
intelligent apps using
Red Hat products and our
broader partner ecosystem.
Hybrid cloud Open source efficiency Intelligent platforms Intelligent apps
21. 21
Overview of Red Hat OpenShift Data Science
Depth and scale without lock-in
Complement common data science
tools in Red Hat OpenShift Data
Science with other Red Hat products
and cloud services
Partner ecosystem
Red Hat portfolio and services
Access specialized capabilities by
adding certified ISV ecosystem
products and services from
Red Hat Marketplace
Managed cloud platform
Deployed on Red Hat OpenShift and
managed on Amazon Web Services
providing access to compute and
accelerators based on your workload
Capabilities delivered through the combination of Red Hat and partner ecosystem
22. 22
Edge is bringing transformation to operational technology
Red Hat industrial edge
OT
Software-defined
everything
▸ Real-world, real-time interaction
▸ Convergence of planning & execution
▸ Implementation of data-driven insights
▸ Integration of formerly closed systems
IT
Software-defined
platforms
▸ Standard, scalable hardware
▸ Cloud-native applications
▸ Flexibility and agility
▸ Convergence of data platforms
23. RED HAT+IBM CONFIDENTIAL. For internal use only.
100+ Red Hat OpenShift certified operators
Red Hat Marketplace
Application Runtimes
Customer Code
{ | }
AI / ML
Databases & Big Data
Networking
Security
Monitoring & Logging
DevOps Tools
Storage
24. 24
Edge computing simplified deployment
Validated Patterns : Simplifying the creation of edge stacks
Bringing the Red Hat portfolio and ecosystem together - from services to the infrastructure
Config as code From POC to production
Open for collaboration
Highly reproducible
Go beyond documentation using
GitOps process to simplify deployment
So that you can scale out your
deployments with consistency
Ensure your teams are ready
to operate at scale
Anyone can suggest improvements,
contribute to it
If Scope 3 emissions have gained such attention, it is partly due to its massive downstream impact in the Automotive and the Oil & Gas industries. In the Automotive sector, Scope 3 emissions have a considerable influence, as it accounts for 95% of the total induced emissions. In the Oil & Gas industry, Scope 3 alone represents 85% of emissions [1] of the industry. Shell, for instance, has 90% of its emissions stemming from its supply chain and the use of its products.
Is your Data Trustworthy?
Need to understand the ‘lineage’ of the data.
You need to ‘label’ your data so that you know which data you have used
For warehouse and data lakes you need to keep the timeliness of data in mind
Data Gravity - Some countries will not let data out of the country (need Hybrid solution)
Intro - Business innovation is driven by big idea:
Wow - what an incredible time we live in. What an exciting time to be alive!
Business is moving faster than ever before. Today, we can do things we could only dream of a few years ago.
Technology, open source communities and new ways of collaboration are driving business innovation
No longer are we looking at startups and Web 2.0 companies like Facebook, Uber and Airbnb for inspiration as to what innovation looks like.
Today, every organization in every geography and any industry can innovate, create more customer value and differentiation and compete on an equal playing field.
And with Red Hat OpenShift, we’re building on our heritage of Red Hat Enterprise Linux to provide you with a platform that enables your organization to innovate faster.
But why is being able to move faster and innovate so important?
So the question comes back to you… Do you need to deliver solutions faster? Will delivering solutions faster help your organization innovate and exceed its goals?
<Let customer talk>
See also here:
https://docs.google.com/presentation/d/1kCQJs0GaYFvmQv1RPov8yUEeNmyfE0tDUc2Aj2hZ8AU/edit#slide=id.gd93df9f22e_0_0
Data ecosystems are becoming more complex, especially as cloud-based data platforms are added to the mix
This means that the process by which the business gets answers to its questions is also becoming more complex
In a modern data ecosystem, there is massive amounts of data sitting in a variety of locations and formats, like database, data lakes and warehouses, both on prem and in the cloud.
Worse yet, this data can be silod. Adding to the silos, there could be a lack of cross-team collaboration
For example, NOC assets may be looked after by different groups, and are therefore stored in different locations. You may need to obtain permission to access the data you are interested in and services of that team’s SME for help in obtaining the data you want.
When data is pulled out of a silo, Legacy tech and poor automation may cause error prone data.
In order to effectively analyze the data, it needs to put it in a common format automatically and be subject to Data Governance:
Data governance (DG) is the process of managing the availability, usability, integrity and security of the data in enterprise systems, based on internal data standards and policies that also control data usage.
Effective data governance ensures that data is consistent and trustworthy and doesn't get misused.
It's increasingly critical as organizations face new data privacy regulations and rely more and more on data analytics to help optimize operations and drive business decision-making.
This requires you to consolidate it into a single location by moving and/or copying it into that location. (address why consolidating the data into a single location may not be a good idea, in next slide)
All these steps are extremely time consuming, it can get quite expensive, and it adds 0 value while also increasing security risks by having multiple copies of your data laying around.
Ultimately these steps drive down your ability to provide insights at the speed of business.
It is no surprise that 80% of an enterprise’s time is spent on making the data available for analysis - while 20% is spent on finding the answers to their questions
Let’s address why consolidating the data into a single location is not a good idea
Consolidation is extremely time consuming and isn’t sufficient
Silos still exist
Data Warehouses & Lakes leave data in original form.
Little change in time to result
Still need SMEs, manual process, no governance, etc.
Consumers handle data preparation
Data consumers still responsible for transformations
This can be dangerous, as customers may choose metric system instead of an imperial system for the data. If the organization uses imperial system and bases its calculations for volumes then costly errors will be made when the customers go ahead and use their model/analytics on other data sets within the organization.
Lacking business-based data model
At the end of the day, the Data should be transformed into the form the business needs and understands.
Requiring automation forces global understanding as teams now self-service based on their needs
All these items seem to point that getting good data in a timely manner is impossible. It’s not, because the current time we have is right for changing the way we store/access/gather and prepare data due to a number of factors:
Cloud computing (public & onPrem)
Usage of Open source technologies such as containers
Edge computing
Programming capabilities increased
Standards adoption
Most important of all we have O&G C-suite level acceptance
With that let’s look at how we can consume data using some of these factors
Enterprise Edge: horizontal stuff, usable from IT / OT, not specific to
Operations Edge: vertical, OT specific, Industrial specific
Provider Edge: Telco Specific, Private 5G solutions
Vehicle Edge: rather Automotive specific, more details here: More detail on Vehicle edge: https://docs.google.com/presentation/d/1Fc4-bWCxsSxAG8DWs18T57sm-KA0qrlRDCrYHslp6TE/edit#slide=id.gee4169c65d_0_110
Single Node now joins our previously announced 3 Node clusters and Remote worker nodes.
3 Node clusters for sites that require high availability, but in a smaller, 3 node footprint
Remote worker nodes where only worker nodes are in smaller edge locations while the controller nodes are in larger sites like regional data centers
Single node, which will provide both software high availability* and a smaller footprint - this is currently scheduled to be available in the second half of 2021
* if a container fails, kubernetes is able to restart it. Obviously hardware failures are not covered when you are running on a single server.if a container fails, k8s will relaunch itif a container fails, k8s will relaunch it
And Red Hat and its system integrator partners are there to help you on every step of the journey, from the culture change and skills development needed to move to cloud native development, to the modernization of existing applications to containers, to optimizing processes for developers and IT
Challenges
Data science is transformational, but its potential is being limited by five key things. It we solve these things, we can get better insights that translate into business value.
Organizations now have access to huge amounts of data, and it is growing exponentially. There is so much data that it’s next to impossible to process all of it. Making matters words, it’s inconsistent--it’s from different sources, different time periods, in different formats.
The end of Moore’s law means that CPUs aren’t just automatically getting significantly faster year after year. Popular data science tools are CPU-constrained, making users sit through long periods of processing time. This is exasperated by the flood of incoming data making data sets larger than ever.
The popular data science tools are spread out among dozens of software repositories, many of them are open source and revised frequently, and it’s very challenging to find the right versions that will all work together.
The next disruptive evolution in technology is not about new companies disrupting traditional incumbents—it’s about traditional incumbents in “old fashioned industries” using technology to connect their preexisting infrastructures to create increased efficiency.
Red Hat has traditionally served IT organizations and the journey that they have been on for the last decade-plus as software-defined platforms have become prevalent is now coming to OT, which opens up even more potential value, as planning and execution systems converge and formerly closed systems get replaced by open architectures designed to support data-driven insights.
Red Hat’s edge computing Validated Patterns are repositories of configuration templates in the form of Kubernetes manifests that describe an edge computing stack fully declaratively and comprehensively; from its services down to the supporting infrastructure. Validated Patterns facilitate complex, highly reproducible deployments and are ideal for operating these deployments at scale using GitOps operational practices.
Use a GitOps model to deliver the Pattern as code
Use as a POC, modified to fit a particular need that you can evolve into a real deployment.
Highly reproducible - great for operating at scale
Open for collaboration, so anyone can suggest improvements, contribute to them