Injustice - Developers Among Us (SciFiDevCon 2024)
Big data security
1. by Felix Rosbach on April 29, 2019
There is an incredible number of people, devices, and sensors that generate, communicate, and
share data. Analyzing this data gives organizations the ability to gain customer insights, develop
better applications, and improve efficiency and effectiveness – or simply make better decisions.
While these insights are bringing many benefits to companies, there are also increasing concerns
over the trustworthiness of this data as well as the security and compliance challenges regarding
the way it is used.
Here are the top 3 challenges for big data security and compliance in 2019:
We need more data. Or do we?
Almost everything we use today creates data – from our smartphones, to connected TVs, to our
smartwatches. According to IDC, by 2025, 175 Zeta Bytes (1021) of data will have been created
worldwide.
On the organizational level, this also includes the large amount of data that was accumulated
internally as well as that which comes from complex infrastructure.
Much of this data, such as emails, spread sheets, and word documents, is held in unstructured
form. In addition, a lot of data is created in an ad hoc manner which causes significant problems
because it is hard for an organization to know what exists and where it is stored.
And looking at the term big data from a broader perspective, much more potential comes from
utilizing data from external sources like social media, publicly available data from government
databases, and data from other organizations.
The combination of data sets holds a lot of value when gaining insights or trying to make
decisions based on consumers preferences.
Challenge 1: Ethics and Compliance
A lot of data that is used to gain insights can be attributed to individuals. Personally identifiable
information is everywhere – sometimes even in unexpected places. Many consumers aren’t
aware of how their data is being used and what organizations do with it. Concerns about the use
of big data are leading to ever stricter regulations on how organizations can collect, store, and
use information.
Big data magnifies the security, compliance, andgovernance challenges that apply to normal
data, in addition to increasing thepotential impact of databreaches.
Organizations have to comply with regulations and legislation when collecting and processing
data.
2. While data protection legislation around the globe differs in certain aspects, it all shares the same
basic principles. It’s all about taking care of personal information, data privacy, and controlling
how data is used. Users have be able to understand what data is collected. The processing of that
data needs to be legitimized by user consent.
Looking at the sheer amount of data organizations have to process, protecting and managing data
is becoming more and more complicated.
Challenge 2: Poor Data management
When there is no clear ownership for big data and poor control over its lifecycle, data
management becomes a true challenge.
Many organizations tend to see security as a technology issue, meaning that security is just
another requirement IT departments have to fulfill and that it is a problem that can be solved by
just buying yet another security solution.
Great data governance is more than that: it starts at the board level. The board has to define
business goals for the use of big data together with acceptable risk and compliance requirements.
There must be clearly defined responsibility for the data, and its lifecycle must be properly
managed. To comply to data privacy regulations, organizations must be able to audit the way
data is acquired, processed, analysed and secured as well as the way the outcomes of analytics
are used.
Challenge 3: Insecure Infrastructure
Security by Design is great. But looking at the vast amount of devices and infrastructure that
produce data, many of them aren’t constructed with security in mind. Especially when it comes
to IoT devices, the limited ability to resist cyberattacks becomes even more problematic.
Sometimes it isn’t even possible to upgrade their defense.
This could not only impact the trustworthiness of data, it could also give hackers access to
vulnerable infrastructure.
In addition, the technology that is used to process this data was designed with massive scalability
in mind and not necessarily to enforce security controls.
While the absence of security by design is nothing new, complex big data environments only
make things worse. There are enough vulnerabilities and backdoors in on premises big data
analytics environments. With the use of cloud services, especially when it comes to hybrid or
multi-cloud environments, we have reached another level of complexity with new challenges and
risks.
3. Using out-of-the-box security delivered by cloud providers and improperly set security controls
can lead to exposed data on the internet.
Additionally, data sent to cloud services is often unprotected. A lot of data breaches have
occurred because of the simplest countermeasures were non-existent or not integrated properly.
To make sure this doesn’t happen to you, adopting a privacy by design approach is crucial. You
have to make sure that your data management is under control and that data is protected
anywhere it is used, stored, or in motion. Pseudonymize it whenever possible.
Data GovernanceinaBigData World
Robust governance programs will always be rooted in people and process, but you also need to
choose the right technology, especially when working with big data.
By MiteshShah
September15,2017
Organizations across the globe are investing in systems capable of housing and processing data
in ways previously unimagined. In some cases, enterprises are even replatforming their existing
IT environments based on these new systems. These big data systems have yielded tangible
results: increased revenues and lower costs. Yet positive outcomes are far from guaranteed. To
truly get value from one's data, these new platforms must be governed.
The term data governance strikes fear in the hearts of many data practitioners. Because it is often
vaguely defined and misunderstood, many simply turn to a technology-only approach to solve
their governance needs. The complexity that comes with many big data systems makes this
technology-based approach especially appealing even though it's well known that technology
alone will rarely suffice. What is perhaps less known is that technologies themselves must be
revisited when optimizing for data governance today.
For Further Reading:
As with All Data, Big Data Needs
Governance
Adapting Data Governance to Big Data
The Future of Data Governance
Defining Data Governance
4. Before we define what data governance is, perhaps it would be helpful to understand what data
governance is not.
Data governance is not data lineage, stewardship, or master data management. Each of these
terms is often heard in conjunction with -- and even in place of -- data governance. In truth, these
practices are components of some organizations' data governance programs. They are important
components, but they are merely components nonetheless.
At its core, data governance is about formally managing important data throughout the enterprise
and thus ensuring value is derived from it. Although maturity levels will vary by organization,
data governance is generally achieved through a combination of people and process, with
technology used to simplify and automate aspects of the process.
Take, for example, security. Even basic levels of governance require that an enterprise's
important, sensitive data assets are protected. Processes must prevent unauthorized access to
sensitive data and expose all or parts of this data to users with a legitimate "need to know."
People must help identify who should or should not have access to certain types of data.
Technologies such as identity management systems and permission management capabilities
simplify and automate key aspects of these tasks. Some data platforms simplify chores even
further by tying into existing username/password-based registries, such as Active Directory, and
allowing for greater expressiveness when assigning permissions, beyond the relatively few
degrees of freedom afforded by POSIX mode bits.
We should also recognize that as the speed and volume of data increase, it will be nearly
impossible for humans (e.g., data stewards or security analysts) to classify this data in a timely
manner. Organizations are sometimes forced to keep new data locked down in a holding cell
until someone has appropriately classified and exposed it to end users. Valuable time is lost.
Fortunately, technology providers are developing innovative ways to automatically classify data,
either directly when ingested or soon thereafter. By leveraging such technologies, a key
prerequisite of the authorization process is satisfied while minimizing time to insight.
How is Data Governance Different in the Age of Big Data?
By now, most of us are familiar with the three V's of big data:
Volume:The volume of data housedinbigdatasystemscan reach intothe petabytesand
beyond.
Variety: Data isno longeronlyinsimple relationalformat;itcan be structured,semistructured,
or evenunstructured;datarepositoriesspanfiles,NoSQLtables,andstreams.
Velocity:Data needstobe ingestedquicklyfromdevicesaroundthe globe,includingIoT
sources.Data mustbe analyzedinreal time.
Governing these systems can be complicated. Organizations are typically forced to stitch
together separate clusters, each of which has its own business purpose or stores and processes
unique data types such as files, tables, or streams. Even if the stitching itself is done carefully,
gaps are quickly exposed because securing data sets consistently across multiple repositories can
be extremely error-prone.
5. Converged architectures greatly simplify governance. In converged systems, several data types
(e.g., files, tables, and streams) are integrated into a single data repository that can be governed
and secured all at once. There is no stitching to be done per se because the entire system is cut
from and governed against the same cloth.
Beyond the three V's, there is another, more subtle difference. Most, if not all, big data
distributions include an amalgamation of different analytics and machine learning engines sitting
"atop" the data store(s). Spark and Hive are just two of the more popular ones in use today. This
flexibility is great for end users because they can simply pick the tool best suited to their specific
analytics needs. The trouble from a governance perspective is that these tools don't always honor
the same security mechanisms or protocols, nor do they log actions completely, consistently, or
in repositories that can scale -- at least not "out of the box."
As a result, big data practitioners might be caught flat-footed when trying to meet compliance or
auditor demands about, for example, data lineage -- a component of governance that aims to
answer the question "Where did this data come from and what happened to it over time?"
Streams-Based Architecture for Data Lineage
Luckily, it is possible to solve for data lineage using a more prescriptive approach and in systems
that scale in proportion to the demands of big data. In particular, a streams-based architecture
allows organizations to "publish" data (or information about data) that is ingested and
transformed within the cluster. Consumers can then "subscribe" to this data and populate
downstream systems in whatever way is deemed necessary.
It is now a simple matter to answer basic lineage questions such as, "Why do my results look
wrong?" Just use the stream to rewind and replay the sequence of events to determine where
things went awry. Moreover, administrators can even replay events from the stream to recreate
downstream systems should they get corrupted or fail.
This is arguably a more compliance-friendly approach to solving for data lineage, but certain
conditions must be met. Specifically:
The streamsmust be immutable (i.e.,publishedeventscannotbe droppedorchanged)
Permissionsare setforpublishersandsubscribersof all events
Auditlogsare set to recordwhoconsumeddataand when
The streamsallowforglobal replication, allowingforhighavailabilityshouldagivensite fail
Summary
Robust governance programs will always be rooted in people and process, but the right choice
and use of technology are critical. The unique set of challenges posed by big data makes this
statement true now more than ever. Technology can be used to simplify aspects of governance
(such as security) and close gaps that would otherwise cause problems for key practices (such as
data lineage).
6. Security Think Tank: Data governance is essential to data security
Why is it important to know where data flows, with whom
it's shared and where it lives at rest – and what is the best
way of achieving this?
By
Raef Meeuwisse
Published:04 Jun2018
It’s another Isaca conference and there are hundreds of security professionals in the room, all
itching to understand the latest thinking on where the greatest amount of cyber risk resides. Is it
cryptojacking, attacks from nation states, rogue insiders, vulnerabilities in CPU design, fileless
malware?
The reality is that there is no master list of cyber risks. When it comes to cyber threats, there is
no one size fits all. The cyber security risks are mostly dependent on the organisations
themselves – what products and services they deliver, what information of value those activities
contain, how good their existing cyber security defences are, how much a company may have
accidentally antagonised a potential hacker – but here comes the real problem:
In order to protect your information of value – you have to know about it. You have to know not
just that you have it, but where you have it, and where you allow it to go. This is because, if you
7. don’t know what your information of value is, or where you allow it to flow – you have no
chance to ensure that you apply the right security to it.
There is a wave of agreement that rolls through the audience. Everyone agrees that you need data
governance. After all, there is a reason that the discipline is called “information security”.
Based on my own audit experience (and now also based on asking this same question at many
conferences), I ask the audience this question – and I invite you (the reader) to think about your
own response: Put your hand in the air if your enterprise has a single, up-to-date inventory of all
your information of value, including all of the places you expect and allow that data to flow –
suppliers, cloud services, applications…?
In rooms filled with 200 or more security professionals, how many hands do you think go up in
the air? Sometimes a few and, often, none at all.
So, how does this happen? How do so many security professionals get left without a basic
component they need to help deliver security efficiently and effectively?
Read moreaboutinformationmanagement
Informationmanagementmeansbettersecurity.
Data governance isgoodfor businessandsecurity.
Data controllersare essentialinmodernbusinessenvironment.
Well, this has nothing to do with the competence of the professionals in the audience. They are
always well aware of the need for effective data governance. They know that any datasets of
value should always be subject to data classification. There is also no lack of training and
guidance, for example, there are plenty of great data governance resources available across the
Isaca publications and education programs.
In my experience there are three main drivers for the issues around data governance.
1. Cybersecurityusedtobe about networksecurity:Forquite awhile,itwaspossibletotreatthe
companynetworklike agatedcommunity.Securityprofessionalswere expectedtocutcorners
and protectthe community(the network) ratherthangothroughthe more expensive and
laboriousprocessof gettingeachdatasetof potential valuetogothrougha data classification
process.That methodnolongerworks,butgettinggovernance structurestoadjustbackto
placingdata governance atthe centre of theirsecurityuniverse issomewhatsimilartoaskinga
supertankertomake an immediate90 degree turnandthendrive acrossfor a few miles.
When all the structures and departments are geared up for certain roles and then the objectives
change, it takes time to re-engineer and re-equip those roles. This leads on to the second
challenge:
2. Notenoughtime allocatedtosecuritystaff fortrainingandeducation:The technology
landscape ischangingfasterthanever.If youwant yoursecurityprofessionalstokeeppace,
thenjustexpectingthemtobe able tokeepupby allocatingasmall amountof trainingtime and
8. budgetwill notserve youoryour staff verywell.Ispendover50% of my time onresearchand I
barelykeepupwiththe essentials.
We need to recognise that security professionals need a much more sizable percentage of their
time allocated to continuing professional education than most other roles. Isaca requires an
average of 40 CPE [continuing professional education] hours per year for members to maintain
their certifications. The reality is that most of us have to attain far more than the minimum. If
you are an organisation struggling to recruit or retain staff, you might want to look at just how
much ongoing training and education you offer security personnel.
3. The CISO is not reporting to the main executive: Just where do you plug-in a chief
information security officer (CISO)? In Isaca’s State of cybersecurity 2018 survey, it was found
that even within the security staff themselves, there was a ‘striking lack of consensus’ about
where they should report. Only 35% of survey respondents indicated their roles are reporting into
a security function, 30% reporting into CIOs (chief information officers) and the remaining 35%
reporting to a multitude of functions and departments.
Reportingto the CEO
In my opinion, there is only one effective place for a CISO to report and that is directly to the
CEO (chief executive officer) as part of the main executive. If I was in any doubt, I offer this as
further evidence; I have only ever seen a few hands go up to acknowledge that their organisation
had managed to get their data assets and data flows inventoried.
When I asked a follow-on question, there was a 100% correlation. The only people that had their
data governance working properly also had a CISO reporting into their CEO.
What does this mean in practice? It means that understanding your data of value and where you
permit it to travel, are key to being able to achieve effective security.
However, it also seems like the organisations most likely to get to that position are those that
recognise the need to put their CISO right up on the main board with the chief financial officer.
You might think twice before allowing money to just flow around without the right checks and
balances. It’s time to think about information flows with the same level of integrity.
Data security has long since breached the walls of tech blogs and legal briefs as a topic of
concern. Thanks to recent revelations and subsequent high-profile apologies and government
hearings, data security is having its day around kitchen tables, in classrooms, and, of course,
across the vast digital universe of social media. Anyone who was somehow still asleep to the
pitfalls and perils of dealing in data and protecting personal information is certainly wide awake
now.
Today’s data management and storage landscape, where data entropy and data sprawl are
rampant, has wide-reaching consequences that go far beyond data accessibility, interpretability,
and ease-of-use for companies. Data security may be on everyone’s minds these days, but it will
9. only continue to become more challenging as data collection and storage becomes more
sophisticated.
Many companies are storing significant data in low-structure distributed systems, open source
databases, and even unmanaged devices. In these not-uncommon scenarios, how can a user know
that a query doesn’t return personal data if they don’t know what is in the data store? And even if
the data is in a well-known system, how can administrators be sure that this same personal data
isn’t provided to the wrong individuals? In other instances, personal data may need to be shared
or placed online, although it must be obfuscated first, but in a way that does not affect its value.
Companies need a new approach to security – new forms of data profiling via heuristics and new
technology to ensure the safety of all data, effective masking of personal data, and compliance
with new data protection and privacy laws like GDPR (General Data Protection Regulation).
Data security may be a hot topic of the moment, but other, less newsworthy issues are no less
important when it comes to data management. No matter how data is being used, governance is
crucial. Companies need audit trails of who did what with which data. CIOs live in constant fear
of an auditor walking in and asking where a value within a business application comes from and
questioning its reliability. Why is this type of dread so prevalent? Enterprises are waking up from
the last decade of the “democratization of data” and realizing that they have antiquated
governance processes in place – or, worse yet, none at all.
In the past, a company might have rolled out applications at a rate of one per year – across the
entire company. Now, each business unit within an organization rolls out one or more
applications each month, with little or no tracking of where the data is coming from, who is
using it, and where it is going. With no business model, no impact analysis, and no lineage
metadata to show any auditor who might come knocking on the door at any time, companies are
taking a massive legal risk. This not only keeps executives awake at night, but gives them
nightmares should they risk closing their eyes.
It is hard to keep pace with the rate of change within one company, let alone the near-constant
changes brought by political regulations and technological advances. Given the potential risks
that data security and data governance pose to a company, it is vital that businesses seek out new
technology that will equip them to face the security challenges of today and prepare them to meet
whatever challenges tomorrow may bring.
It’s not just about you. Learn about The Future of Cybersecurity: Trust as Competitive
Advantage.
This article was originally published on the SAP HANA blog and is republished by permission.
10. Big data privacy:Four ways a data governance strategy supports
security,privacy and trust
By Daniel Teachey,Insights editor
Do a web search for “big data” and you can find countless articles about “delivering value” from
big data. While figuring out what to do with this data is important, a topic that isn’t quite as hot –
but might be much more important – is big data privacy, which focuses on whether big data is
protected in compliance with your organization’s existing standards.
Big data privacy falls under the broad spectrum of IT governance and is a critical component of
your IT strategy. You need a level of confidence in how any data is handled to make sure your
organization isn't at risk of a nasty, often public data exposure. That extends to privacy for all
your data, including big data sets that are increasingly becoming part of the mainstream IT
environment.
Privacy is also related to issues like data monetization. If the data that you have isn’t secure,
high-quality or fit for purpose, can you trust the monetary value placed on that data? And, as the
amount of data grows, do you have a strategy for larger privacy efforts, or big data privacy, in
your organization?
The recommendedapproach...istoblendyourbusinessrulesandITrules.If youcan accomplishthis
collaborative effortthroughthe use of governance solutionstoestablishabigdataprivacyframework
withinyourIT environment,thenall the better.
Big data privacy vs. traditional data privacy standards
Of course, data privacy is not a new topic. By the 1970s, it was a recognized concern for issues
such as medical records or financial information. In those early days, the first data privacy
principles adopted what were often called “Fair Information Practices” (FIP).
The FIP efforts in organizations followed five tenants.
Openness.There shouldbe nosystemsforcollectingpersonal datathatare keptsecret.
Disclosure.Organizationsshouldprovide awayforindividualstolearnwhatinformationis
available andhowitisused.
Secondary usage.Informationcollectedforone purpose shouldnotbe used foranother
purpose withoutthe consentof the individual.(Note:thiswasthe hardesttoimplement –and
therebybecame the leastpracticedtenant).
Correction.Individualsshouldhave the abilitytocorrector amenderroneousinformation.
Security.Any organizationcreating,maintaining,usingordisseminatingidentifiablepersonal
data mustassure the data isbeingusedcorrectlyandmusttake precautionstopreventmisuse.
With new privacy-based regulations like HIPAA and Sarbanes-Oxley, more organizations have a
more defined business need to safeguard data privacy. This has led to an expansion beyond the
11. tenants of a FIP approach.
As our domain of data has evolved, a new focus is tracking the source of information (also
known as lineage). It’s also important to understand the quality of the actual information and the
usage of the information as it pertains to personal privacy and industry compliance criteria. This
gets more complicated as data becomes an asset both for the organization and the consumer.
The push toward self-service and the need for big data
privacy
As any other privacy or security issue, you must balance big data privacy issues against your
business goals. Why do you collect and manage data in the first place? You’re typically using it
to fuel an operational effort (supporting sales) or an analytical effort (learning who to sell to).
For e-commerce or online customer experiences, that data is more visible to the customer
throughout their journey. As a result, the data can have a more direct impact on the bottom line.
After all, without a good e-commerce experience, customers may choose to go elsewhere.
Similarly, a poor online support program may lead to increased churn.
This “transparency” comes with some risk. More self-service interactions with customers means
you are collecting and packaging more information about customers about their accounts, their
purchases and their preferences. More data can lead to a better customer experience, but it can
also put you at risk. There is simply a greater risk of exposure of personal or confidential
information.
As a result, data and IT governance efforts are finding a new push as organizations begin to
collect data for more public consumption. And now business and IT, once mortal enemies
(almost), are now realizing that data is everyone’s responsibility.
Preparing for privacy in a big data world
When planning big data privacy efforts, a starting point is to understand the sources of data and
how this data is used. As we all know, that conversation rapidly goes in the direction of how we
should or should not use or exploit the data.
However, there is tendency to avoid the delicate subject of how to support privacy of the
individual and how to protect data in an increasingly digital world. The complicating factor is
how to keep a balance between:
The value to endusers.
The level of privacyandprotectionthat'snecessaryforbothyou andyour customer.
This issue has to be addressed if you want your digital business practices to be seen as credible to
your customers.
12. Strike a balance: A best practice checklist
The recommended approach for clarifying these concerns is to blend your business rules and IT
rules. If you can accomplish this collaborative effort through the use of governance solutions to
establish a big data privacy framework within your IT environment, then all the better.
Here are a few data governance best practices as they relate to big data privacy:
Define what data governance means – to your companyandto your project.Whenitcomesto
bigdata, youdon’tneedto developaseparate datagovernance programorframework.Youjust
needa data governance programandframeworkthatsupportbigdata.
Know your culture. One size doesnotfitall.Some organizationsare bettersuitedforatop-
downgovernance approach,while otherswillworkbetterfromthe bottomup.
Designyour data governance framework. Identifythe “what”and“how” before specifyingthe
“who.”Make use of existingcommitteesandprocesses.
Treat data governance as a long-termprogram. Implementitasa seriesof tightlyscoped
initiatives.Planforthe activitiesandresourcesrequiredtoexecute andmaintaingovernance
policies.
Organizations that already embrace centralized or shared services that are integrated with
functional business processes will have a less difficult path than those starting from scratch.
However, the effort to establish meaningful and sustainable data governance and management
will still:
Require abusinesscontextconsideredrelevantandvaluable tothe endusers.
Make mistakesthatmayrequire multiple attemptsbeforeresultsare sustained.
Dependona determinedcommitmenttoachieve the visionof dataas a corporate asset,and a
willingnesstolearnfrommistakesandtryagain.
Of course, there will always be debates, both within an organization and in the market overall,
around governance, security and trust. Regardless of the details, it’s vital to have a big data
privacy effort in place. These steps should be addressed during the design and implementation
process – and as part of reviews and proof-of-concept trials – to make sure they fit in your big
data environment.
Data is one of the key assets of an educational institution. Sitting on a mountain of student data,
everyone at the institution is already aware of this fact. However, what most stakeholders are
unaware of is how to take advantage of the information at hand, to achieve the diverse goals at
an organizational, operational and student level. For instance, the right use of data can help boost
student retention rates, as well as sustain or gain high global ranking for the university.
Moreover, as students demand customized experiences, and evolving funding frameworks entail
new requirements, there is a growing need for precise, reliable data that provides insights
regarding enrollment, retention and graduation rates.
As a result, institutions are trying to leverage existing data to enhance time efficiencies, talent
and resources such as infrastructure, while gauging the success of academic programs and
13. student services. But the reality for a majority of higher educational establishments is that while
there exists a deluge of data, it is unidentified, scattered and underutilized. The solution to
addressing these challenges lies in enacting a proper data governance model.
The data governance imperative
A robust data governance framework comprises policies, systems and practices that foster
transparency and seamless access to precise, trustworthy and consistent data. Reliable,
centralized data promotes a shared vision, with the capacity to drive informed decision making at
every level of the institution. Drawing from good governance, organizations can harness high-
quality data to even democratize it, and accelerate outcomes such as increased annual student
enrollment. But what makes data governance an immediate requirement?
For one, the future of every institution depends on it. By applying analytics on accurate data,
higher education providers can now derive varied insights about students, such as identifying
those likely to fail or drop out of a course. These insights can give universities and schools an
opportunity to address the problem in time. Moreover, institutions can seek funding by
presenting evidence-based reports on improved graduation rates, and predictive analytics reports
to estimate future results.
However, a lot of the aggregated data is not clearly defined. For example, the term ‘students’
could refer to multiple data points such as full-time students, under graduates, graduates or those
continuing education. A well-structured data governance plan will empower education providers
to define their data, and eliminate ambiguous terminology. Furthermore, in many institutions,
systems and processes don't function in unison, thereby causing uneven data distribution. For
instance, one system may capture attendance and student performance, while another
unconnected system will have information on demographics. Subsequently, stakeholders have
access to only partial information and cannot make an informed decision at any level. Since data
governance is not the responsibility of only the IT department, it transitions the organization
from compartmental reporting to cross-functional reporting, with all departments working
toward realizing unified objectives.
Three key elements of a well-designed data governance framework:
Data governance council - Typically, the team comprises senior staff from different
departments who can supervise the strategy and operations across the institution. The
council is meant to define, review and validate business rules, data definitions, data
quality, and data security. They also ratify data usability and data consistency.
Data stewardship committee - The brief for this team is to ensure compliance with the
business rules established by the data governance council. They verify if the rules are
being applied to the data, and authorize the publishing of data that is fit for stakeholder
consumption.
Data automation committee - This team has two broad responsibilities. The first
requires the committee to enable, automate and ratify the data governance council’s
14. business rules that are applied to processes and data. Second, they are also responsible for
ensuring that meta data is collected and saved in a metadata depository tool.
Conclusion
While data governance may either be absent or under-developed in most organizations, setting
up a well-defined framework is critical to ensuring a sustainable future for higher education
providers. The many complexities involved in establishing such a governance model can be
addressed with technology. Whichever approach institutions employ, good data governance will
result in good institutional governance. With a clear understanding of data ownership,
accessibility, quality and security, stakeholders can be confident about the veracity of data, and
secure a future inspired by innovation and informed decisions.
Data governance is taking on a front-and-center role among businesses as they grapple with and
try to correct poor data management practices of the past.
This data management methodology is how an organization manages the data it employs, in
terms of availability, usability, integrity and security. Developed and overseen by a governing
body, it involves defining data management procedures and establishing a plan to ensure those
procedures are followed.
The need to bring new rigor to data management partially springs from the explosion of data
being generated. In 2012, 2.5 exabytes (2.5×1018) of data were created every day, according to
IBM. That’s 2.5 billion gigabytes. Given this overwhelming volume, it’s not surprising that
much of corporate data can be lost or become corrupt.
And there’s a significant price to pay. As Larry English, one of the earliest pioneers of data
quality put it, “Process failure and information scrap and rework caused by defective information
cost the United States alone $1.5 trillion or more.”
It’s leading businesses to devise holistic data management strategies that surmount the issues
created by piecemeal approaches of the past. Typically, the tendency was to apply data quality
and integration technologies locally or departmentally. That created silos of good data when
more effective data management would traverse departments, applications, business units and
divisions.
As companies move forward on the continuum from undisciplined to governed, they will have
adopted a “think globally, act globally” perspective. They will have achieved a single view of the
enterprise through master data management, gaining the benefits resulting from the integration of
high quality data with business process management systems.
Data Governance Maturity Model
Better data governance (establishing, codifying and enforcing enterprise-wide best data
management practices) is key for improving data infrastructure that was built for a different
information era. The disjointed structure, with data held in disparate applications and multiple
15. locations across the organization, results in lost or flawed data that hinders performance and
drives up the cost of doing business.
What’s driving more holistic strategies today is the Data Governance Maturity Model. This helps
businesses understand where their data management practices currently stand and indicates a
path that can be taken to eventually evolve into a single, unified approach.
The model is based on four distinct stages for use of enterprise applications: Undisciplined,
reactive, proactive and governed. Investments in both internal resources and third-party
technology are necessary for each stage. Yet, as the organization evolves through each state, data
becomes more universally consistent, accurate and reliable, making the rewards grow and the
risks decrease.
Using the model, businesses can identify where they are on the continuum marked by
technologies where data consolidation and integration commonly occur. Typically it begins with
more limited areas like database marketing and moves through to more global areas like business
process management integration. The continuum represented by the model is advanced, as
businesses start with smaller projects to get more value from data (like database marketing) and
then take on bigger projects.
A holistic data management strategy is what leading businesses aim to put in place, and the Data
Governance Maturity Model is employed by many as a means to achieve this. As recounted
earlier in this article, the model helps businesses understand where their data management
practices stand today and indicates the path that can be taken to eventually transform it into a
single, unified data management approach.
There are four distinct stages to the model. Systematically advancing effectively through them
requires understanding the characteristics of each stage and what it takes to move forward.
Stage One: Undisciplined
In this stage, an organization has defined few rules and policies to set standards for data quality
and integration. Executives aren’t likely to recognize the cost of data that’s been poorly
managed.
A number of characteristics are common to companies in this stage. In the people category, for
example, success rests on the competence of a few individuals. Management neither contributes
nor buys in to data quality issues. On the technology side, undisciplined organizations do not
perform data profiling, analysis or auditing, and cleansing and standardization are isolated
occurrences. In terms of policies, there are no defined data quality processes, resources are not
optimized and problems are addressed as they occur through manually based processes.
Organizations at this stage face very high risk levels, as data issues can force away customers
and improper procedures can hamper productivity. Conversely, poor data quality returns few
rewards. At some point, the price being paid for poor data is realized and effort is made to
quantify the impact to spur change.
16. Advancing to the Reactive stage starts with establishing objectives for data governance,
including the size and scope of governance efforts and what critical data assets are necessary.
Also important are tech components that can handle data quality and data integration for cross-
functional teams. It also is key to centralize, in a single repository, business rules for core data
quality functions and use them across applications.
Stage Two: Reactive
As implied, companies at this stage tend to locate and manage data problems after the fact. Both
ERP and CRM applications are used selectively, and while some employees understand the data
quality issue, corporate management is largely unsupportive. Some 45 to 50% of organizations
fall into this category.
Businesses at this stage typically have a group of database administrators or other employees
who are responsible for data success. While data quality initiatives may benefit by individual
contributions, there are no standard, non-siloed procedures. In terms of technology, tactical data
quality tools are often available and are utilized by certain applications like CRM and ERP.
However, data is not integrated across business units. While there are rules for data governance,
the tendency remains reactive when it comes to data issues. Likewise, data management
processes tend to respond to recent data issues.
This is another high-risk stage, given the lack of data integration and overall inconsistency of
data. That translates into rewards that are limited, with returns delivered through individual
processes and individuals given the overall lack of recognition of the benefits of data quality.
Advancing to the next stage (Proactive) is not easy. It requires creation of a new strategic vision
to guide the processes for improving data that links to tangible business results. Moreover, it
requires getting organizational buy-in, a challenge when business units have previously had
significant autonomy over their applications and data structures.
The team uses best practices to establish cross-functional business rules for data integrity
overseen by a data governance team that includes stakeholders and day-to-day data stewards.
Stage Three: Proactive
Organizations that reach this stage have a significant advantage in an optimal risk/reward ratio
and an environment where data starts to become an asset that can drive informed decisions. Less
than 10% of all companies have reached this stage.
This stage is characterized by management that understands and is committed to the role of data
governance, understanding that data as a strategic asset. Data stewards maintain corporate data
policies and procedures, and ensure ongoing monitoring maintains its integrity. Real time, not
reactive activities, is the practice. Preventive rules and processes are in place as perspective shifts
from problem correction to prevention. Risks are moderated because better, more reliable
17. information is behind decision-making, and rewards are medium to high in light of improved
data quality.
Moving to the next phase involves the solidification of the unified approach that is taking hold
for all corporate information. Moving to the Governed stage requires the organization to
assemble and integrate the pieces that have already been put into place. A framework emerges to
organize the work of brand stewards, supported by businesses analysts and IT professionals.
With data now robust and reliable enough to for high-end process management, the foundation
for Business Process Management integration is now in place.
Stage Four: Governed
At this stage, data quality, integration and synchronization are integral to business processes
under a unified data governance strategy.
In this environment, the CEO is directly behind the strategy given its executive level
sponsorship. Employees are involved in data strategy and delivery, and zero defect policies rule.
Data quality and data integration tools are standardized while data is continually inspected, with
deviations immediately resolved. New initiatives are weighed carefully against their potential
effect on the existing data structure. Automated policies ensure data’s consistency, accuracy and
reliability enterprise-wide.
Risk is low given the tight controls over master data, organization-wide that ensures high quality
information about customers, prospects, inventory and products. Rewards are also high, given
the improvement of data-led insights into the business that increase management’s decision-
making confidence.
The company that reaches this stage has effected a major culture change, where managing data is
less a tactical challenge and more the basis of a sophisticated data strategy and framework that
helps drive the business – and is a significant competitive differentiator.
Data governance is the common denominator that makes security, privacy and risk more
effective and easier.
Security, privacy and risk do not have to be scary, but with GDPR, CCPA and organizations
moving to a risk-based approach to security– rather than focusing only on compliance—they
have become daunting challenges. What is typically at the heart of organizations? Data and
information. The common denominator that makes security, privacy and risk more effective and–
dare I say it– easier? Data governance.
What Is Data Governance?
Data governance is the capability within an organization to help provide for and protect high-
quality data throughout the lifecycle of that data. This includes data integrity, security,
availability and consistency. Data governance includes people, processes and technology that
help enable appropriate handling of the data across the organization.
18. Data governance program policies include:
Delineating accountability for those responsible for data and data assets
Assigning responsibility to appropriate levels in the organization for managing and
protecting the data
Determining who can take what actions, with what data, under what circumstances, using
what methods (See Data Governance Institute for details.)
Identifying safeguards to protect data
Providing integrity controls to provide for the quality and accuracy of data
How Does Data Governance Help with Privacy Management?
You have to know what data you have, where it is, how it is used and whom it is shared with to
comply with applicable privacy regulations. You also need to have the processes to obtain
appropriate consents to access and delete it.
Privacy regulations are basically a business case for data governance. Imagine if organizations
had already done extensive data mapping exercises prior to GDPR? Imagine if they knew where,
why, what and how about the data prior to GDPR being passed? The transition to GDPR would
have been far less painful.
How Does Data Governance Help Cybersecurity?
To protect against threats, organizations need to know what data to protect and how to help keep
it protected. Information protection is at the core of security, but how can you protect it if you do
not know what data you have, where your data is, how it is used, whom it is shared with and how
it is shared? Businesses can no longer have perimeter protections in place and call it a day–the
perimeter has expanded to suppliers, cloud vendors, partners, and so on. So, managing your data
in a structured, responsible and law-abiding way will make it more efficient for security
professionals to protect it.
How Does Data Governance Help an Organization Manage Information Risk?
You need to know the most sensitive and critical data to your organization–your most valuable
information–so that you can allocate more resources to protecting that data. No organization will
be 100% secure, and very few organizations have unlimited resources–people and financial–to
implement, operate and improve cybersecurity measures. Therefore, businesses must take a risk-
based approach and focus on the most sensitive data assets.
Times are changing. Is it easy to design and implement a data governance program? No, or
organizations would have them in place today. However, given privacy regulations, the evolving
threat landscape, the age of digitization and the expanding organizational boundaries, data
governance is no longer a choice for organizations that need quality data, protected from
cybercriminals, and in compliance with data protection laws.
Carisa Brockman has worked as part of the AT&T family for over 18 years (through
acquisitions). She is well-versed in business management practices and has focused on strategic
19. planning, information risk management, compliance management, enterprise policy
management, cross-functional process design and management, consolidation and integration of
enterprise security functions, and organizational effectiveness.