SlideShare une entreprise Scribd logo
1  sur  31
Télécharger pour lire hors ligne
Dark Data Revelation and its
Potential Benefits
What is dark
data?
The information assets organizations
collect, process and store during
regular business activities, but
generally fail to use for other
purposes.
- IT Glossary by Gartner
In simple terms, dark data is all that useful data an
organization possesses, but doesn’t actually
meaningfully use or analyze for the improvement of the
business.
The enormous digital universe
2013
2020 44 ZB 37% 27% 10%
4.4 ZB 22% 17% 2%
Total size of
digital
universe
Data useful
If
analyzed
Data from
mobile
devices
Data
from
Embedded
systems
According to IDC (a research firm), up to 90 percent of the
digital universe is unstructured data.
Traditional sources of dark data
Server log files
Networking machine data
Point-of-sale feeds
Customer queries recorded in calls, emails, forms
Underused employee data
Meeting notes
Unstructured information arising out of business mails and presentations
Unused data resulting from business research and surveys
Why is it
important?
Businesses are heavily invested when
it comes to collection of data;
however, tangible value can be
derived only after companies start to
understand their dark data and how
it can be applied.
It is also a sensible step for any company which is getting
started with big data and building a data warehouse.
In this case, dark data can be a reliable source of historical
data.
3 facets of dark data
Existing
unstructured data
01
Nontraditional
unstructured data
02
Data in the deep
web
03
Existing unstructured data
Many businesses already have large collections of both structured and
unstructured data.
Unstructured data such as emails, notes, messages, documents, logs,
and notifications (including from IoT devices) are confined to the
organization and remain largely unused (due to lack of tools and
techniques or their absence in the database).
These data assets could be potentially having valuable insights related
to competitors, pricing and consumer behavior.
Nontraditional unstructured data
Data present in the web pages, audio and video files and still
images are largely untapped data that can be mined via data
extraction solutions, computer vision, advanced pattern
recognition, and video and sound analytics.
This can help businesses perform advanced analytics on data
present in nontraditional formats to better understand their
customers, employees, operations, and markets.
Data present in the deep web
The deep web presents the largest pool of
unused information—data curated by
academics, consortia, government
agencies, communities, and other third-
party domains.
Companies can potentially curate competitive intelligence
using a type of emerging search tools developed to help users
target scientific research, activist data, or even hobbyist
threads found in the deep web.
An example of such tool can be Stanford University’s search
engine called Hidden Web Exposer that scrapes the deep web
for information using a task-specific, human-assisted
approach.
Potential risks
associated with dark
data
Legal and
regulatory
issues
If the data stored is covered by legal
regulations such as credit card data,
exposure of such data could expose
companies into financial and legal
liabilities.
Intelligence risk
Companies could intentionally or
unintentionally disclose proprietary
or sensitive data on business
operations, products, financial status
and business plans.
PR disaster
Companies are considered as
protector of data they collect. So, any
loss of data, especially sensitive and
confidential data, can lead to loss of
reputation.
Opportunity
costs
If a company avoids analysis and
processing of dark data but its competitors
do, then its competitors will be in a better
position to capture more market share by
leveraging the insights from dark data.
Practical applications
of dark data
Stitch Fix, an online subscription shopping service, uses images from
social media and other sources to track emerging fashion trends and
evolving customer preferences.
Personalization in retail
Questionnaire
filled by clients
Customer’s
Pinterest board
and social
media scanned
Data
augmentation
Deeper insight
of customer’s
style preference
Appropriate
clothing
shipped to the
customer
A financial services firm wanted to gain insight from its trading terminal data to find
correlations between trading patterns and abuses like money laundering and other fraudulent
activities.
Most of the data was dark owing to the volume and geographically scattered storage.
After the customer was able to utilize what was previously underutilized, and completed the
data prep and analysis process to determine suspect patterns in transactional records, they
took that analyzed data and created sophisticated predictive models that can identify activities
that indicate the potential for fraud, and take measures to prevent fraud before it occurs.
Fraud detection
Approaching dark data
Instead of attempting to discover and
collect all of the dark data hidden
within and outside your organization,
work with the business team to find
answers for specific business
problems.
Getting the right
data
Source data from the web to
augment your own data with publicly
available demographic, location, and
statistical information.
Being open to
third party data
Data scientists are valuable
resources, especially those who have
the skills to combine deep modeling
and statistical techniques with
industry or function-specific insights.
Building data
talent
Advanced visualization software can boost
business intelligence by repackaging big data into
smaller, more meaningful chunks, delivering value
to users much faster.
This is crucial since information can be more
easily consumed when presented as an
infographic, a dashboard, or another type of
visual representation.
Utilizing
advanced
visualization
tools
Future of dark data
Most of the companies in general will learn to better tap
into their dark data, it’s the way connected and
measurable world is progressing.
The real value will be delivered to those business that
would open their data sources in a secure and
responsible manner within their business so that the
workforce is empowered enough to become problem
solvers in own right.
Reach out to PromptCloud — a pioneer in custom, managed and cloud-based web
extraction services.
https://www.promptcloud.com | sales@promptcloud.com
Looking to augment data assets with web data?

Contenu connexe

Tendances

Big Data in Banking (Data Science Thailand Meetup #2)
Big Data in Banking (Data Science Thailand Meetup #2)Big Data in Banking (Data Science Thailand Meetup #2)
Big Data in Banking (Data Science Thailand Meetup #2)Data Science Thailand
 
What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?Bernard Marr
 
Big data presentation at Data Driven congres
Big data presentation at Data Driven congresBig data presentation at Data Driven congres
Big data presentation at Data Driven congresHans Smellinckx
 
Data Standardization with Web Data Integration
Data Standardization with Web Data Integration Data Standardization with Web Data Integration
Data Standardization with Web Data Integration PromptCloud
 
How to identify the Return on Investment of Big Data
How to identify the Return on Investment of Big DataHow to identify the Return on Investment of Big Data
How to identify the Return on Investment of Big DataJose Pablo Fernandez
 
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...Matt Stubbs
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsShilpaKrishna6
 
Fun Facts about Big Data
Fun Facts about Big DataFun Facts about Big Data
Fun Facts about Big DataCrayon Data
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewPietro Leo
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMiguel Ángel Gómez
 
Big data slideshare.
Big data slideshare.Big data slideshare.
Big data slideshare.salesEQUITY
 
Achieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - SlidesAchieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - SlidesSocrata
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business AdvantageTeradata Aster
 
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...Comarch_Services
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Dr. Cedric Alford
 
How Big is Big Data business - Outsource People 2015
How Big is Big Data business - Outsource People 2015How Big is Big Data business - Outsource People 2015
How Big is Big Data business - Outsource People 2015Ihor Malchenyuk
 

Tendances (20)

Big Data in Banking (Data Science Thailand Meetup #2)
Big Data in Banking (Data Science Thailand Meetup #2)Big Data in Banking (Data Science Thailand Meetup #2)
Big Data in Banking (Data Science Thailand Meetup #2)
 
What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?What Is Unstructured Data And Why Is It So Important To Businesses?
What Is Unstructured Data And Why Is It So Important To Businesses?
 
Big data presentation at Data Driven congres
Big data presentation at Data Driven congresBig data presentation at Data Driven congres
Big data presentation at Data Driven congres
 
Making sense of consumer data
Making sense of consumer dataMaking sense of consumer data
Making sense of consumer data
 
Data Standardization with Web Data Integration
Data Standardization with Web Data Integration Data Standardization with Web Data Integration
Data Standardization with Web Data Integration
 
How to identify the Return on Investment of Big Data
How to identify the Return on Investment of Big DataHow to identify the Return on Investment of Big Data
How to identify the Return on Investment of Big Data
 
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...
Big Data LDN 2017: Pervasive Intelligence: the Future of Big Data, Machine Le...
 
What is big data ? | Big Data Applications
What is big data ? | Big Data ApplicationsWhat is big data ? | Big Data Applications
What is big data ? | Big Data Applications
 
Graph Database
Graph Database  Graph Database
Graph Database
 
Fun Facts about Big Data
Fun Facts about Big DataFun Facts about Big Data
Fun Facts about Big Data
 
Big Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of ViewBig Data Analytics for Banking, a Point of View
Big Data Analytics for Banking, a Point of View
 
Mejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big DataMejorar la toma de decisiones con Big Data
Mejorar la toma de decisiones con Big Data
 
Big data slideshare.
Big data slideshare.Big data slideshare.
Big data slideshare.
 
Data monetization pov
Data monetization   povData monetization   pov
Data monetization pov
 
Achieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - SlidesAchieve Federal Open Data Policy Compliance - Slides
Achieve Federal Open Data Policy Compliance - Slides
 
Turning Big Data to Business Advantage
Turning Big Data to Business AdvantageTurning Big Data to Business Advantage
Turning Big Data to Business Advantage
 
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
From Automation System to Hyperconvergence - The Top Data Center Trends in Re...
 
Why Alt Data Is So Important
Why Alt Data Is So ImportantWhy Alt Data Is So Important
Why Alt Data Is So Important
 
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
Data Mining: The Top 3 Things You Need to Know to Achieve Business Improvemen...
 
How Big is Big Data business - Outsource People 2015
How Big is Big Data business - Outsource People 2015How Big is Big Data business - Outsource People 2015
How Big is Big Data business - Outsource People 2015
 

Similaire à Dark Data Revelation and its Potential Benefits

Embracing data science
Embracing data scienceEmbracing data science
Embracing data scienceVipul Kalamkar
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark DataAhmed Banafa
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellenceMudit Mangal
 
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdf
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdfDemystifying Data Science Vs. Business Intelligence Vs. Big Data.pdf
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdftv2064526
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solutioncamssguide
 
ebook.driving decision-making, security
ebook.driving decision-making, securityebook.driving decision-making, security
ebook.driving decision-making, securityRoman Chanclor
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Dataijtsrd
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paperJohn Enoch
 
Dark data by Worapol Alex Pongpech
Dark data by Worapol Alex PongpechDark data by Worapol Alex Pongpech
Dark data by Worapol Alex PongpechBAINIDA
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...IT Support Engineer
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)Shahbaz Anjam
 

Similaire à Dark Data Revelation and its Potential Benefits (20)

Embracing data science
Embracing data scienceEmbracing data science
Embracing data science
 
Understanding Dark Data
Understanding Dark DataUnderstanding Dark Data
Understanding Dark Data
 
Data foundation for analytics excellence
Data foundation for analytics excellenceData foundation for analytics excellence
Data foundation for analytics excellence
 
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdf
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdfDemystifying Data Science Vs. Business Intelligence Vs. Big Data.pdf
Demystifying Data Science Vs. Business Intelligence Vs. Big Data.pdf
 
Big data assignment
Big data assignmentBig data assignment
Big data assignment
 
2. Smart Data Discovery
2. Smart Data Discovery2. Smart Data Discovery
2. Smart Data Discovery
 
Analytics solution
Analytics solutionAnalytics solution
Analytics solution
 
Data mining
Data miningData mining
Data mining
 
What is big data
What is big dataWhat is big data
What is big data
 
new.pptx
new.pptxnew.pptx
new.pptx
 
Unlocking big data
Unlocking big dataUnlocking big data
Unlocking big data
 
6 Reasons to Use Data Analytics
6 Reasons to Use Data Analytics6 Reasons to Use Data Analytics
6 Reasons to Use Data Analytics
 
Dark data
Dark dataDark data
Dark data
 
ebook.driving decision-making, security
ebook.driving decision-making, securityebook.driving decision-making, security
ebook.driving decision-making, security
 
Extract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark DataExtract the Analyzed Information from Dark Data
Extract the Analyzed Information from Dark Data
 
Practical analytics john enoch white paper
Practical analytics john enoch white paperPractical analytics john enoch white paper
Practical analytics john enoch white paper
 
Dark data by Worapol Alex Pongpech
Dark data by Worapol Alex PongpechDark data by Worapol Alex Pongpech
Dark data by Worapol Alex Pongpech
 
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...Nuestar "Big Data Cloud" Major Data Center Technology  nuestarmobilemarketing...
Nuestar "Big Data Cloud" Major Data Center Technology nuestarmobilemarketing...
 
Big data (word file)
Big data  (word file)Big data  (word file)
Big data (word file)
 
Achieving Business Success with Data.pdf
Achieving Business Success with Data.pdfAchieving Business Success with Data.pdf
Achieving Business Success with Data.pdf
 

Plus de PromptCloud

All You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfAll You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfPromptCloud
 
Web Scraping Myths vs. Facts
Web Scraping Myths vs. FactsWeb Scraping Myths vs. Facts
Web Scraping Myths vs. FactsPromptCloud
 
Octoparse competitors.pdf
Octoparse competitors.pdfOctoparse competitors.pdf
Octoparse competitors.pdfPromptCloud
 
Parsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxParsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxPromptCloud
 
Product Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxProduct Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxPromptCloud
 
Data Trends in Fashion Industry
Data Trends in Fashion IndustryData Trends in Fashion Industry
Data Trends in Fashion IndustryPromptCloud
 
Visualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesVisualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesPromptCloud
 
15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should TrackPromptCloud
 
Top Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersTop Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersPromptCloud
 
The Birth of a Web Crawling Bot
The Birth of a Web Crawling BotThe Birth of a Web Crawling Bot
The Birth of a Web Crawling BotPromptCloud
 
Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019PromptCloud
 
Zipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersZipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersPromptCloud
 
Analyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsAnalyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsPromptCloud
 
PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud
 
10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web ScrapingPromptCloud
 
How Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersHow Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersPromptCloud
 
Hotel Review Data Analysis
Hotel Review Data AnalysisHotel Review Data Analysis
Hotel Review Data AnalysisPromptCloud
 
Why and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webWhy and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webPromptCloud
 
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)PromptCloud
 
Twitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup FinalTwitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup FinalPromptCloud
 

Plus de PromptCloud (20)

All You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdfAll You Need to Know About Web Crawling.pdf
All You Need to Know About Web Crawling.pdf
 
Web Scraping Myths vs. Facts
Web Scraping Myths vs. FactsWeb Scraping Myths vs. Facts
Web Scraping Myths vs. Facts
 
Octoparse competitors.pdf
Octoparse competitors.pdfOctoparse competitors.pdf
Octoparse competitors.pdf
 
Parsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptxParsehub and competitior ppt.pptx
Parsehub and competitior ppt.pptx
 
Product Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptxProduct Visibility- What Is Seen First, Will ppt.pptx
Product Visibility- What Is Seen First, Will ppt.pptx
 
Data Trends in Fashion Industry
Data Trends in Fashion IndustryData Trends in Fashion Industry
Data Trends in Fashion Industry
 
Visualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe MoviesVisualizing Marvel Cinematic Universe Movies
Visualizing Marvel Cinematic Universe Movies
 
15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track15 Key Metrics Every E-commerce Business Should Track
15 Key Metrics Every E-commerce Business Should Track
 
Top Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce PlayersTop Amazon Services for Ecommerce Players
Top Amazon Services for Ecommerce Players
 
The Birth of a Web Crawling Bot
The Birth of a Web Crawling BotThe Birth of a Web Crawling Bot
The Birth of a Web Crawling Bot
 
Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019Upcoming Applications of Artificial intelligence in 2019
Upcoming Applications of Artificial intelligence in 2019
 
Zipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailersZipcode based price benchmarking for retailers
Zipcode based price benchmarking for retailers
 
Analyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday SongsAnalyzing Positiveness in 160+ Holiday Songs
Analyzing Positiveness in 160+ Holiday Songs
 
PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019PromptCloud's Year in Review - 2019
PromptCloud's Year in Review - 2019
 
10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping10 Mobile App Ideas that can be Fueled by Web Scraping
10 Mobile App Ideas that can be Fueled by Web Scraping
 
How Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate MarketersHow Web Scraping Can Help Affiliate Marketers
How Web Scraping Can Help Affiliate Marketers
 
Hotel Review Data Analysis
Hotel Review Data AnalysisHotel Review Data Analysis
Hotel Review Data Analysis
 
Why and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the webWhy and how to scrape geospatial data from the web
Why and how to scrape geospatial data from the web
 
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
Deploying Web Scraping to Enforce Minimum Advertised Price (MAP)
 
Twitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup FinalTwitter Data Analysis for FIFA World Cup Final
Twitter Data Analysis for FIFA World Cup Final
 

Dernier

How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?sonikadigital1
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptaigil2
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)Data & Analytics Magazin
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Guido X Jansen
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxDwiAyuSitiHartinah
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationGiorgio Carbone
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.JasonViviers2
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityAggregage
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best PracticesDataArchiva
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024Becky Burwell
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...PrithaVashisht1
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introductionsanjaymuralee1
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxVenkatasubramani13
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Vladislav Solodkiy
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionajayrajaganeshkayala
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructuresonikadigital1
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerPavel Šabatka
 

Dernier (17)

How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?How is Real-Time Analytics Different from Traditional OLAP?
How is Real-Time Analytics Different from Traditional OLAP?
 
MEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .pptMEASURES OF DISPERSION I BSc Botany .ppt
MEASURES OF DISPERSION I BSc Botany .ppt
 
AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)AI for Sustainable Development Goals (SDGs)
AI for Sustainable Development Goals (SDGs)
 
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
Persuasive E-commerce, Our Biased Brain @ Bikkeldag 2024
 
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptxTINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
TINJUAN PEMROSESAN TRANSAKSI DAN ERP.pptx
 
Master's Thesis - Data Science - Presentation
Master's Thesis - Data Science - PresentationMaster's Thesis - Data Science - Presentation
Master's Thesis - Data Science - Presentation
 
YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.YourView Panel Book.pptx YourView Panel Book.
YourView Panel Book.pptx YourView Panel Book.
 
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for ClarityStrategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
Strategic CX: A Deep Dive into Voice of the Customer Insights for Clarity
 
5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices5 Ds to Define Data Archiving Best Practices
5 Ds to Define Data Archiving Best Practices
 
SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024SFBA Splunk Usergroup meeting March 13, 2024
SFBA Splunk Usergroup meeting March 13, 2024
 
Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...Elements of language learning - an analysis of how different elements of lang...
Elements of language learning - an analysis of how different elements of lang...
 
Virtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product IntroductionVirtuosoft SmartSync Product Introduction
Virtuosoft SmartSync Product Introduction
 
Mapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptxMapping the pubmed data under different suptopics using NLP.pptx
Mapping the pubmed data under different suptopics using NLP.pptx
 
Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023Cash Is Still King: ATM market research '2023
Cash Is Still King: ATM market research '2023
 
CI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual interventionCI, CD -Tools to integrate without manual intervention
CI, CD -Tools to integrate without manual intervention
 
ChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics InfrastructureChistaDATA Real-Time DATA Analytics Infrastructure
ChistaDATA Real-Time DATA Analytics Infrastructure
 
The Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayerThe Universal GTM - how we design GTM and dataLayer
The Universal GTM - how we design GTM and dataLayer
 

Dark Data Revelation and its Potential Benefits

  • 1. Dark Data Revelation and its Potential Benefits
  • 2. What is dark data? The information assets organizations collect, process and store during regular business activities, but generally fail to use for other purposes. - IT Glossary by Gartner
  • 3. In simple terms, dark data is all that useful data an organization possesses, but doesn’t actually meaningfully use or analyze for the improvement of the business.
  • 4. The enormous digital universe 2013 2020 44 ZB 37% 27% 10% 4.4 ZB 22% 17% 2% Total size of digital universe Data useful If analyzed Data from mobile devices Data from Embedded systems
  • 5. According to IDC (a research firm), up to 90 percent of the digital universe is unstructured data.
  • 6. Traditional sources of dark data Server log files Networking machine data Point-of-sale feeds Customer queries recorded in calls, emails, forms Underused employee data Meeting notes Unstructured information arising out of business mails and presentations Unused data resulting from business research and surveys
  • 7. Why is it important? Businesses are heavily invested when it comes to collection of data; however, tangible value can be derived only after companies start to understand their dark data and how it can be applied.
  • 8. It is also a sensible step for any company which is getting started with big data and building a data warehouse. In this case, dark data can be a reliable source of historical data.
  • 9. 3 facets of dark data Existing unstructured data 01 Nontraditional unstructured data 02 Data in the deep web 03
  • 10. Existing unstructured data Many businesses already have large collections of both structured and unstructured data.
  • 11. Unstructured data such as emails, notes, messages, documents, logs, and notifications (including from IoT devices) are confined to the organization and remain largely unused (due to lack of tools and techniques or their absence in the database). These data assets could be potentially having valuable insights related to competitors, pricing and consumer behavior.
  • 12. Nontraditional unstructured data Data present in the web pages, audio and video files and still images are largely untapped data that can be mined via data extraction solutions, computer vision, advanced pattern recognition, and video and sound analytics.
  • 13. This can help businesses perform advanced analytics on data present in nontraditional formats to better understand their customers, employees, operations, and markets.
  • 14. Data present in the deep web The deep web presents the largest pool of unused information—data curated by academics, consortia, government agencies, communities, and other third- party domains.
  • 15. Companies can potentially curate competitive intelligence using a type of emerging search tools developed to help users target scientific research, activist data, or even hobbyist threads found in the deep web.
  • 16. An example of such tool can be Stanford University’s search engine called Hidden Web Exposer that scrapes the deep web for information using a task-specific, human-assisted approach.
  • 18. Legal and regulatory issues If the data stored is covered by legal regulations such as credit card data, exposure of such data could expose companies into financial and legal liabilities.
  • 19. Intelligence risk Companies could intentionally or unintentionally disclose proprietary or sensitive data on business operations, products, financial status and business plans.
  • 20. PR disaster Companies are considered as protector of data they collect. So, any loss of data, especially sensitive and confidential data, can lead to loss of reputation.
  • 21. Opportunity costs If a company avoids analysis and processing of dark data but its competitors do, then its competitors will be in a better position to capture more market share by leveraging the insights from dark data.
  • 23. Stitch Fix, an online subscription shopping service, uses images from social media and other sources to track emerging fashion trends and evolving customer preferences. Personalization in retail Questionnaire filled by clients Customer’s Pinterest board and social media scanned Data augmentation Deeper insight of customer’s style preference Appropriate clothing shipped to the customer
  • 24. A financial services firm wanted to gain insight from its trading terminal data to find correlations between trading patterns and abuses like money laundering and other fraudulent activities. Most of the data was dark owing to the volume and geographically scattered storage. After the customer was able to utilize what was previously underutilized, and completed the data prep and analysis process to determine suspect patterns in transactional records, they took that analyzed data and created sophisticated predictive models that can identify activities that indicate the potential for fraud, and take measures to prevent fraud before it occurs. Fraud detection
  • 26. Instead of attempting to discover and collect all of the dark data hidden within and outside your organization, work with the business team to find answers for specific business problems. Getting the right data
  • 27. Source data from the web to augment your own data with publicly available demographic, location, and statistical information. Being open to third party data
  • 28. Data scientists are valuable resources, especially those who have the skills to combine deep modeling and statistical techniques with industry or function-specific insights. Building data talent
  • 29. Advanced visualization software can boost business intelligence by repackaging big data into smaller, more meaningful chunks, delivering value to users much faster. This is crucial since information can be more easily consumed when presented as an infographic, a dashboard, or another type of visual representation. Utilizing advanced visualization tools
  • 30. Future of dark data Most of the companies in general will learn to better tap into their dark data, it’s the way connected and measurable world is progressing. The real value will be delivered to those business that would open their data sources in a secure and responsible manner within their business so that the workforce is empowered enough to become problem solvers in own right.
  • 31. Reach out to PromptCloud — a pioneer in custom, managed and cloud-based web extraction services. https://www.promptcloud.com | sales@promptcloud.com Looking to augment data assets with web data?