SlideShare a Scribd company logo
1 of 17
WEB MINING
Presented by:
Gaurav Uniyal
161340101008
C.S.E.(Final Year)
Introduction
• Web mining is to apply data mining techniques to
extract and uncover knowledge from web documents and
services.
• Using data mining techniques to make the web more
useful and more profitable and to increase the efficiency
of our interaction with the web.
Web Mining Services
This technology has enabled e-commerce to do personalized
marketing, which eventually results in higher trade volumes.
Which eventually results in higher trade volumes.
WWW Describes…
 Web: A huge, widely-distributed, highly heterogeneous,
semi-structured, hypertext/hypermedia, interconnected
information repository.
 Web is a huge collection of documents plus
– Hyper-link information
– Access and usage information.
Tasks to Conduct
• Resource Finding.
• Information selection & Pre-processing.
• Generalization.
• Analysis.
Web Mining Classification
Web Mining
Web Content
Mining
Web Structure
Mining
Web Usage
Mining
Search Result
Mining
Customized
Usage Tracking
Web Page
Content Mining
General Access
Pattern Tracking
Web Content Mining
 Discovery of useful information from web contents /data
/documents.
 Information Retrieval view.
 Database View.
Web Structure Mining
 Researchers proposed methods of using citations among
journal articles to evaluate the quality of research
papers
 Customer behavior – evaluate a quality of a product
based on the opinions of other customers (instead of
product’s description or advertisement)
Web Usage Mining
 It’s also known as Web log Mining.
 DEFINITION: Discovery of meaningful patterns from
data generated by client-server transactions (or) from
Web server logs.
 Typical Sources of Data:
 automatically generated data stored in server access
logs, referrer logs, agent logs, and client-side cookies.
 user profiles.
 Metadata: page attributes, content attributes, usage
data
 Generate simple statistical reports:
•A summary report of hits and bytes transferred.
• A list of top requested URLs.
• A list of top referrers.
• A list of most common browsers used.
• Hits per hour/day/week/month reports.
• Hits per domain report.
 Learn:
• Who is visiting you site.
• The path visitors take through your pages.
• How much time visitors spend on each page.
• The most common starting page.
• What content are your visitors going through.
• Where visitors are leaving your site.
Design of Web Log Miner
 Weblog is Filtered to generate a relational Database.
 A Data cube is generated from Database.
 OLAP is used to drill-down and roll-up in the cube
Structures
 Hubs.
 Authority.
 Mutual Reinforcing
Relationship.
 Hyperlinks can infer
 The notation of Authority.
Structures
HITS
 HITS Stands for Hyperlink-Induced Topic Search.
 It Explore interactions between hubs and authoritative
pages.
 Expand the root set into a base set.
 Apply Weight-Propagation.
 System Based on the HITS Algorithm. e.g. GOOGLE.
 Difficulties from ignoring textual contexts
• Drifting: When Hubs contains Multiple Topics.
• Topic hijacking: When Many Pages from a single web
site point to the same single Popular site.
Application of Web Mining
 Improve web server system performance.
 Improve site Design.
 Intrusion Detection.
 Predict user’s Action.
 Enhance the quality and delivery of the internet
information services to the end user.
 Facilitates Adaptive sites/personalization.
Thank You!

More Related Content

What's hot

Introduction To Single Page Application
Introduction To Single Page ApplicationIntroduction To Single Page Application
Introduction To Single Page Application
KMS Technology
 
Data mining PPT
Data mining PPTData mining PPT
Data mining PPT
Kapil Rode
 

What's hot (20)

Web scraping in python
Web scraping in pythonWeb scraping in python
Web scraping in python
 
Big Data Analytics in Transportation
Big Data Analytics in TransportationBig Data Analytics in Transportation
Big Data Analytics in Transportation
 
Web crawler and applications
Web crawler and applicationsWeb crawler and applications
Web crawler and applications
 
WebCrawler
WebCrawlerWebCrawler
WebCrawler
 
Introduction to SEO Presentation
Introduction to SEO PresentationIntroduction to SEO Presentation
Introduction to SEO Presentation
 
Introduction to Web Scraping using Python and Beautiful Soup
Introduction to Web Scraping using Python and Beautiful SoupIntroduction to Web Scraping using Python and Beautiful Soup
Introduction to Web Scraping using Python and Beautiful Soup
 
Introduction To Single Page Application
Introduction To Single Page ApplicationIntroduction To Single Page Application
Introduction To Single Page Application
 
Web Database
Web DatabaseWeb Database
Web Database
 
Topic 1 Introduction to web analytics
Topic  1   Introduction to web analytics Topic  1   Introduction to web analytics
Topic 1 Introduction to web analytics
 
Single page application
Single page applicationSingle page application
Single page application
 
Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen Scraping
 
Website Audit Presentation
Website Audit PresentationWebsite Audit Presentation
Website Audit Presentation
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Introduction to Search Engine Optimization
Introduction to Search Engine OptimizationIntroduction to Search Engine Optimization
Introduction to Search Engine Optimization
 
Web Analytics 101
Web Analytics 101Web Analytics 101
Web Analytics 101
 
Web Information Retrieval and Mining
Web Information Retrieval and MiningWeb Information Retrieval and Mining
Web Information Retrieval and Mining
 
Data mining in Telecommunications
Data mining in TelecommunicationsData mining in Telecommunications
Data mining in Telecommunications
 
Seo and page rank algorithm
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithm
 
Data mining PPT
Data mining PPTData mining PPT
Data mining PPT
 
SEO Sample Report
SEO Sample ReportSEO Sample Report
SEO Sample Report
 

Similar to Gaurav web mining

Scalability andefficiencypres
Scalability andefficiencypresScalability andefficiencypres
Scalability andefficiencypres
NekoGato
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
butest
 

Similar to Gaurav web mining (20)

Web mining
Web miningWeb mining
Web mining
 
Web Mining.pptx
Web Mining.pptxWeb Mining.pptx
Web Mining.pptx
 
Web mining
Web miningWeb mining
Web mining
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Web mining
Web miningWeb mining
Web mining
 
Web mining
Web miningWeb mining
Web mining
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web mining
Web miningWeb mining
Web mining
 
WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
IRT Unit_4.pptx
IRT Unit_4.pptxIRT Unit_4.pptx
IRT Unit_4.pptx
 
Web mining
Web miningWeb mining
Web mining
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Web mining (1)
Web mining (1)Web mining (1)
Web mining (1)
 
Scalability andefficiencypres
Scalability andefficiencypresScalability andefficiencypres
Scalability andefficiencypres
 
TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013TechFuse 2013 - Break down the walls SharePoint 2013
TechFuse 2013 - Break down the walls SharePoint 2013
 
Benefits of Internet
Benefits of Internet Benefits of Internet
Benefits of Internet
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdf
 
E3602042044
E3602042044E3602042044
E3602042044
 
Web Mining
Web MiningWeb Mining
Web Mining
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 

Recently uploaded

1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
AldoGarca30
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
chumtiyababu
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 

Recently uploaded (20)

School management system project Report.pdf
School management system project Report.pdfSchool management system project Report.pdf
School management system project Report.pdf
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 
Hostel management system project report..pdf
Hostel management system project report..pdfHostel management system project report..pdf
Hostel management system project report..pdf
 
Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKARHAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
HAND TOOLS USED AT ELECTRONICS WORK PRESENTED BY KOUSTAV SARKAR
 
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
1_Introduction + EAM Vocabulary + how to navigate in EAM.pdf
 
kiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal loadkiln thermal load.pptx kiln tgermal load
kiln thermal load.pptx kiln tgermal load
 
Moment Distribution Method For Btech Civil
Moment Distribution Method For Btech CivilMoment Distribution Method For Btech Civil
Moment Distribution Method For Btech Civil
 
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
NO1 Top No1 Amil Baba In Azad Kashmir, Kashmir Black Magic Specialist Expert ...
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Verification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptxVerification of thevenin's theorem for BEEE Lab (1).pptx
Verification of thevenin's theorem for BEEE Lab (1).pptx
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
Block diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.pptBlock diagram reduction techniques in control systems.ppt
Block diagram reduction techniques in control systems.ppt
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
Navigating Complexity: The Role of Trusted Partners and VIAS3D in Dassault Sy...
 
Online electricity billing project report..pdf
Online electricity billing project report..pdfOnline electricity billing project report..pdf
Online electricity billing project report..pdf
 
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
Unit 4_Part 1 CSE2001 Exception Handling and Function Template and Class Temp...
 
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
Bhubaneswar🌹Call Girls Bhubaneswar ❤Komal 9777949614 💟 Full Trusted CALL GIRL...
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 

Gaurav web mining

  • 1. WEB MINING Presented by: Gaurav Uniyal 161340101008 C.S.E.(Final Year)
  • 2. Introduction • Web mining is to apply data mining techniques to extract and uncover knowledge from web documents and services. • Using data mining techniques to make the web more useful and more profitable and to increase the efficiency of our interaction with the web.
  • 3. Web Mining Services This technology has enabled e-commerce to do personalized marketing, which eventually results in higher trade volumes. Which eventually results in higher trade volumes.
  • 4. WWW Describes…  Web: A huge, widely-distributed, highly heterogeneous, semi-structured, hypertext/hypermedia, interconnected information repository.  Web is a huge collection of documents plus – Hyper-link information – Access and usage information.
  • 5. Tasks to Conduct • Resource Finding. • Information selection & Pre-processing. • Generalization. • Analysis.
  • 6. Web Mining Classification Web Mining Web Content Mining Web Structure Mining Web Usage Mining Search Result Mining Customized Usage Tracking Web Page Content Mining General Access Pattern Tracking
  • 7. Web Content Mining  Discovery of useful information from web contents /data /documents.  Information Retrieval view.  Database View.
  • 8. Web Structure Mining  Researchers proposed methods of using citations among journal articles to evaluate the quality of research papers  Customer behavior – evaluate a quality of a product based on the opinions of other customers (instead of product’s description or advertisement)
  • 9. Web Usage Mining  It’s also known as Web log Mining.  DEFINITION: Discovery of meaningful patterns from data generated by client-server transactions (or) from Web server logs.  Typical Sources of Data:  automatically generated data stored in server access logs, referrer logs, agent logs, and client-side cookies.  user profiles.  Metadata: page attributes, content attributes, usage data
  • 10.  Generate simple statistical reports: •A summary report of hits and bytes transferred. • A list of top requested URLs. • A list of top referrers. • A list of most common browsers used. • Hits per hour/day/week/month reports. • Hits per domain report.
  • 11.  Learn: • Who is visiting you site. • The path visitors take through your pages. • How much time visitors spend on each page. • The most common starting page. • What content are your visitors going through. • Where visitors are leaving your site.
  • 12. Design of Web Log Miner  Weblog is Filtered to generate a relational Database.  A Data cube is generated from Database.  OLAP is used to drill-down and roll-up in the cube
  • 13. Structures  Hubs.  Authority.  Mutual Reinforcing Relationship.  Hyperlinks can infer  The notation of Authority.
  • 15. HITS  HITS Stands for Hyperlink-Induced Topic Search.  It Explore interactions between hubs and authoritative pages.  Expand the root set into a base set.  Apply Weight-Propagation.  System Based on the HITS Algorithm. e.g. GOOGLE.  Difficulties from ignoring textual contexts • Drifting: When Hubs contains Multiple Topics. • Topic hijacking: When Many Pages from a single web site point to the same single Popular site.
  • 16. Application of Web Mining  Improve web server system performance.  Improve site Design.  Intrusion Detection.  Predict user’s Action.  Enhance the quality and delivery of the internet information services to the end user.  Facilitates Adaptive sites/personalization.