SlideShare une entreprise Scribd logo
1  sur  16
10/9/2013 1
Web mining is to apply data mining techniques
to extract and uncover knowledge from web
documents and services.
Using data mining techniques to make the web
more useful and more profitable and to
increase the efficiency of our interaction with
the web.
10/9/2013 2
10/9/2013 3
Web: A huge, widely-distributed, highly
heterogeneous, semi-structured,
hypertext/hypermedia, interconnected
information repository.
Web is a huge collection of documents plus
– Hyper-link information
– Access and usage information
10/9/2013 4
Resource Finding.
Information selection & Pre-processing.
Generalization.
Analysis.
10/9/2013 5
WEB
MINING
WEB USAGE
MINING
WEB
STRUCTURE
MINING
WEB
CONTENT
MINING
CUSTOMIZED
USAGE
TRACKING
GENERAL
ACCESS
PATTERN
TRACKING
SEARCH
RESULT
MINING
WEB PAGE
CONTENT
MINING
10/9/2013 6
Discovery of useful information from web
contents /data /documents.
Information Retrieval view.
Database View.
10/9/2013 7
Researchers proposed methods of using citations
among journal articles to evaluate the quality of
research papers.
Customer behavior – evaluate a quality of a product
based on the opinions of other customers (instead of
product’s description or advertisement).
10/9/2013 8
It’s also known as Web log Mining.
DEFINITION
Discovery of meaningful patterns from data
generated by client-server transactions (or) from Web
server logs.
Typical Sources of Data:
automatically generated data stored in server access logs,
referrer logs, agent logs, and client-side cookies.
user profiles.
metadata: page attributes, content attributes, usage data.
10/9/2013 9
Generate simple statistical reports:
A summary report of hits and bytes transferred
A list of top requested URLs
A list of top referrers
A list of most common browsers used
Hits per hour/day/week/month reports
Hits per domain reports
Learn:
Who is visiting you site
The path visitors take through your pages
How much time visitors spend on each page
The most common starting page
Where visitors are leaving your site
10/9/2013 10
Weblog is Filtered to generate a relational Database.
A Data cube is generated from Database.
OLAP is used to drill-down and roll-up in the cube.
10/9/2013 11
WEB LOG Database
Data
Cleaning
Knowledge
Patterns
Data cube
creation
Data cube Sliced and
diced cube
Data
Mining
OLAP
Hubs.
Authority.
Mutual Reinforcing
Relationship.
Finding Authoritative
Web Pages.
Hyperlinks can infer
the notation of
Authority.
10/9/2013 12
HUBS AUTHORITIES
Hub-Authority Relations
10/9/2013 13
HITS Stands for Hyperlink-Induced Topic Search.
It Explore interactions between hubs and authoritative
pages.
Expand the root set into a base set.
Apply Weight-Propagation.
System Based on the HITS Algorithm.
- eg) GOOGLE.
Difficulties from ignoring textual contexts
-Drifting: When Hubs contains Multiple Topics.
-Topic hijacking: When Many Pages from a single web
site point to the same single Popular site.
10/9/2013 14
Improve web server system performance.
Improve site Design.
Intrusion Detection.
Predict user’s Action.
Enhance the quality and delivery of the internet
information services to the end user.
Facilitates Adaptive sites/personalization.
10/9/2013 15
10/9/2013 16

Contenu connexe

Tendances

Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingCynthiaCruz55
 
What is Web-scraping?
What is Web-scraping?What is Web-scraping?
What is Web-scraping?Yu-Chang Ho
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction ServicePromptCloud
 
Crawling and Indexing
Crawling and IndexingCrawling and Indexing
Crawling and IndexingHimani Tyagi
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With PythonRobert Dempsey
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slidesmahavir_a
 
Data mining slides
Data mining slidesData mining slides
Data mining slidessmj
 
Open source search engine
Open source search engineOpen source search engine
Open source search enginePrimya Tamil
 

Tendances (20)

Web usage mining
Web usage miningWeb usage mining
Web usage mining
 
WEB Scraping.pptx
WEB Scraping.pptxWEB Scraping.pptx
WEB Scraping.pptx
 
Web mining
Web miningWeb mining
Web mining
 
Web Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen ScrapingWeb Scraping using Python | Web Screen Scraping
Web Scraping using Python | Web Screen Scraping
 
What is Web-scraping?
What is Web-scraping?What is Web-scraping?
What is Web-scraping?
 
Web mining
Web miningWeb mining
Web mining
 
Web mining
Web mining Web mining
Web mining
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Crawling and Indexing
Crawling and IndexingCrawling and Indexing
Crawling and Indexing
 
Introduction to Data Mining
Introduction to Data MiningIntroduction to Data Mining
Introduction to Data Mining
 
Webcrawler
Webcrawler Webcrawler
Webcrawler
 
Web Mining
Web Mining Web Mining
Web Mining
 
Web Scraping With Python
Web Scraping With PythonWeb Scraping With Python
Web Scraping With Python
 
Web mining slides
Web mining slidesWeb mining slides
Web mining slides
 
What is web scraping?
What is web scraping?What is web scraping?
What is web scraping?
 
Web Information Retrieval and Mining
Web Information Retrieval and MiningWeb Information Retrieval and Mining
Web Information Retrieval and Mining
 
Web Mining
Web MiningWeb Mining
Web Mining
 
Data mining slides
Data mining slidesData mining slides
Data mining slides
 
Open source search engine
Open source search engineOpen source search engine
Open source search engine
 

En vedette

En vedette (19)

WEB MINING.
WEB MINING.WEB MINING.
WEB MINING.
 
Web Mining Presentation Final
Web Mining Presentation FinalWeb Mining Presentation Final
Web Mining Presentation Final
 
Web content mining
Web content miningWeb content mining
Web content mining
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)
 
Web mining
Web miningWeb mining
Web mining
 
Web Usage Pattern
Web Usage PatternWeb Usage Pattern
Web Usage Pattern
 
Data mining
Data miningData mining
Data mining
 
Web mining
Web miningWeb mining
Web mining
 
Web mining
Web miningWeb mining
Web mining
 
Multimedia Database
Multimedia DatabaseMultimedia Database
Multimedia Database
 
Fp growth algorithm
Fp growth algorithmFp growth algorithm
Fp growth algorithm
 
The comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithmThe comparative study of apriori and FP-growth algorithm
The comparative study of apriori and FP-growth algorithm
 
Social Data Mining
Social Data MiningSocial Data Mining
Social Data Mining
 
Data mining in social network
Data mining in social networkData mining in social network
Data mining in social network
 
Web Content Filtering for Education and Schools - Webtitan Cloud Reseller Pre...
Web Content Filtering for Education and Schools - Webtitan Cloud Reseller Pre...Web Content Filtering for Education and Schools - Webtitan Cloud Reseller Pre...
Web Content Filtering for Education and Schools - Webtitan Cloud Reseller Pre...
 
Web filtering through Software
Web filtering through SoftwareWeb filtering through Software
Web filtering through Software
 
Internet Filtering and Blocking
Internet Filtering and BlockingInternet Filtering and Blocking
Internet Filtering and Blocking
 
5463 26 web mining
5463 26 web mining5463 26 web mining
5463 26 web mining
 
Data mining
Data miningData mining
Data mining
 

Similaire à Web mining

Intelligent Web Crawling (WI-IAT 2013 Tutorial)
Intelligent Web Crawling (WI-IAT 2013 Tutorial)Intelligent Web Crawling (WI-IAT 2013 Tutorial)
Intelligent Web Crawling (WI-IAT 2013 Tutorial)Denis Shestakov
 
Pdd crawler a focused web
Pdd crawler  a focused webPdd crawler  a focused web
Pdd crawler a focused webcsandit
 
[LvDuit//Lab] Crawling the web
[LvDuit//Lab] Crawling the web[LvDuit//Lab] Crawling the web
[LvDuit//Lab] Crawling the webVan-Duyet Le
 
Jarrar: Introduction to Linked Data
Jarrar: Introduction to Linked DataJarrar: Introduction to Linked Data
Jarrar: Introduction to Linked DataMustafa Jarrar
 
Sekhon final 1_ppt
Sekhon final 1_pptSekhon final 1_ppt
Sekhon final 1_pptManant Sweet
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
 
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...ijmech
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...ijmech
 
Internet browsing techniques
Internet browsing techniquesInternet browsing techniques
Internet browsing techniquesTola Odugbesan
 
Smart crawler a two stage crawler
Smart crawler a two stage crawlerSmart crawler a two stage crawler
Smart crawler a two stage crawlerRishikesh Pathak
 
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...Rana Jayant
 
A survey on Design and Implementation of Clever Crawler Based On DUST Removal
A survey on Design and Implementation of Clever Crawler Based On DUST RemovalA survey on Design and Implementation of Clever Crawler Based On DUST Removal
A survey on Design and Implementation of Clever Crawler Based On DUST RemovalIJSRD
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfSowmyaJyothi3
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...butest
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...butest
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)IJERD Editor
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13DataDryad
 
`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areas`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areasinventionjournals
 

Similaire à Web mining (20)

E3602042044
E3602042044E3602042044
E3602042044
 
Intelligent Web Crawling (WI-IAT 2013 Tutorial)
Intelligent Web Crawling (WI-IAT 2013 Tutorial)Intelligent Web Crawling (WI-IAT 2013 Tutorial)
Intelligent Web Crawling (WI-IAT 2013 Tutorial)
 
Pdd crawler a focused web
Pdd crawler  a focused webPdd crawler  a focused web
Pdd crawler a focused web
 
[LvDuit//Lab] Crawling the web
[LvDuit//Lab] Crawling the web[LvDuit//Lab] Crawling the web
[LvDuit//Lab] Crawling the web
 
Jarrar: Introduction to Linked Data
Jarrar: Introduction to Linked DataJarrar: Introduction to Linked Data
Jarrar: Introduction to Linked Data
 
Sekhon final 1_ppt
Sekhon final 1_pptSekhon final 1_ppt
Sekhon final 1_ppt
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
 
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
Design and Implementation of Carpool Data Acquisition Program Based on Web Cr...
 
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
DESIGN AND IMPLEMENTATION OF CARPOOL DATA ACQUISITION PROGRAM BASED ON WEB CR...
 
Internet browsing techniques
Internet browsing techniquesInternet browsing techniques
Internet browsing techniques
 
Smart crawler a two stage crawler
Smart crawler a two stage crawlerSmart crawler a two stage crawler
Smart crawler a two stage crawler
 
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...
Smart Crawler Base Paper A two stage crawler for efficiently harvesting deep-...
 
A survey on Design and Implementation of Clever Crawler Based On DUST Removal
A survey on Design and Implementation of Clever Crawler Based On DUST RemovalA survey on Design and Implementation of Clever Crawler Based On DUST Removal
A survey on Design and Implementation of Clever Crawler Based On DUST Removal
 
WEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdfWEBMINING_SOWMYAJYOTHI.pdf
WEBMINING_SOWMYAJYOTHI.pdf
 
Web crawling
Web crawlingWeb crawling
Web crawling
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...A machine learning approach to web page filtering using ...
A machine learning approach to web page filtering using ...
 
International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)International Journal of Engineering Research and Development (IJERD)
International Journal of Engineering Research and Development (IJERD)
 
Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13Fox-Keynote-Now and Now of Data Publishing-nfdp13
Fox-Keynote-Now and Now of Data Publishing-nfdp13
 
`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areas`A Survey on approaches of Web Mining in Varied Areas
`A Survey on approaches of Web Mining in Varied Areas
 

Plus de Iniya Kannan

Telephone conversation iniya 14mba002
Telephone conversation iniya 14mba002Telephone conversation iniya 14mba002
Telephone conversation iniya 14mba002Iniya Kannan
 
Mobile App for Booking Movie Ticket
Mobile App for Booking Movie TicketMobile App for Booking Movie Ticket
Mobile App for Booking Movie TicketIniya Kannan
 
Mobile App for Movie Ticket Booking Screenshots
Mobile App for Movie Ticket Booking ScreenshotsMobile App for Movie Ticket Booking Screenshots
Mobile App for Movie Ticket Booking ScreenshotsIniya Kannan
 
Converting agricultural waste for useful purposes
Converting agricultural waste for useful purposesConverting agricultural waste for useful purposes
Converting agricultural waste for useful purposesIniya Kannan
 
Probabilistic reasoning
Probabilistic reasoningProbabilistic reasoning
Probabilistic reasoningIniya Kannan
 
16-Queen's Problem
16-Queen's Problem16-Queen's Problem
16-Queen's ProblemIniya Kannan
 

Plus de Iniya Kannan (14)

Event iniya
Event iniyaEvent iniya
Event iniya
 
Telephone conversation iniya 14mba002
Telephone conversation iniya 14mba002Telephone conversation iniya 14mba002
Telephone conversation iniya 14mba002
 
Mobile App for Booking Movie Ticket
Mobile App for Booking Movie TicketMobile App for Booking Movie Ticket
Mobile App for Booking Movie Ticket
 
Mobile App for Movie Ticket Booking Screenshots
Mobile App for Movie Ticket Booking ScreenshotsMobile App for Movie Ticket Booking Screenshots
Mobile App for Movie Ticket Booking Screenshots
 
9 creations
9 creations9 creations
9 creations
 
Converting agricultural waste for useful purposes
Converting agricultural waste for useful purposesConverting agricultural waste for useful purposes
Converting agricultural waste for useful purposes
 
Nano technology
Nano technologyNano technology
Nano technology
 
Controller
ControllerController
Controller
 
Cmp
CmpCmp
Cmp
 
Probabilistic reasoning
Probabilistic reasoningProbabilistic reasoning
Probabilistic reasoning
 
Long run
Long runLong run
Long run
 
Ray tracing
Ray tracingRay tracing
Ray tracing
 
Tsunami
TsunamiTsunami
Tsunami
 
16-Queen's Problem
16-Queen's Problem16-Queen's Problem
16-Queen's Problem
 

Dernier

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...apidays
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobeapidays
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024The Digital Insurer
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingEdi Saputra
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Jeffrey Haguewood
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesrafiqahmad00786416
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Victor Rentea
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyKhushali Kathiriya
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...apidays
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024The Digital Insurer
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWERMadyBayot
 

Dernier (20)

Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
Apidays New York 2024 - The Good, the Bad and the Governed by David O'Neill, ...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024Manulife - Insurer Transformation Award 2024
Manulife - Insurer Transformation Award 2024
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024AXA XL - Insurer Innovation Award Americas 2024
AXA XL - Insurer Innovation Award Americas 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost SavingRepurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
Repurposing LNG terminals for Hydrogen Ammonia: Feasibility and Cost Saving
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
Web Form Automation for Bonterra Impact Management (fka Social Solutions Apri...
 
ICT role in 21st century education and its challenges
ICT role in 21st century education and its challengesICT role in 21st century education and its challenges
ICT role in 21st century education and its challenges
 
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
Modular Monolith - a Practical Alternative to Microservices @ Devoxx UK 2024
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
Artificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : UncertaintyArtificial Intelligence Chap.5 : Uncertainty
Artificial Intelligence Chap.5 : Uncertainty
 
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
Apidays New York 2024 - Accelerating FinTech Innovation by Vasa Krishnan, Fin...
 
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
+971581248768>> SAFE AND ORIGINAL ABORTION PILLS FOR SALE IN DUBAI AND ABUDHA...
 
FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024FWD Group - Insurer Innovation Award 2024
FWD Group - Insurer Innovation Award 2024
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWEREMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
EMPOWERMENT TECHNOLOGY GRADE 11 QUARTER 2 REVIEWER
 

Web mining

  • 2. Web mining is to apply data mining techniques to extract and uncover knowledge from web documents and services. Using data mining techniques to make the web more useful and more profitable and to increase the efficiency of our interaction with the web. 10/9/2013 2
  • 4. Web: A huge, widely-distributed, highly heterogeneous, semi-structured, hypertext/hypermedia, interconnected information repository. Web is a huge collection of documents plus – Hyper-link information – Access and usage information 10/9/2013 4
  • 5. Resource Finding. Information selection & Pre-processing. Generalization. Analysis. 10/9/2013 5
  • 7. Discovery of useful information from web contents /data /documents. Information Retrieval view. Database View. 10/9/2013 7
  • 8. Researchers proposed methods of using citations among journal articles to evaluate the quality of research papers. Customer behavior – evaluate a quality of a product based on the opinions of other customers (instead of product’s description or advertisement). 10/9/2013 8
  • 9. It’s also known as Web log Mining. DEFINITION Discovery of meaningful patterns from data generated by client-server transactions (or) from Web server logs. Typical Sources of Data: automatically generated data stored in server access logs, referrer logs, agent logs, and client-side cookies. user profiles. metadata: page attributes, content attributes, usage data. 10/9/2013 9
  • 10. Generate simple statistical reports: A summary report of hits and bytes transferred A list of top requested URLs A list of top referrers A list of most common browsers used Hits per hour/day/week/month reports Hits per domain reports Learn: Who is visiting you site The path visitors take through your pages How much time visitors spend on each page The most common starting page Where visitors are leaving your site 10/9/2013 10
  • 11. Weblog is Filtered to generate a relational Database. A Data cube is generated from Database. OLAP is used to drill-down and roll-up in the cube. 10/9/2013 11 WEB LOG Database Data Cleaning Knowledge Patterns Data cube creation Data cube Sliced and diced cube Data Mining OLAP
  • 12. Hubs. Authority. Mutual Reinforcing Relationship. Finding Authoritative Web Pages. Hyperlinks can infer the notation of Authority. 10/9/2013 12 HUBS AUTHORITIES Hub-Authority Relations
  • 14. HITS Stands for Hyperlink-Induced Topic Search. It Explore interactions between hubs and authoritative pages. Expand the root set into a base set. Apply Weight-Propagation. System Based on the HITS Algorithm. - eg) GOOGLE. Difficulties from ignoring textual contexts -Drifting: When Hubs contains Multiple Topics. -Topic hijacking: When Many Pages from a single web site point to the same single Popular site. 10/9/2013 14
  • 15. Improve web server system performance. Improve site Design. Intrusion Detection. Predict user’s Action. Enhance the quality and delivery of the internet information services to the end user. Facilitates Adaptive sites/personalization. 10/9/2013 15