SlideShare une entreprise Scribd logo
1  sur  30
WEB MINING
• Created by-
• SUSHIL KASAR.
RATIONALE
WEB is a rich source of knowledge that can be useful to
many application.
Mining means extracting something useful or value able
from a baser substance, such as mining gold from the earth.
Source..? Billions of web pages and Billions of visitors and
contributors.
Knowledge..?the hyperlink structure and diversity of
languages.
Purpose..?To improve users efficiency and effectiveness in
searching for information on the web. "World wide web,
internet“.
INTRODUCTION.
The information on the internet is in the form of static and
dynamic web pages of various areas from education,
industry to every walk of life including blogs.
• As per the web sites’ survey more than 160,000,000 web
sites are having inter, intra linked web pages. The speed
of web information is rapid.
• The way the web sites and web pages are accessed , it is
useful from the business perspective for giving future
directions for decision making.
• Data mining (sometimes called data or knowledge
discovery) is the process of analyzing data from different
perspectives and summarizing it into useful information.
• The Web Mining is an application of the data mining
techniques to find interesting and potentially useful
knowledge from web data.
Data Mining and Web Mining
 Data mining: turn data into knowledge.
 Web mining is to apply data mining techniques
to extract and uncover knowledge from web
documents and services. ie.web data.
 Eg-Facebook
recommendations,friends,place,pages..etc.is an
part of web mining.
WEB MINING.ImageFile
Web mining is the application of data mining
techniques to extract knowledge from web data.
Web Data is,
1.web content- text, image,record etc.
www.web_content.image.com
2.web structure-hyperlinks,tags etc.
www.web.structureimage.com
3.web usage-http logs, app server logs, etc.
www.web.usage.com
Note-above are the hyperlinks,you add your own
with supportive images.
FEATURES and benefits
1.Information filtering techniques try to learn
about users’ interests based on their
evaluation and actions, and then to use this
information to analyze new documents.
2.It Increase the value of each visitor.
Improve the visitor’s experience at the
websites.
3.WEB MINING allows you to look for pattern in
data through content mining,structure mining
and usage mining.
4.Web mining is attractive for companies,
because of several advantages.
In the most general sense it can contribute to
the increase of profit. By actually selling more
products or services or by Minimizing the
costs. In order to do this marketing
intelligence is requried.This intelligence can
focus on marketing stratergies and
competative analyses or on the relationship of
the customers.
CHALLENGES IN WEB MINING.
• Information is Huge.
• Information is diverse.
• Information is redundant.
How Web Mining is Going to Make
the Users Life Easy?
• “Solution for Business Decision Problems of E-Commerce for Retailer’s
Web Site Solved Using Web Mining.”
• In an e-commerce web site a reduction in user’s web site behavior
analysis time and web site usage, trends will be a value addition
provided by web mining. This e-commerce web site, required to develop
a web data mining system for business users and data analysts as an end
to end solution comprising of data gathering, cleansing, ETL operations,
warehousing.
• The business intelligence systems created user friendly, flexible,
dynamic, multidimensional factual reporting, supported by visualization,
and web data mining techniques.
• In the e-commerce web site the data gathering and data sources includes
not only customer registration and demographic information but also
web click-streams, response to direct-mail, email campaigns, and orders
placed through a website, call center amongst the other sources. The
quantity of data can vary above 100 million records.
• The E-commerce Web Site Architecture can collect
additional click stream data besides the data in the web
logs, web logs have sensitive information about customer’s
login, session information, IP addresses indicating the area,
region they belong to and their age, frequency of using the
web site etc.
• The focus on Business to Customer(B2C) e-commerce for
retailers helps in understanding and fulfilling the business
needs to develop the required expertise and design out of
the box reports and analysis of the domain’s future trends
and patterns of customer behavior understand in a better
way to the business user.
• It can answer the business questions such as to identify
heavy spenders at the web site, which are the customers
who express willingness to receive emails from the web site
are heavy spenders? Such answers reflect the customer’s
loyalty, based on these results promotion offers and
discount offers can also be derived by the business decision
maker and possibility to increase in customers can be
increases for registrations to web site.
Solution to the Search Engine Problems and
How Web Mining Can Help in Improving
the Business Decisions.
• As the search engines use enormous information
existing in the web sites, web pages, it is a
challenging task to engineer, implement and to
improvise the search engine.
• This specifies that indexing of web pages involves a
huge task.
• Per day tens of millions of queries are given to search
engine.
• It helps in problems of how to effectively deal
with uncontrolled hypertext collection where
anyone can publish anything they want. Web
Mining Applications have been used by these
web sites such as Web search e.g., Google and
Yahoo , Web Vertical Search e.g., FatLens and
Become, Web Recommendations e.g.,
Amazon.com , Web Advertising e.g., Google and
Yahoo,
• Web site design e.g., landing page optimization.
The Various Business Areas Where Web
Mining has Helped in Improving
the Business Decision Making.
• E-Business-Analysis of click-stream data(data
regarding to web browsing) i.e. web mining uncovers
real-time e-business opportunities across geography.
It provides ways to target right customers and
understand their needs and to customize services
and strategies in near-or-real time. The area of
advertising is no exception for utilizing the
opportunities provided by online customer analytics
to promote right products in real time to the right
customer. It also helps in effectiveness of a web site
as a channel for marketing by quantifying the user’s
behavior while on the web site.
• CRM-Analytical CRM utilizes business intelligence and reporting
methodologies such as data mining and analytical processing to
CRM applications. While the earlier CRM implementations focus on
improving operational efficiencies in the sales and service functions
through tailor-made solutions for call-center management,
analytical CRM solutions use intelligence solutions to analyze the
data, identify the demographic profiles and measure the purchase
frequency and other behavioral patterns of the customers.
• With the amount of available online content, today organizations
put premium on understanding, adopting and managing the same,
convert them into appropriate knowledge suitable to serve their
customers better, and thus improve the operations and accelerate
the process of delivery of products to markets. The WorldWide Web
is a fertile area for web Mining and it can provide applications,
methods, algorithms to be beneficial in various real-world
applications with respect to the critical e-CRM function.
• Customer Behavior-Web Mining helps in
understanding the concerns such as current and
future probability of every customer, relationship
between behavior and the loyalty at the website
The models based on customer-centric web
behavior can be used not only for identifying
improvements in the appeal of web site
segmentation, which are based on web behavior
providing a precise basis for personalization but
also for predicting customer’s future behavior
that is essential for website content planning and
design.
• Cross Selling-Web Data mining usage which will allow to
cross- sell into web store application with a minimal effort.
• Web Site Service Quality Improvement-The World Wide
Web is one of the most used interfaces to access remote
data and commercial, noncommercial services and the
number of actors involved in these transactions is growing
very quickly.
• Everyone using the Web Experiences knows that how the
connection to a popular website may be very slow during
rush hours and it is well known that web users tend to
leave a site if the wait time for a page to be served exceeds
a given value. Therefore, performance and service quality
attributes have gained enormous relevance in service
design and deployment.
TYPES OF WEB MINING
WEB MINING
Web usage
mining
Web
structure
mining
Web content
mining
WEB USAGE MINING
• Web usage mining is the process of
extracting useful information from server
logs. i.e.User’s history.
• Web usage mining is the process of finding
out what users are looking on internet.
• User identification, session creation, robot
detection.
• And filtering, and extracting usage path
patterns.
WEB STRUCTURE MINING
• The structure of a typical Web graph consists of
Web pages as nodes, and hyperlinks as edges
connecting between two related pages.
• Web Structure Mining is the process of
discovering structure information from the Web.
-This type of mining can be performed either at the
(intra-page) document level or at the (inter-page)
hyperlink level.
The research at the hyperlink level is also called
Hyperlink Analysis.
WEB STRUCTURE TERMINOLOGY.
• Web-graph: A directed graph that represents
the Web.
• ‰
Node: Each Web page is a node of the Web-
graph.
• ‰
Link: Each hyperlink on the Web is a directed
edge of the Web-graph.
WEB CONTENT MINING
• Web Content Mining is the process of
extracting useful information from the
contents of Web documents.
• Content data corresponds to the collection of
facts a Web page was designed to convey to
the users.
• It may consist of text, images, audio, video, or
structured records such as lists and tables.
• The web content mining is differentiated from
two different points of view. Information
Retrieval View and Database View.
Common Mining Techniques
• The more basic and popular mining techniques
include:
• ™
Classification
• ™
Clustering
• ™
Associations.
The other significant ideas:
• ™
Topic Identification, tracking and drift analysis
• ™
Concept hierarchy creation
• ™
Relevance of content.
• Document Classification.
“Supervised” technique.
Categories are defined and documents are
assigned to one or more existing categories.
The “definition” of a category is usually in the
form of a term that is produced during a
“training” phase.
Training is performed through the use of
documents that have already been classified
(often by hand) as belonging to a category.
• Document Clustering.
• “Unsupervised” technique.
• Document clustering is the act of collecting
similar documents into bins, where similarity is
some function on a document.
• Clustering are performed online as well offline.
• Document clustering involves the use of
descriptors and descriptor extraction. Descriptors
are sets of words that describe the contents
within the cluster. Document clustering is
generally considered to be a centralized process.
Examples of document clustering include web
document clustering for search users.
• Topic Identification and Tracking.
• Combination of Clustering and Classification.
• As new documents are added to a collection.
• An attempt is made to assign each document
to an existing topic (category).
• The collection is also checked for the
emergence of new topics. the drift in the
topics are also identified.
• Concept Hierarchy Creation.
• Creation of concept hierarchies is important
to understand the category and sub categories
a document belongs to Key Factors.
Organization of categories. e.g. Flat, Tree, or
Network.
• Maximum number of categories per
document.
• Category Dimensions e.g. Subject, Location,
Time, Alphabetical, Numerical.
APPLICATION AREAS:-
• WEB MINING in E-commerce and services.
1. www.e-commerce&services
• WEB MINING in Advertising based sites.
2.www.Inadversting.com
• WEB MINING in Repositories Information.
3.www.InRepository.sushil'sgmailaccount.com
• Note-above are the hyperlinks,add supportive
images.
Conclusion
• In today’s era where the entire world has become a global village and the driving
force is internet having ebusiness to internet blogs to search engines, the major
questions in front of the business users is while they would like to retain the
existing customers and also would like to understand the patterns and trends of
customer behavior so that their decisions can be supported with facts represented
with visualizations and appropriate reporting made possible with web mining. The
success of accuracy of deriving patterns is directly proportional to the amount of
sample data used for the data mining techniques.
• The advantages of using web mining in search engines and e-commerce, CRM,
customer behavior analysis, cross selling; web site service quality improvement is
noticeable. The recommendation of using web mining techniques can be applied
successfully with a keen analysis of clearly understood business needs and
requirements. Also one more governing factor is the amount of data, as the data is
voluminous the results can be more towards the correct trends and patterns to be
predicted from the given set of data.
• But although the web mining techniques can be applied to even the small web
sites with a few number of web pages and links within them, web mining may not
be the answer for its improvement as it will not be the optimum solution as far as
the cost factor in terms of parameters such as complexity of web mining
techniques using algorithms may not be recommended.
• Possible applications can be On-line social networking community software
applications can use web mining techniques to explore the effectiveness of on-line
networking, also areas such as knowledgemanagement web sites and web mining
can also be useful in bioinformatics, e-governance and e-learning.
WEB MINING.

Contenu connexe

Tendances

Web Mining & Text Mining
Web Mining & Text MiningWeb Mining & Text Mining
Web Mining & Text MiningHemant Sharma
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction ServicePromptCloud
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)Amir Fahmideh
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on irPrimya Tamil
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notesAnandh Arumugakan
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimizationWBUTTUTORIALS
 
Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Ali Saif Mirza
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMSai Kumar Ale
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmKavita Kushwah
 

Tendances (20)

Web Mining & Text Mining
Web Mining & Text MiningWeb Mining & Text Mining
Web Mining & Text Mining
 
Web Scraping and Data Extraction Service
Web Scraping and Data Extraction ServiceWeb Scraping and Data Extraction Service
Web Scraping and Data Extraction Service
 
Web mining
Web mining Web mining
Web mining
 
Web mining (1)
Web mining (1)Web mining (1)
Web mining (1)
 
Web mining (structure mining)
Web mining (structure mining)Web mining (structure mining)
Web mining (structure mining)
 
Web search vs ir
Web search vs irWeb search vs ir
Web search vs ir
 
Web usage-mining
Web usage-miningWeb usage-mining
Web usage-mining
 
The impact of web on ir
The impact of web on irThe impact of web on ir
The impact of web on ir
 
Web Information Retrieval and Mining
Web Information Retrieval and MiningWeb Information Retrieval and Mining
Web Information Retrieval and Mining
 
Web Mining
Web Mining Web Mining
Web Mining
 
CS6007 information retrieval - 5 units notes
CS6007   information retrieval - 5 units notesCS6007   information retrieval - 5 units notes
CS6007 information retrieval - 5 units notes
 
web mining
web miningweb mining
web mining
 
Seo and page rank algorithm
Seo and page rank algorithmSeo and page rank algorithm
Seo and page rank algorithm
 
Query processing-and-optimization
Query processing-and-optimizationQuery processing-and-optimization
Query processing-and-optimization
 
Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )Web search engines ( Mr.Mirza )
Web search engines ( Mr.Mirza )
 
Web crawler
Web crawlerWeb crawler
Web crawler
 
Web mining
Web miningWeb mining
Web mining
 
WEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEMWEB BASED INFORMATION RETRIEVAL SYSTEM
WEB BASED INFORMATION RETRIEVAL SYSTEM
 
Link analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank AlgorithmLink analysis : Comparative study of HITS and Page Rank Algorithm
Link analysis : Comparative study of HITS and Page Rank Algorithm
 
Spatial databases
Spatial databasesSpatial databases
Spatial databases
 

Similaire à WEB MINING.

Website Development and Business Growth-Leveragingthe Digital Landscape.pptx
Website Development and Business Growth-Leveragingthe Digital Landscape.pptxWebsite Development and Business Growth-Leveragingthe Digital Landscape.pptx
Website Development and Business Growth-Leveragingthe Digital Landscape.pptxClaraM27
 
Ecommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadEcommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadBhawani N Prasad
 
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...bzebian
 
web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...nazen2
 
Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage MiningDaminda Herath
 
Personal web usage mining
Personal web usage miningPersonal web usage mining
Personal web usage miningDaminda Herath
 
Webtrends presentation
Webtrends presentationWebtrends presentation
Webtrends presentationcitsweb
 
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketer
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B MarketerA Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketer
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketerskijumpman
 
AB1401-SEM5-GROUP4
AB1401-SEM5-GROUP4AB1401-SEM5-GROUP4
AB1401-SEM5-GROUP4Jeremy Chia
 
ACCELERATED DIGITAL MARKETING PROGRAME 2018
ACCELERATED DIGITAL MARKETING PROGRAME 2018ACCELERATED DIGITAL MARKETING PROGRAME 2018
ACCELERATED DIGITAL MARKETING PROGRAME 2018Stephen Dube
 
Using Google Analytics To Market Your Software Idea
Using Google Analytics To Market Your Software IdeaUsing Google Analytics To Market Your Software Idea
Using Google Analytics To Market Your Software IdeaPierre DeBois
 
Web mining and social media mining
Web mining and social media miningWeb mining and social media mining
Web mining and social media miningRoxana Tadayon
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...IOSR Journals
 
Business Intelligence: A Rapidly Growing Option through Web Mining
Business Intelligence: A Rapidly Growing Option through Web  MiningBusiness Intelligence: A Rapidly Growing Option through Web  Mining
Business Intelligence: A Rapidly Growing Option through Web MiningIOSR Journals
 

Similaire à WEB MINING. (20)

Web mining
Web miningWeb mining
Web mining
 
Website Development and Business Growth-Leveragingthe Digital Landscape.pptx
Website Development and Business Growth-Leveragingthe Digital Landscape.pptxWebsite Development and Business Growth-Leveragingthe Digital Landscape.pptx
Website Development and Business Growth-Leveragingthe Digital Landscape.pptx
 
Ecommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasadEcommerce by bhawani nandan prasad
Ecommerce by bhawani nandan prasad
 
clickstream analysis
 clickstream analysis clickstream analysis
clickstream analysis
 
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...
Lavacon 2012 How Documentation Teams Can Use Web Analytics to Expand their Co...
 
Pxc3893553
Pxc3893553Pxc3893553
Pxc3893553
 
web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...web analysis document | web analysis document | web analysis document | web a...
web analysis document | web analysis document | web analysis document | web a...
 
Personal Web Usage Mining
Personal Web Usage MiningPersonal Web Usage Mining
Personal Web Usage Mining
 
Personal web usage mining
Personal web usage miningPersonal web usage mining
Personal web usage mining
 
Webtrends presentation
Webtrends presentationWebtrends presentation
Webtrends presentation
 
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketer
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B MarketerA Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketer
A Practitioner’s Guide to Web Analytics: Designed for the B-to-B Marketer
 
AB1401-SEM5-GROUP4
AB1401-SEM5-GROUP4AB1401-SEM5-GROUP4
AB1401-SEM5-GROUP4
 
Web usage mining
Web usage miningWeb usage mining
Web usage mining
 
ACCELERATED DIGITAL MARKETING PROGRAME 2018
ACCELERATED DIGITAL MARKETING PROGRAME 2018ACCELERATED DIGITAL MARKETING PROGRAME 2018
ACCELERATED DIGITAL MARKETING PROGRAME 2018
 
Using Google Analytics To Market Your Software Idea
Using Google Analytics To Market Your Software IdeaUsing Google Analytics To Market Your Software Idea
Using Google Analytics To Market Your Software Idea
 
Web mining and social media mining
Web mining and social media miningWeb mining and social media mining
Web mining and social media mining
 
Web
WebWeb
Web
 
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...Performance of Real Time Web Traffic Analysis Using Feed  Forward Neural Netw...
Performance of Real Time Web Traffic Analysis Using Feed Forward Neural Netw...
 
Web analytics
Web analyticsWeb analytics
Web analytics
 
Business Intelligence: A Rapidly Growing Option through Web Mining
Business Intelligence: A Rapidly Growing Option through Web  MiningBusiness Intelligence: A Rapidly Growing Option through Web  Mining
Business Intelligence: A Rapidly Growing Option through Web Mining
 

Plus de Sushil kasar

Microsoft Power BI
Microsoft Power BIMicrosoft Power BI
Microsoft Power BISushil kasar
 
ITIL- Service operations- Incident management
ITIL- Service operations- Incident management ITIL- Service operations- Incident management
ITIL- Service operations- Incident management Sushil kasar
 
Data visualization
Data visualizationData visualization
Data visualizationSushil kasar
 
Sas Statistical Analysis System
Sas Statistical Analysis SystemSas Statistical Analysis System
Sas Statistical Analysis SystemSushil kasar
 
Datawarehousing_Project_using_MS SQL server.
Datawarehousing_Project_using_MS SQL server.Datawarehousing_Project_using_MS SQL server.
Datawarehousing_Project_using_MS SQL server.Sushil kasar
 
Financial statements.
Financial statements.Financial statements.
Financial statements.Sushil kasar
 

Plus de Sushil kasar (7)

Microsoft Power BI
Microsoft Power BIMicrosoft Power BI
Microsoft Power BI
 
ITIL- Service operations- Incident management
ITIL- Service operations- Incident management ITIL- Service operations- Incident management
ITIL- Service operations- Incident management
 
Data visualization
Data visualizationData visualization
Data visualization
 
Sas Statistical Analysis System
Sas Statistical Analysis SystemSas Statistical Analysis System
Sas Statistical Analysis System
 
Datawarehousing_Project_using_MS SQL server.
Datawarehousing_Project_using_MS SQL server.Datawarehousing_Project_using_MS SQL server.
Datawarehousing_Project_using_MS SQL server.
 
Android
Android Android
Android
 
Financial statements.
Financial statements.Financial statements.
Financial statements.
 

Dernier

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Servicegiselly40
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CVKhem
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Drew Madelung
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxKatpro Technologies
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptxHampshireHUG
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfsudhanshuwaghmare1
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Igalia
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024Results
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘RTylerCroy
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfEnterprise Knowledge
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking MenDelhi Call girls
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 

Dernier (20)

Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
CNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of ServiceCNv6 Instructor Chapter 6 Quality of Service
CNv6 Instructor Chapter 6 Quality of Service
 
Real Time Object Detection Using Open CV
Real Time Object Detection Using Open CVReal Time Object Detection Using Open CV
Real Time Object Detection Using Open CV
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptxFactors to Consider When Choosing Accounts Payable Services Providers.pptx
Factors to Consider When Choosing Accounts Payable Services Providers.pptx
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Boost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdfBoost Fertility New Invention Ups Success Rates.pdf
Boost Fertility New Invention Ups Success Rates.pdf
 
Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
Raspberry Pi 5: Challenges and Solutions in Bringing up an OpenGL/Vulkan Driv...
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024A Call to Action for Generative AI in 2024
A Call to Action for Generative AI in 2024
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdfThe Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
The Role of Taxonomy and Ontology in Semantic Layers - Heather Hedden.pdf
 
08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men08448380779 Call Girls In Friends Colony Women Seeking Men
08448380779 Call Girls In Friends Colony Women Seeking Men
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 

WEB MINING.

  • 1. WEB MINING • Created by- • SUSHIL KASAR.
  • 2. RATIONALE WEB is a rich source of knowledge that can be useful to many application. Mining means extracting something useful or value able from a baser substance, such as mining gold from the earth. Source..? Billions of web pages and Billions of visitors and contributors. Knowledge..?the hyperlink structure and diversity of languages. Purpose..?To improve users efficiency and effectiveness in searching for information on the web. "World wide web, internet“.
  • 3. INTRODUCTION. The information on the internet is in the form of static and dynamic web pages of various areas from education, industry to every walk of life including blogs. • As per the web sites’ survey more than 160,000,000 web sites are having inter, intra linked web pages. The speed of web information is rapid. • The way the web sites and web pages are accessed , it is useful from the business perspective for giving future directions for decision making. • Data mining (sometimes called data or knowledge discovery) is the process of analyzing data from different perspectives and summarizing it into useful information. • The Web Mining is an application of the data mining techniques to find interesting and potentially useful knowledge from web data.
  • 4. Data Mining and Web Mining  Data mining: turn data into knowledge.  Web mining is to apply data mining techniques to extract and uncover knowledge from web documents and services. ie.web data.  Eg-Facebook recommendations,friends,place,pages..etc.is an part of web mining.
  • 5. WEB MINING.ImageFile Web mining is the application of data mining techniques to extract knowledge from web data. Web Data is, 1.web content- text, image,record etc. www.web_content.image.com 2.web structure-hyperlinks,tags etc. www.web.structureimage.com 3.web usage-http logs, app server logs, etc. www.web.usage.com Note-above are the hyperlinks,you add your own with supportive images.
  • 6. FEATURES and benefits 1.Information filtering techniques try to learn about users’ interests based on their evaluation and actions, and then to use this information to analyze new documents. 2.It Increase the value of each visitor. Improve the visitor’s experience at the websites. 3.WEB MINING allows you to look for pattern in data through content mining,structure mining and usage mining.
  • 7. 4.Web mining is attractive for companies, because of several advantages. In the most general sense it can contribute to the increase of profit. By actually selling more products or services or by Minimizing the costs. In order to do this marketing intelligence is requried.This intelligence can focus on marketing stratergies and competative analyses or on the relationship of the customers.
  • 8. CHALLENGES IN WEB MINING. • Information is Huge. • Information is diverse. • Information is redundant.
  • 9. How Web Mining is Going to Make the Users Life Easy? • “Solution for Business Decision Problems of E-Commerce for Retailer’s Web Site Solved Using Web Mining.” • In an e-commerce web site a reduction in user’s web site behavior analysis time and web site usage, trends will be a value addition provided by web mining. This e-commerce web site, required to develop a web data mining system for business users and data analysts as an end to end solution comprising of data gathering, cleansing, ETL operations, warehousing. • The business intelligence systems created user friendly, flexible, dynamic, multidimensional factual reporting, supported by visualization, and web data mining techniques. • In the e-commerce web site the data gathering and data sources includes not only customer registration and demographic information but also web click-streams, response to direct-mail, email campaigns, and orders placed through a website, call center amongst the other sources. The quantity of data can vary above 100 million records.
  • 10. • The E-commerce Web Site Architecture can collect additional click stream data besides the data in the web logs, web logs have sensitive information about customer’s login, session information, IP addresses indicating the area, region they belong to and their age, frequency of using the web site etc. • The focus on Business to Customer(B2C) e-commerce for retailers helps in understanding and fulfilling the business needs to develop the required expertise and design out of the box reports and analysis of the domain’s future trends and patterns of customer behavior understand in a better way to the business user. • It can answer the business questions such as to identify heavy spenders at the web site, which are the customers who express willingness to receive emails from the web site are heavy spenders? Such answers reflect the customer’s loyalty, based on these results promotion offers and discount offers can also be derived by the business decision maker and possibility to increase in customers can be increases for registrations to web site.
  • 11. Solution to the Search Engine Problems and How Web Mining Can Help in Improving the Business Decisions. • As the search engines use enormous information existing in the web sites, web pages, it is a challenging task to engineer, implement and to improvise the search engine. • This specifies that indexing of web pages involves a huge task. • Per day tens of millions of queries are given to search engine.
  • 12. • It helps in problems of how to effectively deal with uncontrolled hypertext collection where anyone can publish anything they want. Web Mining Applications have been used by these web sites such as Web search e.g., Google and Yahoo , Web Vertical Search e.g., FatLens and Become, Web Recommendations e.g., Amazon.com , Web Advertising e.g., Google and Yahoo, • Web site design e.g., landing page optimization.
  • 13. The Various Business Areas Where Web Mining has Helped in Improving the Business Decision Making. • E-Business-Analysis of click-stream data(data regarding to web browsing) i.e. web mining uncovers real-time e-business opportunities across geography. It provides ways to target right customers and understand their needs and to customize services and strategies in near-or-real time. The area of advertising is no exception for utilizing the opportunities provided by online customer analytics to promote right products in real time to the right customer. It also helps in effectiveness of a web site as a channel for marketing by quantifying the user’s behavior while on the web site.
  • 14. • CRM-Analytical CRM utilizes business intelligence and reporting methodologies such as data mining and analytical processing to CRM applications. While the earlier CRM implementations focus on improving operational efficiencies in the sales and service functions through tailor-made solutions for call-center management, analytical CRM solutions use intelligence solutions to analyze the data, identify the demographic profiles and measure the purchase frequency and other behavioral patterns of the customers. • With the amount of available online content, today organizations put premium on understanding, adopting and managing the same, convert them into appropriate knowledge suitable to serve their customers better, and thus improve the operations and accelerate the process of delivery of products to markets. The WorldWide Web is a fertile area for web Mining and it can provide applications, methods, algorithms to be beneficial in various real-world applications with respect to the critical e-CRM function.
  • 15. • Customer Behavior-Web Mining helps in understanding the concerns such as current and future probability of every customer, relationship between behavior and the loyalty at the website The models based on customer-centric web behavior can be used not only for identifying improvements in the appeal of web site segmentation, which are based on web behavior providing a precise basis for personalization but also for predicting customer’s future behavior that is essential for website content planning and design.
  • 16. • Cross Selling-Web Data mining usage which will allow to cross- sell into web store application with a minimal effort. • Web Site Service Quality Improvement-The World Wide Web is one of the most used interfaces to access remote data and commercial, noncommercial services and the number of actors involved in these transactions is growing very quickly. • Everyone using the Web Experiences knows that how the connection to a popular website may be very slow during rush hours and it is well known that web users tend to leave a site if the wait time for a page to be served exceeds a given value. Therefore, performance and service quality attributes have gained enormous relevance in service design and deployment.
  • 17. TYPES OF WEB MINING WEB MINING Web usage mining Web structure mining Web content mining
  • 18. WEB USAGE MINING • Web usage mining is the process of extracting useful information from server logs. i.e.User’s history. • Web usage mining is the process of finding out what users are looking on internet. • User identification, session creation, robot detection. • And filtering, and extracting usage path patterns.
  • 19. WEB STRUCTURE MINING • The structure of a typical Web graph consists of Web pages as nodes, and hyperlinks as edges connecting between two related pages. • Web Structure Mining is the process of discovering structure information from the Web. -This type of mining can be performed either at the (intra-page) document level or at the (inter-page) hyperlink level. The research at the hyperlink level is also called Hyperlink Analysis.
  • 20. WEB STRUCTURE TERMINOLOGY. • Web-graph: A directed graph that represents the Web. • ‰ Node: Each Web page is a node of the Web- graph. • ‰ Link: Each hyperlink on the Web is a directed edge of the Web-graph.
  • 21. WEB CONTENT MINING • Web Content Mining is the process of extracting useful information from the contents of Web documents. • Content data corresponds to the collection of facts a Web page was designed to convey to the users. • It may consist of text, images, audio, video, or structured records such as lists and tables.
  • 22. • The web content mining is differentiated from two different points of view. Information Retrieval View and Database View.
  • 23. Common Mining Techniques • The more basic and popular mining techniques include: • ™ Classification • ™ Clustering • ™ Associations. The other significant ideas: • ™ Topic Identification, tracking and drift analysis • ™ Concept hierarchy creation • ™ Relevance of content.
  • 24. • Document Classification. “Supervised” technique. Categories are defined and documents are assigned to one or more existing categories. The “definition” of a category is usually in the form of a term that is produced during a “training” phase. Training is performed through the use of documents that have already been classified (often by hand) as belonging to a category.
  • 25. • Document Clustering. • “Unsupervised” technique. • Document clustering is the act of collecting similar documents into bins, where similarity is some function on a document. • Clustering are performed online as well offline. • Document clustering involves the use of descriptors and descriptor extraction. Descriptors are sets of words that describe the contents within the cluster. Document clustering is generally considered to be a centralized process. Examples of document clustering include web document clustering for search users.
  • 26. • Topic Identification and Tracking. • Combination of Clustering and Classification. • As new documents are added to a collection. • An attempt is made to assign each document to an existing topic (category). • The collection is also checked for the emergence of new topics. the drift in the topics are also identified.
  • 27. • Concept Hierarchy Creation. • Creation of concept hierarchies is important to understand the category and sub categories a document belongs to Key Factors. Organization of categories. e.g. Flat, Tree, or Network. • Maximum number of categories per document. • Category Dimensions e.g. Subject, Location, Time, Alphabetical, Numerical.
  • 28. APPLICATION AREAS:- • WEB MINING in E-commerce and services. 1. www.e-commerce&services • WEB MINING in Advertising based sites. 2.www.Inadversting.com • WEB MINING in Repositories Information. 3.www.InRepository.sushil'sgmailaccount.com • Note-above are the hyperlinks,add supportive images.
  • 29. Conclusion • In today’s era where the entire world has become a global village and the driving force is internet having ebusiness to internet blogs to search engines, the major questions in front of the business users is while they would like to retain the existing customers and also would like to understand the patterns and trends of customer behavior so that their decisions can be supported with facts represented with visualizations and appropriate reporting made possible with web mining. The success of accuracy of deriving patterns is directly proportional to the amount of sample data used for the data mining techniques. • The advantages of using web mining in search engines and e-commerce, CRM, customer behavior analysis, cross selling; web site service quality improvement is noticeable. The recommendation of using web mining techniques can be applied successfully with a keen analysis of clearly understood business needs and requirements. Also one more governing factor is the amount of data, as the data is voluminous the results can be more towards the correct trends and patterns to be predicted from the given set of data. • But although the web mining techniques can be applied to even the small web sites with a few number of web pages and links within them, web mining may not be the answer for its improvement as it will not be the optimum solution as far as the cost factor in terms of parameters such as complexity of web mining techniques using algorithms may not be recommended. • Possible applications can be On-line social networking community software applications can use web mining techniques to explore the effectiveness of on-line networking, also areas such as knowledgemanagement web sites and web mining can also be useful in bioinformatics, e-governance and e-learning.