SlideShare une entreprise Scribd logo
1  sur  23
Microsoft Sequence ClusteringAnd Association Rules
OVERVIEW Introduction DMX Queries Interpreting the sequence clustering model Microsoft Sequence Clustering Algorithm Principles and Parameters Markov chain model Introduction to Microsoft Association Rules Association Algorithm Principles and Parameters
Microsoft Sequence ClusteringAnd Association Rules The Microsoft Sequence Clustering algorithm is a sequence analysis algorithm provided by Microsoft SQL Server Analysis Services. The algorithm finds the most common sequences by grouping, or clustering, sequences that are identical. Ex :  Data that describes the click paths that are created when users navigate or browse a Web site. Data that describes the order in which a customer adds items to a shopping cart at an online retailer.
DMX Queries By querying the data mining schema rowset, you can find various kinds of information about the model such as: Basic metadata,  The date and time that the model was created and last processed,  The name of the mining structure that the model is based on,  The column used as the predictable attribute.
DMX Queries SELECT MINING_PARAMETERS  from  $system.DMSCHEMA_MINING_MODELS WHERE MODEL_NAME = 'Sequence Clustering'     Query to return the parameters that were used to build and train the Sample model.
DMX Queries SELECT FLATTENED NODE_UNIQUE_NAME, (SELECT ATTRIBUTE_VALUE AS [Product 1], [Support] AS [Sequence Support], [Probability] AS [Sequence Probability]     FROM NODE_DISTRIBUTION) AS t FROM [Sequence Clustering].CONTENT WHERE NODE_TYPE = 13 AND [PARENT_UNIQUE_NAME] = 0 Getting a List of Sequences for a State Query to return the complete list of first states in the model, before the sequences are grouped into clusters.  Returning the list of sequences (NODE_TYPE = 13) that have the model root node as parent (PARENT_UNIQUE_NAME = 0).  The FLATTENED keyword makes the results easier to read. Sample  result of this query is shown in the next figure.
DMX Queries you reference the value returned for NODE_UNIQUE_NAME  to get the ID of the node that contains all sequences for the model.  You pass this value to the query as the ID of the parent node, to get only the transitions included in this node, which happens to contain a list of al sequences for the model.
Interpreting the sequence clustering model A sequence clustering model has a single parent node that represents the model and its metadata.  The parent node, which is labeled, has a related sequence node that lists all the transitions that were detected in the training data. The algorithm also creates a number of clusters, based on the transitions that were found in the data and any other input attributes included when creating the model.  Each cluster contains its own sequence node that lists only the transitions that were used in generating that specific cluster.
Interpreting the sequence clustering model
Microsoft Sequence Clustering Algorithm Principles The Microsoft Sequence Clustering algorithm is a hybrid algorithm that combines clustering techniques with Markov chain analysis to identify clusters and their sequences. This data typically represents a series of events or transitions between states in a dataset.  The algorithm examines all transition probabilities and measures the differences, or distances, between all the possible sequences in the dataset to determine which sequences are the best to use as inputs for clustering.  After the algorithm has created the list of candidate sequences, it uses the sequence information as an input for the EM method of clustering.
Markov chain model A Markov chain also contains a matrix of transition probabilities.  The transitions emanating from a given state define a distribution over the possible next states.  The equation P (xi= G|xi-1=A) = 0.15 means that, given the current state A, the probability of the next state being G is 0.15.
Markov chain model Based on the Markov chain, for any given length L sequence x {x1, x2,x3,. . .,xL},  you can calculate the probability of a sequence as follows: P(x) = P(xL . xL-1,. . .,x1)         = P(xL| xL-1,. . .,x1)P (xL-1|xL-2,. . .,x1).. .P(x1) In first-order, the probability of each state xi depends only on the state of xi-1. P(x) = P(xL . xL-1,. . .,x1)        = P(xL|xL-1)P(xL-1|xL-2). . .P(x2|x1)P(x1)
Microsoft Sequence Clustering Parameters ,[object Object],Setting the CLUSTER_COUNT parameter to 0 causes the algorithm to use heuristics to best determine the number of clusters to build. The default is 10. ,[object Object],The default is 100.
Microsoft Sequence Clustering Parameters ,[object Object],The default is 10. ,[object Object],The default is 64.
Introduction to Microsoft Association Rules The Microsoft Association Rules Viewer in Microsoft SQL Server Analysis Services displays mining models that are built with the Microsoft Association algorithm. The Microsoft Association algorithm is an association algorithm provided by Analysis Services that is useful for recommendation engines.  A recommendation engine recommends products to customers based on items they have already bought, or in which they have indicated an interest.  The Microsoft Association algorithm is also useful for market basket analysis.
Structure of an Association Model The top level has a single node (Model Root) that represents the model.  The second level contains nodes that represent qualified item sets and rules.
Association Algorithm Principles The Microsoft Association Rules algorithm belongs to the Apriori association family.  The two steps in the Microsoft Association Rules algorithm are: ,[object Object]
Generate association rules based on frequent item sets. ,[object Object]
Association Algorithm Parameters MINIMUM_PROBABILITY is a threshold parameter.  It defines the minimum probability for an association rule.  Its value is within the range of 0 to 1.  The default value is 0.4. MINIMUM_IMPORTANCE is a threshold parameter for association rules.  Rules with importance less than Minimum_Importance are filtered out.
Association Algorithm Parameters MAXIMUM_ITEMSET_SIZE specifies the maximum size of an itemset.  The default value is 0, which means that there is no size limit on the itemset. MINIMUM_ITEMSET_SIZE specifies the minimum size of the itemset.  The default value is 0. MAXIMUM_ITEMSET_COUNTdefines the maximum number of item sets.
Association Algorithm Parameters OPTIMIZED_PREDICTION_COUNTdefines the number of items to be cached to optimized predictions AUTODETECT_MINIMUM_SUPPORTrepresents the sensitivity of the algorithm used to autodetect minimum support. To automatically detect the smallest appropriate value of minimum support, Set this value to 1.0 . To turns off autodetection, Set this value to 1.0
Summary Introduction to sequence clustering DMX Queries The sequence clustering model Microsoft Sequence Clustering Algorithm Principles and Parameters Markov chain model Introduction to Microsoft Association Rules Association Algorithm Principles and Parameters
Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net

Contenu connexe

Tendances

Chapter 04-discriminant analysis
Chapter 04-discriminant analysisChapter 04-discriminant analysis
Chapter 04-discriminant analysisRaman Kannan
 
Chapter01 introductory handbook
Chapter01 introductory handbookChapter01 introductory handbook
Chapter01 introductory handbookRaman Kannan
 
WEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationWEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationDataminingTools Inc
 
[M2A3] Data Analysis and Interpretation Specialization
[M2A3] Data Analysis and Interpretation Specialization [M2A3] Data Analysis and Interpretation Specialization
[M2A3] Data Analysis and Interpretation Specialization Andrea Rubio
 

Tendances (6)

XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
Chapter 04-discriminant analysis
Chapter 04-discriminant analysisChapter 04-discriminant analysis
Chapter 04-discriminant analysis
 
Chapter01 introductory handbook
Chapter01 introductory handbookChapter01 introductory handbook
Chapter01 introductory handbook
 
XL Miner: Classification
XL Miner: ClassificationXL Miner: Classification
XL Miner: Classification
 
WEKA: Output Knowledge Representation
WEKA: Output Knowledge RepresentationWEKA: Output Knowledge Representation
WEKA: Output Knowledge Representation
 
[M2A3] Data Analysis and Interpretation Specialization
[M2A3] Data Analysis and Interpretation Specialization [M2A3] Data Analysis and Interpretation Specialization
[M2A3] Data Analysis and Interpretation Specialization
 

En vedette

MS Sql Server: Datamining Introduction
MS Sql Server: Datamining IntroductionMS Sql Server: Datamining Introduction
MS Sql Server: Datamining Introductionsqlserver content
 
MS SQL SERVER: Using the data mining tools
MS SQL SERVER: Using the data mining toolsMS SQL SERVER: Using the data mining tools
MS SQL SERVER: Using the data mining toolssqlserver content
 
MS SQLSERVER:Feeding Data Into Database
MS SQLSERVER:Feeding Data Into DatabaseMS SQLSERVER:Feeding Data Into Database
MS SQLSERVER:Feeding Data Into Databasesqlserver content
 
MS SQL SERVER: Neural network and logistic regression
MS SQL SERVER: Neural network and logistic regressionMS SQL SERVER: Neural network and logistic regression
MS SQL SERVER: Neural network and logistic regressionsqlserver content
 
MS SQLSERVER:Doing Calculations With Functions
MS SQLSERVER:Doing Calculations With FunctionsMS SQLSERVER:Doing Calculations With Functions
MS SQLSERVER:Doing Calculations With Functionssqlserver content
 
MS Sql Server: Reporting introduction
MS Sql Server: Reporting introductionMS Sql Server: Reporting introduction
MS Sql Server: Reporting introductionsqlserver content
 
MS Sql Server: Reporting basics
MS Sql  Server: Reporting basicsMS Sql  Server: Reporting basics
MS Sql Server: Reporting basicssqlserver content
 
MS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating DatabaseMS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating Databasesqlserver content
 
MS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database ConceptsMS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database Conceptssqlserver content
 
MS SQLSERVER:Retrieving Data From A Database
MS SQLSERVER:Retrieving Data From A DatabaseMS SQLSERVER:Retrieving Data From A Database
MS SQLSERVER:Retrieving Data From A Databasesqlserver content
 
MS Sql Server: Business Intelligence
MS Sql Server: Business IntelligenceMS Sql Server: Business Intelligence
MS Sql Server: Business Intelligencesqlserver content
 
MS SQL SERVER: Creating A Database
MS SQL SERVER: Creating A DatabaseMS SQL SERVER: Creating A Database
MS SQL SERVER: Creating A Databasesqlserver content
 
MS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data miningMS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data miningsqlserver content
 
MS SQLSERVER:Joining Databases
MS SQLSERVER:Joining DatabasesMS SQLSERVER:Joining Databases
MS SQLSERVER:Joining Databasessqlserver content
 
MS SQL SERVER: Getting Started With Sql Server 2008
MS SQL SERVER: Getting Started With Sql Server 2008MS SQL SERVER: Getting Started With Sql Server 2008
MS SQL SERVER: Getting Started With Sql Server 2008sqlserver content
 

En vedette (15)

MS Sql Server: Datamining Introduction
MS Sql Server: Datamining IntroductionMS Sql Server: Datamining Introduction
MS Sql Server: Datamining Introduction
 
MS SQL SERVER: Using the data mining tools
MS SQL SERVER: Using the data mining toolsMS SQL SERVER: Using the data mining tools
MS SQL SERVER: Using the data mining tools
 
MS SQLSERVER:Feeding Data Into Database
MS SQLSERVER:Feeding Data Into DatabaseMS SQLSERVER:Feeding Data Into Database
MS SQLSERVER:Feeding Data Into Database
 
MS SQL SERVER: Neural network and logistic regression
MS SQL SERVER: Neural network and logistic regressionMS SQL SERVER: Neural network and logistic regression
MS SQL SERVER: Neural network and logistic regression
 
MS SQLSERVER:Doing Calculations With Functions
MS SQLSERVER:Doing Calculations With FunctionsMS SQLSERVER:Doing Calculations With Functions
MS SQLSERVER:Doing Calculations With Functions
 
MS Sql Server: Reporting introduction
MS Sql Server: Reporting introductionMS Sql Server: Reporting introduction
MS Sql Server: Reporting introduction
 
MS Sql Server: Reporting basics
MS Sql  Server: Reporting basicsMS Sql  Server: Reporting basics
MS Sql Server: Reporting basics
 
MS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating DatabaseMS SQLSERVER:Manipulating Database
MS SQLSERVER:Manipulating Database
 
MS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database ConceptsMS SQL SERVER: Introduction To Database Concepts
MS SQL SERVER: Introduction To Database Concepts
 
MS SQLSERVER:Retrieving Data From A Database
MS SQLSERVER:Retrieving Data From A DatabaseMS SQLSERVER:Retrieving Data From A Database
MS SQLSERVER:Retrieving Data From A Database
 
MS Sql Server: Business Intelligence
MS Sql Server: Business IntelligenceMS Sql Server: Business Intelligence
MS Sql Server: Business Intelligence
 
MS SQL SERVER: Creating A Database
MS SQL SERVER: Creating A DatabaseMS SQL SERVER: Creating A Database
MS SQL SERVER: Creating A Database
 
MS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data miningMS SQL SERVER: SSIS and data mining
MS SQL SERVER: SSIS and data mining
 
MS SQLSERVER:Joining Databases
MS SQLSERVER:Joining DatabasesMS SQLSERVER:Joining Databases
MS SQLSERVER:Joining Databases
 
MS SQL SERVER: Getting Started With Sql Server 2008
MS SQL SERVER: Getting Started With Sql Server 2008MS SQL SERVER: Getting Started With Sql Server 2008
MS SQL SERVER: Getting Started With Sql Server 2008
 

Similaire à MS SQL SERVER: Microsoft sequence clustering and association rules

MS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmMS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmsqlserver content
 
MS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxMS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxsqlserver content
 
MS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxMS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxDataminingTools Inc
 
Php and MySQL Web Development
Php and MySQL Web DevelopmentPhp and MySQL Web Development
Php and MySQL Web Developmentw3ondemand
 
mc_simulation documentation
mc_simulation documentationmc_simulation documentation
mc_simulation documentationCarlo Parodi
 
Interface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptxInterface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptxBEENAHASSINA1
 
MS SQL SERVER: Microsoft time series algorithm
MS SQL SERVER: Microsoft time series algorithmMS SQL SERVER: Microsoft time series algorithm
MS SQL SERVER: Microsoft time series algorithmsqlserver content
 
MS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithmMS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithmDataminingTools Inc
 
Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdfKalyankumarVenkat1
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streamingAdam Doyle
 
Clustering in Machine Learning.pdf
Clustering in Machine Learning.pdfClustering in Machine Learning.pdf
Clustering in Machine Learning.pdfSudhanshiBakre1
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationBoston Institute of Analytics
 
Predictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingPredictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingHoria Berca
 
Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008Eduardo Castro
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning AlgorithmsWalaa Hamdy Assy
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachErik De Monte
 
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine LearningA Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine LearningVenkata Karthik Gullapalli
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluationavniS
 

Similaire à MS SQL SERVER: Microsoft sequence clustering and association rules (20)

MS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithmMS SQL SERVER: Microsoft naive bayes algorithm
MS SQL SERVER: Microsoft naive bayes algorithm
 
Database programming
Database programmingDatabase programming
Database programming
 
MS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmxMS SQL Server: Data mining concepts and dmx
MS SQL Server: Data mining concepts and dmx
 
MS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmxMS SQL SERVER: Data mining concepts and dmx
MS SQL SERVER: Data mining concepts and dmx
 
Php and MySQL Web Development
Php and MySQL Web DevelopmentPhp and MySQL Web Development
Php and MySQL Web Development
 
mc_simulation documentation
mc_simulation documentationmc_simulation documentation
mc_simulation documentation
 
Interface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptxInterface Python with MySQL connectivity.pptx
Interface Python with MySQL connectivity.pptx
 
MS SQL SERVER: Microsoft time series algorithm
MS SQL SERVER: Microsoft time series algorithmMS SQL SERVER: Microsoft time series algorithm
MS SQL SERVER: Microsoft time series algorithm
 
MS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithmMS SQL SERVER: Time series algorithm
MS SQL SERVER: Time series algorithm
 
Oracle_Analytical_function.pdf
Oracle_Analytical_function.pdfOracle_Analytical_function.pdf
Oracle_Analytical_function.pdf
 
Spark ml streaming
Spark ml streamingSpark ml streaming
Spark ml streaming
 
Clustering in Machine Learning.pdf
Clustering in Machine Learning.pdfClustering in Machine Learning.pdf
Clustering in Machine Learning.pdf
 
Predicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project PresentationPredicting Employee Churn: A Data-Driven Approach Project Presentation
Predicting Employee Churn: A Data-Driven Approach Project Presentation
 
Predictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matchingPredictive performance analysis using sql pattern matching
Predictive performance analysis using sql pattern matching
 
Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008Minería de Datos en Sql Server 2008
Minería de Datos en Sql Server 2008
 
Machine learning Algorithms
Machine learning AlgorithmsMachine learning Algorithms
Machine learning Algorithms
 
ifip2008albashiri.pdf
ifip2008albashiri.pdfifip2008albashiri.pdf
ifip2008albashiri.pdf
 
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning ApproachReducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
Reducing False Positives - BSA AML Transaction Monitoring Re-Tuning Approach
 
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine LearningA Novel Methodology to Implement Optimization Algorithms in Machine Learning
A Novel Methodology to Implement Optimization Algorithms in Machine Learning
 
Overview of query evaluation
Overview of query evaluationOverview of query evaluation
Overview of query evaluation
 

Plus de sqlserver content

MS SQL SERVER: Programming sql server data mining
MS SQL SERVER:  Programming sql server data miningMS SQL SERVER:  Programming sql server data mining
MS SQL SERVER: Programming sql server data miningsqlserver content
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER:  Olap cubes and data miningMS SQL SERVER:  Olap cubes and data mining
MS SQL SERVER: Olap cubes and data miningsqlserver content
 
MS SQL SERVER: Decision trees algorithm
MS SQL SERVER: Decision trees algorithmMS SQL SERVER: Decision trees algorithm
MS SQL SERVER: Decision trees algorithmsqlserver content
 
MS Sql Server: Reporting models
MS Sql Server: Reporting modelsMS Sql Server: Reporting models
MS Sql Server: Reporting modelssqlserver content
 
MS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating dataMS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating datasqlserver content
 
MS SQLSERVER:Deleting A Database
MS SQLSERVER:Deleting A DatabaseMS SQLSERVER:Deleting A Database
MS SQLSERVER:Deleting A Databasesqlserver content
 
MS SQLSERVER:Customizing Your D Base Design
MS SQLSERVER:Customizing Your D Base DesignMS SQLSERVER:Customizing Your D Base Design
MS SQLSERVER:Customizing Your D Base Designsqlserver content
 
MS SQLSERVER:Creating A Database
MS SQLSERVER:Creating A DatabaseMS SQLSERVER:Creating A Database
MS SQLSERVER:Creating A Databasesqlserver content
 
MS SQLSERVER:Advanced Query Concepts Copy
MS SQLSERVER:Advanced Query Concepts   CopyMS SQLSERVER:Advanced Query Concepts   Copy
MS SQLSERVER:Advanced Query Concepts Copysqlserver content
 
MS SQLSERVER:Sql Functions And Procedures
MS SQLSERVER:Sql Functions And ProceduresMS SQLSERVER:Sql Functions And Procedures
MS SQLSERVER:Sql Functions And Proceduressqlserver content
 
MS SQL SERVER: Sql Functions And Procedures
MS SQL SERVER: Sql Functions And ProceduresMS SQL SERVER: Sql Functions And Procedures
MS SQL SERVER: Sql Functions And Proceduressqlserver content
 
MS SQL SERVER: Retrieving Data From A Database
MS SQL SERVER: Retrieving Data From A DatabaseMS SQL SERVER: Retrieving Data From A Database
MS SQL SERVER: Retrieving Data From A Databasesqlserver content
 
MS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating DatabaseMS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating Databasesqlserver content
 
MS SQL SERVER: Joining Databases
MS SQL SERVER: Joining DatabasesMS SQL SERVER: Joining Databases
MS SQL SERVER: Joining Databasessqlserver content
 

Plus de sqlserver content (15)

MS SQL SERVER: Programming sql server data mining
MS SQL SERVER:  Programming sql server data miningMS SQL SERVER:  Programming sql server data mining
MS SQL SERVER: Programming sql server data mining
 
MS SQL SERVER: Olap cubes and data mining
MS SQL SERVER:  Olap cubes and data miningMS SQL SERVER:  Olap cubes and data mining
MS SQL SERVER: Olap cubes and data mining
 
MS SQL SERVER: Decision trees algorithm
MS SQL SERVER: Decision trees algorithmMS SQL SERVER: Decision trees algorithm
MS SQL SERVER: Decision trees algorithm
 
MS Sql Server: Reporting models
MS Sql Server: Reporting modelsMS Sql Server: Reporting models
MS Sql Server: Reporting models
 
MS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating dataMS Sql Server: Reporting manipulating data
MS Sql Server: Reporting manipulating data
 
MS SQLSERVER:Deleting A Database
MS SQLSERVER:Deleting A DatabaseMS SQLSERVER:Deleting A Database
MS SQLSERVER:Deleting A Database
 
MS SQLSERVER:Customizing Your D Base Design
MS SQLSERVER:Customizing Your D Base DesignMS SQLSERVER:Customizing Your D Base Design
MS SQLSERVER:Customizing Your D Base Design
 
MS SQLSERVER:Creating Views
MS SQLSERVER:Creating ViewsMS SQLSERVER:Creating Views
MS SQLSERVER:Creating Views
 
MS SQLSERVER:Creating A Database
MS SQLSERVER:Creating A DatabaseMS SQLSERVER:Creating A Database
MS SQLSERVER:Creating A Database
 
MS SQLSERVER:Advanced Query Concepts Copy
MS SQLSERVER:Advanced Query Concepts   CopyMS SQLSERVER:Advanced Query Concepts   Copy
MS SQLSERVER:Advanced Query Concepts Copy
 
MS SQLSERVER:Sql Functions And Procedures
MS SQLSERVER:Sql Functions And ProceduresMS SQLSERVER:Sql Functions And Procedures
MS SQLSERVER:Sql Functions And Procedures
 
MS SQL SERVER: Sql Functions And Procedures
MS SQL SERVER: Sql Functions And ProceduresMS SQL SERVER: Sql Functions And Procedures
MS SQL SERVER: Sql Functions And Procedures
 
MS SQL SERVER: Retrieving Data From A Database
MS SQL SERVER: Retrieving Data From A DatabaseMS SQL SERVER: Retrieving Data From A Database
MS SQL SERVER: Retrieving Data From A Database
 
MS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating DatabaseMS SQL SERVER: Manipulating Database
MS SQL SERVER: Manipulating Database
 
MS SQL SERVER: Joining Databases
MS SQL SERVER: Joining DatabasesMS SQL SERVER: Joining Databases
MS SQL SERVER: Joining Databases
 

Dernier

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfhans926745
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonetsnaman860154
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreternaman860154
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationRadu Cotescu
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUK Journal
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...Neo4j
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerThousandEyes
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024The Digital Insurer
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationMichael W. Hawkins
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?Antenna Manufacturer Coco
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherRemote DBA Services
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 

Dernier (20)

08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
How to convert PDF to text with Nanonets
How to convert PDF to text with NanonetsHow to convert PDF to text with Nanonets
How to convert PDF to text with Nanonets
 
Presentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreterPresentation on how to chat with PDF using ChatGPT code interpreter
Presentation on how to chat with PDF using ChatGPT code interpreter
 
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law DevelopmentsTrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
TrustArc Webinar - Stay Ahead of US State Data Privacy Law Developments
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdfUnderstanding Discord NSFW Servers A Guide for Responsible Users.pdf
Understanding Discord NSFW Servers A Guide for Responsible Users.pdf
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
How to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected WorkerHow to Troubleshoot Apps for the Modern Connected Worker
How to Troubleshoot Apps for the Modern Connected Worker
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024Finology Group – Insurtech Innovation Award 2024
Finology Group – Insurtech Innovation Award 2024
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
GenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day PresentationGenCyber Cyber Security Day Presentation
GenCyber Cyber Security Day Presentation
 
What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?What Are The Drone Anti-jamming Systems Technology?
What Are The Drone Anti-jamming Systems Technology?
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 

MS SQL SERVER: Microsoft sequence clustering and association rules

  • 2. OVERVIEW Introduction DMX Queries Interpreting the sequence clustering model Microsoft Sequence Clustering Algorithm Principles and Parameters Markov chain model Introduction to Microsoft Association Rules Association Algorithm Principles and Parameters
  • 3. Microsoft Sequence ClusteringAnd Association Rules The Microsoft Sequence Clustering algorithm is a sequence analysis algorithm provided by Microsoft SQL Server Analysis Services. The algorithm finds the most common sequences by grouping, or clustering, sequences that are identical. Ex : Data that describes the click paths that are created when users navigate or browse a Web site. Data that describes the order in which a customer adds items to a shopping cart at an online retailer.
  • 4. DMX Queries By querying the data mining schema rowset, you can find various kinds of information about the model such as: Basic metadata, The date and time that the model was created and last processed, The name of the mining structure that the model is based on, The column used as the predictable attribute.
  • 5. DMX Queries SELECT MINING_PARAMETERS from $system.DMSCHEMA_MINING_MODELS WHERE MODEL_NAME = 'Sequence Clustering' Query to return the parameters that were used to build and train the Sample model.
  • 6. DMX Queries SELECT FLATTENED NODE_UNIQUE_NAME, (SELECT ATTRIBUTE_VALUE AS [Product 1], [Support] AS [Sequence Support], [Probability] AS [Sequence Probability] FROM NODE_DISTRIBUTION) AS t FROM [Sequence Clustering].CONTENT WHERE NODE_TYPE = 13 AND [PARENT_UNIQUE_NAME] = 0 Getting a List of Sequences for a State Query to return the complete list of first states in the model, before the sequences are grouped into clusters. Returning the list of sequences (NODE_TYPE = 13) that have the model root node as parent (PARENT_UNIQUE_NAME = 0). The FLATTENED keyword makes the results easier to read. Sample result of this query is shown in the next figure.
  • 7. DMX Queries you reference the value returned for NODE_UNIQUE_NAME to get the ID of the node that contains all sequences for the model. You pass this value to the query as the ID of the parent node, to get only the transitions included in this node, which happens to contain a list of al sequences for the model.
  • 8. Interpreting the sequence clustering model A sequence clustering model has a single parent node that represents the model and its metadata. The parent node, which is labeled, has a related sequence node that lists all the transitions that were detected in the training data. The algorithm also creates a number of clusters, based on the transitions that were found in the data and any other input attributes included when creating the model. Each cluster contains its own sequence node that lists only the transitions that were used in generating that specific cluster.
  • 9. Interpreting the sequence clustering model
  • 10. Microsoft Sequence Clustering Algorithm Principles The Microsoft Sequence Clustering algorithm is a hybrid algorithm that combines clustering techniques with Markov chain analysis to identify clusters and their sequences. This data typically represents a series of events or transitions between states in a dataset. The algorithm examines all transition probabilities and measures the differences, or distances, between all the possible sequences in the dataset to determine which sequences are the best to use as inputs for clustering. After the algorithm has created the list of candidate sequences, it uses the sequence information as an input for the EM method of clustering.
  • 11. Markov chain model A Markov chain also contains a matrix of transition probabilities. The transitions emanating from a given state define a distribution over the possible next states. The equation P (xi= G|xi-1=A) = 0.15 means that, given the current state A, the probability of the next state being G is 0.15.
  • 12. Markov chain model Based on the Markov chain, for any given length L sequence x {x1, x2,x3,. . .,xL}, you can calculate the probability of a sequence as follows: P(x) = P(xL . xL-1,. . .,x1) = P(xL| xL-1,. . .,x1)P (xL-1|xL-2,. . .,x1).. .P(x1) In first-order, the probability of each state xi depends only on the state of xi-1. P(x) = P(xL . xL-1,. . .,x1) = P(xL|xL-1)P(xL-1|xL-2). . .P(x2|x1)P(x1)
  • 13.
  • 14.
  • 15. Introduction to Microsoft Association Rules The Microsoft Association Rules Viewer in Microsoft SQL Server Analysis Services displays mining models that are built with the Microsoft Association algorithm. The Microsoft Association algorithm is an association algorithm provided by Analysis Services that is useful for recommendation engines. A recommendation engine recommends products to customers based on items they have already bought, or in which they have indicated an interest. The Microsoft Association algorithm is also useful for market basket analysis.
  • 16. Structure of an Association Model The top level has a single node (Model Root) that represents the model. The second level contains nodes that represent qualified item sets and rules.
  • 17.
  • 18.
  • 19. Association Algorithm Parameters MINIMUM_PROBABILITY is a threshold parameter. It defines the minimum probability for an association rule. Its value is within the range of 0 to 1. The default value is 0.4. MINIMUM_IMPORTANCE is a threshold parameter for association rules. Rules with importance less than Minimum_Importance are filtered out.
  • 20. Association Algorithm Parameters MAXIMUM_ITEMSET_SIZE specifies the maximum size of an itemset. The default value is 0, which means that there is no size limit on the itemset. MINIMUM_ITEMSET_SIZE specifies the minimum size of the itemset. The default value is 0. MAXIMUM_ITEMSET_COUNTdefines the maximum number of item sets.
  • 21. Association Algorithm Parameters OPTIMIZED_PREDICTION_COUNTdefines the number of items to be cached to optimized predictions AUTODETECT_MINIMUM_SUPPORTrepresents the sensitivity of the algorithm used to autodetect minimum support. To automatically detect the smallest appropriate value of minimum support, Set this value to 1.0 . To turns off autodetection, Set this value to 1.0
  • 22. Summary Introduction to sequence clustering DMX Queries The sequence clustering model Microsoft Sequence Clustering Algorithm Principles and Parameters Markov chain model Introduction to Microsoft Association Rules Association Algorithm Principles and Parameters
  • 23. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net