SlideShare a Scribd company logo
1 of 14
Introduction to XLMiner™:  PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data.  http://dataminingtools.net
Types of Partitions XLMiner allows us to create 2 kinds of partitions: Standard Partition: Creates 3 partitions based on the partition ratios provided. Data records are randomly elected and every record  has an equal chance of lying in any of the partition. ,[object Object]
Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
Data Set used for Partition http://dataminingtools.net
Standard Partition (Automatic)-Step 1 http://dataminingtools.net
Standard Partition (Automatic)-Output 	Testing Set			Validation Set http://dataminingtools.net
Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
Oversampled Partition – Step 1 http://dataminingtools.net
Oversampled Partition – Output The records in the training data set http://dataminingtools.net
Oversampled Partition – Output Rows in Validation set = 27,  		Rows in testing set = 30% of 27 = 12. http://dataminingtools.net

More Related Content

Viewers also liked

XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Minerxlminer content
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsDataminingTools Inc
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataDataminingTools Inc
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisDataminingTools Inc
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technologyDataminingTools Inc
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedSlideShare
 

Viewers also liked (19)

XL MINER: Associations
XL MINER: AssociationsXL MINER: Associations
XL MINER: Associations
 
XL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl MinerXL-MINER:Introduction To Xl Miner
XL-MINER:Introduction To Xl Miner
 
Areas of machine leanring
Areas of machine leanringAreas of machine leanring
Areas of machine leanring
 
XL-MINER:Prediction
XL-MINER:PredictionXL-MINER:Prediction
XL-MINER:Prediction
 
Prueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la mediaPrueba de corridas arriba y abajo de la media
Prueba de corridas arriba y abajo de la media
 
Data Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlationsData Mining: Mining ,associations, and correlations
Data Mining: Mining ,associations, and correlations
 
XL-MINER: Data Exploration
XL-MINER: Data ExplorationXL-MINER: Data Exploration
XL-MINER: Data Exploration
 
Introduction To XL-Miner
Introduction To XL-MinerIntroduction To XL-Miner
Introduction To XL-Miner
 
XL-MINER: Data Utilities
XL-MINER: Data UtilitiesXL-MINER: Data Utilities
XL-MINER: Data Utilities
 
AI: AI & Searching
AI: AI & SearchingAI: AI & Searching
AI: AI & Searching
 
Data Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence dataData Mining: Mining stream time series and sequence data
Data Mining: Mining stream time series and sequence data
 
Data Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysisData Mining: Graph mining and social network analysis
Data Mining: Graph mining and social network analysis
 
AI: AI & Problem Solving
AI: AI & Problem SolvingAI: AI & Problem Solving
AI: AI & Problem Solving
 
Data Mining: Data processing
Data Mining: Data processingData Mining: Data processing
Data Mining: Data processing
 
Ria
RiaRia
Ria
 
Radio immuno assay
Radio immuno assayRadio immuno assay
Radio immuno assay
 
Data warehouse and olap technology
Data warehouse and olap technologyData warehouse and olap technology
Data warehouse and olap technology
 
Terminology Machine Learning
Terminology Machine LearningTerminology Machine Learning
Terminology Machine Learning
 
LinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-PresentedLinkedIn SlideShare: Knowledge, Well-Presented
LinkedIn SlideShare: Knowledge, Well-Presented
 

Similar to XL-MINER:Partition (20)

Machine learning session6(decision trees random forrest)
Machine learning   session6(decision trees random forrest)Machine learning   session6(decision trees random forrest)
Machine learning session6(decision trees random forrest)
 
prova4
prova4prova4
prova4
 
provalast
provalastprovalast
provalast
 
test3
test3test3
test3
 
test2
test2test2
test2
 
provoora
provooraprovoora
provoora
 
remoto2
remoto2remoto2
remoto2
 
provacompleta2
provacompleta2provacompleta2
provacompleta2
 
finalelocale2
finalelocale2finalelocale2
finalelocale2
 
domenica2
domenica2domenica2
domenica2
 
provarealw4
provarealw4provarealw4
provarealw4
 
test2
test2test2
test2
 
prova3
prova3prova3
prova3
 
stasera1
stasera1stasera1
stasera1
 
provarealw2
provarealw2provarealw2
provarealw2
 
prova5
prova5prova5
prova5
 
provarealw3
provarealw3provarealw3
provarealw3
 
finalelocale
finalelocalefinalelocale
finalelocale
 
testsfw3
testsfw3testsfw3
testsfw3
 
 

Recently uploaded

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Scriptwesley chun
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxMalak Abu Hammad
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking MenDelhi Call girls
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024The Digital Insurer
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure servicePooja Nehwal
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsEnterprise Knowledge
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking MenDelhi Call girls
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesSinan KOZAK
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonAnna Loughnan Colquhoun
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slidespraypatel2
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsRoshan Dwivedi
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Allon Mureinik
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Miguel Araújo
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024The Digital Insurer
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking MenDelhi Call girls
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024Rafal Los
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processorsdebabhi2
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEarley Information Science
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024The Digital Insurer
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilV3cube
 

Recently uploaded (20)

Automating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps ScriptAutomating Google Workspace (GWS) & more with Apps Script
Automating Google Workspace (GWS) & more with Apps Script
 
The Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptxThe Codex of Business Writing Software for Real-World Solutions 2.pptx
The Codex of Business Writing Software for Real-World Solutions 2.pptx
 
08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men08448380779 Call Girls In Civil Lines Women Seeking Men
08448380779 Call Girls In Civil Lines Women Seeking Men
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure serviceWhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
WhatsApp 9892124323 ✓Call Girls In Kalyan ( Mumbai ) secure service
 
IAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI SolutionsIAC 2024 - IA Fast Track to Search Focused AI Solutions
IAC 2024 - IA Fast Track to Search Focused AI Solutions
 
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men08448380779 Call Girls In Greater Kailash - I Women Seeking Men
08448380779 Call Girls In Greater Kailash - I Women Seeking Men
 
Unblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen FramesUnblocking The Main Thread Solving ANRs and Frozen Frames
Unblocking The Main Thread Solving ANRs and Frozen Frames
 
Data Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt RobisonData Cloud, More than a CDP by Matt Robison
Data Cloud, More than a CDP by Matt Robison
 
Slack Application Development 101 Slides
Slack Application Development 101 SlidesSlack Application Development 101 Slides
Slack Application Development 101 Slides
 
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live StreamsTop 5 Benefits OF Using Muvi Live Paywall For Live Streams
Top 5 Benefits OF Using Muvi Live Paywall For Live Streams
 
Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)Injustice - Developers Among Us (SciFiDevCon 2024)
Injustice - Developers Among Us (SciFiDevCon 2024)
 
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
08448380779 Call Girls In Diplomatic Enclave Women Seeking Men
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Exploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone ProcessorsExploring the Future Potential of AI-Enabled Smartphone Processors
Exploring the Future Potential of AI-Enabled Smartphone Processors
 
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptxEIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
EIS-Webinar-Prompt-Knowledge-Eng-2024-04-08.pptx
 
Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Developing An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of BrazilDeveloping An App To Navigate The Roads of Brazil
Developing An App To Navigate The Roads of Brazil
 

XL-MINER:Partition

  • 1. Introduction to XLMiner™: PARTITION DATA XLMiner and Microsoft Office are registered trademarks of the respective owners.
  • 2. Introduction to Partition Data Generally the data sets used in mining are enormous. Hence in order to mine data easily ,one method is to divide/partition data. Partitioning data means dividing the data set into multiple partitions that are mutually exclusive i.e. they do not overlap or the partitions have no data records are common. Partitioning data generally results in 3 sets of data: Training Data set :- This partition is used to create/build the mining model. Validation Data set :- : It is used to check whether the model developed using the training set is accurate or not. The validation set consists of data whose result (the value of the variable to be determined) is already known so that results obtained after applying the model and the actual results can be matched. Test data set :- It is used to determine how the model would perform when it encounters real world data. http://dataminingtools.net
  • 3.
  • 4. Specify percentages :Unlike automatic, if selected ,the user can specify the ratio of the partitions created in terms of percentages.
  • 5. Equal partitions: Selecting this option sets a partitioning ratio of 33.3(training): 33.3(validation): 33.3(test) .Partition with oversampling: This method of partitioning is used when the percentage of successes in the output variable is very low in the dataset but we want to train the data with a particular percentage of successes. http://dataminingtools.net
  • 6. Data Set used for Partition http://dataminingtools.net
  • 7. Standard Partition (Automatic)-Step 1 http://dataminingtools.net
  • 8. Standard Partition (Automatic)-Output Testing Set Validation Set http://dataminingtools.net
  • 9. Standard Partition (Specify)-Step 1 Selecting “Specify percentages” allows us to set the partitioning ratios as per our need. Here we have set a ratio of 50(testing):30(validation):20(test) http://dataminingtools.net
  • 10. Standard Partition (Equal)-Step 1 Selecting “Equal” sets the partitioning ratio at 33.3% for each partition creating 3 equal sized partitions. http://dataminingtools.net
  • 11. Oversampled Partition – Data Set In order to oversample a data set, it must contain at least 1 data item that accepts only 2 distinct values, not more and only then can it be used as the success class(the data item which is oversampled) http://dataminingtools.net
  • 12. Oversampled Partition – Step 1 http://dataminingtools.net
  • 13. Oversampled Partition – Output The records in the training data set http://dataminingtools.net
  • 14. Oversampled Partition – Output Rows in Validation set = 27, Rows in testing set = 30% of 27 = 12. http://dataminingtools.net
  • 15. Thank you For more visit: http://dataminingtools.net http://dataminingtools.net
  • 16. Visit more self help tutorials Pick a tutorial of your choice and browse through it at your own pace. The tutorials section is free, self-guiding and will not involve any additional support. Visit us at www.dataminingtools.net