SlideShare une entreprise Scribd logo
1  sur  23
Data Reduction:
Attribute Subset Selection
and Data Cube Aggregation
PREPARED BY: RAJAN SHAH
DMBI
SVIT, VASAD
Data Reduction
Data Reduction techniques can be applied to obtain
a reduced representation of the data set that is
much smaller in volume, yet closely maintains the
integrity of the original data.
That, is, Mining on the reduced data set should be
more efficient yet produce the same analytical
results.
Data Reduction Strategies
1. Dimensionality Reduction
2. Numerosity Reduction
3. Data Compression
Dimensionality Reduction
Dimensionality Reduction is the process of
reducing the number of random variables or
attributes under consideration.
Attribute Subset Selection is a method of
dimensionality reduction in which irrelevant,
weakly relevant, or redundant attributes are
detected and removed.
Numerosity Reduction
These techniques replace the original data volume
by alternative, smaller forms of data
representation. May be Parametric or Non-
Parametric.
Parametric Methods: A model is used to estimate
the data, so that only the data parameters need to
be restored and not the actual data. It assumes
that the data fits some model estimates model
parameters.
Examples: Regression and Log-Linear Models.
Cont…
Non-Parametric Methods: Do not assume the data
and are used for storing reduced representations of
the data which includes Histograms, Clustering,
Sampling and Data Cube Aggregation.
Data Compression
Transformations are applied so as to obtain a
“COMPRESSED” representation of the original data.
If the original data can be reconstructed from the
compressed one without loss of any information, it
is called Lossless Data Reduction, else it is called
Lossy Data Reduction.
Attribute Subset Selection
Also known as Feature Selection, which is a
procedure to find a subset of features (relevant to
mining task) to produce “better” model for given
dataset, i.e. removal of redundant data from the
data set which can slow down the mining process.
AIM: To find a minimum set of attributes such that
the mining process results are as close as possible
to the original distribution obtained using all
attributes.
Advantages
Mining on Reduced set of Attributes result in
reduced number of attributes and thus helping to
make patterns easier to detect and understand.
How To Find a GOOD Subset?
For n attributes, there are 2n possible subsets and
thus the methods applied are “greedy” in that,
while searching through attribute space, they
always make what looks to be the local best choice
assuming that it will lead to the global optimal
result.
The BEST and WORST attributes are determined
using tests of Statistical significance assuming the
attributes are independent of each other.
Information Gain can be used to evaluate attributes.
Methods: Stepwise Forward
Selection
It starts with no variables in the
model and testing the addition
of each variable using a chosen
model fit criterion, adding the
variable (if any) whose
inclusion gives the most
statistically significant
improvement of the fit, and
repeating this process until
none improves the model to a
statistically significant extent.
Example:
Stepwise Backward
Elimination
It involves starting with all
candidate variables, testing
the deletion of each variable
using a chosen model fit
criterion, deleting the
variable (if any) whose loss
gives the most statistically
insignificant deterioration of
the model fit, and repeating
this process until no further
variables can be deleted
without a statistically
significant loss of fit.
Example:
Bi-Directional Selection and
Elimination
The stepwise forward
selection and backward
elimination methods can
be combined so that, at
each step, the procedure
selects the best attribute
and removes the worst
from among the
remaining attributes.
Example: Suppose,
when A1(best) is
selected, at the same
time A2(worst) is
eliminated. And similarly
when A4 is selected, A5
gets eliminated and
when A6 is selected, A3
is eliminated, thus
forming the reduced set
{A1, A4, A6}.
Decision Tree Induction
Decision Tree Induction constructs a flowchart where
each internal non-leaf node denotes a test on an
attribute, each branch corresponds to an outcome of
the test, and each external leaf node denotes class-
prediction.
At each node, the algorithm chooses the “best”
attribute to partition the data into individual classes.
All the attributes that do not appear in the tree are
assumed to be irrelevant, while the attributes that
belong to the tree form the reduced data set.
Cont…
Data Cube Aggregation
A data cube is generally used to easily interpret
data. It is especially useful when representing data
together with dimensions as certain measures of
business requirements. A cube's every dimension
represents certain characteristic of the database.
Data Cubes store multidimensional aggregated
information.
Data cubes provide fast access to precomputed,
summarized data, thereby benefiting online
analytical processing (OLAP) as well as data mining.
Categories of Data Cube
Dimensions: Represents
categories of data such
as time or location.
Each dimension includes
different levels of
categories.
Example:
Cont…
Measures: These are the
actual data values that
occupy the cells as
defined by the
dimensions selected.
Measures include facts or
variables typically stored
as numerical fields.
Example:
Cont…
Example: For the data set of employees with their
dept_id, salary, data cube can be used to aggregate
the data so that resulting data summarizes the total
salary corresponding to the dept_id.
The Resulting data is smaller in volume, without loss
of information necessary for analysis task.
Cont…
Concept Hierarchies may exist for each attribute,
allowing the analysis of data at multiple abstraction
levels.
The Cube created at the lowest abstraction level is
called– Base Cuboid.
The Cube created at the highest abstraction level is
called– Apex Cuboid.
Data cube can be 2-D, 3-D or higher dimension.
When replying to data mining requests, the smallest
available cuboid relevant to the given task should be
used.
Example
References
https://www.slideshare.net/algum/data-cubes-
7923771
https://en.wikipedia.org/wiki/Data_cube
Data Reduction

Contenu connexe

Tendances

Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
R A Akerkar
 

Tendances (20)

Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
3. mining frequent patterns
3. mining frequent patterns3. mining frequent patterns
3. mining frequent patterns
 
Data preprocessing in Data Mining
Data preprocessing in Data MiningData preprocessing in Data Mining
Data preprocessing in Data Mining
 
Data preparation
Data preparationData preparation
Data preparation
 
Data Mining: clustering and analysis
Data Mining: clustering and analysisData Mining: clustering and analysis
Data Mining: clustering and analysis
 
Statistics and Data Mining
Statistics and  Data MiningStatistics and  Data Mining
Statistics and Data Mining
 
Data preprocessing
Data preprocessingData preprocessing
Data preprocessing
 
Data mining tasks
Data mining tasksData mining tasks
Data mining tasks
 
1.7 data reduction
1.7 data reduction1.7 data reduction
1.7 data reduction
 
Mining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and CorrelationsMining Frequent Patterns, Association and Correlations
Mining Frequent Patterns, Association and Correlations
 
OLAP
OLAPOLAP
OLAP
 
Data mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, dataData mining :Concepts and Techniques Chapter 2, data
Data mining :Concepts and Techniques Chapter 2, data
 
Exploratory data analysis with Python
Exploratory data analysis with PythonExploratory data analysis with Python
Exploratory data analysis with Python
 
Classification and Regression
Classification and RegressionClassification and Regression
Classification and Regression
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Grid based method & model based clustering method
Grid based method & model based clustering methodGrid based method & model based clustering method
Grid based method & model based clustering method
 
Clustering in Data Mining
Clustering in Data MiningClustering in Data Mining
Clustering in Data Mining
 
2.4 rule based classification
2.4 rule based classification2.4 rule based classification
2.4 rule based classification
 
Clusters techniques
Clusters techniquesClusters techniques
Clusters techniques
 
4.2 spatial data mining
4.2 spatial data mining4.2 spatial data mining
4.2 spatial data mining
 

Similaire à Data Reduction

Similaire à Data Reduction (20)

Introduction to data mining
Introduction to data miningIntroduction to data mining
Introduction to data mining
 
Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.Data Mining Module 2 Business Analytics.
Data Mining Module 2 Business Analytics.
 
Working with the data for Machine Learning
Working with the data for Machine LearningWorking with the data for Machine Learning
Working with the data for Machine Learning
 
Intro to Data warehousing lecture 17
Intro to Data warehousing   lecture 17Intro to Data warehousing   lecture 17
Intro to Data warehousing lecture 17
 
Machine learning module 2
Machine learning module 2Machine learning module 2
Machine learning module 2
 
Data .pptx
Data .pptxData .pptx
Data .pptx
 
DATA MINING.pptx
DATA MINING.pptxDATA MINING.pptx
DATA MINING.pptx
 
ML-Unit-4.pdf
ML-Unit-4.pdfML-Unit-4.pdf
ML-Unit-4.pdf
 
Introduction to Datamining Concept and Techniques
Introduction to Datamining Concept and TechniquesIntroduction to Datamining Concept and Techniques
Introduction to Datamining Concept and Techniques
 
1234
12341234
1234
 
UNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data MiningUNIT 2: Part 2: Data Warehousing and Data Mining
UNIT 2: Part 2: Data Warehousing and Data Mining
 
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
ATTRIBUTE REDUCTION-BASED ENSEMBLE RULE CLASSIFIERS METHOD FOR DATASET CLASSI...
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Tutorial Knowledge Discovery
Tutorial Knowledge DiscoveryTutorial Knowledge Discovery
Tutorial Knowledge Discovery
 
Survey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction TechniquesSurvey on Feature Selection and Dimensionality Reduction Techniques
Survey on Feature Selection and Dimensionality Reduction Techniques
 
Singular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptxSingular Value Decomposition (SVD).pptx
Singular Value Decomposition (SVD).pptx
 
EDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptxEDAB Module 5 Singular Value Decomposition (SVD).pptx
EDAB Module 5 Singular Value Decomposition (SVD).pptx
 
Knowledge discovery claudiad amato
Knowledge discovery claudiad amatoKnowledge discovery claudiad amato
Knowledge discovery claudiad amato
 
Chapter 3.pdf
Chapter 3.pdfChapter 3.pdf
Chapter 3.pdf
 
Dimensionality Reduction.pptx
Dimensionality Reduction.pptxDimensionality Reduction.pptx
Dimensionality Reduction.pptx
 

Plus de Rajan Shah (10)

Xml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web TechnologyXml dtd- Document Type Definition- Web Technology
Xml dtd- Document Type Definition- Web Technology
 
Timing and control circuit
Timing and control circuitTiming and control circuit
Timing and control circuit
 
Rethrowing exception- JAVA
Rethrowing exception- JAVARethrowing exception- JAVA
Rethrowing exception- JAVA
 
Np Completeness
Np CompletenessNp Completeness
Np Completeness
 
Lex Tool
Lex ToolLex Tool
Lex Tool
 
Files and streams In Java
Files and streams In JavaFiles and streams In Java
Files and streams In Java
 
Deadlock- Operating System
Deadlock- Operating SystemDeadlock- Operating System
Deadlock- Operating System
 
Cyclic Redundancy Check
Cyclic Redundancy CheckCyclic Redundancy Check
Cyclic Redundancy Check
 
Client server s/w Engineering
Client server s/w EngineeringClient server s/w Engineering
Client server s/w Engineering
 
Bluetooth protocol
Bluetooth protocolBluetooth protocol
Bluetooth protocol
 

Dernier

Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
amitlee9823
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
ssuser89054b
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
ankushspencer015
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
dharasingh5698
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Christo Ananth
 

Dernier (20)

Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...Booking open Available Pune Call Girls Koregaon Park  6297143586 Call Hot Ind...
Booking open Available Pune Call Girls Koregaon Park 6297143586 Call Hot Ind...
 
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night StandCall Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
Call Girls In Bangalore ☎ 7737669865 🥵 Book Your One night Stand
 
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...Booking open Available Pune Call Girls Pargaon  6297143586 Call Hot Indian Gi...
Booking open Available Pune Call Girls Pargaon 6297143586 Call Hot Indian Gi...
 
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
 
Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)Java Programming :Event Handling(Types of Events)
Java Programming :Event Handling(Types of Events)
 
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
VIP Model Call Girls Kothrud ( Pune ) Call ON 8005736733 Starting From 5K to ...
 
Online banking management system project.pdf
Online banking management system project.pdfOnline banking management system project.pdf
Online banking management system project.pdf
 
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
Call Girls Pimpri Chinchwad Call Me 7737669865 Budget Friendly No Advance Boo...
 
UNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its PerformanceUNIT - IV - Air Compressors and its Performance
UNIT - IV - Air Compressors and its Performance
 
AKTU Computer Networks notes --- Unit 3.pdf
AKTU Computer Networks notes ---  Unit 3.pdfAKTU Computer Networks notes ---  Unit 3.pdf
AKTU Computer Networks notes --- Unit 3.pdf
 
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...Top Rated  Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
Top Rated Pune Call Girls Budhwar Peth ⟟ 6297143586 ⟟ Call Me For Genuine Se...
 
chapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineeringchapter 5.pptx: drainage and irrigation engineering
chapter 5.pptx: drainage and irrigation engineering
 
data_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdfdata_management_and _data_science_cheat_sheet.pdf
data_management_and _data_science_cheat_sheet.pdf
 
Thermal Engineering Unit - I & II . ppt
Thermal Engineering  Unit - I & II . pptThermal Engineering  Unit - I & II . ppt
Thermal Engineering Unit - I & II . ppt
 
Unit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdfUnit 1 - Soil Classification and Compaction.pdf
Unit 1 - Soil Classification and Compaction.pdf
 
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance BookingCall Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
Call Girls Wakad Call Me 7737669865 Budget Friendly No Advance Booking
 
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 BookingVIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
VIP Call Girls Palanpur 7001035870 Whatsapp Number, 24/07 Booking
 
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
The Most Attractive Pune Call Girls Budhwar Peth 8250192130 Will You Miss Thi...
 
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
Call for Papers - Educational Administration: Theory and Practice, E-ISSN: 21...
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 

Data Reduction

  • 1. Data Reduction: Attribute Subset Selection and Data Cube Aggregation PREPARED BY: RAJAN SHAH DMBI SVIT, VASAD
  • 2. Data Reduction Data Reduction techniques can be applied to obtain a reduced representation of the data set that is much smaller in volume, yet closely maintains the integrity of the original data. That, is, Mining on the reduced data set should be more efficient yet produce the same analytical results.
  • 3. Data Reduction Strategies 1. Dimensionality Reduction 2. Numerosity Reduction 3. Data Compression
  • 4. Dimensionality Reduction Dimensionality Reduction is the process of reducing the number of random variables or attributes under consideration. Attribute Subset Selection is a method of dimensionality reduction in which irrelevant, weakly relevant, or redundant attributes are detected and removed.
  • 5. Numerosity Reduction These techniques replace the original data volume by alternative, smaller forms of data representation. May be Parametric or Non- Parametric. Parametric Methods: A model is used to estimate the data, so that only the data parameters need to be restored and not the actual data. It assumes that the data fits some model estimates model parameters. Examples: Regression and Log-Linear Models.
  • 6. Cont… Non-Parametric Methods: Do not assume the data and are used for storing reduced representations of the data which includes Histograms, Clustering, Sampling and Data Cube Aggregation.
  • 7. Data Compression Transformations are applied so as to obtain a “COMPRESSED” representation of the original data. If the original data can be reconstructed from the compressed one without loss of any information, it is called Lossless Data Reduction, else it is called Lossy Data Reduction.
  • 8. Attribute Subset Selection Also known as Feature Selection, which is a procedure to find a subset of features (relevant to mining task) to produce “better” model for given dataset, i.e. removal of redundant data from the data set which can slow down the mining process. AIM: To find a minimum set of attributes such that the mining process results are as close as possible to the original distribution obtained using all attributes.
  • 9. Advantages Mining on Reduced set of Attributes result in reduced number of attributes and thus helping to make patterns easier to detect and understand.
  • 10. How To Find a GOOD Subset? For n attributes, there are 2n possible subsets and thus the methods applied are “greedy” in that, while searching through attribute space, they always make what looks to be the local best choice assuming that it will lead to the global optimal result. The BEST and WORST attributes are determined using tests of Statistical significance assuming the attributes are independent of each other. Information Gain can be used to evaluate attributes.
  • 11. Methods: Stepwise Forward Selection It starts with no variables in the model and testing the addition of each variable using a chosen model fit criterion, adding the variable (if any) whose inclusion gives the most statistically significant improvement of the fit, and repeating this process until none improves the model to a statistically significant extent. Example:
  • 12. Stepwise Backward Elimination It involves starting with all candidate variables, testing the deletion of each variable using a chosen model fit criterion, deleting the variable (if any) whose loss gives the most statistically insignificant deterioration of the model fit, and repeating this process until no further variables can be deleted without a statistically significant loss of fit. Example:
  • 13. Bi-Directional Selection and Elimination The stepwise forward selection and backward elimination methods can be combined so that, at each step, the procedure selects the best attribute and removes the worst from among the remaining attributes. Example: Suppose, when A1(best) is selected, at the same time A2(worst) is eliminated. And similarly when A4 is selected, A5 gets eliminated and when A6 is selected, A3 is eliminated, thus forming the reduced set {A1, A4, A6}.
  • 14. Decision Tree Induction Decision Tree Induction constructs a flowchart where each internal non-leaf node denotes a test on an attribute, each branch corresponds to an outcome of the test, and each external leaf node denotes class- prediction. At each node, the algorithm chooses the “best” attribute to partition the data into individual classes. All the attributes that do not appear in the tree are assumed to be irrelevant, while the attributes that belong to the tree form the reduced data set.
  • 16. Data Cube Aggregation A data cube is generally used to easily interpret data. It is especially useful when representing data together with dimensions as certain measures of business requirements. A cube's every dimension represents certain characteristic of the database. Data Cubes store multidimensional aggregated information. Data cubes provide fast access to precomputed, summarized data, thereby benefiting online analytical processing (OLAP) as well as data mining.
  • 17. Categories of Data Cube Dimensions: Represents categories of data such as time or location. Each dimension includes different levels of categories. Example:
  • 18. Cont… Measures: These are the actual data values that occupy the cells as defined by the dimensions selected. Measures include facts or variables typically stored as numerical fields. Example:
  • 19. Cont… Example: For the data set of employees with their dept_id, salary, data cube can be used to aggregate the data so that resulting data summarizes the total salary corresponding to the dept_id. The Resulting data is smaller in volume, without loss of information necessary for analysis task.
  • 20. Cont… Concept Hierarchies may exist for each attribute, allowing the analysis of data at multiple abstraction levels. The Cube created at the lowest abstraction level is called– Base Cuboid. The Cube created at the highest abstraction level is called– Apex Cuboid. Data cube can be 2-D, 3-D or higher dimension. When replying to data mining requests, the smallest available cuboid relevant to the given task should be used.