SlideShare une entreprise Scribd logo
1  sur  11
Télécharger pour lire hors ligne
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
40
KNOWLEDGE DISCOVERY FROM VEHICLE E-GOVERNANCE
DATA USING DATA WAREHOUSING AND DATA MINING
Pushpal Desai1
1
(M.Sc. (I.T.) Programme, VNSGU, Surat, India)
ABSTRACT
In this paper, multi dimensional schema design, data cube and OLAP operations on
Vehicle e-governance data is discussed. The proposed data mining model and its
implementation on Vehicle e-governance data is also discussed. In the first phase, Clustering
data mining algorithm is implemented to identify important clusters from the Vehicle e-
governance data. In the second phase, Association Rules Mining algorithm is applied to
explore novel relationships from the important data clusters observed in the first phase. The
results indicate that novel relationship can be found using the proposed model.
Keywords: Clustering, Association Rules Analysis, Microsoft SQL Server Analysis
Services.
I. INTRODUCTION
Inmon who is known as the father of data warehousing defines “a data warehouse as a
subject oriented, integrated, nonvolatile, and time variant collection of data in support of
management decisions” [7] [8]. Hen and Kamber defined data mining as “Extracting or
mining knowledge from large amount of data” [2]. The data warehouse and data mining
algorithm are applied in various domains for knowledge discovery. The data warehouse and
data mining algorithms are successfully used in Banking, Insurance, Finance, Marketing,
Education, Telecommunication, Medical Science, Power Industry, Weather Forecasting,
Product Design, Customer Relationship Management (CRM) etc… In our earlier research
work, we tried to find association rules from Birth registration, Decease Registration,
Property and Vehicle e-governance data [4] [5] [6]. In this article, Association Rules
algorithm is applied to find interesting patterns and relationship from the different clusters of
Vehicle e-governance data.
INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY &
MANAGEMENT INFORMATION SYSTEM (IJITMIS)
ISSN 0976 – 6405(Print)
ISSN 0976 – 6413(Online)
Volume 5, Issue 2, May - August (2014), pp. 40-50
© IAEME: http://www.iaeme.com/IJITMIS.asp
Journal Impact Factor (2014): 6.2217 (Calculated by GISI)
www.jifactor.com
IJITMIS
© I A E M E
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
41
II. METHODOLOGY
To provide better understanding of proposed knowledge discovery model, a flow-
chart for the proposed model as shown in the Figure 1. The proposed model for knowledge
discovery involves three major phases.
In the first phase, various data preprocessing tasks on source data to convert into clean
and consistent data.
Fig 1: Proposed Model for Knowledge discovery for e-governance data
In the second phase, data warehouse is designed considering various analytical needs
of the organization from the preprocessed data. In the first task, various dimensions, fact and
measures are indentified keeping in mind organization’s analytical purpose. In the next task,
the multidimensional schema design is developed considering various dimensions tables and
fact tables. In the last task, data cubes are created and perform various OLAP operations like
data drill, slice, dice etc…on it.
In the third phase, clustering and association rules mining algorithms are used to
discover knowledge from the data warehouse. In the first task, clustering algorithm is applied
on data cube to indentify major clusters or group from the data cube. In the second task,
association rules mining algorithm is applied to find novel and interesting relationship from
the data clusters observed in the first task.
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
42
III. MULTIDIMENSIONAL SCHEMA DESIGN
The multidimensional schema is designed for Vehicle e-governance data and OLAP
operations are performed on data cubes for data analysis. Typically, automobile companies
keep on adding new models and hence frequent updates are required in the data warehouse.
The Snowflake schema design is proposed because the vehicle’s models can be easily
updated in the data warehouse. The Snowflake Schema which stores data in normalized form
allows us to easily update data in the data warehouse. In contrast to Snowflake, the Star
Schema design stores data in de-normalized form and that make it difficult to update data in
the data warehouse. In the proposed Snowflake Schema design, VehicleRegistrationbase
Table was used as the Fact Table and Modaelmasterbase, Companynamemaster,
Vehicletypebase and Sitemaster were used as Dimension Tables. There are many measures
like Vehicle Registration Count, Vehicle Amount, Tax amount. The Figure 2 shows the
proposed Snowflake Schema design of the Vehicle data.
Fig 2: Proposed Snowflake Schema Design for Vehicle Data
After implementing Snowflake schema, Data Cube are created and various OLAP
operations are performed like Slice, Dice, Drill-down and Roll-up on Vehicle Data Cube
using Microsoft SQL Server Analysis Services [1] [3] .
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
43
IV. CLUSTERING
The Owner Surname, Vehicle Model, Vehicle Company, Vehicle Type, Vehicle
Price, Vehicle Tax and Registration Year are used as input parameters and Registration Date
is used as key column and generated Clustering model for Vehicle Data Cube. The Cluster
model is used to identify important group of data from the source data. The Clustering is
performed using K-mean algorithm using Microsoft Analysis Service [1] [3].
Fig 3: Proposed Clustering Data Mining Model for Vehicle Data Cube.
V. ASSOCIATION RULES MINING
The Association Rules algorithm was applied on Vehicle Cluster data. For example,
to find interesting relationship from vehicle data, ‘Car’, ‘Motorcycle’, ‘Autorikshaw’ and
‘Moped and Scooter’ clusters data are used. The Owner Surname, Vehicle Type were used as
input fields and Vehicle Company and Vehicle Model Name were used as predict only
attributes. The Apriori algorithm was used to find Association Rules from important data
clusters [1] [3].
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
44
Fig 4: Proposed Association Rules Data Mining Model for Vehicle Data Clusters.
VI. RESULTS
The data cube was created considering “Vehicle Registration Count” and “VAT”
measures. The “Model Masterbase”, “Vehicle Typebase”, “Year master”, “Site master” and
“Company master” tables were selected as dimension tables.
Fig 5: Vehicle Data cube’s Dimensions and Measures
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
45
In the drill-down and dice operation on “Vehicle Data Cube”, two dimensions “Year”
and “Company Code” were selected. The “Registration Year” value as “2003” to “2005” and
“Company Code” value as 1 – “Hero Honda” were selected. The result shown in the Figure 6
indicates that “48,189” vehicles are registered of “Hero Honda” company during the year
“2003” to the year “2005”.
Fig 6: Drill-down and Dice operation on Vehicle Data Cube with Two Dimensions
In further drill-down operation on “Vehicle Data Cube”, “Model Id” dimension with
value 1 – “Splender” was added. The Figure 7 shows that “18,015” are registered for this
particular vehicle model. The roll-up operation can be performed on all above data cubes by
removing various dimensions used in drill-down operations.
Fig 7: Drill-down and Dice operations on Vehicle Data Cube with Three Dimensions
The Clustering data mining algorithm created 10 Clusters from the Vehicle data. The
“Cluster Diagram” of the same is shown in the Figure 8.
Fig 8: Cluster Diagram for Vehicle data
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
46
The Cluster profile of the same model indicates presence of variables like “Company
Name”, “Owner Surname” and “Vehicle Type Name”. The variable “Company Name” name
has many states such as "Maruti Suzuki", "Hero Honda", "Bajaj", "Honda", "Huyndai",
"Tata Motors", "TVS" and Others. The variable “Owner Surname” has different states like
"Patel", "Wala", "Shah", "Desai", "SINGH", "SHAIKH", "KHAN", "PATIL" and Others.
The “Vehicle Type Name” variable has "CAR", "MOTORCYCLE", "AUTORICKSHAW",
"MOPED_SCOOTER" and "COMMERCIAL" states.
Fig 9: Cluster Profile for Vehicle data Clustering Model
To properly understand each cluster data and to answer questions such as:
• Which clusters contain data of “AUTORICKSHAW”? What are the names of the
Companies that manufactured the “AUTORICKSHAW”? What are the Surnames of citizens
who purchased “AUTORICKSHAW”?
• Which clusters contain data of “MOTORCYCLE”? What are the names of the
companies which manufactured the “MOTORCYCLE”? What are the Surnames of citizens
who purchased “MOTORCYCLE”?
To answer such questions, cluster diagram’s shading variables are used to understand
impact of different variables with its states. For example, the “Vehicle Type Name” with
“AUTORICKSHAW” state result is shown in the Figure 10. The result indicates that “Cluster
7” is having 100% population for “AUTORICKSHAW” state.
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
47
Figure 10: Cluster Diagram for Vehicle data (Vehicle Type Name =
“AUTORICKSHAW”)
Further analysis can be performed by viewing characteristics of “Cluster 7”. The
“Cluster 7” characteristic is provided in the Figure 11.
Figure 11: “Cluster 7” Characteristic for Vehicle data
The “Vehicle Type Name” variable with “Motorcycle” value and its’ cluster diagram
is shown in the Figure 12. The result indicates that Cluster 3, Cluster 4 and Cluster 9 are
having population for this state. The Cluster 3 is having 100% population for the value
“Motorcycle”.
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
48
Figure 12: Cluster Diagram for Vehicle data (Vehicle Type Name = “Motorcycle”)
The characteristics of the “Cluster 3” shown in the Figure 13 indicate that “Company
Name” variable is present with two values “Hero Honda” and “TVS”. For the “Hero Honda”
value the probability is 97.09% percent where as for “TVS” value the probability is 1.23%.
For the “Owner Surname” field the value “Patel” is present with 92.72% probability.
Figure 13: “Cluster 3” Characteristic for Vehicle data
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
49
In “Vehicle Data Cluster”, “Company Name”, “Vehicle Type” and “Model Name”
variables were used to find novel relationship. The “Company Name” and “Vehicle Type”
were used as input fields and “Model Name” was used as predict only attribute. Many
interesting relationships were found from the association rules mining model.
For example, the results provided in the Table 1 indicate that for “Ford” company’s
car model “Ford Ikon” is likely to be sold in the city of Surat. Similarly, “Toyota”
company’s car model “Qualis”, “Honda” company’s car model “Honda City”, “Mahindra”
company’s car model “Scorpio”, “Fiat” company’s car model “Palio” , “Tata Motors”
company’s car model “Indica” and “Huyndai” company’s car model “Santro” most likely to
be sold in the city of Surat.
Table 1: Association Rules for Company Name, Vehicle Type=”Car” and Model Name
attributes
Rule
Confidence
Rule
Importance
Association Rules
0.866 4.476980868 Company Name = Ford, Vehicle Type Name = CAR -> Model Name =
FORD IKON
0.7 4.381475564 Company Name = Toyota, Vehicle Type Name = CAR -> Model Name =
QUALIS
0.941 4.209951446 Company Name = Honda, Vehicle Type Name = CAR -> Model Name =
HONDA CITY
0.694 4.08037921 Company Name = Mahindra, Vehicle Type Name = CAR -> Model Name =
SCORPIO TURBO
0.68 4.072957414 Company Name = Fiat, Vehicle Type Name = CAR -> Model Name =
PALIO
0.719 3.680400153 Company Name = Tata Motors, Vehicle Type Name = CAR -> Model Name
= INDICA
0.749 3.613327965 Company Name = Huyndai, Vehicle Type Name = CAR -> Model Name =
SANTRO
Similarly, interesting relationship between moped / scooter manufacturer and its
model were found. The results provided in the Table 2 suggest that for “Suzuki” company’s
model “Access 125”, “Honda” company’s model “Honda Activa”, “TVS” company’s model
“Pep” and “Hero Honda” company’s model “Pleasure” is most likely to be sold.
Table 2: Association Rules for Company Name, Vehicle Type=”Moped_Scooter” and
Model Name attributes
Rule
Confidence
Rule
Importance
Association Rules
0.828 4.458154689 Company Name = SUZUKI, Vehicle Type Name = MOPED_SCOOTER ->
Model Name = ACCESS 125
0.83 3.362508672 Company Name = Honda, Vehicle Type Name = MOPED_SCOOTER ->
Model Name = HONDA ACTIVA
0.434 3.060536766 Vehicle Type Name = MOPED_SCOOTER -> Model Name = HONDA
ACTIVA
0.49 2.973270492 Company Name = TVS, Vehicle Type Name = MOPED_SCOOTER ->
Model Name = PEP
0.711 2.7995445 Vehicle Type Name = MOPED_SCOOTER, Company Name = Hero Honda
-> Model Name = PLEASURE
International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 –
6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME
50
VII. CONCLUSION
The practical research demonstrates that data cube operations such as drill-down, roll-
up, slice and dice could be extremely useful to administrator working at the municipal
corporation, as they are able to query data considering several dimensions. The cube
operations also provide lot of freedom to the administrators as query is not fixed in nature
like we normally find in OLTP systems. Data cube operations allow administrators to
execute ad hoc queries which not possible in the OLTP systems. These results can be utilized
by automobile companies to increase sales of their products by focusing on specific
community residing in the city of Surat. The results are unique in sense that e-governance
data can be utilized by private companies to increase their sales, improve marketing of the
product and analyze the vehicle purchase trend of the citizens. Furthermore, results of
Clustering and Association Rules data mining gives better understanding of data and finds
hidden trends and new relationships from e-governance data.
VIII. LIMITATIONS
All results are based on data provided by the municipal corporation for the research
purpose only. Hence results may change, if data warehouse and data mining is applied on
actual data sets.
IX. REFERENCES
[1] Brian Larson 2008. Delivering Business Intelligence with Microsoft SQL Server 2008,
McGrawHill.
[2] Hen and Kamber 2011. Data Mining Concepts and Techniques, Morgan Kaufmann
Publishers.
[3] Jamie MacLennan et al. 2008. Data Mining with SQL Server® 2008, Wiley.
[4] Pushpal Desai and Dr. Apurva Desai 2011, The Study on Data Warehouse and
Data Mining for Birth Registration System of the Surat City, International Journal
of Computer Applications, Number 4 - Article 2, 2011, pp. 1-5, ISBN: 978-93-80746-
63-0.
[5] Pushpal Desai and Dr. Apurva Desai 2012, An empirical analysis using data mining on
property tax - e-governance data, In the proceedings of National Seminar on Natural
language Processing and Data Mining, Department of Computer Science, Surat, India.
[6] Pushpal Desai and Dr. Apurva Desai 2012, An empirical analysis based on association
rules mining on E-Governance system, In the proceedings of International Conference
& Workshop on Recent Trends in Technology 2012, TCET, Mumbai, India.
[7] W. H. Inmon 2005. Building the Data Warehouse, Wiley.
[8] W. H. Inmon et al. 2001 Corporate Information Factory, Wiley.
[9] Kuldeep Deshpande and Dr. Bhimappa Desai, “A Critical Study of Requirement
Gathering and Testing Techniques for Datawarehousing”, International Journal of
Information Technology and Management Information Systems (IJITMIS), Volume 5,
Issue 1, 2014, pp. 60 - 71, ISSN Print: 0976 – 6405, ISSN Online: 0976 – 6413.
[10] Pushpal Desai, “Building Aggregates in the Data Warehouse: A Case Study of Birth,
Deceased and Property Registration E-Governance Data”, International Journal of
Advanced Research in Engineering & Technology (IJARET), Volume 5, Issue 6, 2014,
pp. 8 - 14, ISSN Print: 0976-6480, ISSN Online: 0976-6499.

Contenu connexe

En vedette (14)

Cadeira Presidente Caribe - preta poliester
Cadeira Presidente Caribe - preta poliesterCadeira Presidente Caribe - preta poliester
Cadeira Presidente Caribe - preta poliester
 
Apresentação Adding Talent PT
Apresentação Adding Talent PTApresentação Adding Talent PT
Apresentação Adding Talent PT
 
Recuros tecnologicos
Recuros tecnologicosRecuros tecnologicos
Recuros tecnologicos
 
Banco de dados
Banco de dadosBanco de dados
Banco de dados
 
Manual de Estilo de Acens Technologies - Febrero 2011
Manual de Estilo de Acens Technologies - Febrero 2011Manual de Estilo de Acens Technologies - Febrero 2011
Manual de Estilo de Acens Technologies - Febrero 2011
 
Cadeira Fixa Java - preta space
Cadeira Fixa Java - preta spaceCadeira Fixa Java - preta space
Cadeira Fixa Java - preta space
 
AcensNews Enero 2011
AcensNews Enero 2011AcensNews Enero 2011
AcensNews Enero 2011
 
Cadeira Fixa Escritório Atenas - cinza corano
Cadeira Fixa Escritório Atenas - cinza coranoCadeira Fixa Escritório Atenas - cinza corano
Cadeira Fixa Escritório Atenas - cinza corano
 
Interpretaciontablas
InterpretaciontablasInterpretaciontablas
Interpretaciontablas
 
Bulgaria 2014-2020 eu grants v2.0
Bulgaria 2014-2020 eu grants v2.0Bulgaria 2014-2020 eu grants v2.0
Bulgaria 2014-2020 eu grants v2.0
 
Trabajo 6
Trabajo 6Trabajo 6
Trabajo 6
 
Criteriosdivisibilidadseptimo
CriteriosdivisibilidadseptimoCriteriosdivisibilidadseptimo
Criteriosdivisibilidadseptimo
 
Huberto Rohden - Lúcifer e Lógos
Huberto Rohden - Lúcifer e LógosHuberto Rohden - Lúcifer e Lógos
Huberto Rohden - Lúcifer e Lógos
 
Cadeira Presidente Santorini - vermelha corano
Cadeira Presidente Santorini - vermelha coranoCadeira Presidente Santorini - vermelha corano
Cadeira Presidente Santorini - vermelha corano
 

Similaire à Knowledge discovery from vehicle e governance data using data warehousing an

CarStream: An Industrial System of Big Data Processing for Internet of Vehicles
CarStream: An Industrial System of Big Data Processing for Internet of VehiclesCarStream: An Industrial System of Big Data Processing for Internet of Vehicles
CarStream: An Industrial System of Big Data Processing for Internet of Vehicles
ijtsrd
 
Preprocessing and secure computations for privacy preservation data mining
Preprocessing and secure computations for privacy preservation data miningPreprocessing and secure computations for privacy preservation data mining
Preprocessing and secure computations for privacy preservation data mining
IAEME Publication
 
Efficient route finder system
Efficient route finder systemEfficient route finder system
Efficient route finder system
IAEME Publication
 
A Mobile Application for Bus E Ticketing System
A Mobile Application for Bus E Ticketing SystemA Mobile Application for Bus E Ticketing System
A Mobile Application for Bus E Ticketing System
ijtsrd
 

Similaire à Knowledge discovery from vehicle e governance data using data warehousing an (20)

CarStream: An Industrial System of Big Data Processing for Internet of Vehicles
CarStream: An Industrial System of Big Data Processing for Internet of VehiclesCarStream: An Industrial System of Big Data Processing for Internet of Vehicles
CarStream: An Industrial System of Big Data Processing for Internet of Vehicles
 
20120140506004
2012014050600420120140506004
20120140506004
 
20120140506002
2012014050600220120140506002
20120140506002
 
Preprocessing and secure computations for privacy preservation data mining
Preprocessing and secure computations for privacy preservation data miningPreprocessing and secure computations for privacy preservation data mining
Preprocessing and secure computations for privacy preservation data mining
 
Shipment Time Prediction for Maritime Industry using Machine Learning
Shipment Time Prediction for Maritime Industry using Machine LearningShipment Time Prediction for Maritime Industry using Machine Learning
Shipment Time Prediction for Maritime Industry using Machine Learning
 
Efficient route finder system
Efficient route finder systemEfficient route finder system
Efficient route finder system
 
A Novel Framework on Web Usage Mining
A Novel Framework on Web Usage MiningA Novel Framework on Web Usage Mining
A Novel Framework on Web Usage Mining
 
Intelligent Transportation System Based On Machine Learning For Vehicle Perce...
Intelligent Transportation System Based On Machine Learning For Vehicle Perce...Intelligent Transportation System Based On Machine Learning For Vehicle Perce...
Intelligent Transportation System Based On Machine Learning For Vehicle Perce...
 
Search Engine Scrapper
Search Engine ScrapperSearch Engine Scrapper
Search Engine Scrapper
 
A machine learning model for predicting innovation effort of firms
A machine learning model for predicting innovation effort of  firmsA machine learning model for predicting innovation effort of  firms
A machine learning model for predicting innovation effort of firms
 
A Mobile Application for Bus E Ticketing System
A Mobile Application for Bus E Ticketing SystemA Mobile Application for Bus E Ticketing System
A Mobile Application for Bus E Ticketing System
 
AutomobileDataAnalysis.pdf
AutomobileDataAnalysis.pdfAutomobileDataAnalysis.pdf
AutomobileDataAnalysis.pdf
 
Rides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA BikeRides Request Demand Forecast- OLA Bike
Rides Request Demand Forecast- OLA Bike
 
Stock Market Prediction Analysis
Stock Market Prediction AnalysisStock Market Prediction Analysis
Stock Market Prediction Analysis
 
Super SBM and GM 1,1 Model Approaches for Global Automobile
Super SBM and GM 1,1 Model Approaches for Global AutomobileSuper SBM and GM 1,1 Model Approaches for Global Automobile
Super SBM and GM 1,1 Model Approaches for Global Automobile
 
50320140502003
5032014050200350320140502003
50320140502003
 
NRB SAP DAY 2017 - William Poos
NRB SAP DAY 2017 - William PoosNRB SAP DAY 2017 - William Poos
NRB SAP DAY 2017 - William Poos
 
IRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering TechniqueIRJET- Automated CV Classification using Clustering Technique
IRJET- Automated CV Classification using Clustering Technique
 
INTELLIGENT ALGORYTHM FOR IMMEDIATE FINANCIAL STRATEGY FOR SMES
INTELLIGENT ALGORYTHM FOR IMMEDIATE FINANCIAL STRATEGY FOR SMESINTELLIGENT ALGORYTHM FOR IMMEDIATE FINANCIAL STRATEGY FOR SMES
INTELLIGENT ALGORYTHM FOR IMMEDIATE FINANCIAL STRATEGY FOR SMES
 
Visual and analytical mining of sales transaction data for production plannin...
Visual and analytical mining of sales transaction data for production plannin...Visual and analytical mining of sales transaction data for production plannin...
Visual and analytical mining of sales transaction data for production plannin...
 

Plus de IAEME Publication

A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
IAEME Publication
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
IAEME Publication
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
IAEME Publication
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
IAEME Publication
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
IAEME Publication
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
IAEME Publication
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
IAEME Publication
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
IAEME Publication
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
IAEME Publication
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
IAEME Publication
 

Plus de IAEME Publication (20)

IAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdfIAEME_Publication_Call_for_Paper_September_2022.pdf
IAEME_Publication_Call_for_Paper_September_2022.pdf
 
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
MODELING AND ANALYSIS OF SURFACE ROUGHNESS AND WHITE LATER THICKNESS IN WIRE-...
 
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURSA STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
A STUDY ON THE REASONS FOR TRANSGENDER TO BECOME ENTREPRENEURS
 
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURSBROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
BROAD UNEXPOSED SKILLS OF TRANSGENDER ENTREPRENEURS
 
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONSDETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
DETERMINANTS AFFECTING THE USER'S INTENTION TO USE MOBILE BANKING APPLICATIONS
 
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONSANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
ANALYSE THE USER PREDILECTION ON GPAY AND PHONEPE FOR DIGITAL TRANSACTIONS
 
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINOVOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
VOICE BASED ATM FOR VISUALLY IMPAIRED USING ARDUINO
 
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
IMPACT OF EMOTIONAL INTELLIGENCE ON HUMAN RESOURCE MANAGEMENT PRACTICES AMONG...
 
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMYVISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
VISUALISING AGING PARENTS & THEIR CLOSE CARERS LIFE JOURNEY IN AGING ECONOMY
 
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
A STUDY ON THE IMPACT OF ORGANIZATIONAL CULTURE ON THE EFFECTIVENESS OF PERFO...
 
GANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICEGANDHI ON NON-VIOLENT POLICE
GANDHI ON NON-VIOLENT POLICE
 
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
A STUDY ON TALENT MANAGEMENT AND ITS IMPACT ON EMPLOYEE RETENTION IN SELECTED...
 
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
ATTRITION IN THE IT INDUSTRY DURING COVID-19 PANDEMIC: LINKING EMOTIONAL INTE...
 
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
INFLUENCE OF TALENT MANAGEMENT PRACTICES ON ORGANIZATIONAL PERFORMANCE A STUD...
 
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
A STUDY OF VARIOUS TYPES OF LOANS OF SELECTED PUBLIC AND PRIVATE SECTOR BANKS...
 
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
EXPERIMENTAL STUDY OF MECHANICAL AND TRIBOLOGICAL RELATION OF NYLON/BaSO4 POL...
 
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
ROLE OF SOCIAL ENTREPRENEURSHIP IN RURAL DEVELOPMENT OF INDIA - PROBLEMS AND ...
 
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
OPTIMAL RECONFIGURATION OF POWER DISTRIBUTION RADIAL NETWORK USING HYBRID MET...
 
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
APPLICATION OF FRUGAL APPROACH FOR PRODUCTIVITY IMPROVEMENT - A CASE STUDY OF...
 
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENTA MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
A MULTIPLE – CHANNEL QUEUING MODELS ON FUZZY ENVIRONMENT
 

Dernier

Dernier (20)

Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024Partners Life - Insurer Innovation Award 2024
Partners Life - Insurer Innovation Award 2024
 
Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024Tata AIG General Insurance Company - Insurer Innovation Award 2024
Tata AIG General Insurance Company - Insurer Innovation Award 2024
 
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...Workshop - Best of Both Worlds_ Combine  KG and Vector search for  enhanced R...
Workshop - Best of Both Worlds_ Combine KG and Vector search for enhanced R...
 
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
Strategies for Unlocking Knowledge Management in Microsoft 365 in the Copilot...
 
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, AdobeApidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
Apidays New York 2024 - Scaling API-first by Ian Reasor and Radu Cotescu, Adobe
 
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
04-2024-HHUG-Sales-and-Marketing-Alignment.pptx
 
Scaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organizationScaling API-first – The story of a global engineering organization
Scaling API-first – The story of a global engineering organization
 
Strategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a FresherStrategies for Landing an Oracle DBA Job as a Fresher
Strategies for Landing an Oracle DBA Job as a Fresher
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024The 7 Things I Know About Cyber Security After 25 Years | April 2024
The 7 Things I Know About Cyber Security After 25 Years | April 2024
 
Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024Axa Assurance Maroc - Insurer Innovation Award 2024
Axa Assurance Maroc - Insurer Innovation Award 2024
 
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
🐬 The future of MySQL is Postgres 🐘
🐬  The future of MySQL is Postgres   🐘🐬  The future of MySQL is Postgres   🐘
🐬 The future of MySQL is Postgres 🐘
 
[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf[2024]Digital Global Overview Report 2024 Meltwater.pdf
[2024]Digital Global Overview Report 2024 Meltwater.pdf
 
A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?A Year of the Servo Reboot: Where Are We Now?
A Year of the Servo Reboot: Where Are We Now?
 
Tech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdfTech Trends Report 2024 Future Today Institute.pdf
Tech Trends Report 2024 Future Today Institute.pdf
 
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
Apidays Singapore 2024 - Building Digital Trust in a Digital Economy by Veron...
 
2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...2024: Domino Containers - The Next Step. News from the Domino Container commu...
2024: Domino Containers - The Next Step. News from the Domino Container commu...
 

Knowledge discovery from vehicle e governance data using data warehousing an

  • 1. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 40 KNOWLEDGE DISCOVERY FROM VEHICLE E-GOVERNANCE DATA USING DATA WAREHOUSING AND DATA MINING Pushpal Desai1 1 (M.Sc. (I.T.) Programme, VNSGU, Surat, India) ABSTRACT In this paper, multi dimensional schema design, data cube and OLAP operations on Vehicle e-governance data is discussed. The proposed data mining model and its implementation on Vehicle e-governance data is also discussed. In the first phase, Clustering data mining algorithm is implemented to identify important clusters from the Vehicle e- governance data. In the second phase, Association Rules Mining algorithm is applied to explore novel relationships from the important data clusters observed in the first phase. The results indicate that novel relationship can be found using the proposed model. Keywords: Clustering, Association Rules Analysis, Microsoft SQL Server Analysis Services. I. INTRODUCTION Inmon who is known as the father of data warehousing defines “a data warehouse as a subject oriented, integrated, nonvolatile, and time variant collection of data in support of management decisions” [7] [8]. Hen and Kamber defined data mining as “Extracting or mining knowledge from large amount of data” [2]. The data warehouse and data mining algorithm are applied in various domains for knowledge discovery. The data warehouse and data mining algorithms are successfully used in Banking, Insurance, Finance, Marketing, Education, Telecommunication, Medical Science, Power Industry, Weather Forecasting, Product Design, Customer Relationship Management (CRM) etc… In our earlier research work, we tried to find association rules from Birth registration, Decease Registration, Property and Vehicle e-governance data [4] [5] [6]. In this article, Association Rules algorithm is applied to find interesting patterns and relationship from the different clusters of Vehicle e-governance data. INTERNATIONAL JOURNAL OF INFORMATION TECHNOLOGY & MANAGEMENT INFORMATION SYSTEM (IJITMIS) ISSN 0976 – 6405(Print) ISSN 0976 – 6413(Online) Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME: http://www.iaeme.com/IJITMIS.asp Journal Impact Factor (2014): 6.2217 (Calculated by GISI) www.jifactor.com IJITMIS © I A E M E
  • 2. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 41 II. METHODOLOGY To provide better understanding of proposed knowledge discovery model, a flow- chart for the proposed model as shown in the Figure 1. The proposed model for knowledge discovery involves three major phases. In the first phase, various data preprocessing tasks on source data to convert into clean and consistent data. Fig 1: Proposed Model for Knowledge discovery for e-governance data In the second phase, data warehouse is designed considering various analytical needs of the organization from the preprocessed data. In the first task, various dimensions, fact and measures are indentified keeping in mind organization’s analytical purpose. In the next task, the multidimensional schema design is developed considering various dimensions tables and fact tables. In the last task, data cubes are created and perform various OLAP operations like data drill, slice, dice etc…on it. In the third phase, clustering and association rules mining algorithms are used to discover knowledge from the data warehouse. In the first task, clustering algorithm is applied on data cube to indentify major clusters or group from the data cube. In the second task, association rules mining algorithm is applied to find novel and interesting relationship from the data clusters observed in the first task.
  • 3. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 42 III. MULTIDIMENSIONAL SCHEMA DESIGN The multidimensional schema is designed for Vehicle e-governance data and OLAP operations are performed on data cubes for data analysis. Typically, automobile companies keep on adding new models and hence frequent updates are required in the data warehouse. The Snowflake schema design is proposed because the vehicle’s models can be easily updated in the data warehouse. The Snowflake Schema which stores data in normalized form allows us to easily update data in the data warehouse. In contrast to Snowflake, the Star Schema design stores data in de-normalized form and that make it difficult to update data in the data warehouse. In the proposed Snowflake Schema design, VehicleRegistrationbase Table was used as the Fact Table and Modaelmasterbase, Companynamemaster, Vehicletypebase and Sitemaster were used as Dimension Tables. There are many measures like Vehicle Registration Count, Vehicle Amount, Tax amount. The Figure 2 shows the proposed Snowflake Schema design of the Vehicle data. Fig 2: Proposed Snowflake Schema Design for Vehicle Data After implementing Snowflake schema, Data Cube are created and various OLAP operations are performed like Slice, Dice, Drill-down and Roll-up on Vehicle Data Cube using Microsoft SQL Server Analysis Services [1] [3] .
  • 4. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 43 IV. CLUSTERING The Owner Surname, Vehicle Model, Vehicle Company, Vehicle Type, Vehicle Price, Vehicle Tax and Registration Year are used as input parameters and Registration Date is used as key column and generated Clustering model for Vehicle Data Cube. The Cluster model is used to identify important group of data from the source data. The Clustering is performed using K-mean algorithm using Microsoft Analysis Service [1] [3]. Fig 3: Proposed Clustering Data Mining Model for Vehicle Data Cube. V. ASSOCIATION RULES MINING The Association Rules algorithm was applied on Vehicle Cluster data. For example, to find interesting relationship from vehicle data, ‘Car’, ‘Motorcycle’, ‘Autorikshaw’ and ‘Moped and Scooter’ clusters data are used. The Owner Surname, Vehicle Type were used as input fields and Vehicle Company and Vehicle Model Name were used as predict only attributes. The Apriori algorithm was used to find Association Rules from important data clusters [1] [3].
  • 5. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 44 Fig 4: Proposed Association Rules Data Mining Model for Vehicle Data Clusters. VI. RESULTS The data cube was created considering “Vehicle Registration Count” and “VAT” measures. The “Model Masterbase”, “Vehicle Typebase”, “Year master”, “Site master” and “Company master” tables were selected as dimension tables. Fig 5: Vehicle Data cube’s Dimensions and Measures
  • 6. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 45 In the drill-down and dice operation on “Vehicle Data Cube”, two dimensions “Year” and “Company Code” were selected. The “Registration Year” value as “2003” to “2005” and “Company Code” value as 1 – “Hero Honda” were selected. The result shown in the Figure 6 indicates that “48,189” vehicles are registered of “Hero Honda” company during the year “2003” to the year “2005”. Fig 6: Drill-down and Dice operation on Vehicle Data Cube with Two Dimensions In further drill-down operation on “Vehicle Data Cube”, “Model Id” dimension with value 1 – “Splender” was added. The Figure 7 shows that “18,015” are registered for this particular vehicle model. The roll-up operation can be performed on all above data cubes by removing various dimensions used in drill-down operations. Fig 7: Drill-down and Dice operations on Vehicle Data Cube with Three Dimensions The Clustering data mining algorithm created 10 Clusters from the Vehicle data. The “Cluster Diagram” of the same is shown in the Figure 8. Fig 8: Cluster Diagram for Vehicle data
  • 7. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 46 The Cluster profile of the same model indicates presence of variables like “Company Name”, “Owner Surname” and “Vehicle Type Name”. The variable “Company Name” name has many states such as "Maruti Suzuki", "Hero Honda", "Bajaj", "Honda", "Huyndai", "Tata Motors", "TVS" and Others. The variable “Owner Surname” has different states like "Patel", "Wala", "Shah", "Desai", "SINGH", "SHAIKH", "KHAN", "PATIL" and Others. The “Vehicle Type Name” variable has "CAR", "MOTORCYCLE", "AUTORICKSHAW", "MOPED_SCOOTER" and "COMMERCIAL" states. Fig 9: Cluster Profile for Vehicle data Clustering Model To properly understand each cluster data and to answer questions such as: • Which clusters contain data of “AUTORICKSHAW”? What are the names of the Companies that manufactured the “AUTORICKSHAW”? What are the Surnames of citizens who purchased “AUTORICKSHAW”? • Which clusters contain data of “MOTORCYCLE”? What are the names of the companies which manufactured the “MOTORCYCLE”? What are the Surnames of citizens who purchased “MOTORCYCLE”? To answer such questions, cluster diagram’s shading variables are used to understand impact of different variables with its states. For example, the “Vehicle Type Name” with “AUTORICKSHAW” state result is shown in the Figure 10. The result indicates that “Cluster 7” is having 100% population for “AUTORICKSHAW” state.
  • 8. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 47 Figure 10: Cluster Diagram for Vehicle data (Vehicle Type Name = “AUTORICKSHAW”) Further analysis can be performed by viewing characteristics of “Cluster 7”. The “Cluster 7” characteristic is provided in the Figure 11. Figure 11: “Cluster 7” Characteristic for Vehicle data The “Vehicle Type Name” variable with “Motorcycle” value and its’ cluster diagram is shown in the Figure 12. The result indicates that Cluster 3, Cluster 4 and Cluster 9 are having population for this state. The Cluster 3 is having 100% population for the value “Motorcycle”.
  • 9. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 48 Figure 12: Cluster Diagram for Vehicle data (Vehicle Type Name = “Motorcycle”) The characteristics of the “Cluster 3” shown in the Figure 13 indicate that “Company Name” variable is present with two values “Hero Honda” and “TVS”. For the “Hero Honda” value the probability is 97.09% percent where as for “TVS” value the probability is 1.23%. For the “Owner Surname” field the value “Patel” is present with 92.72% probability. Figure 13: “Cluster 3” Characteristic for Vehicle data
  • 10. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 49 In “Vehicle Data Cluster”, “Company Name”, “Vehicle Type” and “Model Name” variables were used to find novel relationship. The “Company Name” and “Vehicle Type” were used as input fields and “Model Name” was used as predict only attribute. Many interesting relationships were found from the association rules mining model. For example, the results provided in the Table 1 indicate that for “Ford” company’s car model “Ford Ikon” is likely to be sold in the city of Surat. Similarly, “Toyota” company’s car model “Qualis”, “Honda” company’s car model “Honda City”, “Mahindra” company’s car model “Scorpio”, “Fiat” company’s car model “Palio” , “Tata Motors” company’s car model “Indica” and “Huyndai” company’s car model “Santro” most likely to be sold in the city of Surat. Table 1: Association Rules for Company Name, Vehicle Type=”Car” and Model Name attributes Rule Confidence Rule Importance Association Rules 0.866 4.476980868 Company Name = Ford, Vehicle Type Name = CAR -> Model Name = FORD IKON 0.7 4.381475564 Company Name = Toyota, Vehicle Type Name = CAR -> Model Name = QUALIS 0.941 4.209951446 Company Name = Honda, Vehicle Type Name = CAR -> Model Name = HONDA CITY 0.694 4.08037921 Company Name = Mahindra, Vehicle Type Name = CAR -> Model Name = SCORPIO TURBO 0.68 4.072957414 Company Name = Fiat, Vehicle Type Name = CAR -> Model Name = PALIO 0.719 3.680400153 Company Name = Tata Motors, Vehicle Type Name = CAR -> Model Name = INDICA 0.749 3.613327965 Company Name = Huyndai, Vehicle Type Name = CAR -> Model Name = SANTRO Similarly, interesting relationship between moped / scooter manufacturer and its model were found. The results provided in the Table 2 suggest that for “Suzuki” company’s model “Access 125”, “Honda” company’s model “Honda Activa”, “TVS” company’s model “Pep” and “Hero Honda” company’s model “Pleasure” is most likely to be sold. Table 2: Association Rules for Company Name, Vehicle Type=”Moped_Scooter” and Model Name attributes Rule Confidence Rule Importance Association Rules 0.828 4.458154689 Company Name = SUZUKI, Vehicle Type Name = MOPED_SCOOTER -> Model Name = ACCESS 125 0.83 3.362508672 Company Name = Honda, Vehicle Type Name = MOPED_SCOOTER -> Model Name = HONDA ACTIVA 0.434 3.060536766 Vehicle Type Name = MOPED_SCOOTER -> Model Name = HONDA ACTIVA 0.49 2.973270492 Company Name = TVS, Vehicle Type Name = MOPED_SCOOTER -> Model Name = PEP 0.711 2.7995445 Vehicle Type Name = MOPED_SCOOTER, Company Name = Hero Honda -> Model Name = PLEASURE
  • 11. International Journal of Information Technology & Management Information System (IJITMIS), ISSN 0976 – 6405(Print), ISSN 0976 – 6413(Online), Volume 5, Issue 2, May - August (2014), pp. 40-50 © IAEME 50 VII. CONCLUSION The practical research demonstrates that data cube operations such as drill-down, roll- up, slice and dice could be extremely useful to administrator working at the municipal corporation, as they are able to query data considering several dimensions. The cube operations also provide lot of freedom to the administrators as query is not fixed in nature like we normally find in OLTP systems. Data cube operations allow administrators to execute ad hoc queries which not possible in the OLTP systems. These results can be utilized by automobile companies to increase sales of their products by focusing on specific community residing in the city of Surat. The results are unique in sense that e-governance data can be utilized by private companies to increase their sales, improve marketing of the product and analyze the vehicle purchase trend of the citizens. Furthermore, results of Clustering and Association Rules data mining gives better understanding of data and finds hidden trends and new relationships from e-governance data. VIII. LIMITATIONS All results are based on data provided by the municipal corporation for the research purpose only. Hence results may change, if data warehouse and data mining is applied on actual data sets. IX. REFERENCES [1] Brian Larson 2008. Delivering Business Intelligence with Microsoft SQL Server 2008, McGrawHill. [2] Hen and Kamber 2011. Data Mining Concepts and Techniques, Morgan Kaufmann Publishers. [3] Jamie MacLennan et al. 2008. Data Mining with SQL Server® 2008, Wiley. [4] Pushpal Desai and Dr. Apurva Desai 2011, The Study on Data Warehouse and Data Mining for Birth Registration System of the Surat City, International Journal of Computer Applications, Number 4 - Article 2, 2011, pp. 1-5, ISBN: 978-93-80746- 63-0. [5] Pushpal Desai and Dr. Apurva Desai 2012, An empirical analysis using data mining on property tax - e-governance data, In the proceedings of National Seminar on Natural language Processing and Data Mining, Department of Computer Science, Surat, India. [6] Pushpal Desai and Dr. Apurva Desai 2012, An empirical analysis based on association rules mining on E-Governance system, In the proceedings of International Conference & Workshop on Recent Trends in Technology 2012, TCET, Mumbai, India. [7] W. H. Inmon 2005. Building the Data Warehouse, Wiley. [8] W. H. Inmon et al. 2001 Corporate Information Factory, Wiley. [9] Kuldeep Deshpande and Dr. Bhimappa Desai, “A Critical Study of Requirement Gathering and Testing Techniques for Datawarehousing”, International Journal of Information Technology and Management Information Systems (IJITMIS), Volume 5, Issue 1, 2014, pp. 60 - 71, ISSN Print: 0976 – 6405, ISSN Online: 0976 – 6413. [10] Pushpal Desai, “Building Aggregates in the Data Warehouse: A Case Study of Birth, Deceased and Property Registration E-Governance Data”, International Journal of Advanced Research in Engineering & Technology (IJARET), Volume 5, Issue 6, 2014, pp. 8 - 14, ISSN Print: 0976-6480, ISSN Online: 0976-6499.