SlideShare une entreprise Scribd logo
1  sur  4
Télécharger pour lire hors ligne
Research Inventy: International Journal Of Engineering And Science
Vol.2, Issue 12 (May 2013), Pp 65-68
Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com
65
Warehouse Creator: A Generic Enterprise Solution
1,
Majid Zaman, 2,
Muheet Ahmed Butt
1,
Scientist, Directorate of IT & SS, University of Kashmir, J&K, India
2,
Scientist, PG Department of Computer Science , University of Kashmir, Srinagar, J&K, India
Abstract : With the advent of 21st
century, enterprises developed warehousing solutions and the early solutions
were such that multiple warehouses were created within a single enterprise. Though data warehousing has
become a benchmark of data integration yet no generic tool cut across type,complexity,enterprise has been
proposed as of yet.In this paper, we propose a tool which can create warehouse irrespective of type,
complexity,enterprise work etc.Proposed tool creates dimension tables and fact table by evaluation the data
present in various data sets (data sources). User/Administration intervention required is minimal; however
access required for data sources is essential.
Keywords: OLAP, Data Warehouse, Enterprise, Data Sources
I. INTRODUCTION
1980s-1990s was era of automation of activities and to ease up the industry/office work.In this era the
primary goal was the digitization of data and much was not thought about for the integration and centralization
of data and resources.With the advent of 21st
century industry realized that enterprises had multiple
heterogeneous [3][1][2] data sources and allied infrastructures.In 1996 Ralph Kimball published the book
“Datawarehouse Toolkit” but it was not till 2000 onwards people realized the essence of warehousing and
emergence of its implementation.The motivation behind this project is everyday data crisis. Every now and then
enterprises across the globe are spending millions of rupees to manage and integrate the data, however yet there
is no single industry/enterprise data integration tool/regulation and enterprises across the globe are still in the
process of standardizing data integration making use of warehouse as tool kit.
The Figure 1 illustrates the existing Enterprise solution, where in n heterogeneous data sources are
interconnected via internet/intranet as the case may be.
Figure 1: Warehouse Enterprise Solution
Enterprise users have to access each data source in order to retrieve information of his/her choice; problems
encountered by enterprise user are listed below
Library OLAP-
Mysql(win)
Exam OLAP-MsSql(win)
Admin OLAP-Mysql(linux)
Switch(wired &
Wireless)
Warehouse Creator:A Generic…
66
[1] User interested in information retrieval is required to know query language in order to access information
from Data Source.
[2] User should be well aware of database architecture
[3] User should be well versed with information source, as in where information of his/her interest is saved.
[4] Access rights of enterprise users are area of concern.
[5] Geographical distance of Database servers can become reason of worry, besides other issues.
In order to overcome all the above mentioned problem besides other problems Data Warehousing is most
feasible solution, besides enterprise users can be give web bases access to retrieve information of his/her interest
without having right to update any piece of data/information from any data source.
II. PROBLEM STATEMENT
In order to develop Data Warehousing tool which can integrate data and create fully functional
warehouse following requirements have to be fulfilled
[1] The tool should have the ability to create warehouse from existing OLAP.
[2] This solution can be installed in any enterprise and warehouse can be created.
[3] Data is extracted from Heterogeneous sources.
[4] Network is both wired and wireless
[5] New Data sources can be added at any time and incorporated into warehouse.
[6] Web based interface has to be provided with warehouse in order to give user information retrieval access
based on the key words.
[7] Warehouse creation should not manipulate table data on the remote server.
III. DATA SETS
Data sets are composed of data from databases and legacy sources which can be any other source.Data
warehousing, and data marts are designed to assist individuals and organizations in managing and extracting
meaning from enormous amounts of data. Data mining is used to analyzethese data sets and predict future trends
and forecast. Data warehouses and data marts are used to store and analyse historical data present in the
warehouse in order to make better choices and estimates about the future. The purpose of many of these
activities and approaches is to relate data sets to each other, group related data together [5], and ensure the
ability of users to access the data they need.
3.1 Data In Databases
Data from different databases is extracted and placed into the warehouse, data resides in [3][4]
heterogeneous databases like MySQL ,MSSQL etc.
3.2 Legacy Sources
Legacy sources like text files also form a source of data. Data from these sources could be extracted
and placed into the warehouse.
IV. DESIGN
The detailed design is diagrammatically described below. The DFD shown below enlists n modules
required for creation of the warehouse. The local data source is where data warehouse will be created from n
heterogeneous database e.gOracle, MySql, MsSql etc. The local data source can be Oracle, MySql, MsSql, Db2
etc depending upon system requirements. The modules listed below are required for enlisting of various data
sources and subsequently retrieving data from the said sources. The data is only read from data sources and is
not allowed to modify the contents of OLAPs/ transnational servers which continue to work for the designated
purpose.
Warehouse Creator:A Generic…
67
V. IMPLEMENTATION
The solution has two distant parts:
5.1 Creation of Warehouse:
[1] Identification and addition of servers based on
[2] ip address
[3] username
[4] Password
[5] view list of database servers added
[6] Remove list of database servers added.
[7] Add source where source are the tables which will be required for the creation of warehouse.
[8] Create Warehouse: Associate name with the data warehouse to be created and select a primary table.
Primary table is the table from where the creation starts.
In the back end data warehouse is created along with the fact table and dimension tables. Dimension
tables are populated by extracting and cleaning information from tables added at step(IV).Once the data is
fetched into dimension table, data as per structure of fact table is fetched and stored in fact table.
5.2 Use of Warehouse:
Once warehouse is created for enterprise, it is made functional.
Figure:
[1] GUI for users accepts search string.
[2] Search is made on fact table.
[3] Primary information extracted from fact table is displayed.
[4] Depending upon the user requirement, user can view information of his/her choice.
Warehouse Creator:A Generic…
68
VI. CONCLUSION
Data Warehousecreation methodologies are available to enhance the quality of the data at several
stages in the process of developing a data warehouse. Data warehouse creation tools can be useful in automating
many of the activities that are involved in cleansing the data- parsing, standardizing, correction, matching,
transformation ect. Many of the tools specialize in auditing the data, detecting patterns in the data, and
comparing the data to business rules[4]. Once these problems have been isolated, the warehouse builder could
determine which features of the data quality tools address the specific needs of the data sources to be used. The
association of the data quality tool at every step of the data cleansing process plays a very important role in
ensuring that sanity of the data is protected at every manner.
REFERENCES
[1]. EMA Butt, SMK Quadri, EM Zaman, “Star Schema Implementation for Automation of Examination Records”, WORLDCOMP,
2012 FEC3568, Las Vegas USA.
[2]. MA Butt, SMK Quadri, M Zaman,” Data Warehouse Implementation of Examination Databases”, International Journal of
Computer Applications 44 (5), 18-23.
[3]. M Zaman, MA Butt, DSMK Quadri, “User Desired Information Translation”
Journal of Global Research in Computer Science 3 (6), 51-53
[4]. Neill, P. (1998). “It’s a Dirty Job: Cleaning Data in the Warehouse”, Gartner Group, January 12, 1998
[5]. EMA Butt, SMK Quadri, EM Zaman, “Star Schema Implementation for Automation of Examination Records”, WORLDCOMP,
2012 FEC3568, Las Vegas USA
[6]. Zaman, Majid, S. MK Quadri, and Muheet Ahmed Butt. "Generic Search Optimization for Heterogeneous Data
Sources." International Journal of Computer Applications 44.5 (2012): 14-17.
[7]. EM Zaman, EMA Butt. “Information Integration: An Enterprise Solution” International Journal of Engineering Science and
Innovative Technology (IJESIT) Volume 2, Issue 1, January 2013: 315-317

Contenu connexe

Tendances

Guaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostGuaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostIRJET Journal
 
Improved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providersImproved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providersIRJET Journal
 
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE csandit
 
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...IRJET Journal
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGijcsit
 
Data integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageData integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageIAEME Publication
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd Iaetsd
 
Crypto multi tenant an environment of secure computing using cloud sql
Crypto multi tenant an environment of secure computing using cloud sqlCrypto multi tenant an environment of secure computing using cloud sql
Crypto multi tenant an environment of secure computing using cloud sqlijdpsjournal
 
Data Deduplication: Venti and its improvements
Data Deduplication: Venti and its improvementsData Deduplication: Venti and its improvements
Data Deduplication: Venti and its improvementsUmair Amjad
 
S.A.kalaiselvan toward secure and dependable storage services
S.A.kalaiselvan toward secure and dependable storage servicesS.A.kalaiselvan toward secure and dependable storage services
S.A.kalaiselvan toward secure and dependable storage serviceskalaiselvanresearch
 
Towards Secure and Dependable Storage Services in Cloud Computing
Towards Secure and Dependable Storage Services in Cloud  Computing Towards Secure and Dependable Storage Services in Cloud  Computing
Towards Secure and Dependable Storage Services in Cloud Computing IJMER
 
Ensuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageEnsuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageeSAT Publishing House
 
Ensuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageEnsuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageeSAT Journals
 
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...IRJET Journal
 
Towards secure and dependable storage
Towards secure and dependable storageTowards secure and dependable storage
Towards secure and dependable storageKhaja Moiz Uddin
 
Paper id 252014139
Paper id 252014139Paper id 252014139
Paper id 252014139IJRAT
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET Journal
 

Tendances (18)

Guaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient CostGuaranteed Availability of Cloud Data with Efficient Cost
Guaranteed Availability of Cloud Data with Efficient Cost
 
Improved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providersImproved deduplication with keys and chunks in HDFS storage providers
Improved deduplication with keys and chunks in HDFS storage providers
 
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
BIG DATA TECHNOLOGY ACCELERATE GENOMICS PRECISION MEDICINE
 
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
Methodology for Optimizing Storage on Cloud Using Authorized De-Duplication –...
 
Assigment 2
Assigment 2Assigment 2
Assigment 2
 
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTINGDISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
DISTRIBUTED SCHEME TO AUTHENTICATE DATA STORAGE SECURITY IN CLOUD COMPUTING
 
Data integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storageData integrity proof techniques in cloud storage
Data integrity proof techniques in cloud storage
 
Iaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data setsIaetsd secured and efficient data scheduling of intermediate data sets
Iaetsd secured and efficient data scheduling of intermediate data sets
 
Crypto multi tenant an environment of secure computing using cloud sql
Crypto multi tenant an environment of secure computing using cloud sqlCrypto multi tenant an environment of secure computing using cloud sql
Crypto multi tenant an environment of secure computing using cloud sql
 
Data Deduplication: Venti and its improvements
Data Deduplication: Venti and its improvementsData Deduplication: Venti and its improvements
Data Deduplication: Venti and its improvements
 
S.A.kalaiselvan toward secure and dependable storage services
S.A.kalaiselvan toward secure and dependable storage servicesS.A.kalaiselvan toward secure and dependable storage services
S.A.kalaiselvan toward secure and dependable storage services
 
Towards Secure and Dependable Storage Services in Cloud Computing
Towards Secure and Dependable Storage Services in Cloud  Computing Towards Secure and Dependable Storage Services in Cloud  Computing
Towards Secure and Dependable Storage Services in Cloud Computing
 
Ensuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageEnsuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storage
 
Ensuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storageEnsuring secure transfer, access and storage over the cloud storage
Ensuring secure transfer, access and storage over the cloud storage
 
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
IRJET- An Integrity Auditing &Data Dedupe withEffective Bandwidth in Cloud St...
 
Towards secure and dependable storage
Towards secure and dependable storageTowards secure and dependable storage
Towards secure and dependable storage
 
Paper id 252014139
Paper id 252014139Paper id 252014139
Paper id 252014139
 
IRJET- Secured Hadoop Environment
IRJET- Secured Hadoop EnvironmentIRJET- Secured Hadoop Environment
IRJET- Secured Hadoop Environment
 

En vedette

Glucial slide deck
Glucial slide deckGlucial slide deck
Glucial slide deckScott Png
 
10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer ExperienceYuan Wang
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionIn a Rocket
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanPost Planner
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting PersonalKirsty Hulse
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldabaux singapore
 

En vedette (7)

Glucial slide deck
Glucial slide deckGlucial slide deck
Glucial slide deck
 
10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience10 Insightful Quotes On Designing A Better Customer Experience
10 Insightful Quotes On Designing A Better Customer Experience
 
Learn BEM: CSS Naming Convention
Learn BEM: CSS Naming ConventionLearn BEM: CSS Naming Convention
Learn BEM: CSS Naming Convention
 
How to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media PlanHow to Build a Dynamic Social Media Plan
How to Build a Dynamic Social Media Plan
 
SEO: Getting Personal
SEO: Getting PersonalSEO: Getting Personal
SEO: Getting Personal
 
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika AldabaLightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
Lightning Talk #9: How UX and Data Storytelling Can Shape Policy by Mika Aldaba
 
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job? Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
Succession “Losers”: What Happens to Executives Passed Over for the CEO Job?
 

Similaire à J0212065068

A Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseA Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseRidwan Fadjar
 
Data Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentData Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentIJERA Editor
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkDr. Sunil Kr. Pandey
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDhilsath Fathima
 
Introduction-to-Databases.pptx
Introduction-to-Databases.pptxIntroduction-to-Databases.pptx
Introduction-to-Databases.pptxIvanDarrylLopez
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data LakeIRJET Journal
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET Journal
 
Decoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDecoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDatavalley.ai
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCEcscpconf
 
Database Systems
Database SystemsDatabase Systems
Database SystemsUsman Tariq
 
Big data service architecture: a survey
Big data service architecture: a surveyBig data service architecture: a survey
Big data service architecture: a surveyssuser0191d4
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: C...
A META DATA VAULT APPROACH FOR  EVOLUTIONARY INTEGRATION OF BIG DATA SETS:  C...A META DATA VAULT APPROACH FOR  EVOLUTIONARY INTEGRATION OF BIG DATA SETS:  C...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: C...AIRCC Publishing Corporation
 

Similaire à J0212065068 (20)

A Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium EnterpriseA Study Review of Common Big Data Architecture for Small-Medium Enterprise
A Study Review of Common Big Data Architecture for Small-Medium Enterprise
 
Data Ware House System in Cloud Environment
Data Ware House System in Cloud EnvironmentData Ware House System in Cloud Environment
Data Ware House System in Cloud Environment
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural FrameworkData Warehousing & Basic Architectural Framework
Data Warehousing & Basic Architectural Framework
 
Advanced Database System
Advanced Database SystemAdvanced Database System
Advanced Database System
 
DMDW 1st module.pdf
DMDW 1st module.pdfDMDW 1st module.pdf
DMDW 1st module.pdf
 
Unit 5
Unit 5 Unit 5
Unit 5
 
Dwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousingDwdm unit 1-2016-Data ingarehousing
Dwdm unit 1-2016-Data ingarehousing
 
Introduction-to-Databases.pptx
Introduction-to-Databases.pptxIntroduction-to-Databases.pptx
Introduction-to-Databases.pptx
 
E018142329
E018142329E018142329
E018142329
 
Data Base
Data BaseData Base
Data Base
 
An Overview of Data Lake
An Overview of Data LakeAn Overview of Data Lake
An Overview of Data Lake
 
U - 2 Emerging.pptx
U - 2 Emerging.pptxU - 2 Emerging.pptx
U - 2 Emerging.pptx
 
IRJET- A Scenario on Big Data
IRJET- A Scenario on Big DataIRJET- A Scenario on Big Data
IRJET- A Scenario on Big Data
 
Decoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDecoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdf
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
 
Database Systems
Database SystemsDatabase Systems
Database Systems
 
IT6701-Information management question bank
IT6701-Information management question bankIT6701-Information management question bank
IT6701-Information management question bank
 
Big data service architecture: a survey
Big data service architecture: a surveyBig data service architecture: a survey
Big data service architecture: a survey
 
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: C...
A META DATA VAULT APPROACH FOR  EVOLUTIONARY INTEGRATION OF BIG DATA SETS:  C...A META DATA VAULT APPROACH FOR  EVOLUTIONARY INTEGRATION OF BIG DATA SETS:  C...
A META DATA VAULT APPROACH FOR EVOLUTIONARY INTEGRATION OF BIG DATA SETS: C...
 

Dernier

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...DianaGray10
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProduct Anonymous
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxRustici Software
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Bhuvaneswari Subramani
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century educationjfdjdjcjdnsjd
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontologyjohnbeverley2021
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MIND CTI
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...Zilliz
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FMESafe Software
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...apidays
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native ApplicationsWSO2
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoffsammart93
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businesspanagenda
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfOrbitshub
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAndrey Devyatkin
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistandanishmna97
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...apidays
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodJuan lago vázquez
 

Dernier (20)

Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
Connector Corner: Accelerate revenue generation using UiPath API-centric busi...
 
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemkeProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
ProductAnonymous-April2024-WinProductDiscovery-MelissaKlemke
 
Corporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptxCorporate and higher education May webinar.pptx
Corporate and higher education May webinar.pptx
 
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​Elevate Developer Efficiency & build GenAI Application with Amazon Q​
Elevate Developer Efficiency & build GenAI Application with Amazon Q​
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
presentation ICT roal in 21st century education
presentation ICT roal in 21st century educationpresentation ICT roal in 21st century education
presentation ICT roal in 21st century education
 
Six Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal OntologySix Myths about Ontologies: The Basics of Formal Ontology
Six Myths about Ontologies: The Basics of Formal Ontology
 
MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024MINDCTI Revenue Release Quarter One 2024
MINDCTI Revenue Release Quarter One 2024
 
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ..."I see eyes in my soup": How Delivery Hero implemented the safety system for ...
"I see eyes in my soup": How Delivery Hero implemented the safety system for ...
 
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers:  A Deep Dive into Serverless Spatial Data and FMECloud Frontiers:  A Deep Dive into Serverless Spatial Data and FME
Cloud Frontiers: A Deep Dive into Serverless Spatial Data and FME
 
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
Apidays New York 2024 - Passkeys: Developing APIs to enable passwordless auth...
 
Architecting Cloud Native Applications
Architecting Cloud Native ApplicationsArchitecting Cloud Native Applications
Architecting Cloud Native Applications
 
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot TakeoffStrategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
Strategize a Smooth Tenant-to-tenant Migration and Copilot Takeoff
 
Why Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire businessWhy Teams call analytics are critical to your entire business
Why Teams call analytics are critical to your entire business
 
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdfRising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
Rising Above_ Dubai Floods and the Fortitude of Dubai International Airport.pdf
 
AWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of TerraformAWS Community Day CPH - Three problems of Terraform
AWS Community Day CPH - Three problems of Terraform
 
CNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In PakistanCNIC Information System with Pakdata Cf In Pakistan
CNIC Information System with Pakdata Cf In Pakistan
 
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data DiscoveryTrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
TrustArc Webinar - Unlock the Power of AI-Driven Data Discovery
 
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
Apidays New York 2024 - APIs in 2030: The Risk of Technological Sleepwalk by ...
 
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin WoodPolkadot JAM Slides - Token2049 - By Dr. Gavin Wood
Polkadot JAM Slides - Token2049 - By Dr. Gavin Wood
 

J0212065068

  • 1. Research Inventy: International Journal Of Engineering And Science Vol.2, Issue 12 (May 2013), Pp 65-68 Issn(e): 2278-4721, Issn(p):2319-6483, Www.Researchinventy.Com 65 Warehouse Creator: A Generic Enterprise Solution 1, Majid Zaman, 2, Muheet Ahmed Butt 1, Scientist, Directorate of IT & SS, University of Kashmir, J&K, India 2, Scientist, PG Department of Computer Science , University of Kashmir, Srinagar, J&K, India Abstract : With the advent of 21st century, enterprises developed warehousing solutions and the early solutions were such that multiple warehouses were created within a single enterprise. Though data warehousing has become a benchmark of data integration yet no generic tool cut across type,complexity,enterprise has been proposed as of yet.In this paper, we propose a tool which can create warehouse irrespective of type, complexity,enterprise work etc.Proposed tool creates dimension tables and fact table by evaluation the data present in various data sets (data sources). User/Administration intervention required is minimal; however access required for data sources is essential. Keywords: OLAP, Data Warehouse, Enterprise, Data Sources I. INTRODUCTION 1980s-1990s was era of automation of activities and to ease up the industry/office work.In this era the primary goal was the digitization of data and much was not thought about for the integration and centralization of data and resources.With the advent of 21st century industry realized that enterprises had multiple heterogeneous [3][1][2] data sources and allied infrastructures.In 1996 Ralph Kimball published the book “Datawarehouse Toolkit” but it was not till 2000 onwards people realized the essence of warehousing and emergence of its implementation.The motivation behind this project is everyday data crisis. Every now and then enterprises across the globe are spending millions of rupees to manage and integrate the data, however yet there is no single industry/enterprise data integration tool/regulation and enterprises across the globe are still in the process of standardizing data integration making use of warehouse as tool kit. The Figure 1 illustrates the existing Enterprise solution, where in n heterogeneous data sources are interconnected via internet/intranet as the case may be. Figure 1: Warehouse Enterprise Solution Enterprise users have to access each data source in order to retrieve information of his/her choice; problems encountered by enterprise user are listed below Library OLAP- Mysql(win) Exam OLAP-MsSql(win) Admin OLAP-Mysql(linux) Switch(wired & Wireless)
  • 2. Warehouse Creator:A Generic… 66 [1] User interested in information retrieval is required to know query language in order to access information from Data Source. [2] User should be well aware of database architecture [3] User should be well versed with information source, as in where information of his/her interest is saved. [4] Access rights of enterprise users are area of concern. [5] Geographical distance of Database servers can become reason of worry, besides other issues. In order to overcome all the above mentioned problem besides other problems Data Warehousing is most feasible solution, besides enterprise users can be give web bases access to retrieve information of his/her interest without having right to update any piece of data/information from any data source. II. PROBLEM STATEMENT In order to develop Data Warehousing tool which can integrate data and create fully functional warehouse following requirements have to be fulfilled [1] The tool should have the ability to create warehouse from existing OLAP. [2] This solution can be installed in any enterprise and warehouse can be created. [3] Data is extracted from Heterogeneous sources. [4] Network is both wired and wireless [5] New Data sources can be added at any time and incorporated into warehouse. [6] Web based interface has to be provided with warehouse in order to give user information retrieval access based on the key words. [7] Warehouse creation should not manipulate table data on the remote server. III. DATA SETS Data sets are composed of data from databases and legacy sources which can be any other source.Data warehousing, and data marts are designed to assist individuals and organizations in managing and extracting meaning from enormous amounts of data. Data mining is used to analyzethese data sets and predict future trends and forecast. Data warehouses and data marts are used to store and analyse historical data present in the warehouse in order to make better choices and estimates about the future. The purpose of many of these activities and approaches is to relate data sets to each other, group related data together [5], and ensure the ability of users to access the data they need. 3.1 Data In Databases Data from different databases is extracted and placed into the warehouse, data resides in [3][4] heterogeneous databases like MySQL ,MSSQL etc. 3.2 Legacy Sources Legacy sources like text files also form a source of data. Data from these sources could be extracted and placed into the warehouse. IV. DESIGN The detailed design is diagrammatically described below. The DFD shown below enlists n modules required for creation of the warehouse. The local data source is where data warehouse will be created from n heterogeneous database e.gOracle, MySql, MsSql etc. The local data source can be Oracle, MySql, MsSql, Db2 etc depending upon system requirements. The modules listed below are required for enlisting of various data sources and subsequently retrieving data from the said sources. The data is only read from data sources and is not allowed to modify the contents of OLAPs/ transnational servers which continue to work for the designated purpose.
  • 3. Warehouse Creator:A Generic… 67 V. IMPLEMENTATION The solution has two distant parts: 5.1 Creation of Warehouse: [1] Identification and addition of servers based on [2] ip address [3] username [4] Password [5] view list of database servers added [6] Remove list of database servers added. [7] Add source where source are the tables which will be required for the creation of warehouse. [8] Create Warehouse: Associate name with the data warehouse to be created and select a primary table. Primary table is the table from where the creation starts. In the back end data warehouse is created along with the fact table and dimension tables. Dimension tables are populated by extracting and cleaning information from tables added at step(IV).Once the data is fetched into dimension table, data as per structure of fact table is fetched and stored in fact table. 5.2 Use of Warehouse: Once warehouse is created for enterprise, it is made functional. Figure: [1] GUI for users accepts search string. [2] Search is made on fact table. [3] Primary information extracted from fact table is displayed. [4] Depending upon the user requirement, user can view information of his/her choice.
  • 4. Warehouse Creator:A Generic… 68 VI. CONCLUSION Data Warehousecreation methodologies are available to enhance the quality of the data at several stages in the process of developing a data warehouse. Data warehouse creation tools can be useful in automating many of the activities that are involved in cleansing the data- parsing, standardizing, correction, matching, transformation ect. Many of the tools specialize in auditing the data, detecting patterns in the data, and comparing the data to business rules[4]. Once these problems have been isolated, the warehouse builder could determine which features of the data quality tools address the specific needs of the data sources to be used. The association of the data quality tool at every step of the data cleansing process plays a very important role in ensuring that sanity of the data is protected at every manner. REFERENCES [1]. EMA Butt, SMK Quadri, EM Zaman, “Star Schema Implementation for Automation of Examination Records”, WORLDCOMP, 2012 FEC3568, Las Vegas USA. [2]. MA Butt, SMK Quadri, M Zaman,” Data Warehouse Implementation of Examination Databases”, International Journal of Computer Applications 44 (5), 18-23. [3]. M Zaman, MA Butt, DSMK Quadri, “User Desired Information Translation” Journal of Global Research in Computer Science 3 (6), 51-53 [4]. Neill, P. (1998). “It’s a Dirty Job: Cleaning Data in the Warehouse”, Gartner Group, January 12, 1998 [5]. EMA Butt, SMK Quadri, EM Zaman, “Star Schema Implementation for Automation of Examination Records”, WORLDCOMP, 2012 FEC3568, Las Vegas USA [6]. Zaman, Majid, S. MK Quadri, and Muheet Ahmed Butt. "Generic Search Optimization for Heterogeneous Data Sources." International Journal of Computer Applications 44.5 (2012): 14-17. [7]. EM Zaman, EMA Butt. “Information Integration: An Enterprise Solution” International Journal of Engineering Science and Innovative Technology (IJESIT) Volume 2, Issue 1, January 2013: 315-317