SlideShare a Scribd company logo
1 of 58
Shikha Gautam
Asst.Professor
 Data Warehouse Process and Technology: Warehousing
Strategy, Warehouse management and Support Processes.
 Warehouse Planning and Implementation.
 H/w and O.S. for Data Warehousing, C/Server Computing
Model & Data Warehousing, Parallel Processors &
Cluster Systems, Distributed DBMS implementations.
 Warehousing Software, Warehouse Schema Design.
 Data Extraction, Cleanup & Transformation Tools,
Warehouse Metadata
“Storage or warehousing provides the place
utility as part of logistics for any business and
along with Transportation is a critical
component of customer service standards”.
 To support the company’s customer policy.
 To maintain a source of supply without interruptions.
 To support changing market conditions and sudden
changes in demand.
 To provide customers with the right mix of products
at all times and all locations.
 To ensure least logistics cost for a desired level of
customer service.
 More cost effective decision making.
 Better enterprise intelligence: Increasing
quality and flexibility of enterprise analysis.
 Enhanced customer service.
 Business re-engineering: Knowing what
information is important provides direction and
priority for re-engineering efforts.
 Information system re-engineering.
 Private warehouses: It is a storage facility that is mostly
owned by big companies or single manufacturing units. It
is also known as proprietary warehousing.
 Public warehouses: It is a facility that stores inventory for
many different businesses as opposed to a
"private warehouse”.
 Contract warehouses: A contract warehouse handles the
shipping, receiving and storage of goods on a contract
basis. This type of warehouse usually requires a client to
commit to services for a particular period of time.
 An integrated warehouse strategy focuses on two
questions:
1. How many warehouses should be employed.
2. Which warehouse types should be used to meet
market requirements.
 Many firms utilize a combination of private, public,
and contract facilities.
 It involves following activities:
1. Establish sponsorship.
2. Identify enterprise needs.
3. Determine measurement cycle.
4. Validate measures.
5. Design data warehouse architecture.
6. Apply appropriate technologies.
7. Implementing data warehouse.
1. Establish sponsorship: Establishing the right
sponsorship chain will ensure successful development
and implementation. Sponsorship chain should include
a data warehousing manager and two key individuals.
2. Identify Enterprise needs: Interview with key
enterprise manager and analysis other pertinent
documentations are techniques used to determine
enterprise needs.
3. Determine measurement cycle: Describing
the cycles or time period used for the measure. Are
quarters, months or hours are appropriate to capture
useful measurement data? Does it need historical
data?
4. Validate measures: After determining and
identifying enterprise needs, it is necessary to
“reality check” of it. The feedback will be used for
refining the measures.
5. Design data warehouse architecture: This activity
involves active user participation in facilitated design
sessions.
6. Apply appropriate technologies: Enterprise selects
technology, key technology issues, security policies etc.
7. Implementing data warehouse: Loading preliminary
data, designing user interface, developing standard
queries and reports etc.
There are four major processes that build a data
warehouse:
1. Extract and load data: Data extraction takes data
from the source systems. Data load takes the
extracted data and loads it into the data warehouse.
It involves:
 Controlling the Process: Determining when to start data
extraction. It ensures that the tools, the logic modules,
and the programs are executed in correct sequence and at
correct time.
 When to Initiate Extract: Data warehouse should
represent a single, consistent version of the information to
the user. So, Data needs to be in a consistent state.
 Loading the Data : Data is loaded into a temporary data
store where it is cleaned up and made consistent.
2. Cleaning and transforming the data: Clean
and transform the loaded data into a structure,
Partition the data and Aggregation.
3. Backup and Archive the data: In order to recover the
data in the event of data loss, software failure, or
hardware failure, it is necessary to keep regular back
ups.
4. Managing queries & directing them to the
appropriate data sources: Manages the queries,
helps speed up the execution time of queries, Directs
the queries to their most effective data sources.
Ensures that all the system sources are used in the
most effective way, Monitors actual query profiles.
 A warehouse management system (WMS) is a
software application, designed to support and
optimize warehouse or distribution center
management.
 They facilitate management in their daily planning,
organizing, staffing, directing, and controlling the
utilization of available resources, to move and store
materials into, within, and out of a warehouse, while
supporting staff in the performance of material
movement and storage in and around a warehouse.
1. Load management: Relates to the collection of
information from internal or external sources.
Loading process includes summarizing,
manipulating and changing the data structures
into a format that lends itself to analytical
processing.
2. Warehouse Management: The management
tasks include ensuring its availability, the
effective backup of its contents, and its security.
3. Query management: relates to the provision of
access to the contents of the warehouse and may
include the partitioning of information into
different areas with different privileges to
different users.
Access may be provided through custom-built
applications, or ad hoc query tools.
 Includes loading preliminary data, implementing
transformation program, design user interface,
developing standard query and reports and
training to warehouse users.
ETL
Design user Interface
Develop standard query
Training Users
The process of extracting data from source systems and
bringing it into the data warehouse is commonly
called ETL, which stands for:
 Extraction: To retrieve all the required data from the
source system with as little resources as possible.
 Transformation, and
 Loading.
 Ways to perform the extract:
 Update notification – If the source system is able to
provide a notification that a record has been changed,
this is the easiest way to get the data.
 Incremental extract –They are able to identify which
records have been modified and provide an extract of
such records. By using daily extract, we may not be able
to handle deleted records.
 Full extract - The full extract requires keeping a copy of
the last extract in the same format in order to be able to
identify changes. Handles deletions as well.
2. Clean: Ensures the quality of the data in the data
warehouse.
3. Transform: Applies a set of rules to transform the
data from the source to the target.
Converting any measured data to the same dimension
using the same units so that they can later be joined.
It also requires joining data from several sources,
generating aggregates, generating surrogate keys,
sorting, deriving new calculated values.
4. Load: To ensure that the load is performed correctly
and with as little resources as possible.The target of the
Load process is often a database. The referential integrity
needs to be maintained by ETL tool to ensure consistency.
5. Managing ETL Process:
There is a possibility that the ETL process fails.This can be
caused by missing values in one of the reference tables, or
simply a connection or power outage. It is necessary to
design the ETL process keeping fail-recovery in mind.
6. Staging:
A staging area or landing zone is an intermediate storage
area used for data processing during the ETL process.
Primary motivations for their use are to increase
efficiency of ETL processes, ensure data integrity and
support data quality operations.
 Commercial tools : Ab Initio, IBM InfoSphere
DataStage, Informatica, Oracle Data
Integrator and SAP Data Integrator.
 Open source ETL tools: CloverETL, Apatar,
Pentaho and Talend.
 Data Warehousing comes in all shapes and sizes,
which is having a direct relationship to cost and
time involved.
 The steps listed below are summary of some of
the points to consider:
 Get Professional Advice
 Plan the Data
 Who will use the Data Warehouse
 Integration to External Applications
The key steps in developing a data warehouse can
be summarized as follows:
 Project initiation
 Requirements analysis
 Design (architecture, databases and applications)
 Construction (selecting and installing tools,
developing data feeds and building reports)
 Deployment (release & training)
 Maintenance
 It applies to the software architecture that describe
processing between application and supporting services.
 It represents distributive co-operating processing,
relationship between client and server is the relationship
between hardware and software components.
 It covers a wide range of functions, services and other
aspects of distributed environment.
 Host based application processing is performed on one
computer system with attached unintelligent, “dumb”
terminals.
 A single stand alone PC or an IBM mainframe with
attached character-based display terminals are example
of host-based processing environment.
 Host based processing is totally non-distributed.
 Slave computers are attached to master computer and
perform application-processing-related functions only as
directed by their master.
 Distribution of processing tends to be unidirectional-
from master to slaves.
 Slaves are capable of some limited local application
processing.
 E.g. Mainframe (host) computer, such as IBM 3090 used
with cluster controllers and intelligent terminals.
 This generation used to model:
1. Shared device LAN processing environment : PCs
are attached to a system device that allows these
PCs to share a common resource – file Server on
Hard disk or printer Server.
E.g. Microsoft’s LAN manager, which allows a LAN
to have a system dedicated to file and print services.
2. Client server LAN processing environment:
Extension of shared device processing.
E.g. SYBASE SQL Server
An application running on PC sends Read request
to its database server. DB server process it locally
and sends only the requested records to PC
applications.
 Two-tiered architecture to multi-tiered
architecture.
 Computing model deals with servers
dedicated to application, data, transaction
management and system management.
 Supported relational to multidimensional to
multimedia data structure.
 A distributed database system consists of
loosely coupled sites that share no physical
component.
 Database systems that run on each site are
independent of each other.
 Transactions may access data at one or more
sites.
 In a homogeneous distributed database
 All sites have identical software
 Are aware of each other and agree to cooperate in processing
user requests.
 Each site surrenders part of its autonomy in terms of right to
change schemas or software
 Appears to user as a single system
 In a heterogeneous distributed database
 Different sites may use different schemas and software
▪ Difference in schema is a major problem for query processing
▪ Difference in software is a major problem for transaction processing
 Sites may not be aware of each other and may provide only
limited facilities for cooperation in transaction processing
DDBMS architectures are generally developed
depending on three parameters −
 Distribution − It states the physical distribution of
data across the different sites.
 Autonomy − It indicates the distribution of control
of the database system and the degree to which each
constituent DBMS can operate independently.
 Heterogeneity − It refers to the uniformity or
dissimilarity of the data models, system components
and databases.
 Data Replication
 Fragmentation
The three dimensions of distribution
transparency are −
 Location transparency
 Fragmentation transparency
 Replication transparency
Communication
Network
Site 1
Site 2
Site 3
Site 4
 The data warehouse operations mainly consist of huge data loads and
index builds, generation of materialized views, and queries over large
volumes of data. The elemental I/O system of a data warehouse should
be built to meet these heavy requirements.
 Architecture Options:
1. Symmetric Multiprocessing (SMP): where two or more identical
processors are connected to a single, shared main memory.
2. Massive parallel processing (MPP): large number of processors to
perform a set of coordinated computations in parallel.
 Number of CPUs
 Memory of data warehouse
 Number of Disks
 Server OS determine:
 how quickly the server can fulfill client request
 how many clients it can support concurrently and
reliably,
 how efficient the system resources such as
memory,
 Disk I/O and communication components are
utilized.
 Multiuser Support
 Preemptive multitasking
 Multithreaded Design
 Memory Protection: Concurrent tasks should not
violate each others memory.
 Scalability
 Security
 Reliability
 Availability
 Relatively small and highly secure than uniprocessors.
 Simplified architecture, Extensibility, Portability, real
time support, robust system security and multiprocessor
support.
 This architecture results into highly modular OS that
can support multiple OS “personalities” by configuring
outside services as needed.
 For e.g. Mach 3.0 microkernel used by IBM to allow
DOS, OS/2 and AIX OS to coexist on single machine.
 Distributed Memory Architecture:
 Shared-Nothing Architecture
 Shared Disk Architecture
Local
Memory
Local
Memory
Local
Memory
Local
Memory
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Interconnection Network
Local
Memory
Local
Memory
Local
Memory
Local
Memory
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Processor
Unit (PU)
Interconnection Network
Global Shared Disk Subsystem
 A cluster is a loosely coupled SMP machines
connected by high speed interconnection
network.
 A cluster behave just like a single large
machine.

More Related Content

What's hot

Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
pcherukumalla
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
cpjcollege
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
vivekjv
 

What's hot (20)

Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse
Data warehouse Data warehouse
Data warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Data warehouse architecture
Data warehouse architectureData warehouse architecture
Data warehouse architecture
 
Basic Introduction of Data Warehousing from Adiva Consulting
Basic Introduction of  Data Warehousing from Adiva ConsultingBasic Introduction of  Data Warehousing from Adiva Consulting
Basic Introduction of Data Warehousing from Adiva Consulting
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
Data warehouse architecture
Data warehouse architecture Data warehouse architecture
Data warehouse architecture
 
Project Presentation on Data WareHouse
Project Presentation on Data WareHouseProject Presentation on Data WareHouse
Project Presentation on Data WareHouse
 
Datawarehouse and OLAP
Datawarehouse and OLAPDatawarehouse and OLAP
Datawarehouse and OLAP
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
DATA Warehousing & Data Mining
DATA Warehousing & Data MiningDATA Warehousing & Data Mining
DATA Warehousing & Data Mining
 
DATA WAREHOUSING
DATA WAREHOUSINGDATA WAREHOUSING
DATA WAREHOUSING
 
Ppt
PptPpt
Ppt
 
Data Warehouse Modeling
Data Warehouse ModelingData Warehouse Modeling
Data Warehouse Modeling
 
Architecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case StudyArchitecting a Data Warehouse: A Case Study
Architecting a Data Warehouse: A Case Study
 
OLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSEOLAP & DATA WAREHOUSE
OLAP & DATA WAREHOUSE
 
Dimensional Modelling
Dimensional ModellingDimensional Modelling
Dimensional Modelling
 
Data warehouse design
Data warehouse designData warehouse design
Data warehouse design
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 

Similar to Warehouse Planning and Implementation

ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
ParnalSatle
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
sumit621
 
Synopsis on inventory_management_system
Synopsis on inventory_management_systemSynopsis on inventory_management_system
Synopsis on inventory_management_system
Divya Baghel
 

Similar to Warehouse Planning and Implementation (20)

Data Mining
Data MiningData Mining
Data Mining
 
Data Mining & Data Warehousing
Data Mining & Data WarehousingData Mining & Data Warehousing
Data Mining & Data Warehousing
 
Data Warehouse
Data WarehouseData Warehouse
Data Warehouse
 
Data warehouse
Data warehouseData warehouse
Data warehouse
 
H1803014347
H1803014347H1803014347
H1803014347
 
Unit 5
Unit 5 Unit 5
Unit 5
 
Bi
BiBi
Bi
 
DMDW 1st module.pdf
DMDW 1st module.pdfDMDW 1st module.pdf
DMDW 1st module.pdf
 
Unit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptxUnit-IV-Introduction to Data Warehousing .pptx
Unit-IV-Introduction to Data Warehousing .pptx
 
Data Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubeyData Warehouses & Deployment By Ankita dubey
Data Warehouses & Deployment By Ankita dubey
 
ETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptxETL processes , Datawarehouse and Datamarts.pptx
ETL processes , Datawarehouse and Datamarts.pptx
 
Unit 1
Unit 1Unit 1
Unit 1
 
Datawarehousing
DatawarehousingDatawarehousing
Datawarehousing
 
Decoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdfDecoding the Role of a Data Engineer.pdf
Decoding the Role of a Data Engineer.pdf
 
Export Data Model | SQL Database Modeler
Export Data Model |  SQL Database ModelerExport Data Model |  SQL Database Modeler
Export Data Model | SQL Database Modeler
 
Data warehouse testing
Data warehouse testingData warehouse testing
Data warehouse testing
 
Course Outline Ch 2
Course Outline Ch 2Course Outline Ch 2
Course Outline Ch 2
 
Data mining notes
Data mining notesData mining notes
Data mining notes
 
20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt20IT501_DWDM_PPT_Unit_I.ppt
20IT501_DWDM_PPT_Unit_I.ppt
 
Synopsis on inventory_management_system
Synopsis on inventory_management_systemSynopsis on inventory_management_system
Synopsis on inventory_management_system
 

More from SHIKHA GAUTAM

More from SHIKHA GAUTAM (17)

Agreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared MemoryAgreement Protocols, distributed File Systems, Distributed Shared Memory
Agreement Protocols, distributed File Systems, Distributed Shared Memory
 
Distributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock DetectionDistributed Mutual Exclusion and Distributed Deadlock Detection
Distributed Mutual Exclusion and Distributed Deadlock Detection
 
Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance Distributed Systems Introduction and Importance
Distributed Systems Introduction and Importance
 
Unit 4
Unit 4Unit 4
Unit 4
 
Unit v
Unit vUnit v
Unit v
 
Unit iii
Unit iiiUnit iii
Unit iii
 
Unit ii_KCS201
Unit ii_KCS201Unit ii_KCS201
Unit ii_KCS201
 
Type conversion in c
Type conversion in cType conversion in c
Type conversion in c
 
C intro
C introC intro
C intro
 
4. algorithm
4. algorithm4. algorithm
4. algorithm
 
3. basic organization of a computer
3. basic organization of a computer3. basic organization of a computer
3. basic organization of a computer
 
Generations of computer
Generations of computerGenerations of computer
Generations of computer
 
c_programming
c_programmingc_programming
c_programming
 
Data Mining
Data MiningData Mining
Data Mining
 
Data Warehousing
Data WarehousingData Warehousing
Data Warehousing
 
Dbms Introduction and Basics
Dbms Introduction and BasicsDbms Introduction and Basics
Dbms Introduction and Basics
 
DBMS
DBMSDBMS
DBMS
 

Recently uploaded

Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Kandungan 087776558899
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
MayuraD1
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
HenryBriggs2
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
9953056974 Low Rate Call Girls In Saket, Delhi NCR
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
MsecMca
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
mphochane1998
 

Recently uploaded (20)

Design For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the startDesign For Accessibility: Getting it right from the start
Design For Accessibility: Getting it right from the start
 
A Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna MunicipalityA Study of Urban Area Plan for Pabna Municipality
A Study of Urban Area Plan for Pabna Municipality
 
Air Compressor reciprocating single stage
Air Compressor reciprocating single stageAir Compressor reciprocating single stage
Air Compressor reciprocating single stage
 
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptxS1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
S1S2 B.Arch MGU - HOA1&2 Module 3 -Temple Architecture of Kerala.pptx
 
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptxHOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
HOA1&2 - Module 3 - PREHISTORCI ARCHITECTURE OF KERALA.pptx
 
AIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech studentsAIRCANVAS[1].pdf mini project for btech students
AIRCANVAS[1].pdf mini project for btech students
 
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak HamilCara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
Cara Menggugurkan Sperma Yang Masuk Rahim Biyar Tidak Hamil
 
Thermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.pptThermal Engineering -unit - III & IV.ppt
Thermal Engineering -unit - III & IV.ppt
 
DeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakesDeepFakes presentation : brief idea of DeepFakes
DeepFakes presentation : brief idea of DeepFakes
 
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
scipt v1.pptxcxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx...
 
Online food ordering system project report.pdf
Online food ordering system project report.pdfOnline food ordering system project report.pdf
Online food ordering system project report.pdf
 
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
COST-EFFETIVE  and Energy Efficient BUILDINGS ptxCOST-EFFETIVE  and Energy Efficient BUILDINGS ptx
COST-EFFETIVE and Energy Efficient BUILDINGS ptx
 
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
Call Girls in South Ex (delhi) call me [🔝9953056974🔝] escort service 24X7
 
Employee leave management system project.
Employee leave management system project.Employee leave management system project.
Employee leave management system project.
 
notes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.pptnotes on Evolution Of Analytic Scalability.ppt
notes on Evolution Of Analytic Scalability.ppt
 
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments""Lesotho Leaps Forward: A Chronicle of Transformative Developments"
"Lesotho Leaps Forward: A Chronicle of Transformative Developments"
 
Introduction to Serverless with AWS Lambda
Introduction to Serverless with AWS LambdaIntroduction to Serverless with AWS Lambda
Introduction to Serverless with AWS Lambda
 
Unleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leapUnleashing the Power of the SORA AI lastest leap
Unleashing the Power of the SORA AI lastest leap
 
Double Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torqueDouble Revolving field theory-how the rotor develops torque
Double Revolving field theory-how the rotor develops torque
 
Work-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptxWork-Permit-Receiver-in-Saudi-Aramco.pptx
Work-Permit-Receiver-in-Saudi-Aramco.pptx
 

Warehouse Planning and Implementation

  • 2.  Data Warehouse Process and Technology: Warehousing Strategy, Warehouse management and Support Processes.  Warehouse Planning and Implementation.  H/w and O.S. for Data Warehousing, C/Server Computing Model & Data Warehousing, Parallel Processors & Cluster Systems, Distributed DBMS implementations.  Warehousing Software, Warehouse Schema Design.  Data Extraction, Cleanup & Transformation Tools, Warehouse Metadata
  • 3. “Storage or warehousing provides the place utility as part of logistics for any business and along with Transportation is a critical component of customer service standards”.
  • 4.  To support the company’s customer policy.  To maintain a source of supply without interruptions.  To support changing market conditions and sudden changes in demand.  To provide customers with the right mix of products at all times and all locations.  To ensure least logistics cost for a desired level of customer service.
  • 5.  More cost effective decision making.  Better enterprise intelligence: Increasing quality and flexibility of enterprise analysis.  Enhanced customer service.  Business re-engineering: Knowing what information is important provides direction and priority for re-engineering efforts.  Information system re-engineering.
  • 6.  Private warehouses: It is a storage facility that is mostly owned by big companies or single manufacturing units. It is also known as proprietary warehousing.  Public warehouses: It is a facility that stores inventory for many different businesses as opposed to a "private warehouse”.  Contract warehouses: A contract warehouse handles the shipping, receiving and storage of goods on a contract basis. This type of warehouse usually requires a client to commit to services for a particular period of time.
  • 7.  An integrated warehouse strategy focuses on two questions: 1. How many warehouses should be employed. 2. Which warehouse types should be used to meet market requirements.  Many firms utilize a combination of private, public, and contract facilities.
  • 8.  It involves following activities: 1. Establish sponsorship. 2. Identify enterprise needs. 3. Determine measurement cycle. 4. Validate measures. 5. Design data warehouse architecture. 6. Apply appropriate technologies. 7. Implementing data warehouse.
  • 9. 1. Establish sponsorship: Establishing the right sponsorship chain will ensure successful development and implementation. Sponsorship chain should include a data warehousing manager and two key individuals. 2. Identify Enterprise needs: Interview with key enterprise manager and analysis other pertinent documentations are techniques used to determine enterprise needs.
  • 10. 3. Determine measurement cycle: Describing the cycles or time period used for the measure. Are quarters, months or hours are appropriate to capture useful measurement data? Does it need historical data? 4. Validate measures: After determining and identifying enterprise needs, it is necessary to “reality check” of it. The feedback will be used for refining the measures.
  • 11. 5. Design data warehouse architecture: This activity involves active user participation in facilitated design sessions. 6. Apply appropriate technologies: Enterprise selects technology, key technology issues, security policies etc. 7. Implementing data warehouse: Loading preliminary data, designing user interface, developing standard queries and reports etc.
  • 12.
  • 13. There are four major processes that build a data warehouse: 1. Extract and load data: Data extraction takes data from the source systems. Data load takes the extracted data and loads it into the data warehouse. It involves:  Controlling the Process: Determining when to start data extraction. It ensures that the tools, the logic modules, and the programs are executed in correct sequence and at correct time.
  • 14.  When to Initiate Extract: Data warehouse should represent a single, consistent version of the information to the user. So, Data needs to be in a consistent state.  Loading the Data : Data is loaded into a temporary data store where it is cleaned up and made consistent. 2. Cleaning and transforming the data: Clean and transform the loaded data into a structure, Partition the data and Aggregation.
  • 15. 3. Backup and Archive the data: In order to recover the data in the event of data loss, software failure, or hardware failure, it is necessary to keep regular back ups. 4. Managing queries & directing them to the appropriate data sources: Manages the queries, helps speed up the execution time of queries, Directs the queries to their most effective data sources. Ensures that all the system sources are used in the most effective way, Monitors actual query profiles.
  • 16.
  • 17.  A warehouse management system (WMS) is a software application, designed to support and optimize warehouse or distribution center management.  They facilitate management in their daily planning, organizing, staffing, directing, and controlling the utilization of available resources, to move and store materials into, within, and out of a warehouse, while supporting staff in the performance of material movement and storage in and around a warehouse.
  • 18. 1. Load management: Relates to the collection of information from internal or external sources. Loading process includes summarizing, manipulating and changing the data structures into a format that lends itself to analytical processing. 2. Warehouse Management: The management tasks include ensuring its availability, the effective backup of its contents, and its security.
  • 19. 3. Query management: relates to the provision of access to the contents of the warehouse and may include the partitioning of information into different areas with different privileges to different users. Access may be provided through custom-built applications, or ad hoc query tools.
  • 20.
  • 21.  Includes loading preliminary data, implementing transformation program, design user interface, developing standard query and reports and training to warehouse users.
  • 22. ETL Design user Interface Develop standard query Training Users
  • 23. The process of extracting data from source systems and bringing it into the data warehouse is commonly called ETL, which stands for:  Extraction: To retrieve all the required data from the source system with as little resources as possible.  Transformation, and  Loading.
  • 24.  Ways to perform the extract:  Update notification – If the source system is able to provide a notification that a record has been changed, this is the easiest way to get the data.  Incremental extract –They are able to identify which records have been modified and provide an extract of such records. By using daily extract, we may not be able to handle deleted records.  Full extract - The full extract requires keeping a copy of the last extract in the same format in order to be able to identify changes. Handles deletions as well.
  • 25. 2. Clean: Ensures the quality of the data in the data warehouse. 3. Transform: Applies a set of rules to transform the data from the source to the target. Converting any measured data to the same dimension using the same units so that they can later be joined. It also requires joining data from several sources, generating aggregates, generating surrogate keys, sorting, deriving new calculated values.
  • 26. 4. Load: To ensure that the load is performed correctly and with as little resources as possible.The target of the Load process is often a database. The referential integrity needs to be maintained by ETL tool to ensure consistency. 5. Managing ETL Process: There is a possibility that the ETL process fails.This can be caused by missing values in one of the reference tables, or simply a connection or power outage. It is necessary to design the ETL process keeping fail-recovery in mind.
  • 27. 6. Staging: A staging area or landing zone is an intermediate storage area used for data processing during the ETL process. Primary motivations for their use are to increase efficiency of ETL processes, ensure data integrity and support data quality operations.
  • 28.  Commercial tools : Ab Initio, IBM InfoSphere DataStage, Informatica, Oracle Data Integrator and SAP Data Integrator.  Open source ETL tools: CloverETL, Apatar, Pentaho and Talend.
  • 29.  Data Warehousing comes in all shapes and sizes, which is having a direct relationship to cost and time involved.  The steps listed below are summary of some of the points to consider:  Get Professional Advice  Plan the Data  Who will use the Data Warehouse  Integration to External Applications
  • 30. The key steps in developing a data warehouse can be summarized as follows:  Project initiation  Requirements analysis  Design (architecture, databases and applications)  Construction (selecting and installing tools, developing data feeds and building reports)  Deployment (release & training)  Maintenance
  • 31.
  • 32.  It applies to the software architecture that describe processing between application and supporting services.  It represents distributive co-operating processing, relationship between client and server is the relationship between hardware and software components.  It covers a wide range of functions, services and other aspects of distributed environment.
  • 33.  Host based application processing is performed on one computer system with attached unintelligent, “dumb” terminals.  A single stand alone PC or an IBM mainframe with attached character-based display terminals are example of host-based processing environment.  Host based processing is totally non-distributed.
  • 34.
  • 35.  Slave computers are attached to master computer and perform application-processing-related functions only as directed by their master.  Distribution of processing tends to be unidirectional- from master to slaves.  Slaves are capable of some limited local application processing.  E.g. Mainframe (host) computer, such as IBM 3090 used with cluster controllers and intelligent terminals.
  • 36.
  • 37.  This generation used to model: 1. Shared device LAN processing environment : PCs are attached to a system device that allows these PCs to share a common resource – file Server on Hard disk or printer Server. E.g. Microsoft’s LAN manager, which allows a LAN to have a system dedicated to file and print services.
  • 38.
  • 39. 2. Client server LAN processing environment: Extension of shared device processing. E.g. SYBASE SQL Server An application running on PC sends Read request to its database server. DB server process it locally and sends only the requested records to PC applications.
  • 40.
  • 41.  Two-tiered architecture to multi-tiered architecture.  Computing model deals with servers dedicated to application, data, transaction management and system management.  Supported relational to multidimensional to multimedia data structure.
  • 42.
  • 43.  A distributed database system consists of loosely coupled sites that share no physical component.  Database systems that run on each site are independent of each other.  Transactions may access data at one or more sites.
  • 44.  In a homogeneous distributed database  All sites have identical software  Are aware of each other and agree to cooperate in processing user requests.  Each site surrenders part of its autonomy in terms of right to change schemas or software  Appears to user as a single system  In a heterogeneous distributed database  Different sites may use different schemas and software ▪ Difference in schema is a major problem for query processing ▪ Difference in software is a major problem for transaction processing  Sites may not be aware of each other and may provide only limited facilities for cooperation in transaction processing
  • 45. DDBMS architectures are generally developed depending on three parameters −  Distribution − It states the physical distribution of data across the different sites.  Autonomy − It indicates the distribution of control of the database system and the degree to which each constituent DBMS can operate independently.  Heterogeneity − It refers to the uniformity or dissimilarity of the data models, system components and databases.
  • 46.  Data Replication  Fragmentation The three dimensions of distribution transparency are −  Location transparency  Fragmentation transparency  Replication transparency
  • 48.
  • 49.
  • 50.
  • 51.  The data warehouse operations mainly consist of huge data loads and index builds, generation of materialized views, and queries over large volumes of data. The elemental I/O system of a data warehouse should be built to meet these heavy requirements.  Architecture Options: 1. Symmetric Multiprocessing (SMP): where two or more identical processors are connected to a single, shared main memory. 2. Massive parallel processing (MPP): large number of processors to perform a set of coordinated computations in parallel.  Number of CPUs  Memory of data warehouse  Number of Disks
  • 52.  Server OS determine:  how quickly the server can fulfill client request  how many clients it can support concurrently and reliably,  how efficient the system resources such as memory,  Disk I/O and communication components are utilized.
  • 53.  Multiuser Support  Preemptive multitasking  Multithreaded Design  Memory Protection: Concurrent tasks should not violate each others memory.  Scalability  Security  Reliability  Availability
  • 54.  Relatively small and highly secure than uniprocessors.  Simplified architecture, Extensibility, Portability, real time support, robust system security and multiprocessor support.  This architecture results into highly modular OS that can support multiple OS “personalities” by configuring outside services as needed.  For e.g. Mach 3.0 microkernel used by IBM to allow DOS, OS/2 and AIX OS to coexist on single machine.
  • 55.  Distributed Memory Architecture:  Shared-Nothing Architecture  Shared Disk Architecture
  • 57. Local Memory Local Memory Local Memory Local Memory Processor Unit (PU) Processor Unit (PU) Processor Unit (PU) Processor Unit (PU) Interconnection Network Global Shared Disk Subsystem
  • 58.  A cluster is a loosely coupled SMP machines connected by high speed interconnection network.  A cluster behave just like a single large machine.