IBM infosphere datastage Online training. It is an ETL Tool used to construct data integration solution. As per the information it is available in various versions.
For further details kindly visit:
http://kerneltraining.com/
https://www.facebook.com/KernelTraining
We provide free IBM Infosphere Datastage demo class
Get live interactive classes.
Life time access to material.
Real time hands on experience.
24/7 Support.
Visit our website: www.kerneltraining.com
Contact us: 08099776681
Email Us: sales@kerneltraining.com
IBM Infosphere Datastage Introduction Online Training
1. For more details please contact us:
US : +1 718 819 9361
INDIA : +91 8099776681
Email Us : sales@kerneltraining.com
Welcome to IBM
Data Stage 9.1
2. 2 http://kerneltraining.com/ibm-data-stage/
DATA WAREHOUSE
A data warehouse is a copy of transaction data specifically structured for
querying and reporting.
An expanded definition for data warehousing includes business intelligence
tools, tools to extract, transform and load data into the repository, and tools
to manage and retrieve metadata.
This definition of the data warehouse focuses on data storage.
A data warehouse can be normalized or de normalized.
It can be a relational database, multidimensional database, flat file,
hierarchical database, object database, etc.
Data warehouse data often gets changed.
And data warehouses often focus on a specific activity or entity.
4. 4 http://kerneltraining.com/ibm-data-stage/
Reasons for Dirty Data
Dummy Values
Absence of Data
Multipurpose Fields
Cryptic Data
Contradicting Data
Inappropriate Use of Address Lines
Violation of Business Rules
Reused Primary Keys,
Non-Unique Identifiers
Data Integration Problems
5. 5 http://kerneltraining.com/ibm-data-stage/
Data Cleansing
Source systems contain dirty data that must be cleansed
ETL software contains rudimentary data cleansing
capabilities
Specialized data cleansing software is often used. Important
for performing name and address correction and house
holding functions
Leading data cleansing vendors include Vality (Integrity),
Harte-Hanks (Trillium), and First logic (i.e. Centric)
8. 8 http://kerneltraining.com/ibm-data-stage/
Data Stage
In its simplest form, Data Stage performs from
source systems to target systems in batch and in
real time. The data sources may include indexed
files, sequential files, relational databases, archives,
external data sources, enterprise applications and
message queues.
10. 10 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer Director
Specify general server
defaults Add and delete
projects Set project properties
Access Data Stage Repository
by command interface
Use Data Stage Administrator to:
12. 12 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer Director
Specify how the
data is extracted
Specify data
transformations
Decode (de
normalize) data
going into the data
mart using
referenced lookups
Aggregate data
Split data into
multiple outputs on
the basis of defined
constraints
Use Data Stage
Designer to:
13. 13 http://kerneltraining.com/ibm-data-stage/
Data Stage Administrator Designer Director
Use Data stage Director to run, schedule, and monitor your Data
Stage jobs. You can also gather statistics as the job runs. Also
used for looking at logs for debugging purposes.
The Data Stage Director window is divided into two panes:
The Job Category pane lists all of the jobs in the repository.
Right pane shows one of three views: Status view, Schedule view, or Log view.
15. 15 http://kerneltraining.com/ibm-data-stage/
Frequently seen Status
1 Finished
2 Finished (see log)
9 Has been reset
11 Validated OK
12 Validated (see log)
21 Has been reset
99 Compiled
0 Running
3 Aborted
8 Failed validation
13 Failed validation
96 Aborted
97 Stopped
98 Not Compiled
16. 16 http://kerneltraining.com/ibm-data-stage/
Data Stage:Getting Started
Set up a project – Before you can create any Data Stage jobs,
you must set up your project by entering information about
your data.
Create a job – When a Data Stage project is installed, it is empty
and you must create the jobs you need in Data Stage Designer.
Define Table Definitions
Develop the job – Jobs are designed and developed using the
Designer. Each data source, the data warehouse, and each
processing step is represented by a stage in the job design. The
stages are linked together to show the flow of data.
20. 20 http://kerneltraining.com/ibm-data-stage/
Data Stage Designer Transformer Stage
The Transformer stage performs any data conversion required before the
data is output to another stage in the job design.
After you are done,
compile and run
the job.