Ce diaporama a bien été signalé.
Le téléchargement de votre SlideShare est en cours. ×

Informatica Interview Questions & Answers

Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
Publicité
What is Informatica?
Informatica is a tool, supporting all the steps of Extraction, Transformation and Load process. Now a...
• Report Metadata (Metadata Reporter)
2. Informatica PowerCenter Repository:
Repository is the heart of Informatica tools....
Informatica Product Line
Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of enterprise...
Publicité
Publicité
Chargement dans…3
×

Consultez-les par la suite

1 sur 11 Publicité

Informatica Interview Questions & Answers

Télécharger pour lire hors ligne

50-55 hours Training + Assignments + Actual Project Based Case Studies
All attendees will receive,
Assignment after each module, Video recording of every session
Notes and study material for examples covered.
Access to the Training Blog & Repository of Materials

50-55 hours Training + Assignments + Actual Project Based Case Studies
All attendees will receive,
Assignment after each module, Video recording of every session
Notes and study material for examples covered.
Access to the Training Blog & Repository of Materials

Publicité
Publicité

Plus De Contenu Connexe

Diaporamas pour vous (20)

Similaire à Informatica Interview Questions & Answers (20)

Publicité

Plus par ZaranTech LLC (20)

Publicité

Informatica Interview Questions & Answers

  1. 1. What is Informatica? Informatica is a tool, supporting all the steps of Extraction, Transformation and Load process. Now a days Informatica is also being used as an Integration tool. Informatica is an easy to use tool. It has got a simple visual interface like forms in visual basic. You just need to drag and drop different objects (known as transformations) and design process flow for Data extraction transformation and load. These process flow diagrams are known as mappings. Once a mapping is made, it can be scheduled to run as and when required. In the background Informatica server takes care of fetching data from source, transforming it, & loading it to the target systems/databases. Informatica can communicate with all major data sources (mainframe/RDBMS/Flat Files/XML/VSM/SAP etc), can move/transform data between them. It can move huge volumes of data in a very effective way, many a times better than even bespoke programs written for specific data movement only. It can throttle the transactions (do big updates in small chunks to avoid long locking and filling the transactional log). It can effectively join data from two distinct data sources (even a xml file can be joined with a relational table). In all, Informatica has got the ability to effectively integrate heterogeneous data sources & converting raw data into useful information. Before we start actually working in Informatica, let’s have an idea about the company owning this wonderful product. Some facts and figures about Informatica Corporation: • Founded in 1993, based in Redwood City, California • 1000+ Employees; 2300 + Customers; 79 of the Fortune 100 Companies • NASDAQ Stock Symbol: INFA; Stock Price: $15.10 (03/17/2006) • Revenues in fiscal year 2005: $267.4M • Informatica Developer Networks: 20000 Members In short, Informatica is worlds leading ETL tool & its rapidly acquiring market as an Enterprise Integration Platform Informatica Software Architecture illustrated Informatica ETL product, known as Informatica Power Center consists of 3 main components. 1. Informatica PowerCenter Client Tools: These are the development tools installed at developer end. These tools enable a developer to • Define transformation process, known as mapping. (Designer) • Define run-time properties for a mapping, known as sessions (Workflow Manager) • Monitor execution of sessions (Workflow Monitor) • Manage repository, useful for administrators (Repository Manager)
  2. 2. • Report Metadata (Metadata Reporter) 2. Informatica PowerCenter Repository: Repository is the heart of Informatica tools. Repository is a kind of data inventory where all the data related to mappings, sources, targets etc is kept. This is the place where all the metadata for your application is stored. All the client tools and Informatica Server fetch data from Repository. Informatica client and server without repository is same as a PC without memory/harddisk, which has got the ability to process data but has no data to process. This can be treated as backend of Informatica. 3. Informatica PowerCenter Server: Server is the place, where all the executions take place. Server makes physical connections to sources/targets, fetches data, applies the transformations mentioned in the mapping and loads the data in the target system. This architecture is visually explained in diagram below: Sources Standard: RDBMS, Flat Files, XML, ODBC Applications: SAP R/3, SAP BW, PeopleSoft, Siebel, JD Edwards, i2 EAI: MQ Series, Tibco, JMS, Web Services Legacy: Mainframes (DB2, VSAM, IMS, IDMS, Adabas)AS400 (DB2, Flat File) Remote Sources Targets Standard: RDBMS, Flat Files, XML, ODBC Applications: SAP R/3, SAP BW, PeopleSoft, Siebel, JD Edwards, i2 EAI: MQ Series, Tibco, JMS, Web Services Legacy: Mainframes (DB2)AS400 (DB2) Remote Targets This is the sufficient knowledge to start with Informatica. So lets go straight to development in Informatica.
  3. 3. Informatica Product Line Informatica is a powerful ETL tool from Informatica Corporation, a leading provider of enterprise data integration software and ETL softwares. The important products provided by Informatica Corporation is provided below: • Power Center • Power Mart • Power Exchange • Power Center Connect • Power Channel • Metadata Exchange • Power Analyzer • Super Glue Power Center & Power Mart: Power Mart is a departmental version of Informatica for building, deploying, and managing data warehouses and data marts. Power center is used for corporate enterprise data warehouse and power mart is used for departmental data warehouses like data marts. Power Center supports global repositories and networked repositories and it can be connected to several sources. Power Mart supports single repository and it can be connected to fewer sources when compared to Power Center. Power Mart can extensibily grow to an enterprise implementation and it is easy for developer productivity through a codeless environment. Power Exchange: Informatica Power Exchange as a stand alone service or along with Power Center, helps organizations leverage data by avoiding manual coding of data extraction programs. Power Exchange supports batch, real time and changed data capture options in main frame(DB2, VSAM, IMS etc.,), mid range (AS400 DB2 etc.,), and for relational databases (oracle, sql server, db2 etc) and flat files in unix, linux and windows systems. Power Center Connect: This is add on to Informatica Power Center. It helps to extract data and metadata from ERP systems like IBM's MQSeries, Peoplesoft, SAP, Siebel etc. and other third party applications. Power Channel: This helps to transfer large amount of encrypted and compressed data over LAN, WAN, through Firewalls, tranfer files over FTP, etc. Meta Data Exchange: Metadata Exchange enables organizations to take advantage of the time and effort already invested in defining data structures within their IT environment when used with Power Center. For example, an organization may be using data modeling tools, such as Erwin, Embarcadero, Oracle designer, Sybase Power Designer etc for developing data models. Functional and technical team should have spent much time and effort in creating the data model's data structures(tables, columns, data types, procedures, functions, triggers etc). By using meta deta exchange, these data structures can be imported into power center to identifiy source and target mappings which leverages time and effort. There is no need for informatica developer to create these data structures once again.
  4. 4. Power Analyzer: Power Analyzer provides organizations with reporting facilities. PowerAnalyzer makes accessing, analyzing, and sharing enterprise data simple and easily available to decision makers. PowerAnalyzer enables to gain insight into business processes and develop business intelligence. With PowerAnalyzer, an organization can extract, filter, format, and analyze corporate information from data stored in a data warehouse, data mart, operational data store, or otherdata storage models. PowerAnalyzer is best with a dimensional data warehouse in a relational database. It can also run reports on data in any table in a relational database that do not conform to the dimensional model. Super Glue: Superglue is used for loading metadata in a centralized place from several sources. Reports can be run against this superglue to analyze meta data. Note:This is not a complete tutorial on Informatica. We will add more Tips and Guidelines on Informatica in near future. Please visit us soon to check back. To know more about Informatica, contact its official website www.informatica.com Business Case: Why do we need ETL Tools? Think of GE, the company has over 100+ years of history & presence in almost all the industries. Over these years company’s management style has been changed from book keeping to SAP. This transition was not a single day transition. In transition, from book keeping to SAP, they used a wide array of technologies, ranging from mainframes to PCs, data storage ranging from flat files to relational databases, programming languages ranging from Cobol to Java. This transformation resulted into different businesses, or to be precise different sub businesses within a business, running different applications, different hardware and different architecture. Technologies are introduced as and when invented & as and when required. This directly resulted into the scenario, like HR department of the company running on Oracle Applications, Finance running SAP, some part of process chain supported by mainframes, some data stored on Oracle, some data on mainframes, some data in VSM files & the list goes on. If one day company requires a consolidated reports of assets, there are two ways. • First completely manual, generate different reports from different systems and integrate them. • Second fetch all the data from different systems/applications, make a Data Warehouse, and generate reports as per the requirement. Obviously second approach is going to be the best. Now to fetch the data from different systems, making it coherent, and loading into a Data Warehouse requires some kind of extraction, cleansing, integration, and load. ETL stands for Extraction, Transformation & Load. ETL Tools provide facility to Extract data from different non-coherent systems, cleanse it, merge it and load into target systems. Informatica Repository Manager Q. What type of repositories can be created using Informatica Repository Manager?
  5. 5. A. Informatica PowerCenter includeds following type of repositories : • Standalone Repository : A repository that functions individually and this is unrelated to any other repositories. • Global Repository : This is a centralized repository in a domain. This repository can contain shared objects across the repositories in a domain. The objects are shared through global shortcuts. • Local Repository : Local repository is within a domain and it’s not a global repository. Local repository can connect to a global repository using global shortcuts and can use objects in it’s shared folders. • Versioned Repository : This can either be local or global repository but it allows version control for the repository. A versioned repository can store multiple copies, or versions of an object. This features allows to efficiently develop, test and deploy metadata in the production environment. Q. What is a code page? A. A code page contains encoding to specify characters in a set of one or more languages. The code page is selected based on source of the data. For example if source contains Japanese text then the code page should be selected to support Japanese text. When a code page is chosen, the program or application for which the code page is set, refers to a specific set of data that describes the characters the application recognizes. This influences the way that application stores, receives, and sends character data. Q. Which all databases PowerCenter Server on Windows can connect to? A. PowerCenter Server on Windows can connect to following databases: • IBM DB2 • Informix • Microsoft Access • Microsoft Excel • Microsoft SQL Server • Oracle • Sybase • Teradata Q. Which all databases PowerCenter Server on UNIX can connect to? A. PowerCenter Server on UNIX can connect to following databases: • IBM DB2 • Informix • Oracle • Sybase
  6. 6. • Teradata Infomratica Mapping Designer Q. How to execute PL/SQL script from Informatica mapping? A. Stored Procedure (SP) transformation can be used to execute PL/SQL Scripts. In SP Transformation PL/SQL procedure name can be specified. Whenever the session is executed, the session will call the pl/sql procedure. Q. How can you define a transformation? What are different types of transformations available in Informatica? A. A transformation is a repository object that generates, modifies, or passes data. The Designer provides a set of transformations that perform specific functions. For example, an Aggregator transformation performs calculations on groups of data. Below are the various transformations available in Informatica: • Aggregator • Application Source Qualifier • Custom • Expression • External Procedure • Filter • Input • Joiner • Lookup • Normalizer • Output • Rank • Router • Sequence Generator • Sorter • Source Qualifier • Stored Procedure • Transaction Control • Union • Update Strategy
  7. 7. • XML Generator • XML Parser • XML Source Qualifier Q. What is a source qualifier? What is meant by Query Override? A. Source Qualifier represents the rows that the PowerCenter Server reads from a relational or flat file source when it runs a session. When a relational or a flat file source definition is added to a mapping, it is connected to a Source Qualifier transformation. PowerCenter Server generates a query for each Source Qualifier Transformation whenever it runs the session. The default query is SELET statement containing all the source columns. Source Qualifier has capability to override this default query by changing the default settings of the transformation properties. The list of selected ports or the order they appear in the default query should not be changed in overridden query. Q. What is aggregator transformation? A. The Aggregator transformation allows performing aggregate calculations, such as averages and sums. Unlike Expression Transformation, the Aggregator transformation can only be used to perform calculations on groups. The Expression transformation permits calculations on a row-by-row basis only. Aggregator Transformation contains group by ports that indicate how to group the data. While grouping the data, the aggregator transformation outputs the last row of each group unless otherwise specified in the transformation properties. Various group by functions available in Informatica are : AVG, COUNT, FIRST, LAST, MAX, MEDIAN, MIN, PERCENTILE, STDDEV, SUM, VARIANCE. Q. What is Incremental Aggregation? A. Whenever a session is created for a mapping Aggregate Transformation, the session option for Incremental Aggregation can be enabled. When PowerCenter performs incremental aggregation, it passes new source data through the mapping and uses historical cache data to perform new aggregation calculations incrementally. Q. How Union Transformation is used? A. The union transformation is a multiple input group transformation that can be used to merge data from various sources (or pipelines). This transformation works just like UNION ALL statement in SQL, that is used to combine result set of two SELECT statements. Q. Can two flat files be joined with Joiner Transformation? A. Yes, joiner transformation can be used to join data from two flat file sources. Q. What is a look up transformation? A. This transformation is used to lookup data in a flat file or a relational table, view or synonym. It compares lookup transformation ports (input ports) to the source column values based on the lookup condition. Later returned values can be passed to other transformations.
  8. 8. Q. Can a lookup be done on Flat Files? A. Yes. Q. What is the difference between a connected look up and unconnected look up? A. Connected lookup takes input values directly from other transformations in the pipleline. Unconnected lookup doesn’t take inputs directly from any other transformation, but it can be used in any transformation (like expression) and can be invoked as a function using :LKP expression. So, an unconnected lookup can be called multiple times in a mapping. Q. What is a mapplet? A. A mapplet is a reusable object that is created using mapplet designer. The mapplet contains set of transformations and it allows us to reuse that transformation logic in multiple mappings. Q. What does reusable transformation mean? A. Reusable transformations can be used multiple times in a mapping. The reusable transformation is stored as a metadata separate from any other mapping that uses the transformation. Whenever any changes to a reusable transformation are made, all the mappings where the transformation is used will be invalidated. Q. What is update strategy and what are the options for update strategy? A. Informatica processes the source data row-by-row. By default every row is marked to be inserted in the target table. If the row has to be updated/inserted based on some logic Update Strategy transformation is used. The condition can be specified in Update Strategy to mark the processed row for update or insert. Following options are available for update strategy : • DD_INSERT : If this is used the Update Strategy flags the row for insertion. Equivalent numeric value of DD_INSERT is 0. • DD_UPDATE : If this is used the Update Strategy flags the row for update. Equivalent numeric value of DD_UPDATE is 1. • DD_DELETE : If this is used the Update Strategy flags the row for deletion. Equivalent numeric value of DD_DELETE is 2. • DD_REJECT : If this is used the Update Strategy flags the row for rejection. Equivalent numeric value of DD_REJECT is 3. ================================================================================== Visit the following websites for other FAQS, http://www.coolinterview.com/type.asp?iType=18 http://www.geekinterview.com/FAQs/Informatica http://www.allinterview.com/Interview-Questions/Informatica/page1.html
  9. 9. =================================================================== What are the different types of Lookups? The different types of lookups are static Lookup, dynamic lookup, static persistent Lookup What is Test Load in informatica? If we want to test the execution of the mapping without loading any data in the target database then we check the option of test load. Why is parameterization imp and how does it help? Parameterization helps us to avoid making changes in the mappings in case any of the variables in the mapping have undergone change. By putting values that are likely to change in parameter file(Like schema name, Update user_id) we improve maintainability of the code. Is there any advantage of having less number of Transformations in a mapping? More Transformations mean increased time of execution for a mapping as the data has to be passed thru another process of Transformation. What is the overhead in a dynamic lookup over a static lookup? As the dynamic lookup can undergo change in its values as the mapping is currently executing hence PowerCenter has to continuously look for changes that are likely to happen. This causes additional overhead. What is the advantage of sending sorted data into a joiner transformation? When sorted data is sent thru a joiner Transformation, we can take advantage of the sorted input option available in Joiner. This causes the join operation to take less time than otherwise. Is it possible to have a mapping with update strategy, using bulk mode? No, Update strategy can be executed only in Normal mode. How can a mapping be failed if the numbers of rejects are more than a certain threshold? Session option abort on error offers us the number of errors before which the session will be aborted. By default it is zero. By Sujith Nair Informatica interview questions & FAQs What is a source qualifier? What is a surrogate key? What is difference between Mapplet and reusable transformation? What is DTM session?
  10. 10. What is a Mapplet? What is a look up function? What is default transformation for the look up function? What is difference between a connected look up and unconnected look up? What is up date strategy and what are the options for update strategy? What is subject area? What is the difference between truncate and delete statements? What kind of Update strategies are normally used (Type 1, 2 & 3) & what are the differences? What is the exact syntax of an update strategy? What are bitmap indexes and how and why are they used? What is bulk bind? How does it improve performance? What are the different ways to filter rows using Informatica transformations? What is referential Integrity error? How do you rectify it? What is DTM process? What is target load order? What exactly is a shortcut and how do you use it? What is a shared folder? What are the different transformations where you can use a SQL override? What is the difference between a Bulk and Normal mode and where exactly is it defined? What is the difference between Local & Global repository? What are data driven sessions? What are the common errors while running a Informatica session? What are worklets and what is their use? What is change data capture? What exactly is tracing level? What is the difference between constraints based load ordering and target load plan? What is a deployment group and what is its use? When and how a partition is defined using Informatica? How do you improve performance in an Update strategy? How do you validate all the mappings in the repository at once? How can you join two or more tables without using the source qualifier override SQL or a Joiner transformation? How can you define a transformation? What are different types of transformations in Informatica? How many repositories can be created in Informatica? How many minimum groups can be defined in a Router transformation? How do you define partitions in Informatica? How can you improve performance in an Aggregator transformation? How does the Informatica know that the input is sorted? How many worklets can be defined within a workflow? How do you define a parameter file? Give an example of its use. If you join two or more tables and then pull out about two columns from each table into the source qualifier and then just pull out one column from the source qualifier into an Expression transformation and then do a ‘generate SQL’ in the source qualifier how many columns will show up in the generated
  11. 11. SQL. In a Type 1 mapping with one source and one target table what is the minimum number of update strategy transformations to be used? At what levels can you define parameter files and what is the order? In a session log file where can you find the reader and the writer details? For joining three heterogeneous tables how many joiner transformations are required? Can you look up a flat file using Informatica? While running a session what default files are created? Describe the use of Materialized views and how are they different from a normal view.

×