Wednesday, November 19, 2014

Informatica Powercenter Express - Sequence Generator Transformation


Sequence Generator transformation is a passive transformation used to create unique primary key values, replace missing primary keys, or cycle through a sequential range of numbers. 

 * You can use a range of values from 0 to 9,223,372,036,854,775,807 with the smallest interval f 1.


Sequence Generator Ports

Pass-Through Ports
Pass-through ports are input and output ports that receive input data and return the same data to a mapping without changing it. We must add at least one input port to the transformation and connect it to an upstream source or transformation before you link the NEXTVAL output port to targets.
NEXTVAL Port
Connect the NEXTVAL port to a downstream transformation or target to generate a sequence of numbers. If you connect NEXTVAL to multiple transformations, the Integration Service generates the same sequence of numbers for each transformation. You connect the NEXTVAL port to generate the sequence based on the Start Value and Increment Value properties. If the Sequence Generator is not configured to cycle through the sequence, the NEXTVAL port generates sequence numbers up to the configured end value.

Sequence Generator Transformation Properties
Reset If enabled, the Integration service resets the sequence data object to the start value when the mapping run completes. If disabled, the Integration Service increments the current value after the mapping run completes, and uses that value in the next mapping run.
Tracing Level Level of detail about the transformation that the Integration Service writes into the mapping log.

*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.

Monday, November 17, 2014

Informatica Powercenter Express - Rank Transformation


Rank transformation is an active transformation that limits records to a top or bottom range. Use a Rank transformation to return the largest or smallest numeric value in a port or group, or use a Rank transformation to return the strings at the top or the bottom of a mapping sort order.

Rank Transformation Ports
Output Passes data to downstream transformations.
Pass-Through Passes data unchanged.
Variable Used for local variables. You can use a variable port to store values or calculations to use in an expression. Variable ports cannot be input or output ports. They pass data within the transformation.
Group by Indicates how to create groups. 
Rank Port Designates the column for which you want to rank values.

Rank Index
The Developer tool creates a RANKINDEX port for each Rank transformation. The Data Integration Service uses the Rank Index port to store the ranking position for each row in a group. After the Rank transformation identifies all rows that belong to a top or bottom rank, it then assigns rank index values. If two rank values match, they receive the same value in the rank index and the transformation skips the next value. The RANKINDEX is an output port only. You can pass the rank index to another transformation in the mapping or directly to a target.

Advanced Properties for Rank Transformation
Top/Bottom Specifies whether you want the top or bottom ranking for a column.
Number of Ranks Number of rows to include in the top or bottom ranking.
Case-Sensitive String Comparison Specifies whether the Data Integration Service uses case-sensitive string comparisons when it ranks strings. Clear this option to have the Data Integration Service ignore case for strings. 
Cache Directory Local directory where the Data Integration Service creates the index cache files and data cache files. Default is the CacheDir system parameter.
Rank Data Cache Size Data cache size for the transformation. Default is Auto.
Rank Index Cache Size Index cache size for the transformation. Default is Auto.
Tracing Level Amount of detail that appears in the log for this transformation. You can choose terse, normal, verbose initialization, or verbose data. Default is normal.

Scenario for Rank Transformation
Two highest paid employees in each department


*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.

Friday, November 14, 2014

ETL & Informatica


ETL stands for Extraction, Transformation and Loading. Informatica provides a single enterprise data integration platform to help access, transform, and integrate data from a large variety of systems and deliver that information to other transactional systems, real-time business processes and users.

Extraction
The first part of an ETL process involves extracting the data from the source systems. PowerCenter has the ability to extract structured, unstructured, semi-structured, relational, mainframe, or file data.

Transformation
In this stage, a series of rules or functions are applied to the extracted data from the source. Some data sources will require very little or even no manipulation of data. Informatica Developer provides a set of transformations that perform specific functions like selecting only certain columns to load , deriving a new calculated value, filtering, sorting, joining data from multiple sources, aggregation, generating surrogate-key values, etc.

Loading
In this stage, some data are loaded to the target directly without applying any transformation logic and some data are loaded to the target after applying the logic or business rules. The load phase loads the data into the end target.

Thursday, November 13, 2014

A Look at Informatica Powercenter Express


Informatica Powercenter Express is a basic ETL tool, with potential for growth and useful for small deployments. It is an easy to use tool. You just need to drag and drop different objects and design process flow for Data extraction, transformation and load. PowerCenter Express is not PowerCenter,  the free Express version can only process a quarter of a million rows per day. This tool can be used for moving small amount of data to databases, training purposes, testing and proof of concepts that may or may not make it into production in the enterprise.

Every transformation you can find in  PowerCenter is in Express as well. We can move data from relational databases, flat files, XML, even unstructured and social media data from sources like Twitter and Facebook.  PowerCenter Express combines an ETL tool, with a data profiling tool, and, you can drill-in right there to really see and show the data you’re working with. We dont have to switch tools. And  they have dropped the source and target distinction.

With PowerCenter Express, Departmental groups and mid-sized companies can quickly and cost-effectively incorporate  data integration capabilities into daily business practices to unleash the potential of their data.
PowerCenter Express Installs in 10 minutes, downloadable from Informatica Marketplace. Download free Informatica powercenter express in the following site:


Monday, November 10, 2014

Informatica Powercenter Express - Expression Transformation


Expression transformation is a passive transformation that you can use to perform non-aggregate calculations in a single row. You can also use the Expression transformation to test conditional statements before you pass the results to a target or other transformations. You can enter multiple expressions in a single Expression transformation by creating an expression for each output port.

Expression Transformation Ports
Input Receives data from upstream transformations. Input ports provide values used in a calculation.
Output Provides the return value of the expression. You enter the expression as a configuration option for the output port. You can also configure a default value for each port.
Variable Used for local variables.

Advanced Properties for Expression Transformation

Tracing Level Amount of detail that appears in the log for this transformation. You can choose terse, normal, verbose initialization, or verbose data. Default is normal.

Scenario for Expression transformation

Calculate the new salaries for employees after the 10% hike in salaries.


*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.

Saturday, November 8, 2014

Informatica Powercenter Express - Joiner Transformation


Joiner transformation joins source data from two related heterogeneous sources from different locations or from different file systems. You can also join data from the same source. The Joiner transformation uses a condition that matches one or more pairs of columns between the two sources.

Joiner Ports
The two input pipelines include a master pipeline and a detail pipeline or a master and a detail branch. The master pipeline ends at the Joiner transformation, while the detail pipeline continues to the target.
Master
Ports that link to the master source in the mapping.
Detail
Ports that link to the detail source in the mapping.

Join Types
The Joiner transformation supports the following types of join
Normal Join
In this join, the Data Integration Service discards all rows of data from the master and detail source that do not match, based on the condition.
Master Outer Join
This join keeps all rows of data from the detail source and the matching rows from the master source. It discards the unmatched rows from the master source.
Detail Outer Join
This join keeps all rows of data from the master source and the matching rows from the detail source. It discards the unmatched rows from the detail source.
Full Outer Join
This join keeps all rows of data from both the master and detail sources.

Advanced properties for joiner transformation

Joiner Data Cache Size
Data cache size for the transformation. Default is Auto.
Joiner Index Cache Size
Index cache size for the transformation. Default is Auto.
Cache Directory
Local directory where the Data Integration Service creates the index cache files and data cache files. Default is the CacheDir system parameter.
Sorted Input
Indicates that input data is presorted by groups. Choose Sorted Input to join sorted data.
Master Sort Order
Specifies the sort order of the master source data. Choose Ascending if the master source data is in ascending order.
Tracing Level
Amount of detail that appears in the log for this transformation. Default is normal.

Tips to Improve performance while using Joiner Transformation

  • Designate the master source as the source with fewer duplicate key values.  If the master source contains many rows with the same key value, the Data Integration Service must cache more rows, which can decrease performance.
  • Designate the master source as the source with fewer rows. The Joiner transformation compares each row of the detail source against the master source. The fewer rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
  • Perform joins in a database when possible. Performing a join in a database is faster than performing a join in during the mapping run . The type of database join you use can affect performance. Normal joins are faster than outer joins and result in fewer rows. 
  • Join sorted data when possible. The Data Integration Service increases performance by minimizing disk input and disk output. You see the greatest performance increase when you work with large data sets.

Scenario for Joiner transformation

Join the department name from department table to employee table


*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.