Thursday, January 22, 2015

Unconnected Lookup Transformation In Informatica - Role Playing Date Dimension


Unconnected Lookup Transformation is a passive transformation since it can return only one column from each row to the return port. It is not connected to any other transformations in the mapping.

  • It receives input from the result of a :LKP expression from transformations such as an Expression transformation or Aggregator transformation or Update Strategy Transformation.
  • The Integration Service queries the lookup source or cache based on the lookup ports and condition in the transformation.
  • The :LKP expression passes parameters to the Lookup transformation and receives a result back from the lookup.
  • If there is no match for the lookup condition, the Integration Service returns NULL.
  • You can call the Lookup transformation more than once in a mapping.
  • Unlike Connected lookup, it can use only Static cache.

The main advantage of using an unconnected lookup is the fact that you can use it in any other transformation like a function  in other programming languages. The Lookup Cache is calculated only once as part of the session initialization and hence more efficient, since all subsequent look-ups will re-use the Cache.
The only constraint is that the cache has to be static, i.e. the data in your look-up table shouldn’t be changing once the lookup Cache has been created.

Scenario
Lookup Order Date, Due Date and Ship date in Date Dimension.



*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.

Friday, January 16, 2015

Extracting Twitter Data and Using it in a Mapping - Informatica Powercenter Express


In the Previous blog, we saw how to create a Twitter App, configure Open Authentication and create a Twitter Connection. The next step is to create the Data Object


Create New Data Object

A Twitter data object is a physical data object that represents data based on Twitter resource such as Followers, Profile and Place.
1. Create a new Data Object by clicking File -> New -> Data Object and choose Twitter Data Object.
2. Give a name to the Data Object, location and choose connection created in the above steps. When creating a new data object Selected Resources section is empty. You need to choose resources by clicking Add button.
3. There will be various Twitter Resources shown like Search, Profile lookup, Followers and Friends. For now, just select Search as we are going to search for the tweets based on keywords. Click OK and complete the Data Object creation step by clicking Finish.

Create New Data Object Operation

Create new Data Object Operation on Data Object just created by right clicking on Twitter_Data_Object -> New -> Data Object Operation.
Name the Data Object Operation and select the resource that we added during the creation of data object i.e., Search. Click on Finish button.
Informatica creates two operations Search and Output. In the properties of Output read operation, go to Advanced tab. Use Query Parameter to specify search criteria for tweets to search and endsAfter property can be used to limit number of records fetched by the connection.
Query parameter: QUERY=Informatica


Twitter Data Object Operation Advanced Properties

Query Parameter is the Twitter search criteria. Each resource requires one of the mandatory parameters.
Followers - screen_name,user_id
Friends - screen_name,user_id
Profile Lookup - screen_name,user_id
Search - QUERY
For example, you specify the screen name in the following query parameter to extract the Twitter profile of the Twitter screen name xyzscreen_name=xyz

Token file  is the file name and format of a file that contains a list of access tokens and access secrets. The CSV file must be on the machine where Informatica Services is installed. The Data Integration Service uses the access token and access secret at run time to establish a connection. The Token File is applicale for Followers and Friends Twitter Resource.

Ends After  controls the number of rows requested. Specify a time or a positive numeral:
We can specify the amount of time that the Data Integration Service runs the mapping. Enter the duration in the format, hh:mm. Default is blank, which indicates that the Data Integration Service runs the mapping until you stop it or until there is data.
The Data Integration Service will request the minimum number of rows which is 100. The maximum number of rows requested is based on the rows returned by the Twitter application. Enter the number of rows as a positive numeral.

Create  Mapping

In the Developer, using above Data Object as source and any relational database or flat file as target, mapping can be created.

  

*In case of any questions, feel free to leave comments on this page and I would get back as soon as I can.

Saturday, January 10, 2015

Connecting to Twitter - Informatica Powercenter Express


We can create a Twitter connection to read historical or real-time Twitter data through the Data Integration Service. For example, we can create a Twitter connection and define a query that searches for tweets containing a keyword such as Informatica. For real-time data, use a Twitter streaming connection to continuously read tweets that include a keyword. You can load the extracted data to a target and then use the data for data mining and analysis. 

Create Twitter Application

1. If you are creating a Twitter Connection for the first time, you would need to create a Twitter application  first on Twitter Developer website https://apps.twitter.com/

2. Then login and select the Create New App on the Right hand top corner.

3. Fill up the form with details like Name of the app, description, website and Callback URL. Note that, Callback URL can be a placeholder URL. If you keep it empty, it might give you error while fetching data using Informatica.

4. Agree to the Developer Agreement and Create your Twitter Application.

5. Once the Application is created, the screen will show Consumer Key. Click on the manage Keys and access Tokens . Note the Consumer Key, Consumer Secret, Access Token and Access Token Secret information, since it is required during Twitter Connection in Informatica PowerCenter Express.

 
 
Open Authentication Configuration

A Twitter connection and a Twitter Streaming connection requires open authentication (OAuth). You configure OAuth in the OAuth Utility to send secure authorized requests to Twitter API.
1. Open OAuth utility by using the URL http://<hostname.domain.com>:<port>/ows

2. Select Twitter in the Application.

3. Fill in the Consumer Key and Consumer Secret and Authorize.

4. In the Next page, select the Authorize App.

Create Twitter Connection in Informatica Developer

In this blog, I  will create a Twitter connection to pull history data, similar steps can be followed to use Twitter Streaming Connection.
1. Go to Create New Connection and select the connection type as Social Media.

2. Enter Connection name and click next.


3. Enter the consumer key and consumer secret.



4. Select yes for Oauth Details and fill in the Access Token and Access Secret.

 
5. Click Test Connection to verify the connection to the Twitter web site and Finish.

* In the Next Blog, I have covered how to Create a Data Object,  Data Object Operation and Using this Twitter data in a Mapping.