Joiner transformation joins source data from
two related heterogeneous sources from different locations or from different
file systems. You can also join data from the same source. The Joiner
transformation uses a condition that
matches one or more pairs of columns between the two sources.
Joiner Ports
The two input
pipelines include a master pipeline and a
detail pipeline or a master and a detail
branch. The master pipeline ends at the Joiner transformation, while the detail
pipeline continues to the target.
Master
Ports that link to
the master source in the mapping.
Detail
Ports that link to
the detail source in the mapping.
Join Types
The Joiner
transformation supports the following types of join
Normal Join
In this join, the
Data Integration Service discards all rows of data from the master and detail
source that do not match, based on the condition.
Master Outer Join
This join keeps all
rows of data from the detail source and the matching rows from the master
source. It discards the unmatched rows from the master source.
Detail Outer Join
This join keeps all
rows of data from the master source and the matching rows from the detail
source. It discards the unmatched rows from the detail source.
Full Outer Join
This join keeps all
rows of data from both the master and detail sources.
Advanced properties for joiner transformation
Joiner Data Cache Size
Data cache size for
the transformation. Default is Auto.
Joiner Index Cache Size
Index cache size for
the transformation. Default is Auto.
Cache Directory
Local directory
where the Data Integration Service creates the index cache files and data cache
files. Default is the CacheDir system parameter.
Sorted Input
Indicates that input
data is presorted by groups. Choose Sorted Input to join sorted data.
Master Sort Order
Specifies the sort
order of the master source data. Choose Ascending if the master source data is
in ascending order.
Tracing Level
Amount of detail
that appears in the log for this transformation. Default is normal.
Tips to Improve performance while using Joiner
Transformation
- Designate the master source as the source with fewer duplicate key values. If the master source contains many rows with the same key value, the Data Integration Service must cache more rows, which can decrease performance.
- Designate the master source as the source with fewer rows. The Joiner transformation compares each row of the detail source against the master source. The fewer rows in the master, the fewer iterations of the join comparison occur, which speeds the join process.
- Perform joins in a database when possible. Performing a join in a database is faster than performing a join in during the mapping run . The type of database join you use can affect performance. Normal joins are faster than outer joins and result in fewer rows.
- Join sorted data when possible. The Data Integration Service increases performance by minimizing disk input and disk output. You see the greatest performance increase when you work with large data sets.
Scenario for Joiner transformation
Join the department
name from department table to employee table
*In case of any
questions, feel free to leave comments on this page and I would get back as
soon as I can.
No comments:
Post a Comment