Hi ,
I am new to clover etl and I am making changes to an existing clover ETL application and here is where I was getting stuck :
I have two table A and B and I am extracting columns from the database A to the Datamart table A_DM using HASH JOIN :
<Node id="JOIN_TABLES"
type="HASH_JOIN"
joinKey="EMPLOYEE_ID"
transformClass="clover.xforms.RecordTransformDefault"
leftOuterJoin="false"
hashTableSize="1">
<attr name="cloverLang:1">
${out.0.EMP_PHONE_NUM} = ${in.1.EMP_PHONE};
</attr>
</Node>
How does the HASH JOIN know how to join the 2 tables , what I mean by that is does the type HASH JOIN know the primary keys , foreign keys ? If so how does it know ?
I am new to clover etl and I am making changes to an existing clover ETL application and here is where I was getting stuck :
I have two table A and B and I am extracting columns from the database A to the Datamart table A_DM using HASH JOIN :
<Node id="JOIN_TABLES"
type="HASH_JOIN"
joinKey="EMPLOYEE_ID"
transformClass="clover.xforms.RecordTransformDefault"
leftOuterJoin="false"
hashTableSize="1">
<attr name="cloverLang:1">
${out.0.EMP_PHONE_NUM} = ${in.1.EMP_PHONE};
</attr>
</Node>
How does the HASH JOIN know how to join the 2 tables , what I mean by that is does the type HASH JOIN know the primary keys , foreign keys ? If so how does it know ?
-
Hello,
you have to define keys in your Joiner: attribute joinKey should look like $EmployeeID=$EmployeeID;#, where the left side defines primary key and the right side - foreign key. When the key name is in both cases the same, you can use only joinKey=EmployeeID like in your example. For complete documentation see HashJoin component; checking HashJoin example could be useful too. -
Thanks Agata , that did help me !!
Further more the issue I am dealing with is is that I want to Join three tables and all the three tables do not have any one key in common , for example :
Tables :
1) Employee : Key : Employee_ID
2) Customer : Key : Employee_ID, ORDER_ID
3) Order : Key : Order_ID
Can i do some thing like this which would join these 3 tables :
<Node id="JOIN_TABLES"
type="HASH_JOIN"
joinKey="Employee_ID#ORDER_ID=Order_ID"
transformClass="comclover.xforms.RecordTransformDefault"
leftOuterJoin="false"
hashTableSize="1">
Does this provider a proper join since only one table has common keys, in this case that is the "Customer" table.
If not I was planning on Joining the Employee & Customer and then take the result of this join and join it with Order ?
Second part of the question :
I want to use the following type DB_INPUT_TABLE based on the previous join :
<Node id="UPD_FACT_TABLE"
type="DB_INPUT_TABLE"
commit="12345"
batchMode="Y"
dbConnection="ORACLE_DATAMART"
dbColumns="ORDER_ID:EMPLOYEE_ID"
sourceTable="MY_DB.TRACK_ORDER"
sqlQuery="select TRACKING_TYPE from TRACK_ORDER where ORDER_ID = ? and EMPLOYEED_ID = ? />
</Phase>
How does the '?' know that they have to be replaced buy ORDER_ID and EMPLOYEE_ID from the previous HASH_JOIN ??
Please let me know, I am trying various options and unable to get any thing working. -
Hello,
your key for three tables is proper or you can write it more explicitly: joinKey=$Employee_ID=$Employee_ID;#$ORDER_ID=$Order_ID;#". For joining data from database you can't use DBInputTable - this only for reading data; for joining data use DBJoin component. See also examples: DBJoin example and DBUnloadParametrized example.[/i] -
Hi Agata,
I am using version 2.1 and it only offers HASH_JOIN and it does not support 3 way join. so I tried doing some thing like this.
<Node id="[b]JOIN_TABLES_1_&_2[/b]"
type="HASH_JOIN"
joinKey="Employee_ID"
transformClass="com.clover.xforms.RecordTransformDefault"
leftOuterJoin="false"
hashTableSize="1">
<Node id="[b]JOIN_TABLE_1_&_2-WITH_3[/b]"
type="HASH_JOIN"
joinKey="ORDER_ID=Order_ID"
transformClass="com.clover.xforms.RecordTransformDefault"
leftOuterJoin="false"
hashTableSize="1">
THE EDGE looks like this
<Edge id="JOIN_TABLES_1_&_2_to_JOIN_TABLE_1_&_2-WITH_3"
fromNode="JOIN_TABLES_1_&_2:0"
toNode="JOIN_TABLE_1_&_2-WITH_3:0"
metadata="employee_customer_order_superset.fmt"/>
where the superset involves the metadata of all the three tables : employee,customer,order.
The issue is JOIN_TABLE_1_&_2-WITH_3 is refusing to accept the join key: ORDER_ID=Order_ID and throws a message that
java.lang.RuntimeException: Field name specified as a key doesn't exist: ORDER_ID=Order_ID
at org.jetel.data.RecordKey.init(RecordKey.java:117)
at org.jetel.component.HashJoin.init(HashJoin.java:299)
at org.jetel.graph.Phase.init(Phase.java:150) -
In such old version use joinKey attribute for primary key and slaveOverrideKey for foreign key. By the way why do use so old unsupported version? -
I am working on an existing application, and I have a very specific change and the time line doe not allow me to change the version and retest the entire application. That is the reason I am stuck to the older version. -
Hi Agata,
I did a work around the joins, what I did was:
1) created a View in the data base which select all the columns.
2) Call the recordTransformDefaults guy and the I ma trying to insert.
But along the path when I reach a particular point it throws he this error :
component.DBOutputTable (run:395) - RESULT_ERROR:
java.sql.SQLException: Incompatible Clover & JDBC field types Clover type: string
I checked the metadata for the particular column its of the type string in both the places i.e from and to coloumn.
Any Ideas ? -
Try to extract metadata from db (in GUI or by AnalyzeDB) and then check fields' types.
Please sign in to leave a comment.
Comments 8