Hi All,
I am wondering if anybody can help with the below question/query to do with building Foreign Key Relationships within a database table generated by CloverETL.
I have a dimension table (customer) which has a unique id (customer_id). The customer ID is a surrogate key generated when written to the database and there is 20,000 rows (approx).
When I load the orders table, I need to reference back to the customer via customer_id. To do this I use a merge join to merge orders & Order_Lines together.
I do not have the customer_id when merging the data but do have customer_account_code – which is unique but is a legacy data attribute.
In the customer dimension I have customer_id, customer_account_code: A direct one to one mapping.
My Question is; what would be the simplest, efficient method to resolve customer_id from customer_account_code?
I tried to use DBLookupTable with no success – would anyone have an example? The issue I had was ArrayOutOfBounds on DBLookupTable:153.
Is DBLookupTable the most efficient method as I know it implements caching, however, would you have 20,000 calls to the database if all customer had ordered?
I don't really want to start writing dbinput<-->temp file and reading it into to a simpleLookup object so Ideally, I require a lookuptable that is stored in memory and is populated by a DBInput Table. I can then interrogate the memory for the customer_id from the mergeJoin Component? Would one exist/simple to develop?
Any help would be greatly appreciated.
Many Regards,
Ian.
I am wondering if anybody can help with the below question/query to do with building Foreign Key Relationships within a database table generated by CloverETL.
I have a dimension table (customer) which has a unique id (customer_id). The customer ID is a surrogate key generated when written to the database and there is 20,000 rows (approx).
When I load the orders table, I need to reference back to the customer via customer_id. To do this I use a merge join to merge orders & Order_Lines together.
I do not have the customer_id when merging the data but do have customer_account_code – which is unique but is a legacy data attribute.
In the customer dimension I have customer_id, customer_account_code: A direct one to one mapping.
My Question is; what would be the simplest, efficient method to resolve customer_id from customer_account_code?
I tried to use DBLookupTable with no success – would anyone have an example? The issue I had was ArrayOutOfBounds on DBLookupTable:153.
Is DBLookupTable the most efficient method as I know it implements caching, however, would you have 20,000 calls to the database if all customer had ordered?
I don't really want to start writing dbinput<-->temp file and reading it into to a simpleLookup object so Ideally, I require a lookuptable that is stored in memory and is populated by a DBInput Table. I can then interrogate the memory for the customer_id from the mergeJoin Component? Would one exist/simple to develop?
Any help would be greatly appreciated.
Many Regards,
Ian.
-
Hello Ian !
I just stumbled on this post. Apparently it has slipped through my attention.
To make the reply short:
1) if your 20k customers ordered something and you are using DBLookupTable, then you will definitely access DB 20k times - no caching can improve this
2) You can now feed SimpleLookupTable from DB - using SQLDataParser - as of version 1.1.7 - but you have to manualy set up everything. For using temp file, Clover now offers a lookup table definition stored in XML graph definition.
3) If you use mergeJoin with slave records coming from DBInputTable, you may join it with master data (orders) - it is almost equivalent to building lookup table in memory.
David.
Please sign in to leave a comment.
Comments 1