Hello,
Has it been ever tested or known by reasoning that which one performs better in CloverETL: Multiple smaller joins OR one huge join?
(Assume ExtMergeJoin component)
Many Thanks,
Parsa
Has it been ever tested or known by reasoning that which one performs better in CloverETL: Multiple smaller joins OR one huge join?
(Assume ExtMergeJoin component)
Many Thanks,
Parsa
-
Hi Parsa,
ExtMergeJoin process sorted (by join key) records streams from input ports. Because of this fact, it is extremely fast and memory efficient - no matter whether process one large dataset or many smaller.
Most of the work is done prior ExtMergeJoin - read records into memory, sort them, etc. Here I can recommend to try read already sorted data (SORT BY in sql, sorted text files, ...) and do not sort as part of ETL.
I hope this helps.
Please sign in to leave a comment.
Comments 1