Customer Portal

Multiple small joins VS one big join - performance-wise?!

Comments 1

  • Avatar
    admin
    0
    Comment actions Permalink
    Hi Parsa,

    ExtMergeJoin process sorted (by join key) records streams from input ports. Because of this fact, it is extremely fast and memory efficient - no matter whether process one large dataset or many smaller.

    Most of the work is done prior ExtMergeJoin - read records into memory, sort them, etc. Here I can recommend to try read already sorted data (SORT BY in sql, sorted text files, ...) and do not sort as part of ETL.

    I hope this helps.

Please sign in to leave a comment.