Customer Portal

Performance Tuning

Comments 2

  • Avatar
    dpavlis
    0
    Comment actions Permalink
    Hi !

    From your list of optimization items, number 1) and 2) should do the most. The "ParallelGC" is useful only if you have enough CPUs. Since you mentioned that only 2 are available and your graph is probably, complex, the option may not help. Increasing sizes of internal buffers usually does not help too much. As for giving your JVM more memory - that will help only if you are sorting data in your transformation and need to increase the size of buffer sort component for in-memory sorting.

    Some general rules which you may apply during the graph refactoring:

    • when parsing data, try to convert them immediately into the "native" type - int, long, date, etc.

    • process only data fields which you need - i.e. if you have on input 15 fields, but only 6 are needed down the road, drop the rest as soon as possible (using Reformat, for instance)

    • if you are sorting data, then make sure you give the sorter enough memory (so it can sort as much data in memory without swapping to disk)

    • first filter, then sort (not the opposite way as we see)

    • prefer hash joins, unless your input data are already sorted according your join key
  • Avatar
    anweston
    0
    Comment actions Permalink
    Heya,

    Thank you so much for your reply! I will keep your tips in mind when refactoring the graph - I really appreciate the help...

    Anna :-)

Please sign in to leave a comment.