Customer Portal

Pivot one table into another

Comments 6

  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    if you don't know what properties can be contained in GenericProperties table, it can be very hard. But if properties are known I would do it in following way:
  • Avatar
    twelveeighty
    0
    Comment actions Permalink

    Hello,
    if you don't know what properties can be contained in GenericProperties table, it can be very hard. But if properties are known I would do it in following way:

    "avackova"


    Fantastic! I kept on searching the manuals for "pivot", since I thought that was the standard term for turning row-based data into column-based data, but I guess it's called "denormalization" in CloverETL.

    And yes, the properties are known (or unknowns can be ignored), so this should work.

    Thanks!
  • Avatar
    twelveeighty
    0
    Comment actions Permalink
    OK - wait a minute - something's not right - I don't see the "denormalizer" in the Transformers "Palette" in the CloverETL Designer. I downloaded the Community Edition. Is this not included?
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    in CloverETL CE it requires a little bit more work and some more time and memory for processing, but is also possible:
  • Avatar
    twelveeighty
    0
    Comment actions Permalink

    Hello,
    in CloverETL CE it requires a little bit more work and some more time and memory for processing, but is also possible:

    "avackova"


    Excellent. Thank you for that second solution.

    I also noticed that I am allowed to use the denormalizer node in the "open source" command-line driven version. I got it to work by editing the .frm file and use the XML syntax. Obviously, I can't use the GUI anymore after that on this file, but for my evaluation that is OK, since we'll be able to justify the "commercial" version once we know this works.

    I have one more technical question on these two approaches you suggested: 1) denormalizer + exthashjoin or 2) exthashjoin + dedup. In general, does CloverETL need to load the entire resultset in memory for the exthashjoin to work, or is it smart enough to only load as many rows from the "slave" table into RAM as it needs for a matching "key" in the master and then free up that memory for the next master key? My final dataset will be big and probably won't fit in RAM if the entire dataset must be loaded into RAM.
  • Avatar
    avackova
    0
    Comment actions Permalink
    ExtHashJoin loads all slave data into memory. ExtMergeJoin should be used for merging large data sets. It requires sorted data, but merge it "on the fly" and consume the memory for slaves with the same key only. ExtMergeJoin is available in commercial gui version and in open-source engine, but not in CE version.

Please sign in to leave a comment.