Customer Portal

partition instance or partition number

Comments 6

  • Avatar
    etl_1234
    0
    Comment actions Permalink
    Let me rephrase my question to avoid confusion. If I have reformat with allocation set to 3 workers then in the java or CTL2 code is there a way to fetch the current worker instance number i.e. 0 or 1 or 2.

    I am using Cluster environment and able to set the component allocation. i.e. no of workers are getting set properly. I can also see no. of workers getting executed. However, I need worker instance number in my transformation.
  • Avatar
    dpavlis
    0
    Comment actions Permalink
    It looks like you are mixing the concept of partitioning records in data flow into different output ports/edges in graph - for this Partition component can be used. It may be perceived as an "intelligent" CASE statement. You may send records out through different ports of the component based on some defined key, or ranges of values or use CTL function to decide.
    See Partition component help.

    Then there is a different concept - parallelizing processing of data where you can use ClusterPartition component to split the data flows into chunks which are then processed in parallel. These chunks undergo the same processing but on different nodes of a cluster. The distribution of records into individual chunks can be user influenced to certain extend (to which it make sense). You may use round-robin distribution or hashing (based on key), intervals or again user defined CTL but without explicit link to physical node.
    Said that - you can actually write CTL partitioning function and "sort-of" link to physical node. If ClusterPartition has CTL defined partitioning then first init(integer partitionCount) function is called. The partitionCount is the number of "output ports" - different worker nodes which will be used. Their order (thus number) should correspond to the allocation definition - if you have listed cluster nodes there. But be warned - if you change your allocation definition then you need to update your partitioning function.

    I have asked our developers to check whether if partitioning function was written in Java we could get the exact "match" between output port and worker node in ClusterPartition component - in such case you could have "robust" partitioning resistant to changes in allocation structure - will see.

    Anyway, check documentation of ClusterPartition and you may also check presentation about CloverETL's clustering concept.
  • Avatar
    etl_1234
    0
    Comment actions Permalink
    Thank you David. Let me explain the scenario again. Sorry for the confusion.

    I have clusterCopy followed by Reformat (allocation of 3 workers) which means I am copying the same set of records to 3 reformat components. Now, In the reformat transformation I need the worker instance i.e. run time it should be able to interpret which worker instance number. I went thru Java code and saw there is CloverWorker.class.

    Is there any way to get the particular worker instance number within the transformation.

    Log :

    Starting parallel worker on node "**" (1 of 3)
    Starting parallel worker on node "**" (2 of 3)
    Starting parallel worker on node "**" (3 of 3)

    I am looking for the numbers which are highlighted in the above log. Is there any way to get the instance number.
  • Avatar
    dpavlis
    0
    Comment actions Permalink
    Hi,

    after consulting with our R&D guys, you might use following two parameters which are globally set for any graph running in clustered mode:

    • WORKER_ID - ID (number) - of current worker - e.g. 1,2,3..up to WORKER_COUNT

    • WORKER_COUNT - how many workers are running/processing particular graph in total


    However this parameter is set just "once" for the graph. If you use multiple allocations with different setups then these numbers might be misleading (would be valid only for one of the cases).

    Hope this helps.
  • Avatar
    etl_1234
    0
    Comment actions Permalink
    Thank you David. Can you please let me know how this WORKER_ID can be used in the transformation.
  • Avatar
    etl_1234
    0
    Comment actions Permalink
    its working fine and getting resolved to correct value.

    Thank you David for your valuable inputs.

Please sign in to leave a comment.