Pass Metadata and Filename as Parameter in Designer

CloverDX Customer Portal
Forum
Pass Metadata and Filename as Parameter in Designer

pfield

March 25, 2020 00:00

Answered

Hi,

I would like to somehow pass in a filename and metadata to a graph via parameters. I did this before using Server but am now restricted to Designer and want to know if this is possible?

What I would like to happen is to have a parent graph feeding the child 2 parameters:
1) File Name (to be passed into the Reader URL)
2) Metadata (Metadata is externalized and would be used by the edge coming out of the Reader)

This way, i can use the same graph no matter how many files I need to read. Hoping there is a solution...

Thanks,
Paul

Comments 12

dpavlis

March 26, 2020 17:29
0

Comment actions Permalink
Hi Paul,

This is exactly what Server and JobFlow components are good for. If you need this to work in Designer without Server then you need to use parameters (external parameter file) and re-create the content of the parameter file each time with new values before you run the child graph. There is also no direct way how to execute child graph from within parent graph on Designer - again something easily done on Server through JobFlow, but with Designer only this means manually execute parent graph (which modifies some parameter file) and then manually execute child graph.
pfield

March 26, 2020 19:14
0

Comment actions Permalink
Thanks for the reply David.

Would there be anyway of passing a parameter via RunGraph using command line arguments? See related post below. This would allow for some form of synchronous execution.

https://urldefense.proofpoint.com/v2/url?u=https-3A__forum.cloverdx.com_viewtopic.php-3Ft-3D4576&d=DwMGaQ&c=tq9bLrSQ8zIr87VusnUS9yAL0Jw_xnDiPuZjNR4EDIQ&r=G3WmcsdBDGCmkCl7b1Ipe-ZSjXFSn856hSZy51xAqz4&m=6RzOjMN5sNIrM8LbBA70rnco_QwzrkUqfBP44NPpY7w&s=1LvATZPpTuf86Do2IPzHGOY5_RaQIJ31jAHtx1jJNN8&e=

Paul
dpavlis

April 02, 2020 12:33
0

Comment actions Permalink
Hello Paul,

If you define a parameter in your parent graph (where you RunGraph component is) and then define the same parameter (name) in your child graph, then there is a way to pass the value of parameter from parent graph to child.
You can use property "Graph parameters to pass" (multiple names divided by semicolon might be specified). This holds if "The same JVM" parameter is set to true.

If you want run those two in separate JMVs, then parameters have to be specified as "Command line arguments".

I have attached 2 simple examples which illustrate that.
- Parent_Child_RunGraph_SameJVM.zip
- Parent_Child_RunGraph_SeparateJVM.zip
pfield

January 18, 2021 12:58
0

Comment actions Permalink
Hi David,

Using a different JVM works well as I can pass different values through "cloverCmdLineArgs" on RunGraph input. One drawback I can see is that because it's in a different JVM I can follow the process in the execution log. Is there a way to navigate to the graph in the execution log/access the alternate JVM used?

Also, I see the "-P:Parameter Name=Parameter Value" is used to pass in the parameter value, so wondered what else can be passed into the JVM via command line arguments? A link to any documentation would be much appreciated.

Thanks!
Paul
dpavlis

January 21, 2021 08:29
0

Comment actions Permalink
Hi Paul,

Not sure what are you trying to achieve. If you would like the transformation graph executed in the separate JVM to log info into a specific file then you can use "Log file URL" parameter of the RunGraph component.

The "-P:" command line parameter is not for JVM but for the second(other) Clover runtime you end up executing. Various JVM parameters can be found in official Java/JVM documentation - look for "java command". The most useful is probably -Xmx<size> (e.g. -Xmx1024M) setting the maximum (heap) memory the JVM will use. This may be useful if your transformation executed through RunGraph needs a lot of memory to process all the data.

Nonetheless, as stated previously this is not the officially supported way of orchestrating several data transformation jobs, and it has many drawbacks.
pfield

January 21, 2021 12:46
0

Comment actions Permalink
Hi David,

Thank you for the response - this is helpful.

When I launch a graph via RunGraph using the same JVM I can track all graphs executed in the Execution Tab, click into them, see what data passed through the edge etc. When using a different JVM I cannot; The Execution Tab only shows the "parent" graph that uses the RunGraph. The Log File URL attribute on the RunGraph is definitely helpful so thank you for referencing it.

Appreciate using Clover Server is the best way to orchestrate synchronous processing and is a far better solution... however for my use case I need to work within the capabilities of the Designer product.

Thanks!
Paul
dpavlis

January 22, 2021 10:21
0

Comment actions Permalink
Hi Paul,

Understand your situation. The problem with RunGraph executed from Designer (with "separate JVM" switched on) is that Clover ends up executing a totally independent process at your operating system level. Thus, it has no idea that there is some other CloverDX transformation running which produces tracking statistics etc. It only sees an OS level process and approaches it as if you executed any other program.
The small advantage over using SystemExecute component in this scenario is that as you are using RunGraph component, Clover knows that the OS level process will be another Clover transformation and understands/helps with passing some control data/parameters. But once the separate JVM and transformation is launched, Clover looses much of the control and can approach it only as if you executed for example shell-script. There are STDIN/STDOUT/STDERR and information whether the process finished and if so, what was the status code.
Therefore, the Execution Tab lacks info which is otherwise available (if you run within the same JVM or on CloverDX Server).
pfield

April 25, 2021 20:07
0

Comment actions Permalink
Hi David,

I have a follow up question relevant to this thread...

When passing parameters into a RunGraph (via Command line arguments) calling a graph using a subgraph, I am getting the following error on the Log file:

CloverDX license for subgraphs is expired or not available

My question is, when not running in the same JVM, do we need to pass in any additional setup parameters linking to licenses etc?

Thanks,
Paul
pfield

April 26, 2021 09:58
0

Comment actions Permalink
Hi,

Shortly after my prior post I observed in the documentation for RunGraph that you cannot run a graph with a subgraph when using a separate JVM. This would explain my issue.

However, when removing the subgraph, I got some more errors:

Secure parameters are supported only in CloverDX Server environment

This appears to be caused by using a secure parameter in the graph being executed. Is this also unsupported? For testing purposes, I removed the secure parameter and then tried to execute again, but got the following error

Cannot find class: jk.StreamFile

The StreamFile class is used by a CustomJavaTransformer component within the graph. Is this also unsupported?

Thanks,
Paul
admin

May 13, 2021 05:56
0

Comment actions Permalink
Hi Paul,
The Secure Parameter functionality is available only in the Server Environment. It requires setting the Master Password in the Server console and the Secure parameters are then automatically decrypted by the Server in graph runtime.

Regarding your other issue, I would appreciate some details about your use case and why do you need to run the graph using the "separate JVM" function in the first place. As was mentioned before, this setup has certain limitations and I believe that there might be some workaround for your use case.

However, if you insist on having the RunGraph "The same JVM" parameter set to "false", you might want to try a different setup of your CustomJavaTransformer component. Unfortunately, I was not able to recreate the exact error message as you can see, but I also bumped into some issues when having CustomJavaTransformer being called by RunGraph with separate JVMs. This kind of behavior would suggest my custom class might not be in the classpath, for example. In my example, I was able to resolve the situation by setting up the "Algorithm" property, instead of "Algorithm class" in the CustomJavaTransformer component. Please give it a try and let me know if it worked for you as well.

Best Regards,
Eva
pfield

May 13, 2021 17:43
0

Comment actions Permalink
Hi Eva,

Your recommendation of using the "Algorithm" property worked great - THANK YOU!

Regarding my use-case, let me explain as best I can and keen to hear if there is a better approach...

I have a graph which collects files from a large directory (100GB+) and converts them into base64 before passing them to a web service (via HTTPConnector) for processing. The files that I need to convert are only a subset (say 200k of 2m), and if I were to try and read in all those files into the graph runtime my machine would definitely hit heap space problems.
To get around the problem, I used a CustomJavaTransformer to accept a filepath and then access and read (in bytes) the file then I convert to base64 - this way I am only ever storing in the files I need in runtime. This works well up to around 10-15k files then I start to hit heap space issue.
To solve that problem, I created a parent graph to batch up the files that need processing and then feed a RunGraph to execute the child graph, passing in Batch and Batch Size parameters (e.g. Batch=1, Batch_Size=10000 means the first 10000 files would get processed). The parent graph then just works it way through the batches without me having to intervene. Having "The Same JVM" set to false means I can dynamically pass in my Batch and Batch_Size parameters via the "Command line arguments" parameter, meaning the parent just figures out how many batches are needed and the rest is done for me.

Hopefully that makes sense, let me know if you need any more details.

Thanks,
Paul
admin

May 26, 2021 08:15
0

Comment actions Permalink
Hi Paul,

Your project is quite challenging and we don't have many options in case we would like to rely on the Designer only. I think it is safe to say, that you should keep the solution as it is right now (as long as it works with the "Algorithm" property). I am aware that the RunGraph component has its limitations, but I have been informed that the trend is rather to keep the option to run consecutive jobs with the Server and not adjust and improve this option in the Designer, I am afraid. In fact, the RunGraph component is going to be deprecated soon and removed entirely in the future (it is assumed that you can use ExecuteGraph in the jobflow instead).

Anyway, I have at least one good piece of news for you. The Secure Parameters functionality is, in fact, available in the Designer without the CloverDX Server environment. I am going to research in more detail why the error message said otherwise. To use it, you should just set the Master Password on the Designer Runtime level. For more information see the following link:
https://doc.cloverdx.com/latest/designe ... sword.html
I am very sorry that I haven't found this one before.

Best Regards,
Eva

Please sign in to leave a comment.

Quick links

Access my products

SUPPORT & SERVICES

Community

RESOURCES