I have a requirement to capture the last run key from a graph execution. Store it and then re-inject this in the reader for the next run (in order to incrementally pull only new records). The challenge is the source is and HTTP connection and I do not have access to write to a RDBMS. Does Clover have a mechanism that will allow this( Without writing to file and joining this back in)?
-
Hello,
when executing the graph through the HTTP Api (see Operation graph_run), eg. from HttpConnector component you get a runId as output, so you can use it for configuring the next Reader. -
Sorry for the confusion, but I am not executing a graph through the HTTP API I am utilizing a HTTP connector inside of the graph to source data from a REST API (e.g. Twitter search API ). To prevent me from pulling the same data again and again. Twitter provides a id which I can concatenate with the url to only pull records since the last time I ran. Currently I am storing this value in a flat file and joining this back in but it is sloppy. -
Hello,
this is proper approach. Our incremental reading uses the external file for storing the incremental key also. -
You can also use parameters for this purpose:
1. create a separate parameter file twitter.prm holding parameter TWITTER_ID (do that in gui to have the syntax correct).
2. in the graph, use syntax ${TWITTER_ID} to access the actual parameter value.
3. at the end of the graph run, overwrite twitter.prm file with updated value of TWITTER_ID.
Note: When overwriting the twitter.prm file, you need to make sure that the resulting file has correct syntax.
I would also recommend to make the implementation fail-proof, i.e, handle case when the twitter.prm file gets corrupted or empty. -
Jan,
I originally started with this approach; however I had issues with the parameter file being overwritten even if no records were processed. Without writing my own null handling logic is their a way to avoid this (eg using partitions on the writer)? -
Partitioning using lookup table which would put null values into different file should work. Sample implementation attached. The ${LOOKUP_DIR}/OutputPartitioningTest.dat file contents should be
|_null
Please sign in to leave a comment.
Comments 6