Customer Portal

Nodes retrying many times (infinite loop?)

Comments 3

  • Avatar
    admin
    0
    Comment actions Permalink
    Hi jfuentesve,

    There is no mechanism like that on phase/graph level. This should be handled by components - it should fail when service timeout is detected.

    What are the problematic components?
  • Avatar
    jfuentesve
    0
    Comment actions Permalink
    Hello kubosj, thanks for the response.


    The component in this case is a custom DataService writer, governed by a class that we wrote and that is passed some arguments. This class extends Node, like this:

    MetadataServiceWriter extends org.jetel.graph.Node


    But coming from Node attributes and setters, I can't find a way to tell it how to retry, I can see how to deal with errors, there is a threshold of maximun errors, but in this case, this is not an error, it is just a timeout, not treated as an error.

    The request times out, after a minute, it happens ok, the problem is that the node retries it, which is ok for a while, sometimes the service is too busy. But when the service is down or stuck (alive but zombie or paused by a debugger for instance), a test gets stuck retrying forever, and never fails, it just blocks all the code validation process, and when someone finds out in the morning it could have been running for hours and everyone's blocked because of it. So then someone manually kills the task (the graph) and then we explore the logs and find out that a DS Writer was stuck retrying forever after every timeout, it NEVER fails :S

    We want it to fail, after a certain number of retries... is that possible?
  • Avatar
    admin
    0
    Comment actions Permalink
    Hi,

    this is how I understand your description:
    * you wrote own component calling external service
    ** on input it gets a lot of records containing parameters for calling service
    ** on output port, there is produced call result
    * external service may be unresponsive
    * your component tries to retry service call many times and that causes problems when service is down and there is a lot of input records

    There are some thoughts I have:

    1] you can define custom properties on custom component, something like:
    * "service call timeout" - possibility to change default 1 minute timeout
    * "retry count" - how many times should component retry calling service for one input record
    * "ignore after N fails" - after how many records for which call failed should component ignore rest

    2] you can set component properties in graph from outside
    * use ${PARAMETER} in property value
    * pass this PARAMETER to graph from outside - depending on production/test environment
    * use this for properties in 1]
    * see http://doc.cloveretl.com/documentation/ ... ments.html and parameter "-P"

    3] react on custom properties in your component
    * component is done when reads all inputs and exit method execute()
    * inside of execute() you can react on properties from 1] and 2]
    ** e.g. when "ignore after N fails" is exceed, then just read input records and send output records (without calling service)

    In general, our components solve this internally because general concept would be confusing and not enough powerful. E.g. http://doc.cloveretl.com/documentation/ ... table.html and its property "Max error count".

    I hope this helps.

Please sign in to leave a comment.