Customer Portal

Is clover restricted to use only in 1 CPU

Comments 6

  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    CloverETL can use more CPUs, but DataReader runs as one thread, so it uses only one CPU. In commercial versions of Clover family software there is available ParallelReader, on which you can set number of threads for execution.
  • Avatar
    bala
    0
    Comment actions Permalink
    I guess the functional part of parallel reader component uses multiple thread defined(level of parallelism) to read the given input file.But my question is total CPU usage is between 20% and 50% when we run the clover graph(which contains reader,reformat and universal writer component), even when memory mostly are free.Is there any particular reason behind this one?

    Clover version used :2.7.1

    Thanks,
    Bala
  • Avatar
    avackova
    0
    Comment actions Permalink
    Each component runs in its own thread and jvm rules how the jobs are distributed on the CPUs. I run the more complicated graph and monitor my both processors - both were working and for the while both works for 80%.
  • Avatar
    mzatopek
    0
    Comment actions Permalink
    As it was already mentioned, each component runs in separate thread and JVM take care about physical processors allocation. For runtime environment with more processors than graph components is very likely that you don't exploit all the processors grid, because the IO operations are probably the bottleneck of whole processing and that's why the processors are bored. According our experiences usage of ParallelReader should be very helpful to improve the performance of this type of graphs. Try it and let us know what was the result.
  • Avatar
    bala
    0
    Comment actions Permalink
    Hi,
    Thanks,
    Started using parallel reader component in my Graph instead of universal data reader.
    Currently facing one issue whenever i load a CSV file of 150 records in the parallel reader, it throws error "ERROR - [PARALLEL_READER0].fileURL - Input file 'CSV FILE NAME' is too small and/or level of parallelism is too high.".(default parallelism value 2 is used to test)
    Please let me know is there a way to configure parallel reader to read small files.

    Thanks,
    Bala
  • Avatar
    mzatopek
    0
    Comment actions Permalink
    Unfortunately this is one of limitation of ParallelReader. We have already removed this limitation, however no public release still contains this update. In fact there is no performance improvement for this so small input file while you are using parallel reader. I expected you have big amount of data when you are experimenting with processors load.

Please sign in to leave a comment.