Customer Portal

Parallizing data in a flat file

Comments 14

  • Avatar
    avackova
    0
    Comment actions Permalink
    Uff, I've got it :wink:, although it was not easy at all.
    Attached graph implements following algorithm:
    • read input data

    • add column number to each input record

    • sort the records according to this number

    • format each group of records to one record
      [list:8htuk5jf][*:8htuk5jf] add the empty record if needed
    [/*:m:8htuk5jf]
  • store data in new format
  • [/list:u:8htuk5jf]
  • Avatar
    blekota74
    0
    Comment actions Permalink
    Thank you your fast response but I am having an error when execute the graph:

    INFO  [main] - ***  CloverETL framework/transformation graph, (c) 2002-2011 Javlin a.s, released under GNU Lesser General Public License  ***
    INFO [main] - Running with CloverETL library version 3.1.0 build#17 compiled 16/06/2011 16:06:35
    INFO [main] - Running on 2 CPU(s), OS Windows XP, architecture x86, Java version 1.6.0_26, max available memory for JVM 253440 KB
    INFO [main] - Loading default properties from: defaultProperties
    INFO [main] - Graph definition file: graph/Parallizing.grf
    INFO [main] - Graph revision: 1.48 Modified by: user Modified: Thu Jun 30 17:32:23 CEST 2011
    INFO [main] - Checking graph configuration...
    INFO [main] - Graph configuration is valid.
    INFO [main] - Graph initialization (Parallizing)
    INFO [main] - [Clover] Initializing phase: 0
    INFO [main] - Compiling dynamic class FormatInput...
    ERROR [main] - Error during graph initialization !
    Element [1309427766212:Parallizing]-Phase 0 can't be initilized.
    at org.jetel.graph.TransformationGraph.init(TransformationGraph.java:458)
    at org.jetel.graph.runtime.EngineInitializer.initGraph(EngineInitializer.java:202)
    at org.jetel.graph.runtime.EngineInitializer.initGraph(EngineInitializer.java:165)
    at org.jetel.main.runGraph.runGraph(runGraph.java:364)
    at org.jetel.main.runGraph.main(runGraph.java:328)
    Caused by: DENORMALIZER0 ...FATAL ERROR !
    Reason: Used Java Platform doesn't provide any java compiler!
    at org.jetel.graph.Phase.init(Phase.java:174)
    at org.jetel.graph.TransformationGraph.init(TransformationGraph.java:456)
    ... 4 more
    Caused by: java.lang.IllegalStateException: Used Java Platform doesn't provide any java compiler!
    at org.jetel.util.compile.DynamicCompiler.compile(DynamicCompiler.java:109)
    at org.jetel.util.compile.DynamicJavaClass.instantiate(DynamicJavaClass.java:66)
    at org.jetel.component.Denormalizer.createDenormalizerDynamic(Denormalizer.java:216)
    at org.jetel.component.Denormalizer.createRecordDenormalizer(Denormalizer.java:269)
    at org.jetel.component.Denormalizer.init(Denormalizer.java:241)
    at org.jetel.graph.Phase.init(Phase.java:165)
    ... 5 more
  • Avatar
    jurban
    0
    Comment actions Permalink
    Hi,

    are you running CloverETL with a JRE or JDK? A JDK is required to run Java tranformations - and such a tranformation is used in the graph provided by Agata.

    Best regards,
    Jaro
  • Avatar
    blekota74
    0
    Comment actions Permalink
    I set path to the JDK and it works. I tried out to understand the code written in java in the given example in object named 'Format many to one' (component type: Denormalilzer) and I think there is lack of documentation. For instance I can't find references for classes like DataFormatter or ByteArrayOutputStream. I googled for cloveretl DataFormatter and I couldn't find any information.
    I am dealing mainly with utf-8 and when I use non 'English' characters in the source file (formatted as utf-8) I got error (I set 'Denormalize source set' to utf-8):
    ERROR [WatchDog] - Node DENORMALIZER0 finished with status: Error occurred in nested transformation: ERROR caused by: Message: Denormalization failed! caused by: java.lang.RuntimeException: Exception when converting the field value: g zażółć gęślą jaźń a koń pędź (field name: 'field_B') to ISO-8859-1. (original cause: Input length = 1)


    below is full content of my example data file:
    field_A;field_B
    1;a
    2;b
    3;c
    4;d
    5;e
    6;f
    7;g zażółć gęślą jaźń a koń pędź
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,

    • to handle Polish characters you need to set proper charset on Writer

    • javadoc and source files (of the open source part of CloverETL Engine) can be downloaded from the CloverETL on Sourceforge page
  • Avatar
    blekota74
    0
    Comment actions Permalink
    well, I check input data (debug) on the DENORMALIZER0 object and it is OK, but at the output there is no data - for me it seems to be problem of a class that can't handle multibyte characters (when remove all 'Polish' characters it works properly).
    Exception when converting the field value: g zażółć gęślą jaźń a koń pędź (field name: 'field_B') to ISO-8859-1. (original cause: Input length = 1)
  • Avatar
    avackova
    0
    Comment actions Permalink
    Please change the charset on Writer:UniversalDataWriter.png
    Charset in Denormalizer is used just for decoding of external source of transformation.
  • Avatar
    blekota74
    0
    Comment actions Permalink
    Of course I did it and the the error still exists. The problem is in Node DENORMALIZER0 in my opinion. Try with my input file please.
    INFO  [main] - ***  CloverETL framework/transformation graph, (c) 2002-2011 Javlin a.s, released under GNU Lesser General Public License  ***
    INFO [main] - Running with CloverETL library version 3.1.0 build#17 compiled 16/06/2011 16:06:35
    INFO [main] - Running on 2 CPU(s), OS Windows XP, architecture x86, Java version 1.6.0_21, max available memory for JVM 253440 KB
    INFO [main] - Loading default properties from: defaultProperties
    INFO [main] - Graph definition file: graph/Parallizing.grf
    INFO [main] - Graph revision: 1.66 Modified by: informatyk Modified: Wed Jul 13 13:23:27 CEST 2011
    INFO [main] - Checking graph configuration...
    INFO [main] - Graph configuration is valid.
    INFO [main] - Graph initialization (Parallizing)
    INFO [main] - [Clover] Initializing phase: 0
    INFO [main] - Compiling dynamic class FormatInput...
    INFO [main] - Dynamic class FormatInput successfully compiled and instantiated.
    INFO [main] - [Clover] phase: 0 initialized successfully.
    INFO [main] - register MBean with name:org.jetel.graph.runtime:type=CLOVERJMX_1309427766212_0
    INFO [WatchDog] - Starting up all nodes in phase [0]
    INFO [WatchDog] - Successfully started all nodes in phase!
    ERROR [WatchDog] - Graph execution finished with error
    ERROR [WatchDog] - Node DENORMALIZER0 finished with status: Error occurred in nested transformation: ERROR caused by: Message: Denormalization failed! caused by: java.lang.RuntimeException: Exception when converting the field value: g zażółć gęślą jaźń a koń pędź (field name: 'field_B') to ISO-8859-1. (original cause: Input length = 1)

    Record: #0|field_A|S->7
    #1|field_B|S->g zażółć gęślą jaźń a koń pędź
    #2|key|i->0

    ERROR [WatchDog] - Node DENORMALIZER0 error details:
    org.jetel.exception.TransformException: Message: Denormalization failed! caused by: java.lang.RuntimeException: Exception when converting the field value: g zażółć gęślą jaźń a koń pędź (field name: 'field_B') to ISO-8859-1. (original cause: Input length = 1)

    Record: #0|field_A|S->7
    #1|field_B|S->g zażółć gęślą jaźń a koń pędź
    #2|key|i->0

    at org.jetel.component.denormalize.DataRecordDenormalize.appendOnError(DataRecordDenormalize.java:54)
    at org.jetel.component.Denormalizer.processInput(Denormalizer.java:381)
    at org.jetel.component.Denormalizer.execute(Denormalizer.java:452)
    at org.jetel.graph.Node.run(Node.java:425)
    at java.lang.Thread.run(Unknown Source)
    Caused by: java.lang.RuntimeException: Exception when converting the field value: g zażółć gęślą jaźń a koń pędź (field name: 'field_B') to ISO-8859-1. (original cause: Input length = 1)

    Record: #0|field_A|S->7
    #1|field_B|S->g zażółć gęślą jaźń a koń pędź
    #2|key|i->0

    at org.jetel.data.formatter.DataFormatter.write(DataFormatter.java:263)
    at FormatInput.append(FormatInput.java from JavaSourceFileObject:57)
    at org.jetel.component.Denormalizer.processInput(Denormalizer.java:379)
    ... 3 more
    Caused by: java.nio.charset.UnmappableCharacterException: Input length = 1
    at java.nio.charset.CoderResult.throwException(Unknown Source)
    at java.nio.charset.CharsetEncoder.encode(Unknown Source)
    at org.jetel.data.DataField.toByteBuffer(DataField.java:278)
    at org.jetel.data.formatter.DataFormatter.write(DataFormatter.java:228)
    ... 5 more
    INFO [WatchDog] - [Clover] Post-execute phase finalization: 0
    INFO [WatchDog] - [Clover] phase: 0 post-execute finalization successfully.
    INFO [WatchDog] - Execution of phase [0] finished with error - elapsed time(sec): 0
    ERROR [WatchDog] - !!! Phase finished with error - stopping graph run !!!
    INFO [WatchDog] - -----------------------** Summary of Phases execution **---------------------
    INFO [WatchDog] - Phase# Finished Status RunTime(sec) MemoryAllocation(KB)
    INFO [WatchDog] - 0 ERROR 0 15867
    INFO [WatchDog] - ------------------------------** End of Summary **---------------------------
    INFO [WatchDog] - WatchDog thread finished - total execution time: 5 (sec)
    INFO [main] - Freeing graph resources.
    ERROR [main] - Execution of graph failed !
  • Avatar
    avackova
    0
    Comment actions Permalink
    Yes, you're write. Change the line 22 of transformation to:
    	DataFormatter formatter = new DataFormatter("UTF-8");
  • Avatar
    blekota74
    0
    Comment actions Permalink
    now I have no errors but the result file contains data like double utf-8 formatted - when I set formatting to utf-8 in my editor I see:
    1;a;4;d;7;g zażółć gęślą jaźń a koń pędź
    2;b;5;e;;
    3;c;6;f;;

    for me the text looks like I use no utf-8 formatting

    but when I copied the text above into a txt editor (with no utf-8 formatting) saved it and browse with utf-8 coding it is OK.
    1;a;4;d;7;g zażółć gęślą jaźń a koń pędź
  • Avatar
    avackova
    0
    Comment actions Permalink
    Do you have the same charset everywhere? Attached graph works for me.
  • Avatar
    blekota74
    0
    Comment actions Permalink
    still have wrong results when execute your graph

    my input file (ANSI Windows, coding 1250 - when switch coding to utf-8 the content is presented properly):
    field_A;field_B
    1;a
    2;b
    3;c
    4;d
    5;e
    6;f
    7;g zażółć gęślą jaźń a koń pędź


    output:
    1;a
    ;4;d
    ;7;g zażółć gęślą jaźń a koń pędź
    2;b
    ;5;e
    ;;
    3;c
    ;6;f
    ;;


    for me the problem is in DENORMALIZER - input is correct (I can see all the characters properly in debug mode) but the output is wrong
  • Avatar
    avackova
    0
    Comment actions Permalink
    I think I've found where the problem is: in Denormalizer we need to format data with the same charset as we convert it from bytes for sending to the next Writer (and it doesn't matter what charset is set on Reader or Writer) or we can send it as bytes. The first solution means, that charset used with DataFormater (line 22: DataFormatter formatter = new DataFormatter("UTF-8");) needs to be the same as the charset used for converting ByteArrayOutputStream to string (line 75: value = output.toString("UTF-8");).
  • Avatar
    blekota74
    0
    Comment actions Permalink
    Now it is OK, :)
    Dziękuję.

Please sign in to leave a comment.