Customer Portal

EOF marker x1A throwing csv input

Comments 6

  • Avatar
    vacekm
    0
    Comment actions Permalink
    Hello,

    thank you for the praise, I'll forward it to our development team.

    As for your problem, I've compiled an example graph which shows two different workarounds for your problem. I think the best approach is to set the policy on DataReader to lenient and then maybe dump the bad records into an extra file so you can manually review them later.
  • Avatar
    dnielsen
    0
    Comment actions Permalink
    I tried your suggestions this morning. The failure is not treated as an error record, it is not written to the error output, and the count is zero. Below is the log output.

    It appears all of the records in the final block do not get committed. 2,349,304 rows are read, the last one is in error, and only 2,346,741 are committed.

    One more praise for you. I've been testing this load against dm/express (v 4.4). It has no driver for sqlite and is unable to bulk load like Clover. What took about two minutes for Clover --- I killed dm/express after 20 minutes. It must have been trying to commit after every read. Completely unacceptable. Kudos to you, again.


    INFO [main] - *** CloverETL framework/transformation graph, (c) 2002-2012 Javlin a.s, released under GNU Lesser General Public License ***
    INFO [main] - Running with CloverETL library version 3.3.0.M2 build#074 compiled 02/05/2012 13:21:39
    INFO [main] - Running on 2 CPU(s), OS Windows 2003, architecture x86, Java version 1.6.0_20, max available memory for JVM 253440 KB
    INFO [main] - Loading default properties from: defaultProperties
    INFO [main] - Graph definition file: graph/load_cdi_keys.grf
    INFO [main] - Graph revision: 1.13 Modified by: dnielsen Modified: Fri May 25 06:55:45 EDT 2012
    INFO [main] - Checking graph configuration...
    INFO [main] - Graph configuration is valid.
    WARN [main] - Incompatible Clover & JDBC field types - field seqno. Clover type: integer, sql type: VARCHAR
    INFO [main] - Graph initialization (load_cdi_keys)
    INFO [main] - Initializing connection:
    INFO [main] - DBConnection driver[org.jetel.connection.jdbc.driver.JdbcDriver@1185844]:jndi[null]:url[jdbc:sqlite:d:/cloveretl/projects/test_ii/data-out/test.db]:user[null] ... OK
    INFO [main] - [Clover] Initializing phase: 0
    INFO [main] - drop table if exists cdi_keys
    INFO [main] -
    create table cdi_keys (
    seqno integer,
    addr integer,
    resi integer,
    hhld integer,
    indv integer
    )
    INFO [main] - [Clover] phase: 0 initialized successfully.
    INFO [main] - [Clover] Initializing phase: 1
    INFO [main] - [Clover] phase: 1 initialized successfully.
    INFO [main] - register MBean with name:org.jetel.graph.runtime:type=CLOVERJMX_1337799034753_0
    INFO [WatchDog] - Pre-execute initialization of connection:
    INFO [WatchDog] - DBConnection driver[org.jetel.connection.jdbc.driver.JdbcDriver@1185844]:jndi[null]:url[jdbc:sqlite:d:/cloveretl/projects/test_ii/data-out/test.db]:user[null] ... OK
    INFO [WatchDog] - Starting up all nodes in phase [0]
    INFO [WatchDog] - Successfully started all nodes in phase!
    INFO [WatchDog] - [Clover] Post-execute phase finalization: 0
    INFO [WatchDog] - [Clover] phase: 0 post-execute finalization successfully.
    ...

    ERROR [WatchDog] - Graph execution finished with error
    ERROR [WatchDog] - Node DATA_READER0 finished with status: ERROR caused by: Parsing error: Unexpected end of file in record 2349304, field 1 ("seqno"), metadata "cdi_keys_csv"; value: ''
    ERROR [WatchDog] - Node DATA_READER0 error details:
    org.jetel.exception.BadDataFormatException: Parsing error: Unexpected end of file in record 2349304, field 1 ("seqno"), metadata "cdi_keys_csv"; value: ''
    at org.jetel.data.parser.DataParser.parsingErrorFound(DataParser.java:560)
    at org.jetel.data.parser.DataParser.parseNext(DataParser.java:538)
    at org.jetel.data.parser.DataParser.getNext(DataParser.java:179)
    at org.jetel.util.MultiFileReader.getNext(MultiFileReader.java:416)
    at org.jetel.component.DataReader.execute(DataReader.java:269)
    at org.jetel.graph.Node.run(Node.java:416)
    at java.lang.Thread.run(Thread.java:619)
    INFO [exNode_0_1337799034753_SQLITE_CDI_KEYS] - Number of commited records: 2346701
    INFO [WatchDog] - [Clover] Post-execute phase finalization: 1
    INFO [WatchDog] - [Clover] phase: 1 post-execute finalization successfully.
    INFO [WatchDog] - Execution of phase [1] finished with error - elapsed time(sec): 217
    ERROR [WatchDog] - !!! Phase finished with error - stopping graph run !!!
    INFO [WatchDog] - Post-execute finalization of connection:
    INFO [WatchDog] - DBConnection driver[org.jetel.connection.jdbc.driver.JdbcDriver@1185844]:jndi[null]:url[jdbc:sqlite:d:/cloveretl/projects/test_ii/data-out/test.db]:user[null] ... OK
    INFO [WatchDog] - -----------------------** Summary of Phases execution **---------------------
    INFO [WatchDog] - Phase# Finished Status RunTime(sec) MemoryAllocation(KB)
    INFO [WatchDog] - 0 FINISHED_OK 0 5145
    INFO [WatchDog] - 1 ERROR 216 9476
    INFO [WatchDog] - ------------------------------** End of Summary **---------------------------
    INFO [WatchDog] - WatchDog thread finished - total execution time: 217 (sec)
    INFO [main] - Freeing graph resources.
    ERROR [main] - Execution of graph failed !
  • Avatar
    dnielsen
    0
    Comment actions Permalink
    I've attached a small file that trips the problem, and the graph I've employed to load it.

    dm/express tripped on the character, also. But it flashed a warning, and completed the processing without dropping any records.

    Notepad++ loads the file and identifies the character. It labels it as SUB; but its value is x1A. Now I am starting to wonder if there are multiple instances of that character.

    dvn
  • Avatar
    vacekm
    0
    Comment actions Permalink
    Hello,

    you need to select the last field of your metadata, scroll to the very bottom of the properties on the right and set true on "EOF as delimiter". You can view that in the example as well. Otherwise the line is not recognized as record, since it doesn't contain any delimiter.
    The omitted commit is correct. If there's an error in the batch, it's not commited. That's what happened.
    If you want to commit after every record you have to check the "Atomic SQL query" on DBOutputTable component.
  • Avatar
    dnielsen
    0
    Comment actions Permalink
    This is weird. It has been set; it is set in the graph sent to you.

    <Field eofAsDelimiter="true" label="individual" name="indv" type="integer">
    <attr name="description"><![CDATA[individual level key]]></attr>
    </Field>
    </Record>


    The only time I get this to successfully load is if I edit the file from the mainframe, make sure the last record ends crlf, and I delete the eof marker. Any other variation fails.

    Is eof not used anymore? I look at other files on the network and they do not end with x1A.
  • Avatar
    vacekm
    0
    Comment actions Permalink
    Hello,

    I see the problem, I made a typo in my last post. Sorry for that. You need to turn on the "EOF as delimiter" on the FIRST field of your metadata, not last as you have.
    The "EOF as delimiter" presents an alternative option for your delimiter. If the universal data reader finds no delimiters on the input, it fails this as "not a record". It's only evaluated after at least one delimiter is recognized. It's a fine difference. In your case the first delimiter is set as a semicolon which is not there, but if you set the first delimiter as semicolon or EOF, it will go through.
    I turned on the "EOF as delimiter" option on the "seqno" field and set the data policy as "controlled" and your example graph runs well.

Please sign in to leave a comment.