Customer Portal

Metadata Question

Comments 8

  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello Anna,
    set eofAsDelimiter="true" on the last field.
  • Avatar
    anweston
    0
    Comment actions Permalink
    Heya,

    This almost works the way I want it to. We are parsing the source file(s) with each row as a single field, then breaking the "field" into separate fields. When I have the FMT as:

    <?xml version="1.0" encoding="UTF-8"?>
    <Record name="RECORD_ONE_FIELD_RECORD_" type="delimited" recordDelimiter="\n">
    <Field name="FIELD_INPUT_ROW_NUM" type="numeric" nullable="false" auto_filling="source_row_count" />
    <Field name="FIELD_ROW" type="string" nullable="false" eofAsDelimiter="true" />
    </Record>

    any source files that do not have an "\n" on the final row of data parses to the correct number of rows. BUT, a source file that does have a "\n" on its final data row (the last row is just the end-of-file character) now parses an extra row with one empty field.

    Is there any way to configure the FMT so that both cases will parse the correct number of rows?

    Thanks,
    Anna
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello Anna,
    when I added missing delimiter for FIELD_INPUT_ROW_NUM field, CloverETL 2.9.2 reads data properly.
  • Avatar
    anweston
    0
    Comment actions Permalink
    Heya,

    Turns out the file I was using as a test was messed up (there were some dos-to-linux end-of-line stuff going on) - it now works the way I want it to. :-)

    I did not have to alter the FMT, though. What do you mean by "I added missing delimiter for FIELD_INPUT_ROW_NUM"? We have defined the delimiter in the <Record> tag as 'recordDelimiter="\n"' Am I missing something? just asking in case there's something that's working that shouldn't be and could cause an issue later on..


    Thanks,
    Anna
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello Anna,
    the field has no delimiter, neither default field delimiter is specified. The checkConfig method reports “Graph configuration is invalid (Field delimiter for the field 'FIELD_INPUT_ROW_NUM' in the record element 'RECORD_ONE_FIELD_RECORD_' not found!).“ I can’t guarantee, that graph with such metadata will always work.
  • Avatar
    anweston
    0
    Comment actions Permalink
    Heya,

    Interesting.

    I am now testing an upgrade to 2.9.2 with this test case. We run with "-skipcheckconfig" because we are auto-generating our graph and we get a small performance gain by not running that. The FMT I provided runs just fine with "-skipcheckconfig" on.

    Omit the "-skipcheckconfig" and I get the error you report.

    ERROR - Field delimiter for the field 'FIELD_INPUT_ROW_NUM' in the record element 'RECORD_ONE_FIELD_RECORD_' not found!

    If I add delimiter="%" to the <Field> tag or fieldDelimiter="%" to the <Record> tag, it runs with no warnings.

    This delimiter is bogus - The first field (FIELD_INPUT_ROW_NUM) is a auto-generated field, so there is only one field in the file. If I choose a character that might be in my file, it may try to parse as two fields. Is this correct behaviour?

    Thanks,
    Anna
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello Anna,
    if you add delimiter to the <Field> tag it is not taken into account when reading the file, so it can't cause any problem. But if you use the metadata somewhere else for formatting the data, the graph may fail.
  • Avatar
    anweston
    0
    Comment actions Permalink
    Heya,

    OK, I will try adding "delimiter" to the <Field> tag in case we decide to run graphs without "-skipcheckconfig" I was concerned because I found that using the fieldDelimiter="," to the <Record> tag (and there are commas in the file) seemed to cause parsing errors.

    Thanks,
    Anna

Please sign in to leave a comment.