Customer Portal

Quoting field and universal data reader

Comments 11

  • Avatar
    mzatopek
    0
    Comment actions Permalink
    Hello Peter!

    The warn message is probably caused by invalid quoting format. According the current parsing algorithm, immediately after finishing quoting character has to follow an appropriate field delimiter. I suppose
    your data file doesn't satisfy this condition. Can you please confirm this
    idea?

    For instance the white space gap:

    "Martin" ;"Zatopek

    Thanks, Martin.
  • Avatar
    pmularien
    0
    Comment actions Permalink
    Hello Martin,

    Thanks for moving this conversation from email to the forum. Here are some additional details. I do not believe it is invalid or incorrect data.

    Let me paste my file definition below, and a sample:

    File Definition:

    <Record name="PersonCSVWithProfession" type="delimited">
    <Field delimiter="," name="Profession" nullable="true" size="50" type="string"/>
    <Field delimiter="," name="LastName" nullable="false" size="35" type="string"/>
    <Field delimiter="," name="FirstName" nullable="false" size="15" type="string"/>
    <Field delimiter="," name="MiddleName" nullable="true" size="12" type="string"/>
    <Field delimiter="," name="Suffix" nullable="true" size="3" type="string"/>
    <Field delimiter="," name="Address1" nullable="false" size="100" type="string"/>
    <Field delimiter="," name="Address2" nullable="true" size="100" type="string"/>
    <Field delimiter="," name="Address3" nullable="true" size="100" type="string"/>
    <Field delimiter="," name="City" nullable="false" shift="0" size="25" type="string"/>
    <Field delimiter="," name="StateAbbrev" nullable="false" size="2" type="string"/>
    <Field delimiter="\n" name="ZipCode" nullable="false" shift="0" size="5" type="string"/> </Record>


    Sample Record:

    "XY","SAMPLE","SAMPLE","J","","123 ANYWHERE ST","","","FALMOUTH","MA","02540"


    Reader Definition:

    <Node id="inputFile" type="DATA_READER"
    fileURL="${INPUT_FILE}"
    dataPolicy="controlled"
    skipFirstLine="true"
    quotedStrings="true"
    />


    As you can see, the delimiter for every field directly follows the quotes. If I remove the quotes from the input file, and remove the 'quotedStrings="true"', everything works as expected. Thus, I believe it's a bug. I am using CloverETL 2.4.1.

    Thanks in advance!
  • Avatar
    mzatopek
    0
    Comment actions Permalink
    Peter, only quick response. It seems it's really bug. If you need quick workaround, try to use the Delimited data reader instead. Tomorrow I'll give you more info about this issue.

    Martin
  • Avatar
    pmularien
    0
    Comment actions Permalink

    Peter, only quick response. It seems it's really bug. If you need quick workaround, try to use the Delimited data reader instead. Tomorrow I'll give you more info about this issue.

    Martin

    "mzatopek"

    Thanks, unfortunately the delimited data reader didn't work either - I don't recall the reason now, but it was different. Let me know if you need any further information and I can supply. Thanks!
  • Avatar
    mzatopek
    0
    Comment actions Permalink
    So, as I said that's really bug in the Universal Data Reader component. We will release a fix update 2.4.2 next week. If you need to solve this issue earlier, you can download appropriate branch 2.4 from our public svn repository or I can also offer you to send the binary package as "pre-release" of 2.4.2. It's up to you.

    Martin
  • Avatar
    mzatopek
    0
    Comment actions Permalink

    Peter, only quick response. It seems it's really bug. If you need quick workaround, try to use the Delimited data reader instead. Tomorrow I'll give you more info about this issue.

    Martin

    "mzatopek"


    Thanks, unfortunately the delimited data reader didn't work either - I don't recall the reason now, but it was different. Let me know if you need any further information and I can supply. Thanks!

    "pmularien"


    We are also really interested in the mentioned bug of the Delimited Data Reader. If you again run into this issue, please post me a bug report. Thanks.
  • Avatar
    pmularien
    0
    Comment actions Permalink

    So, as I said that's really bug in the Universal Data Reader component. We will release a fix update 2.4.2 next week. If you need to solve this issue earlier, you can download appropriate branch 2.4 from our public svn repository or I can also offer you to send the binary package as "pre-release" of 2.4.2. It's up to you.

    Martin

    "mzatopek"

    Thanks Martin, I will confirm the fix in SVN and then wait for the official 2.4.2 release. I appreciate the quick follow-up :)
  • Avatar
    pmularien
    0
    Comment actions Permalink

    We are also really interested in the mentioned bug of the Delimited Data Reader. If you again run into this issue, please post me a bug report. Thanks.

    "mzatopek"

    I will try to narrow this down and report a bug.
  • Avatar
    pmularien
    0
    Comment actions Permalink
    Just to follow up - the quotes bug with the universal data reader is indeed fixed in 2.4.2 - thank you!
  • Avatar
    pmularien
    0
    Comment actions Permalink
    Well, hopefully you are still reading this. I ran into an issue with the quoted characters today - it turns out that a single quote is considered a quote character, so if I have a field like this:

    "O'Neil"

    The single quote is considered the "end of string" and the file parsing falls apart. I have never seen a product behave like this - is it a bug, and/or is there any way to alter this behavior?
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    it was really bug, was solved and will be in today's release.

Please sign in to leave a comment.