I am reading in a csv file and having the following error. The error goes away if I switch to ISO-8859-1. But I set it to UTF-8 is because the input file has unicodes and I need to preserve them. I thought UTF-8 can read in anything? What did I do wrong?
Thanks,
Perri
Thanks,
Perri
Component [UniversalDataReader:UNIVERSAL_DATA_READER4] finished with status ERROR. (Out0: 2613 recs)
Error when parsing record #2614 field effdate value "23-N"
Character decoding error occurred. Set correct charset. Current charset is UTF-8
Input length = 1
------------------------------------------------------------------------------------------------------
ERROR [main] - Execution of graph failed !
-
Hi Perri,
Unfortunately UTF-8 does not read everything. It may happen that parser reads sequence of characters which is invalid. See http://en.wikipedia.org/wiki/UTF-8#Inva ... _sequences
In CloverETL it should be possible to use lenient/controlled policy on readers and skip invalid lines. But it will be available since 4.1M1 - see https://bug.javlin.eu/browse/CLO-5043 for details.
So current workaround may be to cleanup data before processing in CloverETL by external tool.
I hope this helps.
Please sign in to leave a comment.
Comments 1