Customer Portal

Having trouble with Streaming vs. In Memory

Comments 1

  • Avatar
    Vladimir Barton
    0
    Comment actions Permalink
    Hi Mike,
    there is an essential difference in the SpreadsheerDataReader read modes in the way how data is stored in the memory. The In-memory mode stores the whole input file in the memory which allows for faster reading and makes it a good candidate for smaller files. In the Streaming mode, the file is being read continuously without the need of storing in the memory. This mode is designed for reading large-sized files without the risk of running out of memory. Although your use case seems to coincide with the way how these 2 modes are designed to work I would not dare to conclude that using the former or the latter is behind the error you encountered. In order to narrow down the cause of both the first issue (incorrect number format) and the second issue (the reported error message), I would like to ask you for more details:

    • I would like to test run the graph on my end in order to replicate the same behavior of the system. Kindly attach the graph file(s), subgraph file(s), any externalized items (such as metadata) and sample input data which would enable me to run the graph and recreate the issue. Feel free to mask any sensitive information that might be contained within these items.

    • Are the input XSL files edited in Excel (or a similar program) by real users or are they just an output of a 3rd party software or process?

    • When you open these files in Excel (or a similar program) can you see any unusual/unexpected format already?

    • Please attach the whole stack trace of the graph run when you received the reported error message.

    • You mentioned a number of smaller input files and some other supposedly bigger files. Could you give me a rough estimate of the number of rows and columns in those files? In other words, what is a small input file in terms of your use case?

    • You also mentioned that depending on the order you selected the spreadsheets, the outcome is different. Could you clarify which order of files being read results in the proper column format and which order does not?

    • What is your CloverETL Designer (and possibly CloverETL Server) version? For instance, 4.8.2.

    Kind regards,

Please sign in to leave a comment.