Customer Portal

Multiple output files

Comments 5

  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    for detecting input file use auto filling feature with source_name name function. Then, during writing, set this field in partitonKey attribute. To not save the file name with the data, set it in excludeFields attribute also (see Data Writer component).
  • Avatar
    shylupriya
    0
    Comment actions Permalink
    thanks for the reply.

    But what i actually want is to pick multiple input files and produce multiple otput files corresponding to the input files.
    i.e if i pick "n" input files i should be able to pick "n" output files.
    How can i do that?

    Thanks in advance
  • Avatar
    shylupriya
    0
    Comment actions Permalink
    Hi,
    Thnaks for the reply.

    But what i actually want is to pick multiple input files and produce multiple output files respectively.
    For eg.
    Input files: in1.txt
    in2.txt
    in3.txt
    Output files:
    out1.txt
    out2.txt
    out3.txt
  • Avatar
    avackova
    0
    Comment actions Permalink
    Hello,
    try following graph:
    <?xml version="1.0" encoding="UTF-8"?>
    <Graph id="1279265421952" name="OIA" revision="1.46">
    <Global>
    <Metadata id="Metadata0" >
    <Record name="data" recordDelimiter="\n" type="delimited">
    <Field eofAsDelimiter="true" name="data" type="string"/>
    <Field auto_filling="source_name" delimiter="." name="input_file_name" type="string"/>
    </Record>
    </Metadata>
    <Metadata id="Metadata1" >
    <Record fieldDelimiter="," name="files" recordDelimiter="\n" type="delimited">
    <Field name="input" type="string"/>
    <Field name="output" type="string"/>
    </Record>
    </Metadata>
    <Property fileURL="workspace.prm" id="GraphParameter0"/>
    <LookupTable id="LookupTable0" initialSize="512" key="input" metadata="Metadata1" name="files" type="simpleLookup"/>
    <Dictionary/>
    </Global>
    <Phase number="0">
    <Node fileURL="${DATAIN_DIR}/*.txt" id="DATA_READER0" trim="false" type="DATA_READER"/>
    <Node id="DENORMALIZER0" key="input_file_name" type="DENORMALIZER">
    <attr name="denormalize"><![CDATA[//#CTL1
    // This transformation defines the way in which multiple input records
    // (with the same key) are denormalized into one output record.

    // This function is called for each input record from a group of records
    // with the same key.
    function append() {
    }

    // This function is called once after the append() function was called for all records
    // of a group of input records defined by the key.
    // It creates a single output record for the whole group.
    function transform() {
    list parts = split($input_file_name,"/");
    $input := $input_file_name;
    $output := replace(parts[length(parts) - 1],"in","out");
    return OK
    }

    // Called during component initialization.
    // function boolean init() {}

    // Called during each graph run before the transform is executed. May be used to allocate and initialize resources
    // required by the transform. All resources allocated within this method should be released
    // by the postExecute() method.
    // function void preExecute() {}

    // Called only if append() throws an exception.
    // function integer appendOnError(string errorMessage, string stackTrace) {
    // }

    // Called only if transform() throws an exception.
    //function integer transformOnError(string errorMessage, string stackTrace) {
    //}

    // Called after transform() to return the resources that have been used to their initial state
    // so that next group of records with different key may be parsed.
    // function void clean() {}

    // Called during each graph run after the entire transform was executed. Should be used to free any resources
    // allocated within the preExecute() method.
    // function void postExecute() {}

    // Called to return a user-defined error message when an error occurs.
    // function string getMessage() {}
    ]]></attr>
    </Node>
    <Node id="LOOKUP_TABLE_READER_WRITER0" lookupTable="LookupTable0" type="LOOKUP_TABLE_READER_WRITER"/>
    <Node id="SIMPLE_COPY0" type="SIMPLE_COPY"/>
    <Edge fromNode="DATA_READER0:0" id="Edge2" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (output)" toNode="SIMPLE_COPY0:0"/>
    <Edge debugMode="true" fromNode="DENORMALIZER0:0" inPort="Port 0 (in)" metadata="Metadata1" outPort="Port 0 (out)" toNode="LOOKUP_TABLE_READER_WRITER0:0"/>
    <Edge fromNode="SIMPLE_COPY0:0" id="Edge3" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 0 (out)" toNode="DATA_WRITER0:0"/>
    <Edge fromNode="SIMPLE_COPY0:1" id="Edge4" inPort="Port 0 (in)" metadata="Metadata0" outPort="Port 1 (out)" toNode="DENORMALIZER0:0"/>
    </Phase>
    <Phase number="1">
    <Node append="false" excludeFields="input_file_name" fileURL="${DATAOUT_DIR}/#" id="DATA_WRITER0" partition="LookupTable0" partitionFileTag="keyNameFileTag" partitionKey="input_file_name" partitionOutFields="output" type="DATA_WRITER"/>
    </Phase>
    </Graph>
  • Avatar
    bartosik
    0
    Comment actions Permalink
    Beautiful. This is a nice simple and elegant demonstration of a number of CloverETL features.

    1. How to solve the problem presented: How to take a series of input files and create a corresponding set of output tables.
    2. Demonstrates the use of phases. One phase to populate the lookup table, one to process the data.
    3. One way to read multiple input files: ${DATAIN_DIR}/*.txt
    4. How to build a lookup table.
    5. How to partition using that lookup table.

    This was a big help to me.

Please sign in to leave a comment.