Customer Portal

Splitting input row into multiple columns using forward slash

Comments 4

  • Avatar
    David Pavlis
    1
    Comment actions Permalink

    Hi there,

    You did not really explain what exactly you want to do with the data. So I will take an assumption that you want to read your input data line by line and then create output (let's assume a file) where there will be 10 columns (the number can be higher) which will be populated by splitting your input data records into pieces by slash (your example contains backslash) and then populating output columns. If you read a line which gets split into 5 pieces, then you would populate the first 5 columns of your output, if split into 10 then you would populate all 10 output columns.

    The transformation graph would look like the picture below. The assumption is that your input data is a text file, with each such path string on a new line (but more complex structure could be parsed too).

    You would use input metadata with just one field called line with default delimiter \n (new line). The output metadata would have 10 fields called field1...field10.

    The Map component (Split&Map) would contain a transformation like this:

    function integer transform() {
        
        string[] elements = split($in.0.line,"/");
        for(integer i=0;i<length(elements);i++){
          setStringValue($out.0,i, elements[i]);
        }
        return ALL;
                
    }

    The nice thing is that should you need to split & populate wider records (e.g. up to 20 elements), you just need to add enough "space" to the output record/metadata - additional 10 fields (field11..field20 for example). The names are arbitrary, we are using Clover's ability to address output fields by their order number,  not name and we do it dynamically.

    Also, if you actually needed to split the data using backslash, then the code would need to be modified to something like this:

    string[] elements = split($in.0.line,"\\\\");

    The four backslashes there are because backslash is a control character both in Clover's CTL language and also in Regex. So we are escaping everything twice.

    Your Input & Output metadata (for those 2 Edges in the transformation graph):

     

  • Avatar
    CloverDxNewBie
    0
    Comment actions Permalink

    Dear David, You have understood the problem exactly, please accept my apologies for not defining the requirements very clear, appreciated your prompt and swift response. sorry for the delay in replying back.

    One more question, within the same transform function, can i also do the further splits for example mainly we are doing using "\\\\" but i have to split using underscore "_"

    for example: consider a line contains a single column (same to the above) :

    D:\Shares\PRD\assets\Producten\000\000\000\hires\000000001_02.jpg

    In the above below code will work to split into 10 columns using backward slash "\" but if you check the last value "000000001_02.jpg" - I want further split using "underscore" i.e. values will be "000000001" & "02.jpg"

     

    function integer transform() {
        
      string[] elements = split($in.0.line,"\\\\");
        for(integer i=0;i<length(elements);i++){
          setStringValue($out.0,i, elements[i]);
        }
        return ALL;
                
    }

    Many thanks, appreciate. it helped

  • Avatar
    David Pavlis
    1
    Comment actions Permalink

    If that is the only change (adding additional "splitting" character) then this should work:

    string[] elements = split($in.0.line,"[\\\\_]");

    So changed the Regex expression. You could add other characters between [] to split also based on that - e.g.

    [\\\\_ ]

    would split also if there is a space in the string.

  • Avatar
    CloverDxNewBie
    0
    Comment actions Permalink

    Yes great David Pavlis - You are a genius, thank you very much, really appreciated. 

Please sign in to leave a comment.