Customer Portal

Import mail attachment in DB based on mail subject filter

Comments 4

  • Avatar
    slechtaj
    0
    Comment actions Permalink
    Hi,

    Provided approach can be shorten into one “phase”. Since EmailReader has two output ports you may use them at once – read subjects and emails at the same moment. Of course you need to enrich these records with MessageID using which you will join the streams. On the first output port (where the subjects are stored), you can apply Filter component using which you will get only the “right” records (based on a pattern/rule defined in the Filter component). Now you have two edges (one with filtered subjects and one with all attachment records) that you want to join. Since you want to join only those that have a matching record on both ports, you may use “INNER JOIN” join type on the MessageID attribute in ExtMergeJoin component in order to get this. The output from the ExtMergeJoin will contain only those records (with attachment path and other required information) that have been filtered by the “subject pattern/rule” filter. These records can be directly sent to a reader component, and further processed (their data transformed as required and inserted into database).

    So again in short:

    • EmailReader with both output ports used.

    • Attach Filter component to the first output port in order to filter only the valid messages out.

    • Add ExtMergeJoin (with default INNER JOIN join type) and join these two streams using MessageId retrieved from EmailReader

    • Output port from ExtMergeJoin will contain only the data you required (attachment information filtered by the subject filter).


    For more information about EmailReader, please refer to our documentation: http://doc.cloveretl.com/documentation/ ... eader.html

    Hope this helps.
    Jan
  • Avatar
    scipio
    0
    Comment actions Permalink
    Thanks. Using ExtMergeJoin was a great idea.

    I did get this error though from CloverETL:

    Data input 0 is not sorted in ascending order. Record #3: Key field="MessageID". Current="MessageID:<CAOPFz2Sw7CYnCXTG4EG2cdjV1n=SR4+dcW52M6abY-45e2rcqg@mail.gmail.com>"; Previous="MessageID:<1686222707.184840.1348632498203.JavaMail.root@li202-140>".

    It looks like it expets the MessageID's to come in either descending or asceding. I have no control over this.

    Any ideas?
  • Avatar
    slechtaj
    0
    Comment actions Permalink
    Hi,

    it's simply because ExtMergeJoin works with sorted data only. In order to sort the incoming records use ExtSort (see our documentation for more information) component on both input edges.
  • Avatar
    scipio
    0
    Comment actions Permalink
    Awesome! This works splendidly.

    BTW I used FastSort instead of your tip. It seems to be the fastest Sorting Transformer you have.

Please sign in to leave a comment.