Customer Portal

Using allocation with DBInputTable

Comments 5

  • Avatar
    Pedro Vazquez Rosario
    0
    Comment actions Permalink
    Hi hewills,

    Could you please provide me with following information:

    • How many records are you processing?

    • How long does this process usually take?

    • What Database Server are you utilizing (vendor, version)?

    • Can you please provide me with your graph, I would like to look over your use case as well are how you configured it (please remove any sensitive data)
  • Avatar
    hewills
    0
    Comment actions Permalink
    Hi, thanks for taking a look.

    How many records are you processing?
    ~ 3,000,000

    How long does this process usually take?
    7.5 minutes

    What Database Server are you utilizing (vendor, version)?
    Oracle 12c database, on Red Hat 7.2
    Clover Server is running on CentOS
    I don't have access to the source server that I'm pulling data from, but it is probably similar.

    Can you please provide me with your graph, I would like to look over your use case as well are how you configured it (please remove any sensitive data). See attached.

    allocation_example.grf
  • Avatar
    Pedro Vazquez Rosario
    0
    Comment actions Permalink
    Hi hewills,

    After looking over your graph, I have a few suggestions that will improve the overall performance:

    • I would recommend using a Normalizer component to be able to create multiple queries

    • I would recommend removing the ParallelGather and adding a ParallelPartition before the DBInputTable

    • The ParallelPartition will distribute incoming data records among different workers (please read more information here)


    I have taken the liberty of adding a simple graph that simulates your use case

    Edit: I changed the Query URL processing type to Discrete on the DBInputTable

    send_optimization_data.grf
  • Avatar
    svecp
    0
    Comment actions Permalink
    There might be a major performance improvement in case that particular table is partitioned. You can query each partition separately, using "PARTITION" clause and instead of using ROW_NUMBER() you can select the whole partition. That should lower DB overhead considerably. See https://docs.oracle.com/cd/B28359_01/server.111/b28286/statements_10002.htm#SQLRF01702.

    But that really depends on your use case and DB structure.
  • Avatar
    hewills
    0
    Comment actions Permalink
    Thanks Pedro! I had a chance to use your design, and it sped up the graph as expected. As you said, using the 'ParallelPartition' seems to be key.
    Using 'PARTITION' wouldn't be an option for the database we are currently working with. But I'll keep it in mind for the future.

Please sign in to leave a comment.