Wednesday, August 3, 2016

"Copy rows to result" twice in Pentaho PDI

I have two different datasets, and I need to do two different command-line operations with them in the same job. This seems simple enough, but if the second set happens to be empty, strange things can happen.

Here's set 1, in the Input_Field1 transformation:
And here's set 2 in the Input_Field2 transformation, filtered to be empty for this example. I have a list of words that start with "z", and I'm filtering out any words that don't start with "a". 

Think of this as a list of files in a directory, possibly filtered to get back any that were created before yesterday. But what if none fit those criteria, because today is Monday?

The setup:

Main.kjb has two job steps, Echo_Field 1 and Echo_Field 2. For this example, each job has the simple task of echoing back the fields that they receive.
The Echo_Field1 job (Echo_Field2 looks the same):
... where echoField1.bat is a batch file with the single statement "echo %1".

When I initially ran this job, I received the "bat, cat, hat" output for both the Field1 and Field2 steps. What to do? How can I stop the Field1 output from interfering with the step for Field2, if Field2 has nothing to say?

The trick is to check the "Clear list of result rows before execution" box in the advanced tab of the job entry details for the Input_Field2 transformation.

Now Field1 and Field2 can express themselves freely, even if that means not saying anything.

And it works if Field2 has entries as well:

No comments:

Post a Comment