Tuesday, October 4, 2016

Exit a batch script when an error occurs

Through Pentaho's shell script step, we can integrate into our ETL file manipulation, Confluence interaction, Tableau server interaction, and just about any other application for which we have a command line interface.

There is a dark side of the shell script that may leave you thinking that your job ran flawlessly, when in reality it just bailed out early.

I've got a very simple test job here. It calls a script to rename the file "foo.bar" to "bar.foo". However, that file doesn't exist, so the shell script step exits with a failure status.

I add a second statement to my script -- this time, to rename a file that does exist. What's going to happen?

If the first statement fails, the second one still runs. In this case, it ran successfully and the "hello.world" file was renamed. This means that according to Pentaho, the success condition for this shell script step was met, we see a nice green check-mark, and if this job contained more steps that depended on the success of the shell script step, everything would proceed as normal. Which probably isn't OK.

Fortunately there's a way around this that doesn't involve having to break these commands up into multiple job steps. As we see on this Microsoft page, the double-pipe gives us a way to execute certain commands only if the preceding command fails.

If we add "|| exit 1" after every command, we can force the job to stop if any command in the script fails. Not only does the script exit, but it exits with a return code of 1, which is a failure.

There are other ways to accomplish a similar result. For example, the same page referenced above indicates that we can use &&: "Use to run the command following && only if the command preceding the symbol is successful."

However, this would require that we put all of our commands on the same line, which is not good for readability if your commands are longer or more complex, or if you have more than a couple of them.