Where I work, we've been using Snakemake[1] or data pipeliens. It's like GNU-make but with a Pythonic syntax. It determines which processing steps need to be executed by building a directed acyclic graph from the end to the beginning; it figures out which operations are needed to produce a given output, then looks at those and figures out their inputs, and so it. It can even submit parallel processing jobs to a batch-queueing cluster.