My git workflow includes the hack and ship commands for easy tracking of a shared
masterbranch, and conveniently delivering commits. Feature branches are cheap and fast in git, and I am often spawning new branches to try stuff out or work on unrelated things.
chop- for chopping down the current working branch after it has been shipped and is no longer needed. The script changes the current branch to
master, and then deletes the branch you was previously on. If you give a branch-name as an argument that will be the new current branch.
I use this small script is multiple times every day, and I really like the name of it. There is not a whole lot of functionlity, but as this is an often repeated action, it makes sense to automate it.
Some web-applications have to ingest an enormous amount of new data on a regular basis. Import scripts easily become an ever-growing procedural mess, annoying to maintain. In this post I show a bit of code which can be used to simplify and unify such import scripts.
Assume you have a pipeline of post-import steps to run. This can be organized in numerous ways. Simplest is to just have a bunch of methods called one after the other once you have the data loaded:
Now, assume once in a while one of the steps fail for an unexpected reason. You know, it’s rare data from external sources is as clean as we’d like. So you need to fix a few things and retry the import. However, as datasizes grow and with that the running time of the import, it can be a huge waste redoing all the work because of a misplaced comma made the final
Exceptions are the obvious way to report fatal data-errors, and implicit or explicit transactions to ensure consistency of the import. But how can this easily be combined for a resume-friendly import mechanism?
Enter the bulk importer step runner with trivial progress reporting:
Notice you obviously have to change the model-name (
ImportModelabove) and provide the actual implementation for these individual steps.
all_stepsreturns the list of methods to run,
run_import_stepruns a single step with error-handling, and
import_updatersruns all the relevant updaters.
Easy performance statistics
As a bit of bonus-functionality, the following can be used for reporting import progress with timing-statistics after each step completes:
Usage is simple - just call
report_progresswith a comment to print and a block of code, like this:
What do you use to make data-imports easier to manage?
subscribe via RSS