Great package in the Debian repo called parallel.
If you need to hash a large volume of data quickly you can use the power of 8.
Basic stuff parallel will launch multiple copies of the same command completely using the capabilities of multicore multithread systems.
Typically using parallel you launch the same number of processes as you have cores.
seq 1000 | parallel -j 8 --workdir $PWD ./yourcommand
seq "number of specific tasks" | parallel -j 64-128 --workdir $PWD ./yourcommand
However if your script is self aware (meaning once it fulfills it's purpose it kills itself)
You can truly saturate the BPI-M3 I use 64 workers routinely.
Load for task goes way high but the amount of work done per minute almost triples.
If you compile software from source make also uses the same argument syntax for parallel processing.
make -j 64