Run a limited number of commands in parallel from a bash for loop

Issue

I have ~10K files that I’d like to run a command on. It’s faster if I can have ~10 of those filenames processed at the same time. The filenames are hash values so there is no clear pattern to use for iterating through them. Below is the sequential for loop going through the files one at a time:

Sequential for loop:

src_dir='/path/to/src'
dest_dir='/path/to/dest'
for f in $src_dir/*.mp4 ; do echo $dest_dir/new_$(basename ${f}); done;

Parallel for loop that starts ALL the commands at once

src_dir='/path/to/src'
dest_dir='/path/to/dest'
for f in $src_dir/*.mp4 ; do echo $dest_dir/new_$(basename ${f}) & done;

Is there a straightforward way of limiting the number of parallel jobs that simultaneously start in a for loop.

Note: I understand that parallel --link -j{num} {command}:::x y z would actually allow you to specify {num} commands starting at the same time, but I’m not sure how to easily specify the x y z parameters because my filenames are hash values.

Solution

Try it like this:

a=1  # counter
n=10 # desired number of simultaneous processes
for i in {1..100}; {
    ((a%n==0)) && wait
    ((a++))
    sleep 5 &
}

Answered By – Ivan

This Answer collected from stackoverflow, is licensed under cc by-sa 2.5 , cc by-sa 3.0 and cc by-sa 4.0

Leave a Reply

(*) Required, Your email will not be published