Batch Calculate and Verify MD5 Checksum With GNU Parallel

Page content

Calculating or verifying md5 checksums for lots of file always a time-consuming process. Here are some notes for how to work with md5 checksum via GNU parallel.

GNU parallel logo

1. Install GNU parallel

In Debian:

sudo apt update
sudo apt install parallel

2. Calaulate md5 checksum in batch

find /data -type f | parallel -j 64 md5sum > md5.txt

The -j option indicates the maximum number of jobs that are run at the same time.

3. Verify md5 checksum in batch

cat md5.txt | parallel --pipe -N1 md5sum -c

-N indicates use at most number of arguments per command line.

If files are small, use -N100.

A. Reference

  1. https://www.gnu.org/software/parallel/
  2. https://www.gnu.org/software/parallel/man.html
  3. https://linux.die.net/man/1/parallel
  4. https://stackoverflow.com/questions/16772186/bash-parallelize-md5sum-checksum-on-many-files
  5. https://www.javaer101.com/en/article/14218848.html