GO-FIND-DUPLICATES(1)

NAME

go-find-duplicatesFind duplicate files (photos, videos, music, documents) on your computer, portable hard drives etc.

SYNOPSIS

$go install github.com/m-manu/go-find-duplicates@latest

INFO

295 stars
23 forks
0 views

DESCRIPTION

Find duplicate files (photos, videos, music, documents) on your computer, portable hard drives etc.

README

Go Find Duplicates

build-and-test Go Report Card Go Reference License

Introduction

A blazingly-fast simple-to-use tool to find duplicate files (photos, videos, music, documents etc.) on your computer, portable hard drives etc.

Note:

  • This tool just reads your files and creates a 'duplicates report' file
  • It does not delete or otherwise modify your files in any way 🙂
  • So, it's very safe to use 👍

How to install?

  1. Install Go version at least 1.22
  2. Run command:
    go install github.com/m-manu/go-find-duplicates@latest
    
  3. Add following line in your .bashrc/.zshrc file:
    export PATH="$PATH:$HOME/go/bin"
    

How to use?

Run directly (preferred)

go-find-duplicates {dir-1} {dir-2} ... {dir-n}

Command line options

Running go-find-duplicates --help displays following:

go-find-duplicates is a tool to find duplicate files and directories

Usage: go-find-duplicates [flags] <dir-1> <dir-2> ... <dir-n>

where, arguments are readable directories that need to be scanned for duplicates

Flags (all optional): -x, --exclusions string path to file containing newline-separated list of file/directory names to be excluded (if this is not set, by default these will be ignored: .DS_Store, System Volume Information, $RECYCLE.BIN etc.) -h, --help display help -m, --minsize uint minimum size of file in KiB to consider (default 4) -o, --output string following modes are accepted: print = just prints the report without creating any file text = creates a text file in the output directory with basic information csv = creates a csv file in the output directory with detailed information json = creates a JSON file in the output directory with basic information (default "text") -f, --outputfile string output file path (will be created, but directory needs to be writeable) -p, --parallelism uint8 extent of parallelism (defaults to number of cores minus 1) -q, --quiet quiet mode: no output on stdout/stderr, except for duplicates/errors -t, --thorough apply thorough check of uniqueness of files (caution: this makes the scan very slow!) --version display version (1.8.0) and exit (useful for incorporating this in scripts)

For more details: https://github.com/m-manu/go-find-duplicates

Run via Docker

docker run --rm -v /Volumes/PortableHD:/mnt/PortableHD manumk/go-find-duplicates:latest go-find-duplicates -o print /mnt/PortableHD

In above command:

  • option --rm removes the container when it exits
  • option -v is mounts host directory /Volumes/PortableHD as /mnt/PortableHD inside the container

How does this identify duplicates?

By default, this tool identifies duplicates if all of the following conditions match:

  1. file extension is same
  2. file size is same
  3. CRC32 hash of "crucial bytes" is same

If above default isn't enough for your requirements, you could use the command line option --thorough to switch to SHA-256 hash of entire file contents. But remember, with this, scan becomes much slower!

When tested on my portable hard drive containing >172k files (videos, audio files, images and documents), with and without --thorough option, the results were same!

How to build?

go build 

SEE ALSO

clihub3/4/2026GO-FIND-DUPLICATES(1)