Inspired by the ls command but lists directory contents grouped by their checksum (MD5, options for other cryptographic hash functions coming soon). The main use case of this tool is to identify duplicate files in nested directories efficiently.
The following example shows all audio samples across 4 drumkits in the current directory drum_kits which are duplicates.
The -d option specifies to go at most 3 subdirectories deep to find files and the -o option omits all files with
a unique checksum.
The >-- seperator is a boundary for each checksum group (in this case groups of duplicates as we've omited all unique files).
$ dupels -d 3 -o drum_kits
drum_kits/kit_1/open hat/oh (wod).wav
drum_kits/kit_1/open hat/oh (baby pluto).wav
>--
drum_kits/kit_3/REAL TRAPPER PERCZ/SF RT PERC 20.wav
drum_kits/kit_0/Hi Hats/Dp Beats- Hi Hat (3).wav
>--
drum_kits/kit_1/claps/clap (baby pluto).wav
drum_kits/kit_1/claps/clap (wod).wav
>--
drum_kits/kit_3/REAL TRAPPER SOUNDFONTZ/ZSF_Brass_Ensemble_SE.sf2
drum_kits/kit_2/VST Presets/Soundfonts/Brass Ensemble SE.sf2
>--
drum_kits/kit_1/808s/classic zay.wav
drum_kits/kit_0/808s/Dp Beats- 808(9).wav
>--
drum_kits/kit_3/REAL TRAPPER PERCZ/SF RT PERC 47.wav
drum_kits/kit_3/REAL TRAPPER PERCZ/SF RT PERC 37.wav
>--
drum_kits/kit_2/VST Presets/Other/OFFICIAL D. RICH PIZZICATO.sf2
drum_kits/kit_2/VST Presets/Soundfonts/Piccolo (2).sf2
drum_kits/kit_2/VST Presets/Soundfonts/Pizzicato_1.sf2
>--
drum_kits/kit_3/REAL TRAPPER SOUNDFONTZ/Pizzicato Strings.sf2
drum_kits/kit_3/REAL TRAPPER SOUNDFONTZ/Piccolo (5).sf2
drum_kits/kit_2/VST Presets/Soundfonts/Piccolo (5).sf2
>--
drum_kits/kit_1/kicks/kick (baby pluto).wav
drum_kits/kit_1/kicks/kick (wod).wav
>--
drum_kits/kit_3/REAL TRAPPER CLAPZ/SF RT CLAP 2.wav
drum_kits/kit_0/Claps/Dp Beats- Clap(1).wav
drum_kits/kit_0/Claps/Dp Beats- Clap (5).wav
>--
drum_kits/kit_1/sfx/ripsquadd riser 2.wav
drum_kits/kit_0/FX/Dp Beats- Drop (3).wav
>--
drum_kits/kit_2/VST Presets/Soundfonts/Synths (2).SF2
drum_kits/kit_2/VST Presets/Soundfonts/Synths (1).SF2
>--
drum_kits/kit_3/REAL TRAPPER SOUNDFONTZ/Orchestra Hits.sf2
drum_kits/kit_2/VST Presets/Soundfonts/Orchestra Hits.sf2Due to the nature of using checksum analysis for detecting duplicate files, false negatives can occur. For example, two MP3 files might sound identical, but still have different checksums if one is encoded at 128 kbps and the other at 320 kbps. Despite being perceptually the same, their binary differences result in unique checksums. On the other hand, false positives, where two files with different binary representations produce the same checksum, are extremely rare. The likelihood of this happening is about 1 in 2^128 for MD5 (unless the files are deliberately engineered to cause a collision). As a disclaimer, this tool is to help aid with productivity and file management, NOT to dictate definitive decissions.
You can download the latest release from the GitHub Releases page and extract the appropriate archive for your system:
wget https://github.com/srdlj/dupels/releases/latest/download/dupels-linux.tar.gz
tar -xzf dupels-linux.tar.gz
cd dupels-linux
./dupels --helpOr, for the zip archive:
wget https://github.com/srdlj/dupels/releases/latest/download/dupels-linux.zip
unzip dupels-linux.zip
cd dupels-linux
./dupels --helpYou can download the latest release from the GitHub Releases page and extract the appropriate archive for your system:
curl -LO https://github.com/srdlj/dupels/releases/latest/download/dupels-macos.tar.gz
tar -xzf dupels-macos.tar.gz
cd dupels-macos
./dupels --helpOr, for the zip archive:
curl -LO https://github.com/srdlj/dupels/releases/latest/download/dupels-macos.zip
unzip dupels-macos.zip
cd dupels
./dupels --helpDownload the Windows release from the GitHub Releases page:
With zip:
Expand-Archive -Path .\dupels-windows.zip -DestinationPath .\dupels-windows
cd .\dupels-windows
.\dupels.exe --helpOr, with tar:
tar -xzf .\dupels-windows.tar.gz
cd .\dupels-windows
.\dupels.exe --help
Ensure you have Rust installed, then run:
git clone https://github.com/srdlj/dupels.git
cd dupels
cargo build --release --package dupels-cliThe compiled binary will be located at:
target/release/dupels (Linux/macOS)target/release/dupels.exe (Windows)
You can then run:
./target/release/dupels --help
or on Windows:
.\target\release\dupels.exe --help
For details, see:
dupels --help to display usage:
$ dupels --help
Usage: dupels [OPTIONS] [FILE]
Arguments:
[FILE] Displays the name of files contained within a directory. If no operand is given, the contents of the current directory are displayed
Options:
-a Include directory entries whose names begin with a dot (.)
-r Generate the file names in a direcotry tree by walking the tree top-down.
If the -d option is specified, walk to the depth specified, otherwise the default is depth of 2.
-d, --depth <DEPTH> Specifies the depth to generate file names during walk.
The -d option implies the -r option.
-s, --seperator <SEPERATOR> Specify the seperator to use when listing the filenames [default: >--]
-o, --omit Omit displaying files that are unique
--max-threads <MAX_THREADS> Specify the maximum number of threads to use.
The default is the number of logical cores on the machine.
-h, --help Print help
-V, --version Print version
-V, --version Print version
If you’ve found a bug:
- Head over to issues
- Create a new issue with a clear title, detailed description, and steps to reproduce if applicable.
- Label the issue appropriatly (ex: bugs labeled bugs, feature request labeled enhancements, etc.)
If you’d like to add a new feature:
-
Open an issue first to discuss your idea.
-
Once approved, submit a pull request with your implementation.
-
Create a new branch for your change. Label it
/feature/<your_username>/<brief_feature_name> -
Make your changes with clear, atomic commits.
-
Write tests and update documentation as needed.
-
Submit a pull request with a clear title and description.
-
Ensure all CI tests pass before requesting review.
-
Breaking changes must be documented in the PR description.
-
Code needs to be readable.
-
Code should have helpful docstrings when applicable.
It's recommended to use the devcontainer that's already set up for this project.
- Docker installed and running.
- An editor or IDE that supports Dev Containers, such as Visual Studio Code with the Dev Containers extension.
- Clone the repo
- Open the project in your editor/IDE.
- From your command palette select the option to "Reopen in Container"
- Done!
By contributing, you agree that your contributions will be licensed under the LICENSE file in the root of this repository.
- DupeLs-GUI (maybe egui?)
- Option to allow users to choose different cryptographic hash functions (SHA256, SHA1, etc.)
- Option to target popular formats: Audio -> wav, mp3, m4a, etc. Images -> jpg, png, gif, svg, etc. Video -> mp4, mov, etc.
- Optimize recursive search
- Introduce threads/parallel computing (checksum calculation causing bottlenecks)
- Optimize MD5 checksum calculation (build from scratch)
- Better error handling