🧪 Skills
Image Deduplicator
Detect and remove exact or similar duplicate images in folders using perceptual and MD5 hashing with configurable similarity and actions.
v1.0.0
Description
Image Deduplicator
Find and remove duplicate or similar images in a folder using perceptual hashing. Use when user wants to clean up duplicate images, find near-duplicates, or deduplicate an image dataset.
Features
- Exact Duplicates: Find images with identical content
- Similar Images: Detect visually similar images (threshold configurable)
- Hash-based: Fast MD5 hashing for exact duplicates
- Perceptual Hash: pHash for finding similar images
- Batch Processing: Process large image folders
- Multiple Actions: List, delete, or move duplicates
Usage
# Find exact duplicates
python scripts/dedupe.py scan /path/to/images/
# Find similar images (90% similarity)
python scripts/dedupe.py scan /path/to/images/ --threshold 90
# Delete duplicates (keeps first occurrence)
python scripts/dedupe.py scan /path/to/images/ --action delete
# Move duplicates to a folder
python scripts/dedupe.py scan /path/to/images/ --action move --output /path/to/dupes/
Examples
$ python scripts/dedupe.py scan ./images/
Scanning images...
Found 150 images
Computing hashes...
Found 5 duplicate groups:
Group 1 (3 files):
./images/photo1.jpg
./images/photo1_copy.jpg
./images/photo1_final.jpg
Group 2 (2 files):
./images/screenshot.png
./images/screenshot (1).png
Total: 5 duplicate groups, 8 duplicate files
Installation
pip install pillow imagehash
Options
--threshold: Similarity threshold (0-100), default: 100 (exact)--action: What to do with duplicates (list, delete, move)--output: Output folder for --action move--extensions: File extensions to scan (default: jpg,jpeg,png,bmp)
Reviews (0)
Sign in to write a review.
No reviews yet. Be the first to review!
Comments (0)
No comments yet. Be the first to share your thoughts!