Photo De-Duplication
You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.
 
 
 
 
 
dedup/README.md

1.8 KiB

dedup

Photo De-Duplication

Install Windows

cd dedup
python -m venv C:\Users\$env:USERNAME\Common\env_dedup

Check to see if C:\Users$env:USERNAME\Common\env_dedup\Scripts OR bin folder exists in Windows...

& C:\Users\$env:USERNAME\Common\env_dedup\Scripts\Activate.ps1

If you get an error like: "running scripts is disabled on this system" It means your PowerShell execution policy is too restrictive. You can temporarily allow scripts by running:

Set-ExecutionPolicy -Scope Process -ExecutionPolicy Bypass

After that repeat the Activate Script command:

If that does not work, then TRY:
& C:\Users\$env:USERNAME\Common\env_dedup\bin\Activate.ps1

Here are the packages that need to be installed by pip:

pip install xxhash opencv-python pillow

Usage Windows:

Again, verify if the folder exists Scripts or bin:

& C:\Users\$env:USERNAME\Common\env_dedup\Scripts\Activate.ps1
python dedup.py 0.jpg 1.jpg

Windows Automated Directory use, not TESTED YET!:

get_dups.bat %USERPROFILE%\Pictures

Install Linux

cd dedup
python3 -m venv myenv
source myenv/bin/activate
pip install xxhash opencv-python pillow

Useage Linux:

cd dedup
source myenv/bin/activate
python dedup.py 0.jpg 1.jpg

Get more details on scores:

This command will give more details: Matrix deviation score, Decomposed similarity, Combined similarity, and general score...

python dedup.py 0.jpg 1.jpg scores

Linux Automated Directoy use:

./get_dups.sh .
  OR: 
./get_dups.sh $HOME/Pictures

Files made by get_dups Scripts:

error level: 0 = NOT a Dup, 1 = Duplicate, 2 = Close Match, 5 = Same GPS GEO-location, 8 = Invalid Image, 9 = File Too small/big. Possible files: dups.txt, alike.txt, sameGPS.txt, invalid.txt, size.txt.

[Image of ScreenShot]