kaishi.image.filters.similar

Class definition for filtering similar images in a dataset.

Module Contents

class kaishi.image.filters.similar.FilterSimilar

Bases: kaishi.core.pipeline_component.PipelineComponent

Filter near duplicate files, detected via perceptual hashing (using the imagehash library).

__call__(self, dataset)

Perform filter operation on a specified dataset.

Parameters

dataset (kaishi.image.dataset.ImageDataset) – dataset to perform operation on

configure(self, perceptual_hash_threshold=3)

Configure the filter with a perceptual hash threshold.

Parameters

perceptual_hash_threshold (int or float) – threshold for determining whether or not images are similar (> are deemed not similar)