kaishi.core.filters.duplicate_files

Class definition for filtering duplicate files.

Module Contents

class kaishi.core.filters.duplicate_files.FilterDuplicateFiles

Bases: kaishi.core.pipeline_component.PipelineComponent

Filter duplicate files, detected via hashing.

__call__(self, dataset)