kaishi.core.file_group
¶
Class definition for reading/writing files of various types.
Module Contents¶
-
class
kaishi.core.file_group.
FileGroup
(recursive: bool)¶ Class for reading and performing general operations on groups of files.
-
__getitem__
(self, key)¶ Get a specific file object.
-
load_dir
(self, source: str, file_initializer, recursive: bool)¶ Read file names in a directory
- Parameters
source (str) – Directory to load from
file_initializer (kaishi file initializer class (e.g.
kaishi.core.file.File
)) – Data file calss to initialize each file with
-
get_pipeline_options
(self)¶ Returns available pipeline options for this dataset.
- Returns
list of uninitialized pipeline component objects
- Return type
list
-
configure_pipeline
(self, choices: list = None, verbose: bool = False)¶ Configures the sequence of components in the data processing pipeline.
- Parameters
choices (list) – list of pipeline choices
verbose (bool) – flag to indicate verbosity
-
file_report
(self, max_file_entries=16, max_filter_entries=10)¶ Show a report of valid and invalid data.
- Parameters
max_file_entries (int) – max number of entries to print of file list
max_filter_entries (int) – max number of entries to print per filter category (e.g. duplicates, similar, etc.)
-
run_pipeline
(self, verbose: bool = False)¶ Run the pipeline as configured.
- Parameters
verbose (bool) – flag to indicate verbosity
-