kaishi.tabular.file_group
¶
Class definition for group of tabular files.
Module Contents¶
-
class
kaishi.tabular.file_group.
TabularFileGroup
(source: str, recursive: bool, use_predefined_pipeline: bool = False, out_dir: str = None)¶ Bases:
kaishi.core.file_group.FileGroup
Object containing groups of
kaishi.tabular.file.File
objects with methods to perform common operations on them.-
_get_indexes_with_valid_dataframe
(self)¶ Get a list of indexes with valid dataframes.
- Returns
indexes with valid dataframe
- Return type
list
-
_get_valid_dataframes
(self)¶ Get a list of valid dataframe objects.
- Returns
valid dataframes
- Return type
list[
pandas.core.frame.DataFrame
]
-
save
(self, out_dir: str, file_format: str = 'csv')¶ Save the processed dataset as individual files or as one file with all the data.
- Parameters
out_dir (str) – The path of the output directory. If the directory does not exist, it will be created.
file_format (str) – The format of output files. Currently only supports “csv”.
-
load_all
(self)¶ Load all files from the source directory.
-
run_pipeline
(self, verbose: bool = False)¶ Run the pipeline as configured.
- Parameters
verbose (bool) – flag indicating verbosity
-
report
(self)¶ Print a report of the dataset in its current state.
-