kaishi.image.generator

Data generator for image datasets.

Module Contents

kaishi.image.generator.augment_and_label(imobj)

Augment an image with common issues and return the modified image + label vector.

Labels at output layer (probabilities, no softmax): [DOCUMENT, RECTIFIED, ROTATED_RIGHT, ROTATED_LEFT, UPSIDE_DOWN, STRETCHING]

Parameters

imobj (kaishi.image.file.ImageFile) – image object to randomly augment and label

Returns

augmented image and label vector applied

Return type

kaishi.image.file.ImageFile and numpy.array

kaishi.image.generator.train_generator(self, batch_size: int = 32, string_to_match: str = None)

Generator for training the data labeler. Operates on a kaishi.image.dataset.ImageDataset object.

Parameters
  • self (kaishi.image.dataset.ImageDatset) – image dataset

  • batch_size (int) – batch size for generated data

  • string_to_match (str) – string to match (ignores files without this string in the relative path)

Returns

batch arrays and label vectors

Return type

numpy.array and list

kaishi.image.generator.generate_validation_data(self, n_examples: int = 400, string_to_match: str = None)

Generate a reproducibly random validation data set.

Parameters
  • n_examples (int) – number of examples in the validation set

  • string_to_match (str) – string to match (ignores files without this string in the relative path)

Returns

stacked training examples (first dimension is batch) and stacked labels

Return type

numpy.array and numpy.array