Nkululeko: how to augment the training set

To do data augmentation with Nkululeko, you can use the augment interface.
In the DATA section of your configuration file, you specify the method and name of the output list of file

  • augment: is the classic augmentation, e.g. by cropping data or adding a bit of noise.
  • random-splice: is a special method introduced in this paper that randomly splices and re-connects the audio samples
[DATA]
# select the samples to augment: either train, test, or all
augment = train
# file name to store the augmented data (can then be added to training)
augment_result = augment.csv

or, if you want to use random-splicing:

[DATA]
# random_splice: select the samples to be random spliced: either train, test, or all
random_splice = train
# random_splice_result: file name to store the random spliced data (can then be added to training)
random_splice_result = random_spliced.csv

and then call the interface:

python -m nkululeko.augment --config myconfig.ini

Currently, apart from random-splicing, Nkululeko simply uses the augmentations that are specified as a demo in the audiomentations documentation, i.e.:

self.audioment = Compose([
    AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.015, p=0.5),
    TimeStretch(min_rate=0.8, max_rate=1.25, p=0.5),
    PitchShift(min_semitones=-4, max_semitones=4, p=0.5),
    Shift(min_fraction=-0.5, max_fraction=0.5, p=0.5),
])

These manipulations are applied randomly to your training set.

You should find the augmented files in the storage folder of the result folder of your experiment and could listen to them there.

Once you augmentations have been processed, you can add them to the training in a new experiment:

[DATA]
databases = ['original data', 'augment']
augment = my_augmentations.csv
augment.type = csv
augment.absolute_path = True
augment.split_strategy = train

Leave a Reply

Your email address will not be published. Required fields are marked *