Nkululeko: how to augment the training set

To do data augmentation with Nkululeko, you can use the augment or the aug_train interface.
The difference is that the former only augments samples, whereas the latter augments the training set of a configuration and then immediately performs the training, including the augmented files.

In the AUGMENT section of your configuration file, you specify the method and name of the output list of file

traditional: is the classic augmentation, e.g. by cropping data or adding a bit of noise. We use the audiomentations package for this
random-splice: is a special method introduced in this paper that randomly splices and re-connects the audio samples

[AUGMENT]
# select the samples to augment: either train, test, or all
sample_selection = train
# select the method(s)
augment = ['traditional', 'random_splice']
# file name to store the augmented data (can then be added to training)
result = augmented.csv

and then call the interface:

python -m nkululeko.augment --config myconfig.ini

python -m nkululeko.aug_train--config myconfig.ini

if you want to run a training in the same run.

Currently, apart from random-splicing, Nkululeko simply uses the audiomentations module, i.e.:

[AUGMENT]
augment = ['traditional']
augmentations = Compose([
AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.05),
Shift(p=0.5),
BandPassFilter(min_center_freq=100.0, max_center_freq=6000),])

These manipulations are applied randomly to your training set.

You should find the augmented files in the storage folder of the result folder of your experiment and could listen to them there.

Once you augmentations have been processed, you can add them to the training in a new experiment:

[DATA]
databases = ['original data', 'augment']
augment = my_augmentations.csv
augment.type = csv
augment.split_strategy = train

speechsurfer

Nkululeko: how to augment the training set

Leave a Reply Cancel reply

blog around speech technology