To do data augmentation with Nkululeko, you can use the augment or the aug_train interface.
The difference is that the former only augments samples, whereas the latter augments the training set of a configuration and then immediately performs the training, including the augmented files.
In the AUGMENT section of your configuration file, you specify the method and name of the output list of file
- traditional: is the classic augmentation, e.g. by cropping data or adding a bit of noise. We use the audiomentations package for this
- random-splice: is a special method introduced in this paper that randomly splices and re-connects the audio samples
[AUGMENT]
# select the samples to augment: either train, test, or all
sample_selection = train
# select the method(s)
augment = ['traditional', 'random_splice']
# file name to store the augmented data (can then be added to training)
result = augmented.csv
and then call the interface:
python -m nkululeko.augment --config myconfig.ini
or
python -m nkululeko.aug_train--config myconfig.ini
if you want to run a training in the same run.
Currently, apart from random-splicing, Nkululeko simply uses the audiomentations module, i.e.:
[AUGMENT]
augment = ['traditional']
augmentations = Compose([
AddGaussianNoise(min_amplitude=0.001, max_amplitude=0.05),
Shift(p=0.5),
BandPassFilter(min_center_freq=100.0, max_center_freq=6000),])
These manipulations are applied randomly to your training set.
With respect to the random_splicing method, you can adjust two parameters:
- p_reverse: probability of some samples to be in reverse order (default: 0.3)
- top_db: top dB level for silence to be recognized (default: 12)
This configuration, for example, would distort the samples much more than the default:
[AUGMENT]
augment = ['random_splice']
p_reverse = .8
top_db = 6
You should find the augmented files in the storage folder of the result folder of your experiment and could listen to them there.
Once you augmentations have been processed, you can add them to the training in a new experiment:
[DATA]
databases = ['original data', 'augment']
augment = my_augmentations.csv
augment.type = csv
augment.split_strategy = train