Augmenting data

Often (kind of always) there is a lack of training data for supervised learning.

One way to tackle this is representation learning which can be done in an self-supervised fashion.

Another approach is to multiply your labeled training data by adding slightly altered versions of it, that would not change the information that is the aim of the detection, for example by adding noise to the data or clipping it. This is called augmentation and here is a post how to do this with nkululeko.

A third way is to synthesize data based on the labeled training, for example with GANs, VAEs or with rule-based simulation. It can be distinguished if in this case only a parameterized for of the samples (ie. the features) or whole audio files are generated.

Sometimes only samples for a rare class are needed, in this case techniques like ROS (random over sampling), Synthetic Minority Oversampling Technique (SMOTE) or the Adaptive Synthetic (ADASYN) can be used.
Here is a post how to do this with nkululeko

speechsurfer

Leave a Reply Cancel reply

blog around speech technology