Nkululeko: How to import a database

Nkululeko is a tool to ease machine learning on speech databases.
This tutorial should help you to import databases.
There are two formats upported:
1) csv (comma seperated values)
2) audformat

CSV format

The easiest is CSV, you simply create a table with the following informations:

  • file: the path to the audio file
  • speaker: a speaker identifier
  • sex: the biological sex (has quite an influence on the voice, so sometimes submodeling makes senss)
  • task: is the speaker characteristics value that you want to explore, e.g. age or emotion.

and then fill it with values of your database.
So a file for emotion might look like this

file, speaker, sex, emotion
<path to>/s12343.wav, s1, female, happy
...

You can then specify the data in your initialization file like this:

[DATA]
databases = ['my_db']
my_db.type = csv
my_db = <path to>/my_data_file.csv
...
target = emotion

You can not specify split tables with this format, but would have to simply split the file in several databases.

audformat

audformat allows for many usecases, so the specification might be more complex.
So in the easiest case you have a database with two tables, one called files that contains the speaker informations (id and sex) and one called like your task (aka target), so for example age or emotion.
That's the case for our demo example, the Berlin EmoDB, ando so you can include it simply with.

[DATA]
databases = ['emodb']
emodb = /<path to>/emodb/
target = emotion
...

But if there are more tables and they have special names, you can specifiy them like this:

[DATA]
databases = ['msp']
# path to data
msp = /<path to>/msppodcast/
# tables with speaker information
msp.files_tables =  ['files.test-1', 'files.train']
# tables with task labels
msp.target_tables =  ['emotion.test-1', 'emotion.train']
# train and evaluation splits will be provided
msp.split_strategy = specified
# here are the test/evaluatoin split tables
msp.test_tables = ['emotion.test-1']
# here are the training tables
msp.train_tables = ['emotion.train']
target = emotion

Nkululeko: classifying continuous variables

Nkululeko supports classification and regression.
Classification means predicting a class (or category) from data, regression predicting a continuous value, as for example the speaker age in years.

If you want to use classification with continuous variables, you need to first bin it, which means that you put the values into pre-defined bins. To stay with our age example, you'd declare everyone above 50 years as old and all other as young.

This post shows you how to do that with Nkululeko by setting up your .ini file.

You set up the experiment as classification type:

[EXP]
...
type = classification

But declare the data to be continuous:

[DATA]
...
type = continuous
labels = ['u40', '40ies', '50ies', '60ies', 'ΓΌ70']
bins  = [-1000,  40, 50, 60, 70, 1000]

Then the data will be binned according to the sepecified bins and labeled accordingly.
You need (number of labels) + 1 values for the bins, as they are given lower and upper limit. It makes sense to set the lower and upper absolute limits extreme as you don't know what the classifier will predict.

How to soft-label a database with Nkululeko

Soft-labeling means to annotate data with labels that were predicited by a machine classifier.
As they were not evaluated by a human, you might call them "soft".

Two steps are necessary:
1) save a test/evaluation set as a new database
2) load this new database in a new experiment as training data

Within nkululeko, you would do it like this:

step 1: save a new database

You simply specifify a name in the EXP section to save the test predictions like this:

[EXP]
...
save_test = ./my_test_predictions.csv

You need Model save to be turned on, because it will look for the best performing model:

[MODEL]
...
save = True

This will store the new database to a file called my_test_predictions.csv in the folder where the python was called.

step 2: load as training data

Here is an example configuration how to load this data as additional training (in this case in addition to emodb):

[EXP]
root = ./tests/
name = exp_add_learned
runs = 1
epochs = 1
[DATA]
strategy = cross_data
databases = ['emodb', 'learned', 'test_db']
trains = ['emodb', 'learned']
tests = ['test_db']
emodb = /path-to-emodb/
emodb.split_strategy = speaker_split
emodb.mapping = {'anger':'angry', 'happiness':'happy', 'sadness':'sad', 'neutral':'neutral'}
test_db = /path to test database/
test_db.mapping = <if any mapping to the target categories is needed>
learned = ./my_test_predictions.csv
learned.type = csv
target = emotion
labels = ['angry', 'happy', 'neutral', 'sad']
[FEATS]
type = os
[MODEL]
type = xgb
save = True
[PLOT]

Nkululeko: try out / demo a trained model

This is another Nkululeko post that shows you how to demo model that you trained before.

First you need to train a model, e.g. on emodb as shown here

The difference is, that you need to set the parameters

[EXP]
...
save = True

in the general section and

[MODEL]
...
save = True

in the MODEL section of the configuration file.
Background: we need both the experiment as well as all model files to be saved on disk.

Here is an example script theen how to call the demo mode:

# set the path to the nkululeko sources
import sys
sys.path.append("./src")
# import classes you need from there
import experiment as exp
import configparser
from util import Util

def main(config_file):
    # load the configuration that created the experiment
    config = configparser.ConfigParser()
    config.read(config_file)
    util = Util()
    # set up the experiment
    expr = exp.Experiment(config)
    print(f'running {expr.name}')
    # load the experiment from the name that is suggested
    expr.load(f'{util.get_save_name()}')
    # run the demo
    expr.demo()

if __name__ == "__main__":
    # example call with a configuration file
    main('tests/exp_emodb.ini')

Automatically the best performing model will be used.