Nkululeko: perform cross database experiments

This is one of a series of posts about how to use nkululeko.
If you're unfamilar with nkululelo, you might want to start here.

This post is about cross database experiments, i.e. training a classifier on one database and test it on another, something that happens quite often with real life situations.

In this post I will only talk about the config file, the python file can be re-used.

I'll walk you through the sections of the config file (all options here):
The first section deals with general setup:

[EXP]
# root is the base directory for the experiment relative to the python call
root = ./experiment_1/
# mainly a name for the top folder to store results (inside root)
name = cross_data

Next, the DATA section is in this case more complex than usual:

[DATA]
# list all databases
databases = ['polish', 'emodb']
# strategy as opposed to train_test
strategy = cross_data
# state which databases to use for training
trains = ['emodb']
# state with databases to use as a test
tests = ['polish']
# what is the target label?
target = emotion
# what are the category names?
labels = ['neutral', 'happy', 'sad', 'angry', 'fright.']
# for each database:
# where is it?
polish = PATH/polish-emotional-speech
# map the databases categories to a common set 
polish.mapping = {'anger':'angry', 'joy':'happy', 'sadness':'sad', 'fear':'fright.', 'neutral':'neutral'}
# plot the distribution of categories
polish.value_counts = True
# and for the second database:
emodb = PATH/emodb
emodb.mapping = {'anger':'angry', 'happiness':'happy', 'sadness':'sad', 'fear':'fright.', 'neutral':'neutral'}
emodb.value_counts = True

The features section, better explained in this post

[FEATS]
type = os

The classifiers section, better explained in this post

[MODEL]
type = xgb

Again, you might want to plot the final distribution of categories per train and test set:

[PLOT]
value_counts = True

Leave a Reply

Your email address will not be published. Required fields are marked *