This is one of a series of posts about how to use nkululeko.
If you're unfamilar with nkululelo, you might want to start here.
This post is about cross database experiments, i.e. training a classifier on one database and test it on another, something that happens quite often with real life situations.
In this post I will only talk about the config file, the python file can be re-used.
I'll walk you through the sections of the config file (all options here):
The first section deals with general setup:
[EXP]
# root is the base directory for the experiment relative to the python call
root = ./experiment_1/
# mainly a name for the top folder to store results (inside root)
name = cross_data
Next, the DATA section is in this case more complex than usual:
[DATA]
# list all databases
databases = ['polish', 'emodb']
# strategy as opposed to train_test
strategy = cross_data
# state which databases to use for training
trains = ['emodb']
# state with databases to use as a test
tests = ['polish']
# what is the target label?
target = emotion
# what are the category names?
labels = ['neutral', 'happy', 'sad', 'angry', 'fright.']
# for each database:
# where is it?
polish = PATH/polish-emotional-speech
# map the databases categories to a common set
polish.mapping = {'anger':'angry', 'joy':'happy', 'sadness':'sad', 'fear':'fright.', 'neutral':'neutral'}
# plot the distribution of categories
polish.value_counts = True
# and for the second database:
emodb = PATH/emodb
emodb.mapping = {'anger':'angry', 'happiness':'happy', 'sadness':'sad', 'fear':'fright.', 'neutral':'neutral'}
emodb.value_counts = True
The features section, better explained in this post
[FEATS]
type = os
The classifiers section, better explained in this post
[MODEL]
type = xgb
Again, you might want to plot the final distribution of categories per train and test set:
[PLOT]
value_counts = True