Tag Archives: classifiers

Nkululeko: how to compare classifiers, features and databases using multiple runs

With nkululeko since version 0.98 there is a functionality to compare the outcome for several runs across experiments.

Say, you would like to know if the difference between using acoustic (opensmile) features and linguistic embeddings (bert) as features for some classifier is significant. You could than use the outcomes of several runs from one MLP (multi layer perceptron) as tests that represent all possible runs (disclaimer: afaik this approach is disputable according to some statisticians).

You would set up your experiment like this:

[EXP]
...
runs = 10
epochs = 100
[FEATS]
type = ['bert']
#type = ['os']
#type = ['os', 'bert']
[MODEL]
type = mlp
...
patience = 5
[EXPL]
# turn on extensive statistical output
print_stats = True
[PLOT]
runs_compare = features

and run this three times, each time changing the feature type that is being used (bert, os, or the combination of both), so in the end you got a results folder three different run_results as text files in it.

Using this, nkululeko prints a plot that compares the three feature sets, here's a example (having used only 5 runs):

The title states the overall significance for all differences, as well as the largest one for pair-wise comparison. If you run-number is larger than 30, t-tests will be used instead of Mann-Whitney.