Since version 0.27.0, Nululeko has a concept for a test set, despite train and dev set.
Let's recap the concept of train/dev/test splits:
- train is used to train a supervised model
- dev is a set to evaluate this model, i.e. know when it is a good model (that doesn't overfit)
- test is a set to be used ONLY once: for the real use of the model. If you would use the test as a dev set, you can't be sure if you're not overfitting again (because you used the dev set to adjust the meta parameters of your model).
So, in order to evaluate a third dataset ( beneath train and dev) you set a label_data entry in the configuration [DATA] section like so:
[DATA]
...
label_data = emovo
label_result = my_label_result.csv
and then run the experiment.