Since version 0.27.0, Nululeko has a concept for a test set, despite train and dev set.
Let's recap the concept of train/dev/test splits:
- train is used to train a supervised model
- dev is a set to evaluate this model, i.e. know when it is a good model (that doesn't overfit)
- test is a set to be used ONLY once: for the real use of the model. If you would use the test as a dev set, you can't be sure if you're not overfitting again (because you used the dev set to adjust the meta parameters of your model).
So, in order to evaluate a third dataset ( beneath train and dev) you set a tests entry in the configuration [DATA] section like so:
[DATA]
tests = ['my_testdb']
my_testdb = /mypath/my_testdb
...
and then call Nkululeko's test module
python -m nkululeko.test --config mycoonfg.ini --outfile myresults.csv