Nkululeko is a framework build machine learning models that recognize speaker characteristics.
This post is meant to help you with setting up your first experiment, based on the Berlin Emodb.
1) Set up python
It's written in python so first you have to set up a Python environment
2) Get a database
Load the Berlin emodb database to some location on you harddrive, as discussed in this post. I will refer to the location as "emodb root" from now on.
3) Download nkululeko
Navigate with a browser to the nkululeko github page and click on the "code" button, download the zip or (better) clone with your git software (step 1).
Unpack (if zip file) to some location on your hard disk that I will call "nkululeko root" from now on.
4) Install the required python packages
Inside the virtual environment that you created!
Navigate with a shell to the nkululeko root and install the python packages needed by nkululeko with
pip install -r requirements.txt
5) Adapt the ini file
Use your favourite editor, e.g. visual studio code and open the nkululeko root. If you use visual studio code, set the path to the environment as python interpreter path and store this (nkululeko root and python envirnment path) as a workspace configuration, so next time you can simply open the wprkspace and you're set up.
Open the exp_emodb.ini file and put your nkululeko root as the root value, for me this looks like this:
root = /home/felix/data/research/nkululeko/
Put the emodb root folder as the emodb value, for me this looks like this
emodb = /home/felix/data/audb/emodb
An overview on all nkululeko options should be here
6) Run the experiment
Inside a shell type (or use VSC) and start the process with
python exp_emodb.py exp_emodb.ini
7) Inspect the results
If all goes well, the program should start by extracting opensmile features, and, if you're done, you should be able to inspect the results in the folder named like the experiment:
There should be a subfolder with a confusion matrix named
and a subfolder for the textual results named `