Category Archives: tutorial

Nkululeko: show feature importance

20. February 2023 felix Leave a comment

Since version 0.40, Nkululeko can now show the best performing X acoustic features according to some model.

There is a new section call EXPL (short for exploration), and you could state

[EXPL]
model = tree
sample_num = 15

in your config file, and then run the exploration module like this:

python -m nkululeko.explore --config my_config.ini

The resulting list will then appear in the result folder and a barplot image in the image folder.

Afterwards you could inspect single features as described here

nkululeko, tutorial

Nkululeko: how to plot distributions of feature values

16. February 2023 felix Leave a comment

As shown in this post, with Nkululeko you can select only specific features from your features sets by specifying them in the [FEAT] section:

[FEATS]
features = ['JitterPCA', 'meanF0Hz', 'hld_sylRate']

What you can also do, is plotting them per category (only for classification), by specifying in the PLOT section if you would like that for all samples or only test or train samples:

[EXPL]
# turn it on
feature_distributions = True 
# use only training samples
sample_selection = train 
# only plot the 5 most important features 
max_feats = 5

You would have to call nkululeko with the explore interface:

python -m nkululeko.explore --config <myConfig.ini>

The image file is in the image folder and should look similar to this:

nkululeko, tutorial

Nkululeko: how to predict many samples

9. February 2023 felix Leave a comment

There are three ways to predict a number of samples:

If you want to save the predictions of an experiment for later use, you can do so by stating in the EXP section
```
[EXP]
save_test = ./my_saved_test_predictions.csv
```
The output format is CSV, comma seperated values.
Alternatively, you can test an existing database against the best model you trained before, by stating the databases as tests in the DATA section:
```
[DATA]
tests = ['my_testdb']
my_testdb = /mypath/my_testdb
...
```
and then calling Nkululeko's test module
```
python -m nkululeko.test --config mycoonfg.ini --outfile myresults.csv
```

Run the demo module simply for a set of files:

python -m nkululeko.demo --config mycoonfg.ini --list my_filelist.txt

code, tutorial

Packaging with python

25. January 2023 felix Leave a comment

write your

setup.cfg
setup.py
pyproject.toml

like in the nkululeko project
When you have a new version, you can test it with

pip install .

and, when happy,

python -m build

will generate the dist folder

twine upload dist/*

will upload to pypi server

If you want to develop your package, you can run

python setup.py develop

Allgemein, seminar, tutorial

Transformation architectures

2. November 2022 felix Leave a comment

Generally a difference for machine learners can be made by the nature of input and output.

source

One to one

Typically an application would be to classify the main motive of a picture (e.g. cat or dog) or the emotional category that is displayed in an audio recording. Key is, that the input is represented by a single vector of values of fixed length.

One to many

Many to one

Sequence to sequence

Many to many

nkululeko, tutorial

How to import features from outside the Nkululeko software

18. October 2022 felix Leave a comment

Since version 0.29.1 there is the possibilty to directly import acoustic features into the Nkululeko framework.

You can specify a file to be imported in the FEATS section:

[FEATS]
type = ['import']
import_file = ['/home/.../my_features_1.csv']

Of course the features still can be combined with other feature sets and will be assigned to training and test splits accordingly.

There can be several feature files (e.g. for train and dev serpately), and they must be in CSV format (comma separated values) in audformat with segmented index.
Here is an example:

file,start,end,voice segments,HNR Mean (dB),F1 Mean (Hz)
/home/.../a42_1.wav,0 days,0 days 00:00:07.815875,4.13,45,7.48,

Allgemein, tutorial

Predict emotional states with the audEERING model

29. September 2022 felix Leave a comment

audEERING recently published an emotion prediction model based on a finetuned Wav2vec2 transformer model.

Here I'd like to show you how you can use this model to predict your audio samples (it is actually also explained in the Github link above).

As usual, you should start with dedicating a folder on your harddisk for this and install a virtual environment:

virtualenv -p=3 venv

which means we want python version 3 (and not 2)
Don't forget to activate it!

Then you would need to install the packages that are used:

pandas
numpy
audeer
protobuf == 3.20
audonnx
jupyter
audiofile
audinterface

easiest to copy this list into a file called requierments.txt and then do

pip install -r requirements.txt

and start writing a python script that includes the packages:

import audeer
import audonnx
import numpy as np
import audiofile
import audinterface

, load the model:

# and download and load the model
url = 'https://zenodo.org/record/6221127/files/w2v2-L-robust-12.6bc4a7fd-1.1.0.zip'
cache_root = audeer.mkdir('cache')
model_root = audeer.mkdir('model')

archive_path = audeer.download_url(url, cache_root, verbose=True)
audeer.extract_archive(archive_path, model_root)
model = audonnx.load(model_root)

sampling_rate = 16000
signal = np.random.normal(size=sampling_rate).astype(np.float32)

load a test sentence (in 16kHz 16 bit wav format)

# read in a wave file for testing
signal, sampling_rate = audiofile.read('test.wav')

and print out the results

# print the results in the order arousal, dominance, valence.
print(model(signal, sampling_rate)['logits'].flatten())

You can also use audinterace's magic and process a whole list of files like this:

# define the interface
interface = audinterface.Feature(
    model.labels('logits'),
    process_func=model,
    process_func_args={
        'outputs': 'logits',
    },
    sampling_rate=sampling_rate,
    resample=True,    
    verbose=True,
)
# create a list of audio files
files = ['test.wav']
# and process it
interface.process_files(files).round(2)

should result in:

Also check out this great jupyter notebook from audEERING

Allgemein, tutorial

Get your speech recognized with Whisper

26. September 2022 felix Leave a comment

OpenAI published new speech recognition models that are very easy to use and work in many languages trained on 680,000 hours of multilingual and multitask supervised data collected from the web.

In my case all I had to do to recognize some German test:

# create a virtual environment
virtualenv venv
# activate it
. venv/bin/activate
# install whisper
pip install git+https://github.com/openai/whisper.git
# run the test
whisper test.wav --language German

And my file got recognized correctly, though it took a very long time: for the tiny model speed = x32, i.e. 32 times the time of the speech file duration, was announced

nkululeko, tutorial

Nkululeko: How to evaluate a test set with a given best model

1. September 2022 felix Leave a comment

Nululeko has two modules for testing and unknown data set, despite train and development/evaluation set.

Let's recap the concept of train/dev/test splits:

train is used to train a supervised model
dev is a set to evaluate this model, i.e. know when it is a good model (that doesn't overfit)
test is a set to be used ONLY once: for the real use of the model. If you would use the test as a dev set, you can't be sure if you're not overfitting again (because you used the dev set to adjust the meta parameters of your model).

So, in order to evaluate a third dataset ( beneath train and dev) you might have situations:
a) you have a labeled test set and want to evaluate it
b) you have an unknown test set (no labels) and want to add predictions (without evaluation)

For a),
you can use the test module, and set a tests entry in the configuration [DATA] section like so:

[DATA]
tests = ['my_testdb']
my_testdb = /mypath/my_testdb
my_testdb.split_strategy = test
...

and then call Nkululeko's test module

python -m nkululeko.test --config mycoonfg.ini --outfile myresults.csv

For b),
you can use the demo module and state your test set as a list of files like so:

python -m nkululeko.demo --config my_config.ini --list my_testsamples.csv --outfile my_results.csv

In order to use a model, of course you do need to have it trained and saved before. So you need a run with the nkululeko module before.

python -m nkululeko.nkululeko --config my_config.ini

with my_config,ini containing:

[EXP]
save = True
[MODEL]
save = True

tutorial, visualization

How to use Latex for your project documentation

25. August 2022 felix Leave a comment

Using a documentation system that separates content and presentation has many advantages, the biggest one probably flexibility.
I vote for latex and since there is now a company that offers free latex environment, you don't have to set it up yourself (you still can, but it might be tedious).

I've set up a sample project that you should be able to copy and use as a start here:

Overleaf sample project

speechsurfer

Category Archives: tutorial

Nkululeko: show feature importance

Nkululeko: how to plot distributions of feature values

Nkululeko: how to predict many samples

Packaging with python

Transformation architectures

One to one

One to many

Many to one

Sequence to sequence

Many to many

How to import features from outside the Nkululeko software

Predict emotional states with the audEERING model

Get your speech recognized with Whisper

Nkululeko: How to evaluate a test set with a given best model

How to use Latex for your project documentation

blog around speech technology