Tag Archives: feature extractor

Nkululeko: how to investigate correlations of specific features

As shown in this post, nkululeko can be used to investigate correlations of specific features with a target variable.

Now nkululeko can also be used to check on correlation between two real-valued acoustic features.

With the key regplot you can specify two features and optionally a target variable (if omitted, the ini-file target is used) like so:

[EXPL]
regplot = [['lld_mfcc3_sma3_median', 'lld_mfcc1_sma3_median'],
['lld_mfcc3_sma3_median', 'lld_F2frequency_sma3nz_median', 'age']]

The first tuple of features is related to the emotion target (default for this example data: emodb) and would produce this plot:

The second line states age as the target, which is a continuous target and thus will be grouped

How to set up wav2vec embedding for nkululeko

Since version 0.10, nkululeko supports facebook's wav2vec 2.0 embeddings as acoustic features.
This post shows you how to set this up.

set up nkululeko

in your nkululeko configuration (*.ini) file, set the feature extractor as wav2vec2 and denote the path to the model like this:

[FEATS]
type = ['wav2vec2']
wav2vec.model = /my path to the huggingface model/

Alternatively you can state the huggingface model name directly:

[FEATS]
type = ['wav2vec2-base-960h']

Out of the box, as embeddings the last hidden layer is used. But the original wav2vec2 model consists of 7 CNN layers followed by up to 24 transformer layers. if you like to use an earlier layer than the last one, you can simply count down-

[FEATS]
type = wav2vec2
wav2vec.layer = 12

This would use the 12th layer of a 24 layer model and only the4 CNN layers of a 12 layer model.