Sometimes you want to compare two different databases that share a similar target variable, say, related to likability, but in a different scaling, say the one asked on a scale from 1 to 10 and the other used likert-scale from 1-7. With nkululeko you can re-name labels, normalize the target values, and even inverse the … Continue reading Nkululeko: how to tweak the target variable for database comparison→
With nkululeko since version 0.77.8 you have the possibility to convert all feature values into the discreet classes low, mid and high Simply state [FEATS] type = ['praat'] scale = bins store_format = csv in your config to use Praat features. With the store format stated as csv you will be able to look at … Continue reading Nkululeko: how to bin/discretize your feature values→
With nkululeko since version 0.77.7 there is a new interface named multidb which lets you compare several databases. You can state their names in the [EXP] section and they will then be processed one after each other and against each other, the results are stored in a file called heatmap.png in the experiment folder. !Mind … Continue reading Nkululeko: compare several databases→
Sometimes, with categorically labeled data, the number of samples per class is very unevenly distributed, misleading the model to think that the overwhelming majority class is more important than the others. In this case, two techniques might help: class weighting assigns a higher weight to samples from minority classes, and oversampling "invents" new samples for … Continue reading Nkululeko: oversample the training set→
With nkululeko since version 0.68.1, you can re-name data fields (columns in your data table) by setting the following in your ini-file: [DATA] databases = ['mydata'] mydata.colnames = {'Participant ID':'speaker', 'sex':'gender', 'Age': 'age'} which means, that, before further processing, the Participant ID field in your database mydata will be treated as speaker label and so … Continue reading Nkululeko: re-name data column names→
With nkululeko since version 0.68.0, the selection of test/dev vs. train samples can be done automatically in a stratified manner, i.e. trying to find splits that are age or gender balanced. An example for such a configuration is this: [DATA] # the name of the database databases = ['emodb'] # the location of the data … Continue reading Nkululeko: automatically stratify your split sets→
With nkululeko since version 0.67.0, the spotlight software is directly integrated as part of the EXPLORE module. You can simply run your data filters, augmentations, machine learning experiments, segmentations and model predictions as usual, and then call the spotlight software by adding to your configuration file: [EXPL] sample_selection = all # or train or test … Continue reading Nkululeko: inspect your data with Spotlight→
With nkululeko since version 0.66.3, a report document formatted in Latex and compiled as a PDF file can automatically be generated, basically as a compilation of the images that are generated. There is a dedicated REPORT section in the config file for this, here is an example: [REPORT] # should the report be shown in … Continue reading Nkululeko: generate a latex/pdf report→
If you use modules, feature-extractors or models that use torchaudio with Nkululeko, like e.g . Resampler or Squim model, you need to install the nightly version. pip uninstall -y torch torchvision torchaudio pip install –pre torch torchvision torchaudio –extra-index-url https://download.pytorch.org/whl/nightly/cpu
With nkululeko since version 0.64.0, some statistics are printed as part of the plot’s titles. With the explore module, you can plot correlations between the target (e.g. emotion or age) and other variables that are in the database, e.g. gender or duration, or everything you might have predicted with the predict module. You need to … Continue reading Nkululeko: get some statistics on correlation and effect size→