How to limit a dataset with Nkululeko

In some cases you don't want to use the whole dataset for training or test, but filter it in some way. There are several possibilities demonstrated:
Some are valid per database:

[DATA]
databases = ['d1']
...
# force a specific feature to be present, e.g. gender labels ( when not all data has gender values)
d1.required = gender

# limit the number of samples per speaker
d1.max_samples_per_speaker = 20

# only use the first 10000 samples
d1.limit = 10000

Others are valid for the whole experiment, i.e. all databases

[DATA]
# specify a minimum duration for test samples (in seconds)
min_dur_test = 3.5

# use only samples where gender is female
sex = female

Leave a Reply

Your email address will not be published. Required fields are marked *