The easiest is CSV, you simply create a table with the following informations:
- file: the path to the audio file
- speaker: a speaker identifier
- sex: the biological sex (has quite an influence on the voice, so sometimes submodeling makes senss)
- task: is the speaker characteristics value that you want to explore, e.g. age or emotion.
and then fill it with values of your database.
So a file for emotion might look like this
file, speaker, sex, emotion <path to>/s12343.wav, s1, female, happy ...
You can then specify the data in your initialization file like this:
[DATA] databases = ['my_db'] my_db.type = csv my_db = <path to>/my_data_file.csv ... target = emotion
You can not specify split tables with this format, but would have to simply split the file in several databases.
audformat allows for many usecases, so the specification might be more complex.
So in the easiest case you have a database with two tables, one called files that contains the speaker informations (id and sex) and one called like your task (aka target), so for example age or emotion.
That's the case for our demo example, the Berlin EmoDB, ando so you can include it simply with.
[DATA] databases = ['emodb'] emodb = /<path to>/emodb/ target = emotion ...
But if there are more tables and they have special names, you can specifiy them like this:
[DATA] databases = ['msp'] # path to data msp = /<path to>/msppodcast/ # tables with speaker information msp.files_tables = ['files.test-1', 'files.train'] # tables with task labels msp.target_tables = ['emotion.test-1', 'emotion.train'] # train and evaluation splits will be provided msp.split_strategy = specified # here are the test/evaluatoin split tables msp.test_tables = ['emotion.test-1'] # here are the training tables msp.train_tables = ['emotion.train'] target = emotion