Tag Archives: audio extraction

How to compare formant tracks extracted with opensmile vs. Praat

First, some imports

import pandas as pd
import parselmouth 
from parselmouth import praat
import opensmile
import audiofile

Then, a test file:

testfile = '/home/felix/data/data/audio/testsatz.wav'
signal, sampling_rate = audiofile.read(testfile)
print('length in seconds: {}'.format(len(signal)/sampling_rate))

Get the opensmile formant tracks by copying them from the official GeMAPS config file

smile = opensmile.Smile(
    feature_set=opensmile.FeatureSet.GeMAPSv01b,
    feature_level=opensmile.FeatureLevel.LowLevelDescriptors,
)
result_df = smile.process_file(testfile)
centerformantfreqs = ['F1frequency_sma3nz', 'F2frequency_sma3nz', 'F3frequency_sma3nz']
formant_df = result_df[centerformantfreqs]

Get the Praat tracks (smile configuration computes every 10 msec with frame length 20 msec)

sound = parselmouth.Sound(testfile) 
formants = praat.call(sound, "To Formant (burg)", 0.01, 4, 5000, 0.02, 50)
f1_list = []
f2_list = []
f3_list = []
for i in range(2, formants.get_number_of_frames()+1):
    f1 = formants.get_value_at_time(1, formants.get_time_step()*i)
    f2 = formants.get_value_at_time(2, formants.get_time_step()*i)
    f3 = formants.get_value_at_time(3, formants.get_time_step()*i)
    f1_list.append(f1)
    f2_list.append(f2)
    f3_list.append(f3)

To be sure: compare the size of the output:

print('{}, {}'.format(result_df.shape[0], len(f1_list)))

combine and inspect the result:

formant_df['F1_praat'] = f1_list
formant_df['F2_praat'] = f2_list
formant_df['F3_praat'] = f3_list
formant_df.head()

How to extract formant tracks with Praat and python

This tutorial was adapted based on the examples from David R Feinberg

This tutorial assumes you started a Jupyter notebook . If you don't know what this is, here's a tutorial on how to set one up (first part)

First you should install the parselmouth package, which interfaces Praat with python:

!pip install -U praat-parselmouth

which you would then import:

import parselmouth 
from parselmouth import praat

You do need some audio input (wav header, 16 kHz sample rate)

testfile = '/home/felix/data/data/audio/testsatz.wav'

And would then read in the sound with parselmouth like this:

sound = parselmouth.Sound(testfile) 

Here's the code to extract the first three formant tracks, I guess it's more or less self-explanatory if you know Praat.

First, compute the occurrences of periodic instances in the signal:

f0min=75
f0max=300
pointProcess = praat.call(sound, "To PointProcess (periodic, cc)", f0min, f0max)

then, compute the formants:

formants = praat.call(sound, "To Formant (burg)", 0.0025, 5, 5000, 0.025, 50)

And finally assign formant values with times where they make sense (periodic instances)

numPoints = praat.call(pointProcess, "Get number of points")
f1_list = []
f2_list = []
f3_list = []
for point in range(0, numPoints):
    point += 1
    t = praat.call(pointProcess, "Get time from index", point)
    f1 = praat.call(formants, "Get value at time", 1, t, 'Hertz', 'Linear')
    f2 = praat.call(formants, "Get value at time", 2, t, 'Hertz', 'Linear')
    f3 = praat.call(formants, "Get value at time", 3, t, 'Hertz', 'Linear')
    f1_list.append(f1)
    f2_list.append(f2)
    f3_list.append(f3)