What you need to do this at first is to get yourselg a Google API key,
- you need to register with Google speech APIs, i.e. get a Google cloud platform account
- you need to share payment details, but (at the time of writing, i think) the first 60 minutes of processed speech per month are free.
I export my API key each time I want to use this like so:
export GOOGLE_APPLICATION_CREDENTIALS="/home/felix/data/research/Google/api_key.json"
This tutorial assumes you did that and you started a Jupyter notebook . If you don't know what this is, here's a tutorial on how to set one up (first part)
Bevor you can import the Google speech api make shure it's installed:
!pip install google-cloud
Then you would import the Google Cloud client library
from google.cloud import speech
import io
Instantiate a client
client = speech.SpeechClient()
And load yourself a recorded speech file, should be wav format 16kHz sample rate
speech_file = '/home/felix/tmp/google_speech_api_test.wav'
with io.open(speech_file, "rb") as audio_file:
content = audio_file.read()
get yourself an audio object
audio = speech.RecognitionAudio(content = content)
Configure the ASR
config = speech.RecognitionConfig(
encoding=speech.RecognitionConfig.AudioEncoding.LINEAR16,
sample_rate_hertz=16000,
language_code="de-DE",
)
Detects speech in the audio file
response = client.recognize(config=config, audio=audio)
and show what you got (with my trial only the first alternative was filled):
for result in response.results:
for index, alternative in enumerate(result.alternatives):
print("Transcript {}: {}".format(index, alternative.transcript))