Start now for free

1

New parameter and timestamps fix

We’ve introduced a new, optional speech_threshold parameter, allowing users to only transcribe files that contain at least a specified percentage of spoken audio, represented as a ratio in the range [0, 1].

You can use the speech_threshold parameter with our Python SDK as below:

import assemblyai as aai

aai.settings.api_key = f"{ASSEMBLYAI_API_KEY}"

config = aai.TranscriptionConfig(speech_threshold=0.1)

file_url = "https://github.com/AssemblyAI-Examples/audio-examples/raw/main/20230607_me_canadian_wildfires.mp3"

transcriber = aai.Transcriber()
transcript = transcriber.transcribe(file_url, config)

print(transcript.text)
Smoke from hundreds of wildfires in Canada is triggering air quality alerts throughout the US. Skylines from ...

If the percentage of speech in the audio file does not meet or surpass the provided threshold, then the value of transcript.text will be None and you will receive an error:

if not transcript.text:
	print(transcript.error)
Audio speech threshold 0.9461 is below the requested speech threshold value 1.0

As usual, you can also include the speech_threshold parameter in the JSON of raw HTTP requests for any language.

We’ve fixed a bug in which timestamps could sometimes be incorrectly reported for our Topic Detection and Content Safety models.

We’ve made improvements to detect and remove a hallucination that would sometimes occur with specific audio patterns.