Start now for free
CHANGELOG

Product improvements

Check out the AssemblyAI changelog to see weekly accuracy and product improvements our team has been working on.

Powering incredible companies

1

Universal improvements

Last week we delivered improvements to our October 2024 Universal release across latency, accuracy, and language coverage.

Universal demonstrates the lowest standard error rate when compared to leading models on the market for English, German, and Spanish:

Average word error rate (WER) across languages for several providers. WER is a canonical metric in speech-to-text that measures typical accuracy (lower is better). Descriptions of our evaluation sets can be found in our October release blog post.

Additionally, these improvements to accuracy are accompanied by significant increases in processing speed. Our latest Universal release achieves a 27.4% speedup in inference time for the vast majority of files (at the 95th percentile), enabling faster transcription at scale.

Additionally, these changes build on Universal's already best-in-class English performance to bring significant upgrades to last-mile challenges, meaning that Universal faithfully captures the fine details that make transcripts useable, like proper nouns, alphanumerics, and formatting.

Comparative error rates across speech recognition models, with lower values indicating better performance. Descriptions of our evaluation sets can be found in our October release blog post.

You can read our launch blog to learn more about these Universal updates.

1

Ukrainian support for Speaker Diarization

Our Speaker Diarization service now supports Ukrainian speech. This update enables automatic speaker labeling for Ukrainian audio files, making transcripts more readable and powering downstream features in multi-speaker contexts.

Here's how you can get started obtaining Ukrainian speaker labels using our Python SDK:

import assemblyai as aai

aai.settings.api_key = "<YOUR_API_KEY>"
audio_file = "/path/to/your/file"

config = aai.TranscriptionConfig(
  speaker_labels=True,
  language_code="uk"
)

transcript = aai.Transcriber().transcribe(audio_file, config)

for utterance in transcript.utterances:
  print(f"Speaker {utterance.speaker}: {utterance.text}")

Check out our Docs for more information.

1

Claude 2 sunset

As previously announced, we sunset Claude 2 and Claude 2.1 for LeMUR on February 6th.

If you were previously using these models, we recommended switching to Claude 3.5 Sonnet, which is both more performant and less expensive than Claude 2. You can do so via the final_model parameter in LeMUR requests. Additionally, this parameter is now required.

Additionally, we have sunset the lemur/v3/generate/action-items endpoint.

1

Reduced hallucination rates; Bugfix

We have reduced Universal-2's hallucination rate for the string "sa" during periods of silence.

We have fixed a rare bug in our Speaker Labels service that would occasionally cause requests to fail and return a server error.

1

Multichannel audio trim fix

We've fixed an issue which caused the audio_start_from and audio_end_at parameters to not be respected for multichannel audio.