CHANGELOG
Product improvements
Check out the AssemblyAI changelog to see weekly accuracy and product improvements our team has been working on.
Powering incredible companies
- This week, our engineering team has been hard at work preparing for the release of exciting new features like:
- Chapter Detection: Automatically summarize audio and video files into segments (aka "chapters").
- Sentiment Analysis: Determine the sentiment of sentences in your transcript as
"positive"
, "negative"
, or "neutral"
. - Disfluencies: Detects filler-words like
"um"
and "uh"
.
- Improved average real-time latency by 2.1% and p99 latency by 0.06%.
- Fixed an edge-case where confidence scores in the utterances category for dual-channel audio files would occasionally receive a confidence score greater than 1.0.
- Improved the API's ability to handle audio/video files with a duration over 8 hours.
- Further improved transcription processing times by 12%.
- Fixed an edge case in our responses for dual channel audio files where if speaker 2 interrupted speaker 1, the text from speaker 2 would cause the text from speaker 1 to be split into multiple turns, rather than contextually keeping all of speaker 1's text together.
- Improved the API's ability to handle audio/video files with a duration over 8 hours.
- Further improved transcription processing times by 12%.
- Fixed an edge case in our responses for dual channel audio files where if speaker 2 interrupted speaker 1, the text from speaker 2 would cause the text from speaker 1 to be split into multiple turns, rather than contextually keeping all of speaker 1's text together.
- Today, we're happy to announce the release of our most accurate Speech Recognition model for asynchronous transcription to date—version 8 (v8).
- This new model dramatically improves overall accuracy (up to 19% relative), and proper noun accuracy as well (up to 25% relative).
- You can read more about our v8 model in our blog here.
- Fixed an edge case where a small percentage of short (<60 seconds in length) dual-channel audio files, with the same audio on each channel, resulted in repeated words in the transcription.
- Today, we're happy to announce the release of our most accurate Speech Recognition model for asynchronous transcription to date—version 8 (v8).
- This new model dramatically improves overall accuracy (up to 19% relative), and proper noun accuracy as well (up to 25% relative).
- You can read more about our v8 model in our blog here.
- Fixed an edge case where a small percentage of short (<60 seconds in length) dual-channel audio files, with the same audio on each channel, resulted in repeated words in the transcription.