Spanish Language Support, Automatic Language Detection, and Custom Spelling Released
Spanish transcription is now publicly available. Check out our documentation for more information on Specifying a Language in your POST request.
Automatic Language Detection is now available for our /v2/transcript endpoint. This feature can identify the dominant language that’s spoken in an audio file and route the file to the appropriate model for the detected language.
Our new Custom Spelling feature gives you the ability to specify how words are spelled or formatted in the transcript text. For example, Custom Spelling could be used to change all instances "CS 50" to "CS50".
Auto Chapters v6 Released
Released Auto Chapters v6, improving the summarization of longer chapters.
Auto Chapters v5 Released
Auto Chapters v5 released, improving headline and gist generation and quote formatting in the summary key.
Fixed an edge case in Dual-Channel files where initial words in an audio file would occasionally be missed in the transcription.
Regional Spelling Improvements
Region-specific spelling improved for en_uk and en_au language codes.
Improved the formatting of “MP3” in transcripts.
Improved Real-Time transcription error handling for corrupted audio files.
Real-Time v3 Released
Released v3 of our Real-Time Transcription model, improving overall accuracy by 18% and proper noun recognition by 23% relative to the v2 model.
Improved PII Redaction and Entity Detection for CREDIT_CARD_CVV and LOCATION.
Auto Chapters v4 Released, Auto Retry Feature Added
Added an Auto Retry feature, which automatically retries transcripts that fail with a Server error, developers have been alerted message. This feature is enabled by default. To disable it, visit the Account tab in your Developer Dashboard.
Auto Chapters v4 released, improving chapter summarization in the summary key.
Added a trailing period for the gist key in the Auto Chapters feature.
Auto Chapters v3 Released
Released v3 of our Auto Chapters model, improving the model’s ability to segment audio into chapters and chapter boundary detection by 56.3%.
Improved formatting for Auto Chapters summaries. The summary, headline, and gist keys now include better punctuation, casing, and text formatting.
Miscellaneous Bug Fixes
Fixed a rare edge case affecting audio duration calculation of a small percentage of multi-channel files that contained no speech.
Miscellaneous bug fixes for Real-Time Transcription.
Webhook Status Codes, Entity Detection Improved
POST requests from the API to webhook URLs will now accept any status code from 200 to 299 as a successful HTTP response. Previously only 200 status codes were accepted.
Updated the text key in our Entity Detection feature to return the proper noun rather than the possessive noun. For example, Andrew instead of Andrew’s.
Fixed an edge case with Entity Detection where under certain contexts, a disfluency could be identified as an entity.
Punctuation and Casing Accuracy Improved, Inverse Text Normalization Model Updated
Released v4 of our Punctuation model, increasing punctuation and casing accuracy by ~2%.
Updated our Inverse Text Normalization (ITN) model for our /v2/transcript endpoint, improving web address and email address formatting and fixing the occasional number formatting issue.
Fixed an edge case where multi-channel files would return no text when the two channels were out of phase with each other.
Support for Non-English Languages Coming Soon
Our Deep Learning team has been hard at work training our new non-English language models. In the coming weeks, we will be adding support for French, German, Italian, and Spanish.
Shorter Summaries Added to Auto Chapters, Improved Filler Word Detection
Added a new gist key to the Auto Chapters feature. This new key provides an ultra-short, usually 3 to 8 word summary of the content spoken during that chapter.
Implemented profanity filtering into Auto Chapters, which will prevent the API from generating a summary, headline, or gist that includes profanity.
Improved Filler Word (aka, disfluencies) detection by ~5%.
Improved accuracy for Real-Time Streaming Transcription.
Fixed an edge case where WebSocket connections for Real-Time Transcription sessions would occasionally not close properly after the session was terminated. This resulted in the client receiving a 4031 error code even after sending a session termination message.
Corrected a bug that occasionally attributed disfluencies to the wrong utterance when Speaker Labels or Dual-Channel Transcription was enabled.