New Node/JavaScript SDK works in multiple runtimes
We’ve released v4 of our Node JavaScript SDK. Previously, the SDK was developed specifically for Node, but the latest version now works in additional runtimes without any extra steps. The SDK can now be used in the browser, Deno, Bun, Cloudflare Workers, etc.
New Punctuation Restoration and Truecasing models, PCM Mu-law support
We’ve released new Punctuation and Truecasing models, achieving significant improvements for acronyms, mixed-case words, and more.
Below is a visual comparison between our previous Punctuation Restoration and Truecasing models (red) and the new models (green):
Going forward, the new Punctuation Restoration and Truecasing models will automatically be used for async and real-time transcriptions, with no need to upgrade for special access. Use the parameters punctuate and format_text, respectively, to enable/disable the models in a request (enabled by default).
Our real-time transcription service now supports PCM Mu-law, an encoding used primarily in the telephony industry. This encoding is set by using the `encoding` parameter in requests to our API. You can read more about our PCM Mu-law support here.
We have improved internal reporting for our transcription service, which will allow us to better monitor traffic.
1
New LeMUR parameter, reduced hold music hallucinations
Users can now directly pass in custom text inputs into LeMUR through the input_text parameter as an alternative to transcript IDs. This gives users the ability to use any information from the async API, formatted however they want, with LeMUR for maximum flexibility.
For example, users can assign action items per user by inputting speaker-labeled transcripts, or pull citations by inputting timestamped transcripts. Learn more about the new input_text parameter in our LeMUR API reference, or check out examples of how to use the input_text parameter in the AssemblyAI Cookbook.
We’ve made improvements that reduce hallucinations which sometimes occurred from transcribing hold music on phone calls. This improvement is effective immediately with no changes required by users.
We’ve fixed an issue that would sometimes yield an inability to fulfill a request when XML was returned by LeMUR /task endpoint.
1
Reduced latency, improved error messaging
We’ve made improvements to our file downloading pipeline which reduce transcription latency. Latency has been reduced by at least 3 seconds for all audio files, with greater improvements for large audio files provided via external URLs.
We’ve improved error messaging for increased clarity in the case of internal server errors.
1
New Dashboard features and LeMUR fix
We have released the beta for our new usage dashboard. You can now see a usage summary broken down by async transcription, real-time transcription, Audio Intelligence, and LeMUR. Additionally, you can see charts of usage over time broken down by model.
We have added support for AWS marketplace on the dashboard/account management pages of our web application.
We have fixed an issue in which LeMUR would sometimes fail when handling extremely short transcripts.