Google ai video transcription

9/22/2023

“We’re excited to see how these improvements to speech recognition improve the customer experience for contact centers of all shapes and sizes - whether you’re working with one of our partners to deploy the Contact Center AI solution or taking a DIY approach using our conversational AI suite,” wrote Aharon and Misra. (Google notes that this effectively makes live automatic transcription infinite in length.) Additionally, Cloud Speech-to-Text now natively supports the MP3 file format previously, MP3 files had to be expanded into the LINEAR16 format prior to processing. Perhaps more significantly, Cloud Speech-to-Text, which since launch has only supported streaming audio in one-minute increments, can now process sessions up to five minutes in length and resume streaming where the previous sessions left off. Lastly, SpeechContext now supports up to 5,000 phrase hints per API request (up from 500), increasing the probability uncommon words or phrases will be captured by ASR. As for SpeechContext boost, it helps adjust speech adaptation strength while cutting down on the number of false positives - i.e., when a phrase wasn’t mentioned but appears in a transcript. SpeechContext classes - prebuilt entities reflecting concepts like digit sequences, addresses, numbers, and money denominations - optimize ASR for a list of words at once. There’s a trio of new features within SpeechContext parameters, the collection of Cloud Speech-to-Text settings and toggles that tailor transcriptions to businesses’ and verticals’ vernaculars. Google debuted in beta today “richer” manual speed adaptation and entity classes, in addition to expanded phrase limits, endless streaming, and more. Increased contextual awareness and enhanced speech-to-text aren’t the only new natural language understanding improvements coming down the Contact Center AI pipeline. “We’re constantly adding more quality improvements to the roadmap - an automatic benefit to any IVR or phone-based virtual agent, without any code changes needed - and will share more about these updates in future.” Better transcription and endless streaming “Applying speech adaptation can also provide additional improvements on top of that gain,” wrote Aharon and Misra. The model is now 15% more accurate relative to the previously announced improvements. Today, Google revealed that its engineers have further optimized the model for short utterances in U.S.

The Mountain View company claimed at the time that this model had 62% fewer transcription errors compared with its predecessor’s 54%. Google recently launched in preview premium speech-to-text models tuned to specific use cases, and in February it made one of them - a phone model optimized for two- to four-person conversations - generally available. You’ll find it in the Dialogflow console. Auto Speech Adaptation is switched off by default.

0 Comments

Google ai video transcription

Leave a Reply.

Author

Archives

Categories