tencent cloud

Automatic Speech Recognition

User Guide
Product Introduction
Release Notes
Access Management
Purchase Guide
Billing Overview
Purchase Methods
Payment Overdue
Getting Started
Operation Guide
Quick Server API Connection
FAQs
API Documentation
History
Introduction
API Category
Making API Requests
Recording Recognition APIs
Real-Time Speech Recognition APIs
Data Types
Error Codes
SDK Documentation
Quick SDK Integration and Run
FAQs
Recognition Effect
Service and Billing
Features
API and SDK
Others
Related Agreement
Service Level Agreement
Privacy Policy
Data Privacy And Security Agreement
Contact us
Glossary

Release Notes

PDF
Focus Mode
Font Size
Last updated: 2024-12-11 18:03:16
Tencent Cloud Automatic Speech Recognition (ASR) provides highly cost-effective speech recognition services. It has been widely used by many Tencent businesses such as WeChat, Honor of Kings, and Tencent Video and has implemented multiple use cases, including recording quality inspection, real-time meeting transcription, and voice input method.

Features

Real-time speech recognition

It recognizes real-time audio streams to achieve the effect of instant speech-to-text, which is suitable for real-time audio streaming scenarios such as voice input and phone bot.

Recording file recognition

Recognizes recording files and allows asynchronous processing of lengthy audio recordings, applicable to long audio scenarios such as customer service quality inspection and subtitle generation.

Strengths

Massive data accumulation

Based on Tencent's vast social data platform, ASR has accumulated hundreds of thousands of hours of annotated voice data in a rich and diverse corpus, laying a data foundation for a high recognition accuracy.

Industry-leading algorithms

Based on multiple sequential neural network structures (LSTM, Attention Model, and DeepCNN), ASR is trained in the multitask learning method and delivers an industry-leading recognition accuracy together with the T/S approach in general and vertical fields.

Cross-platform Support

ASR provides RESTful APIs and SDKs and supports a wide variety of devices and terminals, including smart hardware, mobile application, website, desktop client, and IoT.

Support for Multiple Languages

ASR currently supports speech recognition in Mandarin and English, with more languages to come in the future.

Excellent recognition performance in noisy environment

ASR features robust recognition models, high recognition accuracy, and strong noise resistance. It can recognize audio information from noisy environments with no need of noise reduction processing.

Well proven capabilities

ASR has been fully verified by Tencent's internal businesses such as WeChat, Tencent Video, and Honor of Kings and has implemented many external use cases for customers in the internet, finance, education, and other industries, serving billions of users every day with a stable performance.

Application Scenario

Voice input method

ASR makes smart voice input possible through real-time speech recognition, which saves users the input time and improves the input experience.

Meeting Minutes

Audio information in conferences, court trials, and interviews can be converted to text by the real-time speech recognition service, which reduces human recording costs and improves the efficiency.

Call quality inspection

Rep conversations can be converted to text by the real-time speech recognition service, which comprehensively covers the content and improves the efficiency of quality inspection.

Help and Support

Was this page helpful?

Help us improve! Rate your documentation experience in 5 mins.

Feedback