Preprocess video files to improve file transcription efficiency (for audio file recognition scenarios)
Paraformer speech recognition API is compatible with video files, but they are typically large and time-consuming to transfer. Pre-process video files by extracting the audio track needed for speech recognition and compressing it to significantly reduce file size and improve transcription throughput. Use ffmpeg for pre-processing.
Prerequisites
Install ffmpeg from ffmpeg.org.
Pre-process video files
Use ffmpeg to extract the first audio track, downsample to 16kHz, and compress with opus encoding.
ffmpeg -i input-video-file -ac 1 -ar 16000 -acodec libopus output-audio-file.opusThe output audio file will be significantly smaller than the input video. Submit the audio file (via URL) to the file transcription API for speech recognition results.
该文章对您有帮助吗?