基本信息:
- 专利标题: TRAINED GENERATIVE MODEL SPEECH CODING
- 申请号:PCT/US2021/070064 申请日:2021-01-22
- 公开(公告)号:WO2022159247A1 公开(公告)日:2022-07-28
- 发明人: KLEIJN, Willem Bastiaan , STORUS, Andrew
- 申请人: GOOGLE LLC
- 申请人地址: 1600 Amphitheatre Parkway
- 专利权人: GOOGLE LLC
- 当前专利权人: GOOGLE LLC
- 当前专利权人地址: 1600 Amphitheatre Parkway
- 代理机构: BELLERMANN, Mark R.W. et al.
- 主分类号: G10L21/02
- IPC分类号: G10L21/02
摘要:
A method includes receiving sampled audio data corresponding to utterances and training a machine learning (ML) model, using the sampled audio data, to generate a high-fidelity audio stream from a low bitrate input bitstream. The training of the ML model includes de-emphasizing the influence of low-probability distortion events in the sampled audio data on the trained ML model, where the de-emphasizing of the distortion events is achieved by the inclusion of a term in an objective function of the ML model, which term encourages low-variance predictive distributions of a next sample in the sampled audio data, based on previous samples of the audio data.
IPC结构图谱:
G | 物理 |
--G10 | 乐器;声学 |
----G10L | 语言分析或合成;语言识别 |
------G10L21/00 | 为了改变语音信号的品质或其可理解性而处理语音信号以产生另一种可听的或非可听的信号,例如视觉信号、触觉信号 |
--------G10L21/02 | .语音增强,例如降低噪声、消除回声 |