基本信息:
- 专利标题: DELIBERATION MODEL-BASED TWO-PASS END-TO-END SPEECH RECOGNITION
- 申请号:PCT/US2021/013449 申请日:2021-01-14
- 公开(公告)号:WO2021150424A1 公开(公告)日:2021-07-29
- 发明人: HU, Ke , SAINATH, Tara, N. , PANG, Ruoming , PRABHAVALKAR, Rohit, Prakash
- 申请人: GOOGLE LLC
- 申请人地址: 1600 Amphitheatre Parkway
- 专利权人: GOOGLE LLC
- 当前专利权人: GOOGLE LLC
- 当前专利权人地址: 1600 Amphitheatre Parkway
- 代理机构: KRUEGER, Brett, A.
- 优先权: US62/963,721 2020-01-21
- 主分类号: G10L15/16
- IPC分类号: G10L15/16 ; G10L15/32 ; G06N3/04 ; G10L15/02 ; G06N3/0445 ; G06N3/049 ; G06N3/084 ; G06N3/088 ; G10L15/063 ; G10L15/1815 ; G10L15/187 ; G10L19/0018 ; G10L2015/081 ; G10L2015/085
摘要:
A method of performing speech recognition using a two-pass deliberation architecture includes receiving a first-pass hypothesis and an encoded acoustic frame and encoding the first-pass hypothesis at a hypothesis encoder. The first-pass hypothesis is generated by a recurrent neural network (RNN) decoder model for the encoded acoustic frame. The method also includes generating, using a first attention mechanism attending to the encoded acoustic frame, a first context vector, and generating, using a second attention mechanism attending to the encoded first-pass hypothesis, a second context vector. The method also includes decoding the first context vector and the second context vector at a context vector decoder to form a second-pass hypothesis