专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20200227023A1 SYSTEM AND METHOD FOR PROSODICALLY MODIFIED UNIT SELECTION DATABASES 审中-公开
公开(公告)号：US20200227023A1
公开(公告)日：2020-07-16
申请号：US16828070
申请日：2020-03-24
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Ladan Golipour , Ann K. Syrdal
IPC分类号： G10L13/06 , G11C7/16 , G10L15/06 , G10L15/02 , G10L21/003 , G10L25/90
摘要： Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

2. 发明授权

US09997154B2 System and method for prosodically modified unit selection databases 有权
公开(公告)号：US09997154B2
公开(公告)日：2018-06-12
申请号：US14275349
申请日：2014-05-12
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Ladan Golipour , Ann K. Syrdal
IPC分类号： G10L19/12 , G10L13/06 , G10L21/003 , G11C7/16 , G10L25/90 , G10L15/06 , G10L15/02
CPC分类号： G10L13/06 , G10L15/02 , G10L15/063 , G10L21/003 , G10L25/90 , G10L2015/0635 , G11C7/16
摘要： Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

3. 发明授权

US09218815B2 System and method for dynamic facial features for speaker recognition 有权
标题翻译：用于说话者识别的动态面部特征的系统和方法
公开(公告)号：US09218815B2
公开(公告)日：2015-12-22
申请号：US14551907
申请日：2014-11-24
申请人： AT&T Intellectual Property I, L.P.
发明人： Ann K. Syrdal , Sumit Chopra , Patrick Haffner , Taniya Mishra , Ilija Zeljkovic , Eric Zavesky
IPC分类号： G06K9/00 , G10L17/24 , G06F21/32
CPC分类号： G10L15/25 , G06F21/32 , G06F2221/2103 , G06K9/00255 , G06K9/00281 , G06K9/00288 , G06K9/00315 , G06K9/00335 , G10L17/24 , G10L21/06
摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for performing speaker verification. A system configured to practice the method receives a request to verify a speaker, generates a text challenge that is unique to the request, and, in response to the request, prompts the speaker to utter the text challenge. Then the system records a dynamic image feature of the speaker as the speaker utters the text challenge, and performs speaker verification based on the dynamic image feature and the text challenge. Recording the dynamic image feature of the speaker can include recording video of the speaker while speaking the text challenge. The dynamic feature can include a movement pattern of head, lips, mouth, eyes, and/or eyebrows of the speaker. The dynamic image feature can relate to phonetic content of the speaker speaking the challenge, speech prosody, and the speaker's facial expression responding to content of the challenge.
摘要翻译：本文公开了用于执行说话者验证的系统，方法和非暂时的计算机可读存储介质。被配置为实施该方法的系统接收到验证说话者的请求，产生对该请求是唯一的文本挑战，并且响应该请求提示说话者发出文本挑战。然后当扬声器发出文本挑战时，系统记录扬声器的动态图像特征，并且基于动态图像特征和文本挑战来执行说话者验证。录制扬声器的动态图像功能可以包括在说出文本挑战时录制扬声器的视频。动态特征可以包括扬声器的头部，嘴唇，嘴巴，眼睛和/或眉毛的运动模式。动态图像特征可以涉及讲话者讲话的语音内容，语音韵律以及响应于挑战内容的说话者的面部表情。

4. 发明申请

US20140358540A1 System and Method for Adapting Automatic Speech Recognition Pronunciation by Acoustic Model Restructuring 有权
公开(公告)号：US20140358540A1
公开(公告)日：2014-12-04
申请号：US14459696
申请日：2014-08-14
申请人： AT&T INTELLECTUAL PROPERTY I, L.P.
发明人： Andrej LJOLJE , Alistair D. CONKIE , Ann K. Syrdal
IPC分类号： G10L15/07 , G10L15/06
CPC分类号： G10L17/14 , G10L15/063 , G10L15/07 , G10L15/14 , G10L15/187 , G10L15/265 , G10L15/30 , G10L2015/025
摘要： Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

5. 发明授权

US10249290B2 System and method for prosodically modified unit selection databases 有权
公开(公告)号：US10249290B2
公开(公告)日：2019-04-02
申请号：US16004812
申请日：2018-06-11
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Ladan Golipour , Ann K. Syrdal
IPC分类号： G10L19/12 , G10L13/06 , G10L21/003 , G11C7/16 , G10L25/90 , G10L15/06 , G10L15/02
摘要： Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

6. 发明授权

US09305547B2 System and method for adapting automatic speech recognition pronunciation by acoustic model restructuring 有权
公开(公告)号：US09305547B2
公开(公告)日：2016-04-05
申请号：US14698183
申请日：2015-04-28
申请人： AT&T Intellectual Property I, L.P.
发明人： Andrej Ljolje , Alistair D. Conkie , Ann K. Syrdal
IPC分类号： G10L15/04 , G10L15/187 , G10L15/07 , G10L15/06 , G10L15/14
CPC分类号： G10L17/14 , G10L15/063 , G10L15/07 , G10L15/14 , G10L15/187 , G10L15/265 , G10L15/30 , G10L2015/025
摘要： Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.

7. 发明授权

US10607594B2 System and method for prosodically modified unit selection databases 有权
公开(公告)号：US10607594B2
公开(公告)日：2020-03-31
申请号：US16369882
申请日：2019-03-29
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Ladan Golipour , Ann K. Syrdal
IPC分类号： G10L13/06 , G10L19/12 , G10L25/90 , G10L19/04 , G10L19/18 , G11C7/16 , G10L15/06 , G10L15/02 , G10L21/003
摘要： Systems, methods, and computer-readable storage devices to improve the quality of synthetic speech generation. A system selects speech units from a speech unit database, the speech units corresponding to text to be converted to speech. The system identifies a desired prosodic curve of speech produced from the selected speech units, and also identifies an actual prosodic curve of the speech units. The selected speech units are modified such that a new prosodic curve of the modified speech units matches the desired prosodic curve. The system stores the modified speech units into the speech unit database for use in generating future speech, thereby increasing the prosodic coverage of the database with the expectation of improving the output quality.

8. 发明申请

US20150243282A1 System and Method for Adapting Automatic Speech Recognition Pronunciation by Acoustic Model Restructuring 有权
标题翻译：通过声学模型重构适应自动语音识别发音的系统和方法
公开(公告)号：US20150243282A1
公开(公告)日：2015-08-27
申请号：US14698183
申请日：2015-04-28
申请人： AT&T Intellectual Property I, L.P.
发明人： Andrej LJOLJE , Alistair D. CONKIE , Ann K. Syrdal
IPC分类号： G10L15/187 , G10L15/06 , G10L15/14
CPC分类号： G10L17/14 , G10L15/063 , G10L15/07 , G10L15/14 , G10L15/187 , G10L15/265 , G10L15/30 , G10L2015/025
摘要： Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for recognizing speech by adapting automatic speech recognition pronunciation by acoustic model restructuring. The method identifies an acoustic model and a matching pronouncing dictionary trained on typical native speech in a target dialect. The method collects speech from a new speaker resulting in collected speech and transcribes the collected speech to generate a lattice of plausible phonemes. Then the method creates a custom speech model for representing each phoneme used in the pronouncing dictionary by a weighted sum of acoustic models for all the plausible phonemes, wherein the pronouncing dictionary does not change, but the model of the acoustic space for each phoneme in the dictionary becomes a weighted sum of the acoustic models of phonemes of the typical native speech. Finally the method includes recognizing via a processor additional speech from the target speaker using the custom speech model.
摘要翻译：这里公开的是系统，计算机实现的方法和用于通过声学模型重构来适应自动语音识别发音来识别语音的计算机可读存储介质。该方法识别在目标方言中典型的本地语音训练的声学模型和匹配的发音字典。该方法从新的演讲者收集演讲，从而收集到的演讲并转录收集的演讲，以产生一个合理的音素格子。然后，该方法创建一个自定义语音模型，用于通过用于所有似乎合理的音素的声学模型的加权和来表示在发音字典中使用的每个音素，其中发音字典不改变，而是在每个音素的声学空间的模型中字典成为典型本地语音的音素的声学模型的加权和。最后，该方法包括使用定制语音模型通过处理器从目标说话者识别附加语音。

9. 发明授权

US08965767B2 System and method for synthetic voice generation and modification 有权
标题翻译：合成语音产生和修改的系统和方法
公开(公告)号：US08965767B2
公开(公告)日：2015-02-24
申请号：US14282035
申请日：2014-05-20
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Ann K. Syrdal
IPC分类号： G10L13/00 , G10L13/08 , G10L13/027 , G10L13/06 , H04B7/04 , H04B7/06 , H04W72/04
CPC分类号： G10L13/043 , G10L13/027 , G10L13/047 , G10L13/06 , G10L25/63 , H04B7/0404 , H04B7/0697 , H04W72/0413
摘要： Disclosed herein are systems, methods, and non-transitory computer-readable storage media for generating a synthetic voice. A system configured to practice the method combines a first database of a first text-to-speech voice and a second database of a second text-to-speech voice to generate a combined database, selects from the combined database, based on a policy, voice units of a phonetic category for the synthetic voice to yield selected voice units, and synthesizes speech based on the selected voice units. The system can synthesize speech without parameterizing the first text-to-speech voice and the second text-to-speech voice. A policy can define, for a particular phonetic category, from which text-to-speech voice to select voice units. The combined database can include multiple text-to-speech voices from different speakers. The combined database can include voices of a single speaker speaking in different styles. The combined database can include voices of different languages.
摘要翻译：这里公开了用于产生合成语音的系统，方法和非暂时的计算机可读存储介质。被配置为实施该方法的系统组合第一文本到语音语音的第一数据库和第二文本到语音语音的第二数据库以生成组合数据库，基于策略从组合数据库中进行选择，用于合成语音的语音类别的语音单元以产生所选择的语音单元，并且基于所选择的语音单元来合成语音。该系统可以合成语音，而无需参数化第一个文本到语音的语音和第二个文本到语音的语音。对于特定语音类别，策略可以定义哪些文本到语音语音来选择语音单元。组合的数据库可以包括来自不同扬声器的多个文本到语音的声音。组合的数据库可以包括以不同风格说话的单个扬声器的声音。组合的数据库可以包括不同语言的语音。

10. 发明授权

US09564121B2 System and method for generalized preselection for unit selection synthesis 有权
标题翻译：用于单位选择合成的广义预选系统和方法
公开(公告)号：US09564121B2
公开(公告)日：2017-02-07
申请号：US14454123
申请日：2014-08-07
申请人： AT&T Intellectual Property I, L.P.
发明人： Alistair D. Conkie , Mark Beutnagel , Yeon-Jun Kim , Ann K. Syrdal
IPC分类号： G10L13/06 , G10L13/047 , G10L13/00
CPC分类号： G10L13/06 , G10L13/00 , G10L13/047
摘要： Disclosed herein are systems, computer-implemented methods, and computer-readable storage media for unit selection synthesis. The method causes a computing device to add a supplemental phoneset to a speech synthesizer front end having an existing phoneset, modify a unit preselection process based on the supplemental phoneset, preselect units from the supplemental phoneset and the existing phoneset based on the modified unit preselection process, and generate speech based on the preselected units. The supplemental phoneset can be a variation of the existing phoneset, can include a word boundary feature, can include a cluster feature where initial consonant clusters and some word boundaries are marked with diacritics, can include a function word feature which marks units as originating from a function word or a content word, and/or can include a pre-vocalic or post-vocalic feature. The speech synthesizer front end can incorporates the supplemental phoneset as an extra feature.
摘要翻译：本文公开了用于单元选择合成的系统，计算机实现的方法和计算机可读存储介质。该方法使得计算设备将辅助电话机添加到具有现有电话机的语音合成器前端，基于补充电话机修改单元预选过程，基于修改的单位预选过程从辅助电话机和现有电话机中预选单元，并根据预选单位产生语音。补充手机可以是现有手机的变体，可以包括字边界特征，可以包括其中初始辅音簇和一些字边界用变音符标记的群集特征，可以包括将单位标记为源自于功能词或内容词，和/或可以包括语音前或后声部特征。语音合成器前端可以将补充的电话机作为额外的功能。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式