会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 4. 发明公开
    • A system and method of providing an automated data-collection in spoken dialog systems
    • 系统和Verfahren zur Bereitstellung einer automatisierten Datensammlung在Sprachdialogsystemen
    • EP1679693A1
    • 2006-07-12
    • EP06100060.0
    • 2006-01-04
    • AT&T Corp.
    • Di Fabbrizio, GiuseppeHakkani-Tur, Dilek, Z.Rahim, Mazin G.Renger, Bernard S.Tur, Gokhan
    • G10L15/22G10L15/18
    • G10L15/063G10L15/183G10L15/22
    • The invention relates to a system and method for gathering data for use in a spoken dialog system. An aspect of the invention is generally referred to as an automated hidden human that performs data collection automatically at the beginning of a conversation with a user in a spoken dialog system. The method comprises presenting an initial prompt to a user, recognizing a received user utterance using an automatic speech recognition engine and classifying the recognized user utterance using a spoken language understanding module. If the recognized user utterance is not understood or classifiable to a predetermined acceptance threshold, then the method re-prompts the user. If the recognized user utterance is not classifiable to a predetermined rejection threshold, then the method transfers the user to a human as this may imply a task-specific utterance. The received and classified user utterance is then used for training the spoken dialog system.
    • 本发明涉及一种用于收集在口头对话系统中使用的数据的系统和方法。 本发明的一个方面通常被称为在与对话系统中的用户的对话开始时自动执行数据收集的自动隐藏人物。 该方法包括向用户呈现初始提示,使用自动语音识别引擎识别所接收的用户话语,并使用口语理解模块对所识别的用户话语进行分类。 如果识别的用户话语不能被理解或可被分类到预定的接受阈值,则该方法重新提示用户。 如果识别的用户话语不能被分类为预定的拒绝阈值,则该方法将用户转移给人,因为这可能意味着特定于任务的话语。 接收和分类的用户话语随后用于训练口语对话系统。
    • 5. 发明公开
    • Spoken language understanding that incorporates prior knowledge into boosting
    • Sprachverständnismit Vorwissen zurErhöhung
    • EP1280136A1
    • 2003-01-29
    • EP02254994.3
    • 2002-07-16
    • AT&T Corp.
    • Alshawi, HiyanDi Fabrizio, GiuseppeGupta, Nagendra K.Rahim, Mazin G.Schapire, Robert EliasYoram, Singer
    • G10L15/06
    • G10L15/063
    • A system for understanding entries, such as speech, develops a classifier by employing prior knowledge with which a given corpus of training entries is enlarged threefold. The prior knowledge is embodied in a rule, combined from separate rules created for each label outputted by the classifier, each of which includes a weight measure p ( x ). A first a set of created entries for increasing the corpus of training entries is created by attaching all labels to each entry of the original corpus of training entries, with a weight h p ( x ), or h(1- p ( x )), in association with each label that meets, or fails to meet, the condition specified for the label, h being a preselected positive number. The second set of is created by not attaching any of the labels to each of the original corpus of training entries, with a weight of h(1- p ( x )), or h p ( x ), in association with each label that meets, or fails to meet, the condition specified for the label.
    • 用于理解条目(例如语音)的系统通过使用预先知识来开发分类器,给定的训练条目语料库被放大三倍。 现有知识体现在一个规则中,从为分类器输出的每个标签创建的单独规则组合起来,每个标签都包括权重度p(x)。 通过将所有标签附加到训练条目的原始语料库的每个条目,具有权重hp(x)或h(1-p(x)),创建用于增加训练条目语料库的第一组创建条目。 与符合或不符合标签规定的条件的每个标签相关联,h是预选的正数。 第二组是通过不将任何标签附加到每个训练条目的原始语料库中,其重量与h(1-p(x))或hp(x)的重量相关联,并与每个符合的标签相关联 ,或者不符合标签规定的条件。
    • 7. 发明公开
    • Learning in automatic speech recognition
    • Lernen zur Spracherkennung
    • EP1696421A2
    • 2006-08-30
    • EP06110328.9
    • 2006-02-23
    • AT&T Corp.
    • Hakkani-Tur, Dilek Z.Rahim, Mazin G.Tur, GokhanRiccardi, Giuseppe
    • G10L15/06
    • G10L15/063G10L15/07G10L15/18G10L15/26G10L2015/0638
    • Utterance data that includes at least a small amount of manually transcribed data is provided. Automatic speech recognition is performed on ones of the utterance data not having a corresponding manual transcription to produce automatically transcribed utterances. A model is trained using all of the manually transcribed data and the automatically transcribed utterances. A predetermined number of utterances not having a corresponding manual transcription are intelligently selected and manually transcribed. Ones of the automatically transcribed data as well as ones having a corresponding manual transcription are labeled. In another aspect of the invention, audio data is mined from at least one source, and a language model is trained for call classification from the mined audio data to produce a language model.
    • 提供了包括至少少量手动转录数据的语音数据。 对没有相应的手动转录的话语数据中的一个执行自动语音识别以产生自动转录的话语。 使用所有手动转录数据和自动转录的话语训练模型。 智能地选择并手动转录预定数量的不具有相应手动转录的话语。 自动转录的数据以及具有相应手动转录的数据的标签。 在本发明的另一方面,音频数据从至少一个源开始,并且语言模型被训练用于从所开采的音频数据进行呼叫分类以产生语言模型。