会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 42. 发明授权
    • Conversational computing via conversational virtual machine
    • 通过对话虚拟机进行会话计算
    • US07729916B2
    • 2010-06-01
    • US11551901
    • 2006-10-23
    • Daniel CoffmanLiam D. ComerfordSteven DeGennaroEdward A. EpsteinPonani GopalakrishnanStephane H. MaesDavid Nahamoo
    • Daniel CoffmanLiam D. ComerfordSteven DeGennaroEdward A. EpsteinPonani GopalakrishnanStephane H. MaesDavid Nahamoo
    • G10L15/22G10L15/28
    • H04M3/50G06F17/30899G10L15/22G10L15/285G10L2015/228H04L67/02H04M1/72561H04M3/42204H04M3/44H04M3/493H04M3/4931H04M3/4936H04M3/4938H04M7/00H04M2201/40H04M2201/60H04M2203/355H04M2250/74
    • A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.
    • 一种对话计算系统,其跨越多个会话感知应用(11)(即,“说”对话协议的应用)和常规应用(12)提供通用协调多模态对话用户界面(CUI)10。 对话感知应用(11)通过对话应用API(13)与对话内核(14)通信。 会话核心14基于其注册的对话能力和需求来控制应用和设备(本地和网络)之间的对话,并提供统一的对话用户界面和对话服务和行为。 对话计算系统可以构建在常规操作系统和API(15)和常规设备硬件(16)之上。 对话内核(14)处理所有I / O处理和控制对话引擎(18)。 会话内核(14)将语音请求转换为查询,并将会话引擎(18)和会话参数(17)将输出和结果转换为口语消息。 对话应用程序API(13)传达对话内核(14)的所有信息,以将查询转换成应用程序调用,并相反地将输出转换为语音,在提供给用户之前进行适当排序。
    • 45. 发明授权
    • Reduction of search space in speech recognition using phone boundaries
and phone ranking
    • 使用手机边界和手机排名减少语音识别中的搜索空间
    • US5729656A
    • 1998-03-17
    • US347013
    • 1994-11-30
    • David NahamooMukund Padmanabhan
    • David NahamooMukund Padmanabhan
    • G10L15/00G10L15/02G10L15/04G10L15/14G10L5/06
    • G10L15/04G10L15/142G10L2015/085
    • A method for estimating the probability of phone boundaries and the accuracy of the acoustic modelling in reducing a search-space in a speech recognition system. The accuracy of the acoustic modelling is quantified by the rank of the correct phone. The system includes a microphone for converting an utterance into an electrical signal, which is processed by an acoustic processor and label match which finds the best-matched acoustic label prototype. A probability distribution on phone boundaries is produced for every time frame using a first decision tree. These probabilities are compared to a threshold and some time frames are identified as boundaries between phones. An acoustic score is computed for all phones between every given pair of hypothesized boundaries, and the phones are ranked on the basis of this score. A second decision tree is traversed for every time frame to obtain the worst case rank of the correct phone at that time, and a short list of allowed phones is made for every time frame. A fast acoustic word match processor matches the label string from the acoustic processor to produce an utterance signal which includes at least one word. From recognition candidates produced by the fast acoustic match and the language model, the detailed acoustic match matches the label string from the acoustic processor against acoustic word models and outputs a word string corresponding to an utterance.
    • 一种用于在减少语音识别系统中的搜索空间中估计电话边界的概率和声学建模的准确度的方法。 声学建模的准确度由正确的手机的等级来量化。 该系统包括用于将发音转换成电信号的麦克风,该电信号由声学处理器处理,并且标签匹配找到最佳匹配的声学标签原型。 使用第一决策树为每个时间帧产生电话边界上的概率分布。 将这些概率与阈值进行比较,并且将一些时间帧识别为电话之间的边界。 对于所有给定的一对假设边界之间的所有电话,计算声学得分,并且手机基于该分数进行排名。 每个时间帧都会遍历第二个决策树,以获得当时正确的电话的最差情况等级,并为每个时间帧制作一个简短的允许电话列表。 快速声学词匹配处理器将来自声学处理器的标签串匹配以产生包括至少一个单词的话语信号。 从快速声学匹配和语言模型产生的识别候选中,详细的声匹配将来自声学处理器的标签串与声学词模型相匹配,并输出与发音对应的字串。
    • 46. 发明授权
    • Continuous parameter hidden Markov model approach to automatic
handwriting recognition
    • 连续参数隐马尔可夫模型法自动手写识别
    • US5544257A
    • 1996-08-06
    • US818193
    • 1992-01-08
    • Eveline J. BellegardaJerome R. BellegardaDavid NahamooKrishna S. Nathan
    • Eveline J. BellegardaJerome R. BellegardaDavid NahamooKrishna S. Nathan
    • G06K9/62G06K9/68G06K9/70G06K9/00
    • G06K9/6297
    • A computer-based system and method for recognizing handwriting. The present invention includes a preprocessor, a front end, and a modeling component. The present invention operates as follows. First, the present invention identifies the lexemes for all characters of interest. Second, the present invention performs a training phase in order to generate a hidden Markov model for each of the lexemes. Third, the present invention performs a decoding phase to recognize handwritten text. Hidden Markov models for lexemes are produced during the training phase. The present invention performs the decoding phase as follows. The present invention receives test characters to be decoded (that is, to be recognized). The present invention generates sequences of feature vectors for the test characters by mapping in chirographic space. For each of the test characters, the present invention computes probabilities that the test character can be generated by the hidden Markov models. The present invention decodes the test character as the recognized character associated with the hidden Markov model having the greatest probability.
    • 一种用于识别笔迹的基于计算机的系统和方法。 本发明包括预处理器,前端和建模部件。 本发明如下操作。 首先,本发明识别所有感兴趣的人物的词汇。 第二,本发明执行训练阶段,以便为每个词汇生成隐马尔可夫模型。 第三,本发明执行解码阶段来识别手写文本。 训练阶段产生了隐马尔可夫模型。 本发明如下进行解码阶段。 本发明接收要解码的测试字符(即将被识别)。 本发明通过在手写空间中映射来生成用于测试字符的特征向量的序列。 对于每个测试字符,本发明计算由隐马尔可夫模型可以产生测试字符的概率。 本发明将测试字符解码为与具有最大概率的隐马尔可夫模型相关联的识别字符。
    • 47. 发明授权
    • Speech coding apparatus having speaker dependent prototypes generated
from nonuser reference data
    • 具有由非用户参考数据生成的具有说话者依赖原型的语音编码装置
    • US5278942A
    • 1994-01-11
    • US802678
    • 1991-12-05
    • Lalit R. BahlJerome R. BellegardaPeter V. De SouzaPonani S. GopalakrishnanArthur J. NadasDavid NahamooMichael A. Picheny
    • Lalit R. BahlJerome R. BellegardaPeter V. De SouzaPonani S. GopalakrishnanArthur J. NadasDavid NahamooMichael A. Picheny
    • G10L19/00G10L15/02G10L15/06G10L15/10G10L9/02
    • G10L15/063G10L15/02
    • A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals. The synthesized training vector signals are transformed reference feature vector signals representing the values of features of one or more utterances of one or more speakers in a reference set of speakers. The measured training feature vector signals represent the values of features of one or more utterances of a new speaker/user not in the reference set.
    • 一种用于语音识别装置和方法的语音编码装置和方法。 在一系列连续时间间隔的每一个期间测量话音的至少一个特征的值,以产生表示特征值的一系列特征向量信号。 存储多个具有至少一个参数值和唯一识别值的原型矢量信号。 将特征矢量信号的接近度与原型矢量信号的参数值进行比较,以获得特征值信号和每个原型矢量信号的原型匹配分数。 输出具有最佳原型匹配分数的原型矢量信号的识别值作为特征矢量信号的编码表示信号。 从合成的训练矢量信号和测量的训练矢量信号产生与扬声器相关的原型矢量信号。 合成的训练矢量信号是变换的参考特征矢量信号,其代表参考的一组扬声器中的一个或多个扬声器的一个或多个话音的特征值。 测量的训练特征向量信号表示不在参考集合中的新的说话者/用户的一个或多个话语的特征值。
    • 48. 发明授权
    • Normalization of speech by adaptive labelling
    • 通过自适应标签规范语音
    • US4926488A
    • 1990-05-15
    • US71687
    • 1987-07-09
    • Arthur J. NadasDavid Nahamoo
    • Arthur J. NadasDavid Nahamoo
    • G10L11/00G10L15/02G10L15/06G10L15/12G10L15/20G10L19/00G10L21/02
    • G10L15/07G10L15/20
    • In a speech processor system in which prototype vectors of speech are generated by an acoustic processor under reference noise and known ambient conditions and in which feature vectors of speech are generated during varying noise and other ambient and recording conditions, normalized vectors are generated to reflect the form the feature vectors would have if generated under the reference conditions. The normalized vectors are generated by: (a) applying an operator function A.sub.i to a set of feature vectors x occurring at or before time interval i to yield a normalized vector y.sub.i =A.sub.i (x); (b) determining a distance error vector E.sub.i by which the normalized vector is projectively moved toward the closest prototype vector to the normalized vector y.sub.i ; (c) up-dating the operator function for next time interval to correspond to the most recently determined distance error vector; and (d) incrementing i to the next time interval and repeating steps (a) through (d) wherein the feature vector corresponding to the incremented i value has the most recent up-dated operator function applied thereto. With successive time intervals, successive normalized vectors are generated based on a successively up-dated operator function. For each normalized vector, the closest prototype thereto is associated therewith. The string of normalized vectors or the string of associated prototypes (or respective label identifiers thereof) or both provide output from the acoustic processor.