专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

41. 发明申请

US20120310643A1 METHODS AND APPARATUS FOR PROOFING OF A TEXT INPUT 有权
标题翻译：用于验证文本输入的方法和装置
公开(公告)号：US20120310643A1
公开(公告)日：2012-12-06
申请号：US13478930
申请日：2012-05-23
申请人： Martin Labsky , Jan Kleindienst , Tomas Macek , David Nahamoo , Jan Curin , Lars Koenig , Holger Quast
发明人： Martin Labsky , Jan Kleindienst , Tomas Macek , David Nahamoo , Jan Curin , Lars Koenig , Holger Quast
IPC分类号： G06F17/28 , G10L13/08 , G10L15/26
CPC分类号： G10L13/08 , G06F17/21 , G06F17/2785 , G10L13/00 , G10L15/01 , G10L15/02 , G10L15/06 , G10L15/14 , G10L15/1822 , G10L15/26 , G10L15/265 , G10L15/28 , G10L15/30 , G10L15/32 , G10L17/00 , G10L21/06
摘要： Techniques for presenting data input as a plurality of data chunks including a first data chunk and a second data chunk. The techniques include converting the plurality of data chunks to a textual representation comprising a plurality of text chunks including a first text chunk corresponding to the first data chunk and a second text chunk corresponding to the second data chunk, respectively, and providing a presentation of at least part of the textual representation such that the first text chunk is presented differently than the second text chunk to, when presented, assist a user in proofing the textual representation.
摘要翻译：用于将数据输入呈现为包括第一数据块和第二数据块的多个数据块的技术。这些技术包括将多个数据块转换成包括多个文本块的文本表示的文本表示，所述文本块包括分别对应于第一数据块的第一文本块和对应于第二数据块的第二文本块，并且提供在文本表示的最少部分，使得第一文本块被呈现与第二文本块不同，当被呈现时，帮助用户校对文本表示。

42. 发明授权

US07729916B2 Conversational computing via conversational virtual machine 有权
标题翻译：通过对话虚拟机进行会话计算
公开(公告)号：US07729916B2
公开(公告)日：2010-06-01
申请号：US11551901
申请日：2006-10-23
申请人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo
发明人： Daniel Coffman , Liam D. Comerford , Steven DeGennaro , Edward A. Epstein , Ponani Gopalakrishnan , Stephane H. Maes , David Nahamoo
IPC分类号： G10L15/22 , G10L15/28
CPC分类号： H04M3/50 , G06F17/30899 , G10L15/22 , G10L15/285 , G10L2015/228 , H04L67/02 , H04M1/72561 , H04M3/42204 , H04M3/44 , H04M3/493 , H04M3/4931 , H04M3/4936 , H04M3/4938 , H04M7/00 , H04M2201/40 , H04M2201/60 , H04M2203/355 , H04M2250/74
摘要： A conversational computing system that provides a universal coordinated multi-modal conversational user interface (CUI) 10 across a plurality of conversationally aware applications (11) (i.e., applications that “speak” conversational protocols) and conventional applications (12). The conversationally aware applications (11) communicate with a conversational kernel (14) via conversational application APIs (13). The conversational kernel 14 controls the dialog across applications and devices (local and networked) on the basis of their registered conversational capabilities and requirements and provides a unified conversational user interface and conversational services and behaviors. The conversational computing system may be built on top of a conventional operating system and APIs (15) and conventional device hardware (16). The conversational kernel (14) handles all I/O processing and controls conversational engines (18). The conversational kernel (14) converts voice requests into queries and converts outputs and results into spoken messages using conversational engines (18) and conversational arguments (17). The conversational application API (13) conveys all the information for the conversational kernel (14) to transform queries into application calls and conversely convert output into speech, appropriately sorted before being provided to the user.
摘要翻译：一种对话计算系统，其跨越多个会话感知应用（11）（即，“说”对话协议的应用）和常规应用（12）提供通用协调多模态对话用户界面（CUI）10。对话感知应用（11）通过对话应用API（13）与对话内核（14）通信。会话核心14基于其注册的对话能力和需求来控制应用和设备（本地和网络）之间的对话，并提供统一的对话用户界面和对话服务和行为。对话计算系统可以构建在常规操作系统和API（15）和常规设备硬件（16）之上。对话内核（14）处理所有I / O处理和控制对话引擎（18）。会话内核（14）将语音请求转换为查询，并将会话引擎（18）和会话参数（17）将输出和结果转换为口语消息。对话应用程序API（13）传达对话内核（14）的所有信息，以将查询转换成应用程序调用，并相反地将输出转换为语音，在提供给用户之前进行适当排序。

43. 发明申请

US20090276539A1 Conversational Asyncronous Multichannel Communication through an Inter-Modality Bridge 有权
标题翻译：通过跨模式桥梁的对话异步多通道通信
公开(公告)号：US20090276539A1
公开(公告)日：2009-11-05
申请号：US12112839
申请日：2008-04-30
申请人： Juan Huerta , David Lubensky , David Nahamoo
发明人： Juan Huerta , David Lubensky , David Nahamoo
IPC分类号： G06F15/16
CPC分类号： H04L49/355 , G06Q10/06 , H04L69/08
摘要： A communications apparatus is configured to bridge modalities and different communications formats. The apparatus may include a bridge to receive an input through a modality gateway and to deliver an output through an output channel, a communication engine configured to manipulate the input into the output, a router configured to route the configured output to a respective output channel, and a controller configured to control the bridge. The controller may determine a new modality depending on a context of the communications apparatus.
摘要翻译：通信装置被配置为桥接模态和不同的通信格式。该装置可以包括：桥接器，用于通过模态网关接收输入，并通过输出通道传送输出;通信引擎，被配置为操纵输入到输出;路由器，被配置为将配置的输出路由到相应的输出通道; 以及配置成控制所述桥的控制器。控制器可以根据通信设备的上下文来确定新的模态。

44. 发明授权

US07464031B2 Speech recognition utilizing multitude of speech features 失效
标题翻译：语音识别利用多种语音特征
公开(公告)号：US07464031B2
公开(公告)日：2008-12-09
申请号：US10724536
申请日：2003-11-28
申请人： Scott E. Axelrod , Sreeram Viswanath Balakrishnan , Stanley F. Chen , Yuging Gao , Ramesh A. Gopinath , Hong-Kwang Kuo , Benoit Maison , David Nahamoo , Michael Alan Picheny , George A. Saon , Geoffrey G. Zweig
发明人： Scott E. Axelrod , Sreeram Viswanath Balakrishnan , Stanley F. Chen , Yuging Gao , Ramesh A. Gopinath , Hong-Kwang Kuo , Benoit Maison , David Nahamoo , Michael Alan Picheny , George A. Saon , Geoffrey G. Zweig
IPC分类号： G10L15/00 , G10L15/20
CPC分类号： G10L15/063 , G10L15/02 , G10L15/14 , G10L2015/085
摘要： In a speech recognition system, the combination of a log-linear model with a multitude of speech features is provided to recognize unknown speech utterances. The speech recognition system models the posterior probability of linguistic units relevant to speech recognition using a log-linear model. The posterior model captures the probability of the linguistic unit given the observed speech features and the parameters of the posterior model. The posterior model may be determined using the probability of the word sequence hypotheses given a multitude of speech features. Log-linear models are used with features derived from sparse or incomplete data. The speech features that are utilized may include asynchronous, overlapping, and statistically non-independent speech features. Not all features used in training need to appear in testing/recognition.
摘要翻译：在语音识别系统中，提供了具有多个语音特征的对数线性模型的组合来识别未知语音语音。语音识别系统使用对数线性模型对与语音识别相关的语言单位的后验概率进行建模。后验模型捕获了语言单位给出观察到的语音特征和后验模型参数的概率。可以使用给定多个语音特征的单词序列假设的概率来确定后验模型。对数线性模型与来自稀疏或不完整数据的特征一起使用。所使用的语音特征可以包括异步，重叠和统计上非独立的语音特征。培训中使用的并非所有功能都需要出现在测试/识别中。

45. 发明授权

US5729656A Reduction of search space in speech recognition using phone boundaries and phone ranking 失效
标题翻译：使用手机边界和手机排名减少语音识别中的搜索空间
公开(公告)号：US5729656A
公开(公告)日：1998-03-17
申请号：US347013
申请日：1994-11-30
申请人： David Nahamoo , Mukund Padmanabhan
发明人： David Nahamoo , Mukund Padmanabhan
IPC分类号： G10L15/00 , G10L15/02 , G10L15/04 , G10L15/14 , G10L5/06
CPC分类号： G10L15/04 , G10L15/142 , G10L2015/085
摘要： A method for estimating the probability of phone boundaries and the accuracy of the acoustic modelling in reducing a search-space in a speech recognition system. The accuracy of the acoustic modelling is quantified by the rank of the correct phone. The system includes a microphone for converting an utterance into an electrical signal, which is processed by an acoustic processor and label match which finds the best-matched acoustic label prototype. A probability distribution on phone boundaries is produced for every time frame using a first decision tree. These probabilities are compared to a threshold and some time frames are identified as boundaries between phones. An acoustic score is computed for all phones between every given pair of hypothesized boundaries, and the phones are ranked on the basis of this score. A second decision tree is traversed for every time frame to obtain the worst case rank of the correct phone at that time, and a short list of allowed phones is made for every time frame. A fast acoustic word match processor matches the label string from the acoustic processor to produce an utterance signal which includes at least one word. From recognition candidates produced by the fast acoustic match and the language model, the detailed acoustic match matches the label string from the acoustic processor against acoustic word models and outputs a word string corresponding to an utterance.
摘要翻译：一种用于在减少语音识别系统中的搜索空间中估计电话边界的概率和声学建模的准确度的方法。声学建模的准确度由正确的手机的等级来量化。该系统包括用于将发音转换成电信号的麦克风，该电信号由声学处理器处理，并且标签匹配找到最佳匹配的声学标签原型。使用第一决策树为每个时间帧产生电话边界上的概率分布。将这些概率与阈值进行比较，并且将一些时间帧识别为电话之间的边界。对于所有给定的一对假设边界之间的所有电话，计算声学得分，并且手机基于该分数进行排名。每个时间帧都会遍历第二个决策树，以获得当时正确的电话的最差情况等级，并为每个时间帧制作一个简短的允许电话列表。快速声学词匹配处理器将来自声学处理器的标签串匹配以产生包括至少一个单词的话语信号。从快速声学匹配和语言模型产生的识别候选中，详细的声匹配将来自声学处理器的标签串与声学词模型相匹配，并输出与发音对应的字串。

46. 发明授权

US5544257A Continuous parameter hidden Markov model approach to automatic handwriting recognition 失效
标题翻译：连续参数隐马尔可夫模型法自动手写识别
公开(公告)号：US5544257A
公开(公告)日：1996-08-06
申请号：US818193
申请日：1992-01-08
申请人： Eveline J. Bellegarda , Jerome R. Bellegarda , David Nahamoo , Krishna S. Nathan
发明人： Eveline J. Bellegarda , Jerome R. Bellegarda , David Nahamoo , Krishna S. Nathan
IPC分类号： G06K9/62 , G06K9/68 , G06K9/70 , G06K9/00
CPC分类号： G06K9/6297
摘要： A computer-based system and method for recognizing handwriting. The present invention includes a preprocessor, a front end, and a modeling component. The present invention operates as follows. First, the present invention identifies the lexemes for all characters of interest. Second, the present invention performs a training phase in order to generate a hidden Markov model for each of the lexemes. Third, the present invention performs a decoding phase to recognize handwritten text. Hidden Markov models for lexemes are produced during the training phase. The present invention performs the decoding phase as follows. The present invention receives test characters to be decoded (that is, to be recognized). The present invention generates sequences of feature vectors for the test characters by mapping in chirographic space. For each of the test characters, the present invention computes probabilities that the test character can be generated by the hidden Markov models. The present invention decodes the test character as the recognized character associated with the hidden Markov model having the greatest probability.
摘要翻译：一种用于识别笔迹的基于计算机的系统和方法。本发明包括预处理器，前端和建模部件。本发明如下操作。首先，本发明识别所有感兴趣的人物的词汇。第二，本发明执行训练阶段，以便为每个词汇生成隐马尔可夫模型。第三，本发明执行解码阶段来识别手写文本。训练阶段产生了隐马尔可夫模型。本发明如下进行解码阶段。本发明接收要解码的测试字符（即将被识别）。本发明通过在手写空间中映射来生成用于测试字符的特征向量的序列。对于每个测试字符，本发明计算由隐马尔可夫模型可以产生测试字符的概率。本发明将测试字符解码为与具有最大概率的隐马尔可夫模型相关联的识别字符。

47. 发明授权

US5278942A Speech coding apparatus having speaker dependent prototypes generated from nonuser reference data 失效
标题翻译：具有由非用户参考数据生成的具有说话者依赖原型的语音编码装置
公开(公告)号：US5278942A
公开(公告)日：1994-01-11
申请号：US802678
申请日：1991-12-05
申请人： Lalit R. Bahl , Jerome R. Bellegarda , Peter V. De Souza , Ponani S. Gopalakrishnan , Arthur J. Nadas , David Nahamoo , Michael A. Picheny
发明人： Lalit R. Bahl , Jerome R. Bellegarda , Peter V. De Souza , Ponani S. Gopalakrishnan , Arthur J. Nadas , David Nahamoo , Michael A. Picheny
IPC分类号： G10L19/00 , G10L15/02 , G10L15/06 , G10L15/10 , G10L9/02
CPC分类号： G10L15/063 , G10L15/02
摘要： A speech coding apparatus and method for use in a speech recognition apparatus and method. The value of at least one feature of an utterance is measured during each of a series of successive time intervals to produce a series of feature vector signals representing the feature values. A plurality of prototype vector signals, each having at least one parameter value and a unique identification value are stored. The closeness of the feature vector signal is compared to the parameter values of the prototype vector signals to obtain prototype match scores for the feature value signal and each prototype vector signal. The identification value of the prototype vector signal having the best prototype match score is output as a coded representation signal of the feature vector signal. Speaker-dependent prototype vector signals are generated from both synthesized training vector signals and measured training vector signals. The synthesized training vector signals are transformed reference feature vector signals representing the values of features of one or more utterances of one or more speakers in a reference set of speakers. The measured training feature vector signals represent the values of features of one or more utterances of a new speaker/user not in the reference set.
摘要翻译：一种用于语音识别装置和方法的语音编码装置和方法。在一系列连续时间间隔的每一个期间测量话音的至少一个特征的值，以产生表示特征值的一系列特征向量信号。存储多个具有至少一个参数值和唯一识别值的原型矢量信号。将特征矢量信号的接近度与原型矢量信号的参数值进行比较，以获得特征值信号和每个原型矢量信号的原型匹配分数。输出具有最佳原型匹配分数的原型矢量信号的识别值作为特征矢量信号的编码表示信号。从合成的训练矢量信号和测量的训练矢量信号产生与扬声器相关的原型矢量信号。合成的训练矢量信号是变换的参考特征矢量信号，其代表参考的一组扬声器中的一个或多个扬声器的一个或多个话音的特征值。测量的训练特征向量信号表示不在参考集合中的新的说话者/用户的一个或多个话语的特征值。

48. 发明授权

US4926488A Normalization of speech by adaptive labelling 失效
标题翻译：通过自适应标签规范语音
公开(公告)号：US4926488A
公开(公告)日：1990-05-15
申请号：US71687
申请日：1987-07-09
申请人： Arthur J. Nadas , David Nahamoo
发明人： Arthur J. Nadas , David Nahamoo
IPC分类号： G10L11/00 , G10L15/02 , G10L15/06 , G10L15/12 , G10L15/20 , G10L19/00 , G10L21/02
CPC分类号： G10L15/07 , G10L15/20
摘要： In a speech processor system in which prototype vectors of speech are generated by an acoustic processor under reference noise and known ambient conditions and in which feature vectors of speech are generated during varying noise and other ambient and recording conditions, normalized vectors are generated to reflect the form the feature vectors would have if generated under the reference conditions. The normalized vectors are generated by: (a) applying an operator function A.sub.i to a set of feature vectors x occurring at or before time interval i to yield a normalized vector y.sub.i =A.sub.i (x); (b) determining a distance error vector E.sub.i by which the normalized vector is projectively moved toward the closest prototype vector to the normalized vector y.sub.i ; (c) up-dating the operator function for next time interval to correspond to the most recently determined distance error vector; and (d) incrementing i to the next time interval and repeating steps (a) through (d) wherein the feature vector corresponding to the incremented i value has the most recent up-dated operator function applied thereto. With successive time intervals, successive normalized vectors are generated based on a successively up-dated operator function. For each normalized vector, the closest prototype thereto is associated therewith. The string of normalized vectors or the string of associated prototypes (or respective label identifiers thereof) or both provide output from the acoustic processor.

49. 发明授权

US09646001B2 Machine translation (MT) based spoken dialog systems customer/machine dialog 有权
公开(公告)号：US09646001B2
公开(公告)日：2017-05-09
申请号：US13236016
申请日：2011-09-19
申请人： Ruhi Sarikaya , Vaibhava Goel , David Nahamoo , Rèal Tremblay , Bhuvana Ramabhadran , Osamuyimen Stewart
发明人： Ruhi Sarikaya , Vaibhava Goel , David Nahamoo , Rèal Tremblay , Bhuvana Ramabhadran , Osamuyimen Stewart
IPC分类号： G06F17/28 , G10L15/26 , G10L15/22 , H04M3/42
CPC分类号： G06F17/289 , G10L15/22 , G10L15/26 , H04M3/42 , H04M3/42391 , H04M2242/12
摘要： Operation of an automated dialog system is described using a source language to conduct a real time human machine dialog process with a human user using a target language. A user query in the target language is received and automatically machine translated into the source language. An automated reply of the dialog process is then delivered to the user in the target language. If the dialog process reaches an initial assistance state, a first human agent using the source language is provided to interact in real time with the user in the target language by machine translation to continue the dialog process. Then if the dialog process reaches a further assistance state, a second human agent using the target language is provided to interact in real time with the user in the target language to continue the dialog process.

50. 发明授权

US08954329B2 Methods and apparatus for acoustic disambiguation by insertion of disambiguating textual information 有权
标题翻译：通过插入消歧文本信息进行声学消歧的方法和装置
公开(公告)号：US08954329B2
公开(公告)日：2015-02-10
申请号：US13478978
申请日：2012-05-23
申请人： Martin Labsky , Jan Kleindienst , Tomas Macek , David Nahamoo , Jan Curin , William F. Ganong, III
发明人： Martin Labsky , Jan Kleindienst , Tomas Macek , David Nahamoo , Jan Curin , William F. Ganong, III
IPC分类号： G10L13/00 , G06F17/27 , G10L13/08 , G10L15/14
CPC分类号： G10L13/08 , G06F17/21 , G06F17/2785 , G10L13/00 , G10L15/01 , G10L15/02 , G10L15/06 , G10L15/14 , G10L15/1822 , G10L15/26 , G10L15/265 , G10L15/28 , G10L15/30 , G10L15/32 , G10L17/00 , G10L21/06
摘要： Techniques for disambiguating at least one text segment from at least one acoustically similar word and/or phrase. The techniques include identifying at least one text segment, in a textual representation having a plurality of text segments, having at least one acoustically similar word and/or phrase which has a different spelling, annotating the textual representation with disambiguating information to help disambiguate the at least one text segment from the at least one acoustically similar word and/or phrase, and synthesizing a speech signal, at least in part, by performing text-to-speech synthesis on at least a portion of the textual representation that includes the at least one text segment, wherein the speech signal includes speech corresponding to the disambiguating information located proximate the portion of the speech signal corresponding to the at least one text segment.
摘要翻译：从至少一个声学上类似的词和/或短语中消除至少一个文本段的歧义的技术。这些技术包括在具有多个文本段的文本表示中识别至少一个文本段，其具有至少一个具有不同拼写的声学上相似的单词和/或短语，用消歧信息注释文本表示，以帮助消除歧义至少一个文本段从所述至少一个声学上类似的词和/或短语中至少一个文本段，并且至少部分地通过对所述文本表示的至少一部分执行文本到语音合成来合成语音信号，所述至少一部分包括至少一个文本段，其中所述语音信号包括对应于邻近所述语音信号对应于所述至少一个文本段的部分的消歧信息的语音。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式