会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • System and method for speech recognition
    • 用于语音识别的系统和方法
    • US20040260546A1
    • 2004-12-23
    • US10830458
    • 2004-04-23
    • Hiroshi SeoSoichi Toyama
    • G10L015/00
    • G10L15/20
    • A system and method include an initial noise model produced based on pre-estimated noise of a service environment and an initial synthesized model of a voice containing noise. The system and method produce an utterance environment noise model from background noise of the service environment upon speech recognition as well as a sequence of feature vectors from noise-superimposed speech including an uttered voice and the background noise. The system and method also produce an adaptive model by adapting the initial synthesized model using the utterance environment noise model, the initial noise model, and a compensation model, so that the adaptive model is checked against the sequence of feature vectors to perform speech recognition. Upon performing the speech recognition, a compensation model is created upon which the signal to noise ratio between the background noise present at the time of actual utterance of a voice and the uttered voice is reflected.
    • 系统和方法包括基于服务环境的预估噪声和包含噪声的语音的初始合成模型产生的初始噪声模型。 该系统和方法从语音识别中的服务环境的背景噪声以及包括发出的语音和背景噪声的噪声叠加语音的特征向量序列产生语音环境噪声模型。 该系统和方法还通过使用发声环境噪声模型,初始噪声模型和补偿模型来适应初始合成模型来产生自适应模型,从而针对特征向量序列检查自适应模型以执行语音识别。 在执行语音识别时,产生补偿模型,在该补偿模型上反映在语音实际发声时出现的背景噪声与发出的声音之间的信噪比。
    • 2. 发明申请
    • Method for integrating processes with a multi-faceted human centered interface
    • 将过程与多方面的人机对话界面相集成的方法
    • US20040249640A1
    • 2004-12-09
    • US10619204
    • 2003-07-14
    • Richard GrantPedro E. McGregor
    • G10L015/00
    • G10L15/193G10L2015/228
    • According to the present invention, a method for integrating processes with a multi-faceted human centered interface is provided. The interface is facilitated to implement a hands free, voice driven environment to control processes and applications. A natural language model is used to parse voice initiated commands and data, and to route those voice initiated inputs to the required applications or processes. The use of an intelligent context based parser allows the system to intelligently determine what processes are required to complete a task which is initiated using natural language. A single window environment provides an interface which is comfortable to the user by preventing the occurrence of distracting windows from appearing. The single window has a plurality of facets which allow distinct viewing areas. Each facet has an independent process routing its outputs thereto. As other processes are activated, each facet can reshape itself to bring a new process into one of the viewing areas. All activated processes are executed simultaneously to provide true multitasking.
    • 根据本发明,提供了一种用于将过程与多方面的人对中接口进行集成的方法。 该接口有助于实现免提,语音驱动的环境来控制过程和应用程序。 自然语言模型用于解析语音发起的命令和数据,并将这些语音发起的输入路由到所需的应用程序或进程。 使用基于智能上下文的解析器允许系统智能地确定完成使用自然语言启动的任务所需的进程。 单个窗口环境通过防止分散窗口出现而提供对用户舒适的界面。 单个窗口具有允许不同观看区域的多个小平面。 每个方面都有一个独立的过程来路由其输出。 随着其他过程被激活,每个方面都可以重塑自己,以将新的过程带入其中一个观看区域。 所有激活的进程都被同时执行,以提供真正的多任务。
    • 3. 发明申请
    • Detecting repeated phrases and inference of dialogue models
    • 检测反复的短语和对话模型的推论
    • US20040249637A1
    • 2004-12-09
    • US10857896
    • 2004-06-02
    • Aurilab, LLC
    • James K. Baker
    • G10L015/00
    • G10L15/1822G10L15/1815
    • A method of speech recognition obtains acoustic data from a plurality of conversations. A plurality of pairs of utterances are selected from the plurality of conversations. At least one portion of the first utterance of the pair of utterances is dynamically aligned with at least one portion of the second utterance of the pair of utterance, and an acoustic similarity is computed. At least one pair that includes a first portion from a first utterance and a second portion from a second utterance is chosen, based on a criterion of acoustic similarity. A common pattern template is created from the first portion and the second portion.
    • 一种语音识别方法从多个对话中获得声学数据。 从多个会话中选择多对话语。 一对话语的第一个发音的至少一部分与该对发音的第二个发音的至少一部分动态对齐,并且计算声学相似度。 基于声学相似性的标准,选择至少一对包括来自第一话语的第一部分和来自第二话语的第二部分。 从第一部分和第二部分创建共同的图案模板。
    • 4. 发明申请
    • Automatic assessment of phonological processes
    • 自动评估语音过程
    • US20040230430A1
    • 2004-11-18
    • US10637235
    • 2003-08-08
    • Sunil K. GuptaPrabhu RaghavanChetan Vinchhi
    • G10L015/00
    • G09B19/06G10L15/02
    • A computer-based system generates alternative phonetic transcriptions for a target word or phrase corresponding to specific phonological processes that replace individual phonemes or clusters of two or more phonemes with replacement phonemes. The system compares a user's speech with a list of possible transcriptions that includes the base (i.e., correct) transcription of the test target as well as the different alternative transcriptions, to identify the transcription that best matches the user's. In a speech therapy application, the system identifies the phonological process(es), if any, associated with the user's speech and generates statistics over multiple test targets that can be used to diagnose the user's specific phonological disorders. The system can also be implemented in other contexts such as foreign language instruction and automated attendant applications to cover a wide variety and range of accents and/or phonological disorders.
    • 基于计算机的系统产生用于替换具有替换音素的两个或多个音素的单个音素或簇的特定语音过程的目标词或短语的替代语音转录。 该系统将用户的语音与包括测试目标的基础(即,正确)转录以及不同的替代转录的可能转录的列表进行比较,以识别与用户最匹配的转录。 在语音治疗应用中,系统识别与用户语音相关联的语音过程(如果有的话),并产生可用于诊断用户的特定语音障碍的多个测试目标的统计。 该系统还可以在诸如外语指令和自动应答之类的其他情况下实现,以覆盖广泛的各种各样的口音和/或语音障碍。
    • 5. 发明申请
    • Speaker recognition using local models
    • 扬声器识别使用本地模型
    • US20040225498A1
    • 2004-11-11
    • US10810232
    • 2004-03-26
    • Ryan Rifkin
    • G10L015/00
    • G10L17/02G10L17/08
    • A system and method for voice recognition is disclosed. The system enrolls speakers using an enrollment voice samples and identification information. An extraction module characterizes enrollment voice samples with high-dimensional feature vectors or speaker data points. A data structuring module organizes data points into a high-dimensional data structure, such as a kd-tree, in which similarity between data points dictates a distance, such as a Euclidean distance, a Minkowski distance, or a Manhattan distance. The system recognizes a speaker using an unidentified voice sample. A data querying module searches the data structure to generate a subset of approximate nearest neighbors based on an extracted high-dimensional feature vector. A data modeling module uses Parzen windows to estimate a probability density function representing how closely characteristics of the unidentified speaker match enrolled speakers, in real-time, without extensive training data or parametric assumptions about data distribution. A smoothing parameter controls the relative contributions of close and far speaker data points to the estimated density.
    • 公开了一种用于语音识别的系统和方法。 系统使用注册语音样本和身份信息注册演讲者。 提取模块表征具有高维特征向量或扬声器数据点的注册语音样本。 数据结构化模块将数据点组织成诸如kd-tree的高维数据结构,其中数据点之间的相似性指示距离,例如欧几里德距离,闵可夫斯基距离或曼哈顿距离。 该系统识别使用不明身份的语音样本的扬声器。 数据查询模块基于提取的高维特征向量来搜索数据结构以生成近似最近邻的子集。 数据建模模块使用Parzen窗口来估​​计概率密度函数,表示不确定的说话人的特征与已登记的演讲者的特征密切相关,无需广泛的训练数据或关于数据分布的参数假设。 平滑参数控制近距离和远扬声器数据点对估计密度的相对贡献。
    • 9. 发明申请
    • Automated decision making using time-varying stream reliability prediction
    • 使用时变流可靠性预测的自动决策
    • US20040193415A1
    • 2004-09-30
    • US10397762
    • 2003-03-26
    • International Business Machines Corporation
    • Upendra V. ChaudhariChalapathy NetiGerasimos PotamianosGanesh N. Ramaswamy
    • G10L015/00
    • G10L17/06G10L17/20
    • Automated decision making techniques are provided. For example, a technique for generating a decision associated with an individual or an entity includes the following steps. First, two or more data streams associated with the individual or the entity are captured. Then, at least one time-varying measure is computed in accordance with the two or more data streams. Lastly, a decision is computed based on the at least one time-varying measure. One form of the time-varying measure may include a measure of the coverage of a model associated with previously-obtained training data by at least a portion of the captured data. Another form of the time-varying measure may include a measure of the stability of at least a portion of the captured data. While either measure may be employed alone to compute a decision, preferably both the coverage and stability measures are employed. The technique may be used to authenticate a speaker.
    • 提供自动决策技术。 例如,用于生成与个体或实体相关联的决定的技术包括以下步骤。 首先,捕获与个体或实体相关联的两个或多个数据流。 然后,根据两个或多个数据流来计算至少一个时变度量。 最后,基于至少一个时变度量来计算决定。 时变测量的一种形式可以包括通过所捕获的数据的至少一部分与先前获得的训练数据相关联的模型的覆盖度的度量。 时变措施的另一种形式可以包括所捕获的数据的至少一部分的稳定性的度量。 尽管可以单独使用任一种方法来计算决策,但优选采用覆盖和稳定性度量。 该技术可用于认证扬声器。