会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Audio-only backoff in audio-visual speech recognition system
    • 音视频语音识别系统中的音频回退
    • US07251603B2
    • 2007-07-31
    • US10601350
    • 2003-06-23
    • Jonathan H. ConnellNorman HaasEtienne MarcheretChalapathy Venkata NetiGerasimos Potamianos
    • Jonathan H. ConnellNorman HaasEtienne MarcheretChalapathy Venkata NetiGerasimos Potamianos
    • G10L21/00
    • G10L15/25
    • Techniques for performing audio-visual speech recognition, with improved recognition performance, in a degraded visual environment. For example, in one aspect of the invention, a technique for use in accordance with an audio-visual speech recognition system for improving a recognition performance thereof includes the steps/operations of: (i) selecting between an acoustic-only data model and an acoustic-visual data model based on a condition associated with a visual environment; and (ii) decoding at least a portion of an input spoken utterance using the selected data model. Advantageously, during periods of degraded visual conditions, the audio-visual speech recognition system is able to decode (recognize) input speech data using audio-only data, thus avoiding recognition inaccuracies that may result from performing speech recognition based on acoustic-visual data models and degraded visual data.
    • 在劣化的视觉环境中执行视听语音识别技术,具有改进的识别性能。 例如,在本发明的一个方面,根据用于改善其识别性能的视听语音识别系统使用的技术包括以下步骤/操作:(i)在仅声学数据模型和 基于与视觉环境相关的条件的声学可视数据模型; 以及(ii)使用所选择的数据模型解码输入口头发音的至少一部分。 有利的是,在恶化的视觉条件期间,视听语音识别系统能够使用仅音频数据解码(识别)输入语音数据,从而避免了基于声学可视数据模型执行语音识别可能导致的识别不准确 并降低视觉数据。
    • 2. 发明授权
    • Methods and apparatus for audio-visual speech detection and recognition
    • 视听语音检测和识别的方法和装置
    • US06594629B1
    • 2003-07-15
    • US09369707
    • 1999-08-06
    • Sankar BasuPhilippe Christian de CuetosStephane Herman MaesChalapathy Venkata NetiAndrew William Senior
    • Sankar BasuPhilippe Christian de CuetosStephane Herman MaesChalapathy Venkata NetiAndrew William Senior
    • G10L1500
    • G06K9/00228G06K9/00335G10L15/25G10L25/78
    • In a first aspect of the invention, methods and apparatus for providing speech recognition comprise the steps of processing a video signal associated with an arbitrary content video source, processing an audio signal associated with the video signal, and decoding the processed audio signal in conjunction with the processed video signal to generate a decoded output signal representative of the audio signal. In a second aspect 6f the invention, methods and apparatus for providing speech detection in accordance with a speech recognition system comprise the steps of processing a video signal associated with a video source to detect whether one or more features associated with the video signal are representative of speech, and processing an audio signal associated with the video signal in accordance with the speech recognition system to generate a decoded output signal representative of the audio signal when the one or more features associated with the video signal are representative of speech. Speech detection may also be performed using information from both the video path and the audio path simultaneously.
    • 在本发明的第一方面,用于提供语音识别的方法和装置包括以下步骤:处理与任意内容视频源相关联的视频信号,处理与视频信号相关联的音频信号,以及结合处理的音频信号 处理的视频信号以产生表示音频信号的解码输出信号。 在本发明的第二方面6f中,根据语音识别系统提供语音检测的方法和装置包括以下步骤:处理与视频源相关联的视频信号,以检测与视频信号相关联的一个或多个特征是否代表 并且当与视频信号相关联的一个或多个特征代表语音时,根据语音识别系统处理与视频信号相关联的音频信号,以产生表示音频信号的解码输出信号。 也可以使用来自视频路径和音频路径的信息同时执行语音检测。
    • 3. 发明授权
    • Speech recognition models combining gender-dependent and
gender-independent phone states and using phonetic-context-dependence
    • 语音识别模型结合了性别依赖和与性别无关的手机状态,并使用语音背景相关性
    • US5953701A
    • 1999-09-14
    • US10466
    • 1998-01-22
    • Chalapathy Venkata NetiSalim Estephan Roukos
    • Chalapathy Venkata NetiSalim Estephan Roukos
    • G10L5/06
    • G10L15/07G10L15/142
    • A method of gender dependent speech recognition includes the steps of identifying phone state models common to both genders, identifying gender specific phone state models, identifying a gender of a speaker and recognizing acoustic data from the speaker. A method of constructing a gender-dependent speech recognition model includes the steps of providing training data of a known gender, aligning the training data, tagging the training data with a gender to create gender-tagged data, determining a gender question at a node to determine gender dependence of the gender-tagged data, determining a phonetic context question at the node to determine phonetic context dependence of the gender-tagged data, determining a highest value of an evaluation function between the gender dependence and the phonetic context dependence to determine which dependence is a dominant dependence, splitting the data of the dominant dependence into child nodes according to likelihood criteria, comparing the highest value with a threshold value to determine if additional splitting is necessary, repeating theses steps for each child node until the highest value is below the threshold value and counting the nodes having gender dependence to determine an overall gender dependence level. A gender-dependent speech recognition system includes an input device for inputting speech to a preprocessor. The preprocessor converts the speech into acoustic data, and a processor for identifies gender-dependent phone state models and phone state modes common to both genders. The phone state models are stored in a memory device wherein the processor recognizes the speech in accordance with the phone state models.
    • 一种性别依赖性语音识别的方法包括识别两性的共同的电话状态模型,识别性别特定的电话状态模型,识别说话人的性别以及从说话者识别声学数据的步骤。 一种构建性别相关语音识别模型的方法包括以下步骤:提供已知性别的训练数据,对准训练数据,将训练数据与性别标记以产生性别标记的数据,在节点处确定性别问题 确定性别标签数据的性别依赖性,确定节点处的语音上下文问题以确定性别标记数据的语音上下文依赖性,确定性别依赖性和语音上下文依赖性之间的评估函数的最高值,以确定哪个 依赖性是主要依赖,根据似然准则将主要依赖的数据分解为子节点,将最高值与阈值进行比较,以确定是否需要额外的分割,重复每个子节点的这些步骤,直到最高值低于 阈值并计算具有性别依赖性的节点以确定整体性别 依赖度。 性别依赖语音识别系统包括用于向预处理器输入语音的输入装置。 预处理器将语音转换为声学数据,以及用于识别性别相关电话状态模型和两种性别共同的电话状态模式的处理器。 电话状态模型存储在存储设备中,其中处理器根据电话状态模型识别语音。