专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08165317B2 Method and system for position detection of a sound source 有权
标题翻译：声源位置检测方法及系统
公开(公告)号：US08165317B2
公开(公告)日：2012-04-24
申请号：US12498012
申请日：2009-07-06
申请人： Osamu Ichikawa , Tohru Nagano , Masafumi Nishimura
发明人： Osamu Ichikawa , Tohru Nagano , Masafumi Nishimura
IPC分类号： H04R3/00
CPC分类号： G01S5/30 , H04R3/005
摘要： A position detection method, system, and computer readable article of manufacture tangibly embodying computer readable instructions for executing the method for detecting the position of a sound source using at least two microphones. The method includes the steps of: emitting a reproduced sound from the sound source; observing the reproduced sound and an observed sound at the microphones; converting the reproduced sound and the observed sound into electrical signals; transforming the signals of the reproduced sound and of the observed sound into frequency spectra by a frequency spectrum transformer apparatus; calculating Crosspower Spectrum Phase (CSP) coefficients of the frequency spectra of the signals by a CSP coefficient calculator apparatus; and calculating distances between the position of the sound source and the positions of the microphones based on the calculated CSP coefficients by a distance calculating apparatus, thereby detecting the position of the sound source.
摘要翻译：一种有形地体现用于执行用于使用至少两个麦克风来检测声源的位置的方法的计算机可读指令的位置检测方法，系统和计算机可读制品。该方法包括以下步骤：从声源发射再现的声音; 观察麦克风上再现的声音和观察到的声音; 将再现的声音和观察到的声音转换成电信号; 通过频谱变换装置将再生声音和观测声音的信号变换为频谱; 通过CSP系数计算装置计算信号频谱的交叉频谱相位（CSP）系数; 并且通过距离计算装置基于所计算的CSP系数来计算声源的位置与麦克风的位置之间的距离，从而检测声源的位置。

2. 发明申请

US20100008516A1 METHOD AND SYSTEM FOR POSITION DETECTION OF A SOUND SOURCE 有权
标题翻译：声源检测方法及系统
公开(公告)号：US20100008516A1
公开(公告)日：2010-01-14
申请号：US12498012
申请日：2009-07-06
申请人： Osamu Ichikawa , Tohru Nagano , Masafumi Nishimura
发明人： Osamu Ichikawa , Tohru Nagano , Masafumi Nishimura
IPC分类号： H04R3/00
CPC分类号： G01S5/30 , H04R3/005
摘要： A position detection method, system, and computer readable article of manufacture tangibly embodying computer readable instructions for executing the method for detecting the position of a sound source using at least two microphones. The method includes the steps of: emitting a reproduced sound from the sound source; observing the reproduced sound and an observed sound at the microphones; converting the reproduced sound and the observed sound into electrical signals; transforming the signals of the reproduced sound and of the observed sound into frequency spectra by a frequency spectrum transformer apparatus; calculating Crosspower Spectrum Phase (CSP) coefficients of the frequency spectra of the signals by a CSP coefficient calculator apparatus; and calculating distances between the position of the sound source and the positions of the microphones based on the calculated CSP coefficients by a distance calculating apparatus, thereby detecting the position of the sound source.
摘要翻译：一种有形地体现用于执行用于使用至少两个麦克风来检测声源的位置的方法的计算机可读指令的位置检测方法，系统和计算机可读制品。该方法包括以下步骤：从声源发射再现的声音; 观察麦克风上再现的声音和观察到的声音; 将再现的声音和观察到的声音转换成电信号; 通过频谱变换装置将再生声音和观测声音的信号变换为频谱; 通过CSP系数计算装置计算信号频谱的交叉频谱相位（CSP）系数; 并且通过距离计算装置基于所计算的CSP系数来计算声源的位置与麦克风的位置之间的距离，从而检测声源的位置。

3. 发明申请

US20110131044A1 TARGET VOICE EXTRACTION METHOD, APPARATUS AND PROGRAM PRODUCT 失效
标题翻译：目标语音提取方法，装置和程序产品
公开(公告)号：US20110131044A1
公开(公告)日：2011-06-02
申请号：US12955882
申请日：2010-11-29
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L17/00
CPC分类号： G10L25/78 , G10L15/20 , G10L21/028 , G10L2021/02166
摘要： An apparatus, program product and method is provided for separating a target voice from a plurality of other voices having different directions of arrival. The method comprises the steps of disposing a first and a second voice input device at a predetermined distance from one another and upon receipt of voice signals at said devices calculating discrete Fourier transforms for the signals and calculating a CSP (cross-power spectrum phase) coefficient by superpositioning multiple frequency-bin components based on correlation of the two spectra signals received and then calculating a weighted CSP coefficient from said two discrete Fourier-transformed speech signals. A target voice is separated when received by said devices from other voice signals in a spectrum by using the calculated weighted CSP coefficient.
摘要翻译：提供了一种用于将目标语音与具有不同到达方向的多个其他语音分离的装置，程序产品和方法。该方法包括以下步骤：将第一和第二语音输入设备彼此隔开预定的距离，并且在所述设备接收到语音信号时，为信号计算离散付里叶变换并计算CSP（交叉功率谱相位）系数通过基于所接收的两个频谱信号的相关性叠加多个频率分量，然后从所述两个离散傅里叶变换的语音信号计算加权的CSP系数。通过使用计算的加权CSP系数，通过所述设备从频谱中的其他语音信号接收目标语音。

4. 发明授权

US07599836B2 Voice recording system, recording device, voice analysis device, voice recording method and program 有权
标题翻译：录音系统，录音设备，语音分析设备，录音方式和程序
公开(公告)号：US07599836B2
公开(公告)日：2009-10-06
申请号：US11136831
申请日：2005-05-25
申请人： Osamu Ichikawa , Masafumi Nishimura , Tetsuya Takiguchi
发明人： Osamu Ichikawa , Masafumi Nishimura , Tetsuya Takiguchi
IPC分类号： G10L17/00
CPC分类号： G10L21/028
摘要： To provide a method of specifying each of speakers of individual voices, based on recorded voices made by a plurality of speakers, with a simple system configuration, and to provide a system using the method. The system includes: microphones individually provided for each of the speakers; a voice processing unit which gives a unique characteristic to each pair of two-channel voice signals recorded with each of the microphones 10, by executing different kinds of voice processing on the respective pairs of voice signals, and which mixes the voice signals for each channel; and an analysis unit which performs an analysis according to the unique characteristics, given to the voice signals concerning the respective microphones through the processing by the voice processing unit, and which specifies the speaker for each speech segment of the voice signals.
摘要翻译：为了提供一种基于由多个扬声器产生的记录的声音以简单的系统配置来指定各个语音的每个扬声器的方法，并且提供使用该方法的系统。该系统包括：为每个扬声器单独提供的麦克风; 语音处理单元，通过对各个语音信号对执行不同种类的语音处理，并且将每个声道的语音信号进行混合，为记录在每个麦克风10中的每对双声道语音信号提供独特的特性 ; 以及分析单元，其根据通过语音处理单元的处理给予关于各个麦克风的语音信号的独特特性进行分析，并且指定语音信号的每个语音段的说话者。

5. 发明申请

US20080270131A1 METHOD, PREPROCESSOR, SPEECH RECOGNITION SYSTEM, AND PROGRAM PRODUCT FOR EXTRACTING TARGET SPEECH BY REMOVING NOISE 有权
标题翻译：方法，预处理程序，语音识别系统和通过删除噪声提取目标语音的程序产品
公开(公告)号：US20080270131A1
公开(公告)日：2008-10-30
申请号：US12105621
申请日：2008-04-18
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L15/00 , G10L19/14
CPC分类号： G10L15/20 , G10L15/02 , G10L21/02 , G10L21/0272 , G10L2021/02161
摘要： The present invention relates to a method, preprocessor, speech recognition system, and program product for extracting a target speech by removing noise. In an embodiment of the invention target speech is extracted from two input speeches, which are obtained through at least two speech input devices installed in different places in a space, applies a spectrum subtraction process by using a noise power spectrum (Uω) estimated by one or both of the two speech input devices (Xω(T)) and an arbitrary subtraction constant (α) to obtain a resultant subtracted power spectrum (Yω(T)). The invention further applies a gain control based on the two speech input devices to the resultant subtracted power spectrum to obtain a gain-controlled power spectrum (Dω(T)). The invention further applies a flooring process to said resultant gain-controlled power spectrum on the basis of arbitrary Flooring factor (β) to obtain a power spectrum for speech recognition (Zω(T)).
摘要翻译：本发明涉及通过去除噪声来提取目标语音的方法，预处理器，语音识别系统和程序产品。在本发明的一个实施例中，从通过安装在空间中的不同位置的至少两个语音输入设备获得的两个输入语音提取目标语音，通过使用由一个估计的噪声功率谱（Uomega）来应用频谱减法处理或两个语音输入装置（Xomega（T））和任意减法常数（α）两者以获得合成的减去的功率谱（Yomega（T））。本发明还将基于两个语音输入装置的增益控制应用于合成的减去的功率谱以获得增益控制的功率谱（Domega（T））。本发明还基于任意地板因子（β）对所得的增益控制功率谱进行地板处理，以获得用于语音识别的功率谱（Zomega（T））。

6. 发明授权

US08930185B2 Speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program 有权
标题翻译：语音特征提取装置，语音特征提取方法和语音特征提取方案
公开(公告)号：US08930185B2
公开(公告)日：2015-01-06
申请号：US13392901
申请日：2010-07-12
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L19/02 , G10L15/02 , G10L15/20 , G10L25/24
CPC分类号： G10L15/02 , G10L15/20 , G10L25/24
摘要： A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.
摘要翻译：语音特征提取装置，语音特征提取方法和语音特征提取方案。语音特征提取装置包括：第一差分计算模块，用于：（i）接收作为每个频率仓分成帧的语音信号的频谱作为输入; 和（ii）计算每个帧的增量谱，其中Δ谱是频率仓的连续帧内的频谱的差; 以及第一归一化模块，用于通过将Δ谱除以平均频谱的函数来对频率仓的帧的Δ谱进行归一化; 其中平均频谱是通过所有帧的频谱的平均值，其是频率仓的总体语音; 并且其中第一归一化模块的输出被定义为第一增量特征。

7. 发明授权

US07720679B2 Speech recognition apparatus, speech recognition apparatus and program thereof 有权
标题翻译：语音识别装置，语音识别装置及其程序
公开(公告)号：US07720679B2
公开(公告)日：2010-05-18
申请号：US12236588
申请日：2008-09-24
申请人： Osamu Ichikawa , Tetsuya Takiguchi , Masafumi Nishimura
发明人： Osamu Ichikawa , Tetsuya Takiguchi , Masafumi Nishimura
IPC分类号： G10L15/20
CPC分类号： G10L21/0216 , G10L21/028 , G10L2021/02166
摘要： Provided is a method for canceling background noise of a sound source other than a target direction sound source in order to realize highly accurate speech recognition, and a system using the same. In terms of directional characteristics of a microphone array, due to a capability of approximating a power distribution of each angle of each of possible various sound source directions by use of a sum of coefficient multiples of a base form angle power distribution of a target sound source measured beforehand by base form angle by using a base form sound, and power distribution of a non-directional background sound by base form, only a component of the target sound source direction is extracted at a noise suppression part. In addition, when the target sound source direction is unknown, at a sound source localization part, a distribution for minimizing the approximate residual is selected from base form angle power distributions of various sound source directions to assume a target sound source direction. Further, maximum likelihood estimation is executed by using voice data of the component of the sound source direction passed through these processes, and a voice model obtained by predetermined modeling of the voice data, and speech recognition is carried out based on an obtained assumption value.
摘要翻译：提供了一种用于消除除目标方向声源以外的声源的背景噪声以实现高精度语音识别的方法，以及使用该声音的系统。在麦克风阵列的方向特性方面，由于通过使用目标声源的基本形式角功率分布的系数倍数的和来逼近可能的各种声源方向中的每一个的每个角度的功率分布的能力通过使用基本形式的声音预先通过基本形式角度测量，并且通过基本形式对非定向背景声音的功率分配进行测量，在噪声抑制部分仅提取目标声源方向的分量。此外，当目标声源方向未知时，在声源定位部分，从各种声源方向的基本形式角功率分布中选择最小化近似残差的分布，以呈现目标声源方向。此外，通过使用通过这些处理的声源方向的分量的声音数据和通过对语音数据的预定建模获得的语音模型来执行最大似然估计，并且基于获得的假设值执行语音识别。

8. 发明授权

US08812312B2 System, method and program for speech processing 有权
标题翻译：用于语音处理的系统，方法和程序
公开(公告)号：US08812312B2
公开(公告)日：2014-08-19
申请号：US12200610
申请日：2008-08-28
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L15/00 , G10L15/20 , G10L17/00
CPC分类号： G10L15/20 , G10L15/02 , G10L25/24
摘要： The present invention relates to a system, method and program for speech recognition. In an embodiment of the invention a method for processing a speech signal consists of receiving a power spectrum of a speech signal and generating a log power spectrum signal of the power spectrum. The method further consists of performing discrete cosine transformation on the log power spectrum signal and cutting off cepstrum upper and lower terms of the discrete cosine transformed signal. The method further consists of performing inverse discrete cosine transformation on the signal from which the cepstrum upper and lower terms are cut off. The method further consists of converting the inverse discrete cosine transformed signal so as to bring the signal back to a power spectrum domain and filtering the power spectrum of the speech signal by using, as a filter, the signal which is brought back to the power spectrum domain.
摘要翻译：本发明涉及用于语音识别的系统，方法和程序。在本发明的实施例中，用于处理语音信号的方法包括接收语音信号的功率谱并产生功率谱的对数功率谱信号。该方法还包括对对数功率谱信号执行离散余弦变换，并切断离散余弦变换信号的倒谱上下项。该方法还包括对从其中切断倒谱谱上限和下限的信号执行逆离散余弦变换。该方法还包括转换逆离散余弦变换信号，以使信号回到功率谱域，并通过使用带回到功率谱的信号作为滤波器来过滤语音信号的功率谱域。

9. 发明申请

US20120330657A1 SPEECH FEATURE EXTRACTION APPARATUS, SPEECH FEATURE EXTRACTION METHOD, AND SPEECH FEATURE EXTRACTION PROGRAM 有权
标题翻译：语音特征提取装置，语音提取方法和语音特征提取程序（SPEECH FEATURE EXTRACTION PROGRAM
公开(公告)号：US20120330657A1
公开(公告)日：2012-12-27
申请号：US13604721
申请日：2012-09-06
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L15/20
CPC分类号： G10L15/02 , G10L15/20 , G10L25/24
摘要： A speech feature extraction apparatus, speech feature extraction method, and speech feature extraction program. A speech feature extraction apparatus includes: first difference calculation module to: (i) receive, as an input, a spectrum of a speech signal segmented into frames for each frequency bin; and (ii) calculate a delta spectrum for each of the frame, where the delta spectrum is a difference of the spectrum within continuous frames for the frequency bin; and first normalization module to normalize the delta spectrum of the frame for the frequency bin by dividing the delta spectrum by a function of an average spectrum; where the average spectrum is an average of spectra through all frames that are overall speech for the frequency bin; and where an output of the first normalization module is defined as a first delta feature.
摘要翻译：语音特征提取装置，语音特征提取方法和语音特征提取方案。语音特征提取装置包括：第一差分计算模块，用于：（i）接收作为每个频率仓分成帧的语音信号的频谱作为输入; 和（ii）计算每个帧的增量谱，其中Δ谱是频率仓的连续帧内的频谱的差; 以及第一归一化模块，用于通过将Δ谱除以平均频谱的函数来对频率仓的帧的Δ谱进行归一化; 其中平均频谱是通过所有帧的频谱的平均值，其是频率仓的总体语音; 并且其中第一归一化模块的输出被定义为第一增量特征。

10. 发明授权

US07856353B2 Method for processing speech signal data with reverberation filtering 有权
标题翻译：用混响滤波处理语音信号数据的方法
公开(公告)号：US07856353B2
公开(公告)日：2010-12-21
申请号：US11834964
申请日：2007-08-07
申请人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
发明人： Takashi Fukuda , Osamu Ichikawa , Masafumi Nishimura
IPC分类号： G10L19/02 , G10L15/20 , G10L13/06 , H04B3/20 , H03G3/00 , A61F11/06
CPC分类号： G10L15/20 , G10L19/04 , G10L2021/02082
摘要： Method for processing speech signal data. A speech signal is divided into frames. Each frame is characterized by a frame number T representing a unique interval of time. Each speech signal is characterized by a power spectrum with respect to frame T and frequency band ω. A speech segment and a reverberation segment of the speech signal is determined. L filter coefficients W(k) (k=1, 2, . . . , L) respectively corresponding to L frames immediately preceding frame T are computed such that the L filter coefficients minimize a function Φ that is a linear combination of sum of squares of a residual speech power in the reverberation segment and a sum of squares of a subtracted speech power in the speech segment. The computed L filter coefficients are stored within storage media of the computing apparatus.
摘要翻译：用于处理语音信号数据的方法。语音信号被分成帧。每个帧的特征在于代表唯一的时间间隔的帧号T. 每个语音信号的特征在于相对于帧T和频带ω的功率谱。确定语音信号的语音段和混响段。计算分别对应于帧T之前的L帧的L个滤波器系数W（k）（k = 1,2，...，L），使得L个滤波器系数最小化作为平方和的线性组合的函数Φ 在混响段中的剩余语音功率和语音段中减去的语音功率的平方和。所计算的L个滤波器系数存储在计算装置的存储介质中。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式