专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06640208B1 Voiced/unvoiced speech classifier 有权
标题翻译：有声/无声语音分类器
公开(公告)号：US06640208B1
公开(公告)日：2003-10-28
申请号：US09659318
申请日：2000-09-12
申请人： Yaxin Zhang , Jianming Song , Anton Madievski
发明人： Yaxin Zhang , Jianming Song , Anton Madievski
IPC分类号： G10L1106
CPC分类号： G10L25/93
摘要： A voiced/unvoiced speech classifier (30) includes a speech segmentor (34) which segments an input digitized speech waveform into frames of speech and a band-pass filter (36) which filters the frames of speech. A relative energy generator (38) generates a relative energy value for each filtered frame of speech and a decision parameter generator (52) including an autocorrelation calculator (54) and a pitch calculator (56) generates a decision parameter based on an autocorrelation function and a pitch frequency index for the filtered frames of speech. A normalized energy calculator (46) adjusts the threshold and then normalizes the relative energy. A comparator (60) provides a signal indicative of whether a frame of speech is voiced speech or unvoiced speech depending on a comparison of the decision parameter and the normalized relative energy value for each filtered frame of speech.
摘要翻译：有声/无声语音分类器（30）包括将输入的数字化语音波形分成语音帧的语音分割器（34）和对语音帧进行滤波的带通滤波器（36）。相对能量发生器（38）为每个经滤波的语音帧产生相对能量值，并且包括自相关计算器（54）和音高计算器（56）的判定参数发生器（52）基于自相关函数产生决策参数，并且用于滤波的语音帧的音调频率索引。归一化能量计算器（46）调整阈值，然后使相对能量归一化。比较器（60）根据决定参数与每个被滤波的语音帧的归一化相对能量值的比较，提供指示语音帧是语音语音还是无声语音的信号。

2. 发明授权

US06553342B1 Tone based speech recognition 有权
标题翻译：基于语音识别
公开(公告)号：US06553342B1
公开(公告)日：2003-04-22
申请号：US09496868
申请日：2000-02-02
申请人： Yaxin Zhang , Jianming Song , Anton Madievski
发明人： Yaxin Zhang , Jianming Song , Anton Madievski
IPC分类号： G10L1502
CPC分类号： G10L15/02 , G10L25/15
摘要： A method and apparatus for speech recognition involves classifying (38) a digitized speech segment according to whether the speech segment comprises voiced or unvoiced speech and utilizing that classification to generate tonal feature vectors (41) of the speech segment when the speech is voiced. The tonal feature vectors are then combined (42) with other non-tonal feature vectors (40) to provide speech feature vectors. The speech feature vectors are compared (35) with previously stored models of speech feature vectors (37) for different segments of speech to determine which previously stored model is a most likely match for the segment to be recognized.
摘要翻译：用于语音识别的方法和装置涉及根据语音段是否包括有声或无声语音来分类（38）数字化语音段，并且当语音被语音时利用该分类来生成语音段的音调特征向量（41）。然后将音调特征向量与其他非音调特征向量（40）组合（42）以提供语音特征向量。将语音特征向量与先前存储的用于不同语音段的语音特征向量（37）的模型进行比较（35），以确定先前存储的模型是否将被识别的段最可能匹配。

3. 发明申请

US20060270467A1 Method and apparatus of increasing speech intelligibility in noisy environments 有权
标题翻译：在嘈杂环境中增加语音清晰度的方法和设备
公开(公告)号：US20060270467A1
公开(公告)日：2006-11-30
申请号：US11137182
申请日：2005-05-25
申请人： Jianming Song , John Johnson
发明人： Jianming Song , John Johnson
IPC分类号： H04B1/38 , H04M1/00
CPC分类号： H03G3/3089 , G10L21/0208 , G10L21/0232 , G10L25/15 , H04M1/6025
摘要： A method (400, 600, 700) and apparatus (220) for enhancing the intelligibility of speech emitted into a noisy environment. After filtering (408) ambient noise with a filter (304) that simulates the physical blocking of noise by a at least a part of a voice communication device (102) a frequency dependent SNR of received voice audio relative to ambient noise is computed (424) on a perceptual (e.g. Bark) frequency scale. Formants are identified (426, 600, 700) and the SNR in bands including certain formants are modified (508, 510) with formant enhancement gain factors in order to improve intelligibility. A set of high pass filter gains (338) is combined (516) with the formant enhancement gains factors yielding combined gains which are clipped (518), scaled (520) according to a total SNR, normalized (526), smoothed across time (530) and frequency (532) and used to reconstruct (532, 534) an audio signal.
摘要翻译：一种用于增强发射到嘈杂环境中的语音的可懂度的方法（400,600,700）和装置（220）。在利用模拟语音通信设备（102）的至少一部分来模拟噪声的物理阻塞的滤波器（304）对环境噪声进行滤波（408）之后，计算接收到的语音音频相对于环境噪声的频率相关SNR（424 ）在感知（例如树皮）频率标度上。识别共振峰（426,600,700），并且包括某些共振峰的频带中的SNR被修改（508,510），具有共振峰增强增益因子，以便提高可懂度。将一组高通滤波器增益（338）与共振峰增强增益因子组合（516），产生组合增益，该组合增益根据总SNR进行削波（518），缩放（520），标准化（526），跨越时间平滑 530）和频率（532），并用于重建（532,534）音频信号。

4. 发明申请

US20130158989A1 APPARATUS AND METHOD FOR NOISE REMOVAL 有权
标题翻译：噪声去除装置和方法
公开(公告)号：US20130158989A1
公开(公告)日：2013-06-20
申请号：US13330235
申请日：2011-12-19
申请人： Jianming Song , David Barron
发明人： Jianming Song , David Barron
IPC分类号： G10L21/02
CPC分类号： G10L21/0232 , G10L2021/02165
摘要： A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.
摘要翻译：从多个输入信号产生连续的噪声流。从连续的噪声流连续计算平滑频谱估计。使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。从所选择的输入信号中去除噪声基本上同步地进行，并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。

5. 发明申请

US20060136205A1 Method of refining statistical pattern recognition models and statistical pattern recognizers 有权
标题翻译：统计模式识别模型和统计模式识别方法
公开(公告)号：US20060136205A1
公开(公告)日：2006-06-22
申请号：US11018271
申请日：2004-12-21
申请人： Jianming Song
发明人： Jianming Song
IPC分类号： G10L15/06
CPC分类号： G10L15/063 , G06K9/6277 , G10L2015/0635
摘要： A device (800) performs statistical pattern recognition using model parameters that are refined by optimizing an objective function that includes a term for many items of training data for which recognition errors occur wherein each term depends on a relative magnitude of a first score for a recognition result for an item of training data and a second score calculated by evaluating a statistical pattern recognition model identified by a transcribed identity of the training data item with feature vectors extracted from the item of training data. The objective function does not include terms for items of training data for which there is a gross discrepancy between a transcribed identity and a recognized identity. Gross discrepancies can be detected by probability score or pattern identity comparisons. Terms, of the objective function are weighted based on the type of recognition error and weights can be increased for high priority patterns.
摘要翻译：设备（800）使用通过优化目标函数来改进的模型参数来执行统计模式识别，所述目标函数包括用于识别错误发生的许多训练数据项的项，其中每个项取决于用于识别的第一分数的相对大小通过从训练数据项目提取的特征向量评估由训练数据项的转录身份识别的统计模式识别模型而计算出的训练数据项目和第二分数。目标函数不包括训练数据项，其中转录身份与识别身份之间存在严重差异。总差异可以通过概率分数或模式识别比较来检测。根据识别误差的类型对目标函数的术语进行加权，对于高优先级模式，可以增加权重。

6. 发明授权

US06393397B1 Cohort model selection apparatus and method 失效
标题翻译：队列模型选择装置及方法
公开(公告)号：US06393397B1
公开(公告)日：2002-05-21
申请号：US09332927
申请日：1999-06-14
申请人： Ho Chuen Choi , Xiaoyuan Zhu , Jianming Song
发明人： Ho Chuen Choi , Xiaoyuan Zhu , Jianming Song
IPC分类号： G10L1506
CPC分类号： G10L17/04 , G10L17/12
摘要： An apparatus for selecting a cohort model for use in a speaker verification system includes a model generator (108) for determining a target speaker model (114) from a speech sample collected from the target speaker (106). A cohort selector (110) determines a similarity value between each of a number of predetermined existing speaker models from a model pool (112) and the target speaker model (114) and a dissimilarity value between each of the existing speaker models and any previously selected cohort models (116). An existing speaker model which is most similar to the target speaker model, but most dissimilar to previously chosen cohort models, is then chosen as another cohort model for the target speaker.
摘要翻译：一种用于选择在扬声器验证系统中使用的队列模型的装置包括：模型发生器（108），用于从从目标扬声器（106）收集的语音样本中确定目标说话者模型（114）。队列选择器（110）确定来自模型池（112）和目标说话者模型（114）的多个预定的现有说话者模型中的每一个之间的相似度值，以及现有说话者模型中的每一者与之前选择的任何一个之间的相似度值队列模型（116）。然而，与目标说话者模型最相似但与以前选择的队列模型最相似的现有说话者模型被选择为目标说话者的另一队列模型。

7. 发明授权

US09497528B2 Cotalker nulling based on multi super directional beamformer 有权
标题翻译：基于多个超定向波束形成器的Cotalker归零
公开(公告)号：US09497528B2
公开(公告)日：2016-11-15
申请号：US14074645
申请日：2013-11-07
申请人： Jianming Song , Mike Reuter
发明人： Jianming Song , Mike Reuter
IPC分类号： A61F11/06 , H04R1/08 , G10L17/00 , H04R3/00 , G10L21/0272 , H04R1/40 , G10L21/0208
CPC分类号： H04R1/08 , G10L17/00 , G10L21/0272 , G10L2021/02087 , H04R1/406 , H04R3/005 , H04R2430/20 , H04R2499/13
摘要： Speech from a driver and speech from a passenger in a vehicle is selected directionally using a plurality of directional microphones. Sounds detected as coming from a passenger from a plurality of directional microphones are suppressed from sounds detected as coming from a driver by a second plurality of directional microphones.
摘要翻译：使用多个定向麦克风定向地选择来自车辆中的乘客的驾驶员和言语的语音。通过第二多个定向麦克风抑制从多个定向麦克风来的来自乘客的声音被抑制为来自驾驶员的声音。

8. 发明申请

US20150124988A1 COTALKER NULLING BASED ON MULTI SUPER DIRECTIONAL BEAMFORMER 有权
标题翻译：基于多个超方向波束的COTALKER空闲
公开(公告)号：US20150124988A1
公开(公告)日：2015-05-07
申请号：US14074645
申请日：2013-11-07
申请人： Jianming Song , Mike Reuter
发明人： Jianming Song , Mike Reuter
IPC分类号： H04R1/08 , G10L17/00
CPC分类号： H04R1/08 , G10L17/00 , G10L21/0272 , G10L2021/02087 , H04R1/406 , H04R3/005 , H04R2430/20 , H04R2499/13
摘要： Speech from a driver and speech from a passenger in a vehicle is selected directionally using a plurality of directional microphones. Sounds detected as coming from a passenger from a plurality of directional microphones are suppressed from sounds detected as coming from a driver by a second plurality of directional microphones.
摘要翻译：使用多个定向麦克风定向地选择来自车辆中的乘客的驾驶员和言语的语音。通过第二多个定向麦克风抑制从多个定向麦克风来的来自乘客的声音被抑制为来自驾驶员的声音。

9. 发明授权

US08712769B2 Apparatus and method for noise removal by spectral smoothing 有权
标题翻译：通过光谱平滑噪声消除的装置和方法
公开(公告)号：US08712769B2
公开(公告)日：2014-04-29
申请号：US13330235
申请日：2011-12-19
申请人： Jianming Song , David Barron
发明人： Jianming Song , David Barron
IPC分类号： G10L21/0232
CPC分类号： G10L21/0232 , G10L2021/02165
摘要： A continuous stream of noise is created from a plurality of input signals. A smoothing spectrum estimate is continuously calculated from the continuous stream of noise. Noise is responsively removed from a selected one of the plurality of input signals using the smoothing spectrum estimate. The removal of the noise from the selected input signal is performed substantially synchronously and in time alignment with the creating of the continuous stream of noise and the calculating of the smoothing spectrum estimate.
摘要翻译：从多个输入信号产生连续的噪声流。从连续的噪声流连续计算平滑频谱估计。使用平滑频谱估计从多个输入信号中的所选择的一个响应地去除噪声。从所选择的输入信号中去除噪声基本上同步地进行，并且与连续的噪声流的产生以及平滑频谱估计的计算在时间上一致。

10. 发明申请

US20070078659A1 Wireless communication device for providing reliable voice-based web browsing 审中-公开
标题翻译：用于提供可靠的基于语音的网络浏览的无线通信设备
公开(公告)号：US20070078659A1
公开(公告)日：2007-04-05
申请号：US11241170
申请日：2005-09-30
申请人： Ukrit Visitkitjakarn , John Johnson , Jianming Song
发明人： Ukrit Visitkitjakarn , John Johnson , Jianming Song
IPC分类号： G10L21/00
CPC分类号： G10L15/22 , G06F16/957 , H04M1/72561 , H04M2250/74
摘要： A wireless communication device for voice-based web browsing comprising a processor (204) coupled to a memory (206), a speech input device (224), and a display (216). The memory (206) stores voice commands and sequential values such that each voice command is associated with a sequential value. The speech input device (224) receives a voice input corresponding to a particular sequential value. The display (216) shows web pages one page at a time so that each web page has web links and sequential values assigned to web links. The processor (204) activates a web site associated with a web link corresponding to the particular sequential value in response to each occurrence of receiving the voice input.
摘要翻译：一种用于基于语音的网络浏览的无线通信设备，包括耦合到存储器（206）的处理器（204），语音输入设备（224）和显示器（216）。存储器（206）存储语音命令和顺序值，使得每个语音命令与顺序值相关联。语音输入装置（224）接收对应于特定顺序值的语音输入。显示器（216）一次显示网页一页，使得每个网页具有分配给web链接的web链接和顺序值。响应于接收到语音输入的每次出现，处理器（204）激活与特定顺序值对应的web链接相关联的网站。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式