专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20150382138A1 LOCATION-BASED AUDIO MESSAGING 审中-公开
标题翻译：基于位置的音频消息传递
公开(公告)号：US20150382138A1
公开(公告)日：2015-12-31
申请号：US14316667
申请日：2014-06-26
申请人： Raja Bose , Hiroshi Horii , Jonathan Lester , Ruchita Bhargava , Kazuhito Koishida , Michelle L. Holtmann , Christina Chen
发明人： Raja Bose , Hiroshi Horii , Jonathan Lester , Ruchita Bhargava , Kazuhito Koishida , Michelle L. Holtmann , Christina Chen
IPC分类号： H04W4/02 , H04W4/12
摘要： Mobile devices provide a variety of techniques for presenting messages from sources to a user. However, when the message pertains to the presence of the user at a location, the available communications techniques may exhibit deficiencies, e.g., reliance on the memory of the source and/or user of the existence and content of a message between its initiation and the user's visit to the location, or reliance on the communication accessibility of the user, the device, and/or the source during the user's location visit. Presented herein are techniques for enabling a mobile device, at a first time, to receive a request to present an audio message during the presence of the user at a location; and, at a second time, detecting the presence of the user at the location, and presenting the audio message to the user, optionally without awaiting a request from the user to present the message.
摘要翻译：移动设备提供用于将来自源的消息呈现给用户的各种技术。然而，当消息涉及用户在某个位置的存在时，可用的通信技术可能表现出缺陷，例如，依赖于源和/或用户的存储器的消息的存在和内容在其启动和用户访问该位置，或者在用户的位置访问期间依赖用户，设备和/或源的通信可访问性。这里提出的技术是使移动设备在第一时间能够在用户在某个位置的存在期间接收呈现音频消息的请求; 并且在第二时间，检测在该位置处的用户的存在，并且可以在不等待用户呈现消息的请求的情况下向用户呈现音频消息。

2. 发明授权

US09435873B2 Sound source localization using phase spectrum 有权
标题翻译：声源定位使用相位谱
公开(公告)号：US09435873B2
公开(公告)日：2016-09-06
申请号：US13182449
申请日：2011-07-14
申请人： Shankar Regunathan , Kazuhito Koishida , Harshavardhana Narayana Kikkeri
发明人： Shankar Regunathan , Kazuhito Koishida , Harshavardhana Narayana Kikkeri
IPC分类号： H04R3/00 , G01S3/80 , G01S3/808 , G01S3/82
CPC分类号： G01S3/8083 , G01S3/8006 , G01S3/82 , H04R3/005
摘要： An array of microphones placed on a mobile robot provides multiple channels of audio signals. A received set of audio signals is called an audio segment, which is divided into multiple frames. A phase analysis is performed on a frame of the signals from each pair of microphones. If both microphones are in an active state during the frame, a candidate angle is generated for each such pair of microphones. The result is a list of candidate angles for the frame. This list is processed to select a final candidate angle for the frame. The list of candidate angles is tracked over time to assist in the process of selecting the final candidate angle for an audio segment.
摘要翻译：放置在移动机器人上的麦克风阵列提供多个音频信号通道。接收到的一组音频信号被称为音频片段，其被分成多个帧。对来自每对麦克风的信号的帧进行相位分析。如果在帧期间两个麦克风处于活动状态，则为每个这样的麦克风生成候选角度。结果是帧的候选角度列表。处理该列表以选择帧的最终候选角度。候选角度的列表随着时间被跟踪以帮助选择音频片段的最终候选角度的过程。

3. 发明授权

US08880545B2 Query and matching for content recognition 有权
公开(公告)号：US08880545B2
公开(公告)日：2014-11-04
申请号：US13110185
申请日：2011-05-18
申请人： Kazuhito Koishida , David Nister , Ian Simon , Tom Butcher
发明人： Kazuhito Koishida , David Nister , Ian Simon , Tom Butcher
IPC分类号： G06F17/30
摘要： Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, multiple queries are transmitted to the content recognition service. In at least some embodiments, subsequent queries can progressively incorporate previous queries plus additional data that is captured. In one or more embodiments, responsive to receiving the query, the content recognition service can employ a multi-stage matching technique to identify content items responding to the query. This matching technique can be employed as queries are progressively received.

4. 发明授权

US08457958B2 Audio transcoder using encoder-generated side information to transcode to target bit-rate 有权
标题翻译：音频代码转换器使用编码器生成的侧面信息转码为目标比特率
公开(公告)号：US08457958B2
公开(公告)日：2013-06-04
申请号：US11938194
申请日：2007-11-09
申请人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
发明人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
IPC分类号： G10L19/02 , H04B1/66
CPC分类号： G10L19/173
摘要： An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.
摘要翻译：音频编码器将侧信息编码成包含由编码器使用的编码参数的压缩音频比特流，用于一种或多种编码技术，例如用于速率控制的噪声屏蔽比曲线。代码转换器使用编码器产生的侧信息将来自具有初始比特率的原始压缩比特流的音频转码为具有新比特率的第二比特流。因为侧信息是从原始音频导出的，所以代码转换器能够更好地保持转码的音频质量。侧面信息还允许代码转换器从中间解码/编码级重新编码，以实现更快和更低复杂度的代码转换。

5. 发明申请

US20120296938A1 Query and Matching for Content Recognition 有权
标题翻译：内容识别的查询和匹配
公开(公告)号：US20120296938A1
公开(公告)日：2012-11-22
申请号：US13110185
申请日：2011-05-18
申请人： Kazuhito Koishida , David Nister , Ian Simon , Tom Butcher
发明人： Kazuhito Koishida , David Nister , Ian Simon , Tom Butcher
IPC分类号： G06F17/30
CPC分类号： G06F17/30743
摘要： Various embodiments enable audio data, such as music data, to be captured, by a device, from a background environment and processed to formulate a query that can then be transmitted to a content recognition service. In one or more embodiments, multiple queries are transmitted to the content recognition service. In at least some embodiments, subsequent queries can progressively incorporate previous queries plus additional data that is captured. In one or more embodiments, responsive to receiving the query, the content recognition service can employ a multi-stage matching technique to identify content items responding to the query. This matching technique can be employed as queries are progressively received.
摘要翻译：各种实施例使得诸如音乐数据的音频数据能够被设备从背景环境中捕获并被处理以制定可以被发送到内容识别服务的查询。在一个或多个实施例中，多个查询被发送到内容识别服务。在至少一些实施例中，后续查询可以逐渐地并入先前查询加上所捕获的附加数据。在一个或多个实施例中，响应于接收查询，内容识别服务可以采用多阶段匹配技术来识别响应于查询的内容项目。可以采用这种匹配技术，因为逐渐接收到查询。

6. 发明授权

US07904293B2 Sub-band voice codec with multi-stage codebooks and redundant coding 有权
标题翻译：具有多级码本和冗余编码的子带语音编解码器
公开(公告)号：US07904293B2
公开(公告)日：2011-03-08
申请号：US11973689
申请日：2007-10-09
申请人： Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
发明人： Tian Wang , Kazuhito Koishida , Hosam A. Khalil , Xiaoqin Sun , Wei-Ge Chen
IPC分类号： G10L15/00
CPC分类号： G10L19/005 , G10L19/12 , G10L2019/0005
摘要： Techniques and tools related to coding and decoding of audio information are described. For example, redundant coded information for decoding a current frame includes signal history information associated with only a portion of a previous frame. As another example, redundant coded information for decoding a coded unit includes parameters for a codebook stage to be used in decoding the current coded unit only if the previous coded unit is not available. As yet another example, coded audio units each include a field indicating whether the coded unit includes main encoded information representing a segment of an audio signal, and whether the coded unit includes redundant coded information for use in decoding main encoded information.
摘要翻译：描述与音频信息的编码和解码相关的技术和工具。例如，用于解码当前帧的冗余编码信息包括仅与先前帧的一部分相关联的信号历史信息。作为另一示例，用于对已编码单元进行解码的冗余编码信息包括仅当前一编码单元不可用时才将用于解码当前编码单元的码本级的参数。作为另一示例，编码音频单元各自包括指示编码单元是否包括表示音频信号的段的主编码信息的字段，以及编码单元是否包括用于解码主编码信息的冗余编码信息。

7. 发明申请

US20100280827A1 NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE 有权
标题翻译：噪音强大的语音分类器ENSEMBLE
公开(公告)号：US20100280827A1
公开(公告)日：2010-11-04
申请号：US12433143
申请日：2009-04-30
申请人： Kunal Mukerjee , Kazuhito Koishida , Shankar Regunathan
发明人： Kunal Mukerjee , Kazuhito Koishida , Shankar Regunathan
IPC分类号： G10L15/00
CPC分类号： G10L15/142 , G10L15/16 , G10L15/197 , G10L15/20 , G10L21/0208 , G10L21/0216 , G10L25/18 , G10L25/93
摘要： Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.
摘要翻译：公开了实现包括语音分类器集合的语音识别系统的实施例。根据一个实施例，语音识别系统包括将表示语音向量的特征向量转换为对数概率集的分类器集合。分类器集合包括多个分类器。语音识别系统包括将对数概率集合变换为输出符号序列的解码器集合。该语音识别系统还包括一个查询组件，用于使用输出符号序列从语音数据库中检索一个或多个语音话语。

8. 发明申请

US20090125315A1 TRANSCODER USING ENCODER GENERATED SIDE INFORMATION 有权
标题翻译：使用编码器生成侧信息的TRANSCODER
公开(公告)号：US20090125315A1
公开(公告)日：2009-05-14
申请号：US11938194
申请日：2007-11-09
申请人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
发明人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
IPC分类号： G10L19/00
CPC分类号： G10L19/173
摘要： An audio encoder encodes side information into a compressed audio bitstream containing encoding parameters used by the encoder for one or more encoding techniques, such as a noise-mask-ratio curve used for rate control. A transcoder uses the encoder generated side information to transcode the audio from the original compressed bitstream having an initial bit-rate into a second bitstream having a new bit-rate. Because the side information is derived from the original audio, the transcoder is able to better maintain audio quality of the transcoding. The side information also allows the transcoder to re-encode from an intermediate decoding/encoding stage for faster and lower complexity transcoding.
摘要翻译：音频编码器将侧信息编码成包含由编码器使用的编码参数的压缩音频比特流，用于一种或多种编码技术，例如用于速率控制的噪声屏蔽比曲线。代码转换器使用编码器产生的侧信息将来自具有初始比特率的原始压缩比特流的音频转码为具有新比特率的第二比特流。因为侧信息是从原始音频导出的，所以代码转换器能够更好地保持转码的音频质量。侧面信息还允许代码转换器从中间解码/编码级重新编码，以实现更快和更低复杂度的代码转换。

9. 发明申请

US20080312758A1 CODING OF SPARSE DIGITAL MEDIA SPECTRAL DATA 有权
标题翻译：编码数字媒体光谱数据
公开(公告)号：US20080312758A1
公开(公告)日：2008-12-18
申请号：US11764108
申请日：2007-06-15
申请人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
发明人： Kazuhito Koishida , Sanjeev Mehrotra , Wei-Ge Chen
IPC分类号： G06F17/00
CPC分类号： G10L19/02 , G10L19/0212 , G10L19/032 , G10L19/18
摘要： An audio encoder/decoder provides efficient compression of spectral transform coefficient data characterized by sparse spectral peaks. The audio encoder/decoder applies a temporal prediction of the frequency position of spectral peaks. The spectral peaks in the transform coefficients that are predicted from those in a preceding transform coding block are encoded as a shift in frequency position from the previous transform coding block and two non-zero coefficient levels. The prediction may avoid coding very large zero-level transform coefficient runs as compared to conventional run length coding. For spectral peaks not predicted from those in a preceding transform coding block, the spectral peaks are encoded as a value trio of a length of a run of zero-level spectral transform coefficients, and two non-zero coefficient levels.
摘要翻译：音频编码器/解码器提供以稀疏频谱峰值为特征的频谱变换系数数据的有效压缩。音频编码器/解码器对频谱峰值的频率位置进行时间预测。从前一变换编码块中预测的变换系数中的频谱峰值被编码为来自先前变换编码块和两个非零系数电平的频率位置的移位。与常规游程长度编码相比，预测可以避免编码非常大的零电平变换系数运行。对于未在前面的变换编码块中预测的频谱峰值，频谱峰值被编码为零电平频谱变换系数的行程的长度和两个非零系数电平的三值。

10. 发明授权

US07454332B2 Gain constrained noise suppression 有权
标题翻译：增加约束噪声抑制
公开(公告)号：US07454332B2
公开(公告)日：2008-11-18
申请号：US10869467
申请日：2004-06-15
申请人： Kazuhito Koishida , Feng Zhuge , Hosam A. Khalil , Tian Wang , Wei-ge Chen
发明人： Kazuhito Koishida , Feng Zhuge , Hosam A. Khalil , Tian Wang , Wei-ge Chen
IPC分类号： G10L21/02 , G10L19/14
CPC分类号： G10L21/0208 , G10L21/0232
摘要： A gain-constrained noise suppression for speech more precisely estimates noise, including during speech, to reduce musical noise artifacts introduced from noise suppression. The noise suppression operates by applying a spectral gain G(m, k) to each short-time spectrum value S(m, k) of a speech signal, where m is the frame number and k is the spectrum index. The spectrum values are grouped into frequency bins, and a noise characteristic estimated for each bin classified as a “noise bin.” An energy parameter is smoothed in both the time domain and the frequency domain to improve noise estimation per bin. The gain factors G(m, k) are calculated based on the current signal spectrum and the noise estimation, then smoothed before being applied to the signal spectral values S(m, k). First, a noisy factor is computed based on a ratio of the number of noise bins to the total number of bins for the current frame, where a zero-valued noisy factor means only using constant gain for all the spectrum values and noisy factor of one means no smoothing at all. Then, this noisy factor is used to alter the gain factors, such as by cutting off the high frequency components of the gain factors in the frequency domain.
摘要翻译：用于语音的增益约束噪声抑制更精确地估计包括在语音期间的噪声，以减少从噪声抑制引入的音乐噪声伪像。通过对语音信号的每个短时间频谱值S（m，k）应用频谱增益G（m，k）来进行噪声抑制，其中m是帧号，k是频谱索引。频谱值被分组成频率仓，并且对于被分类为“噪声仓”的每个仓估计的噪声特性。能量参数在时域和频域均被平滑，以改善每个bin的噪声估计。基于当前信号频谱和噪声估计来计算增益因子G（m，k），然后在施加到信号频谱值S（m，k）之前进行平滑处理。首先，基于噪声箱数与当前帧的总数的比率来计算噪声因子，其中零值噪声因子意味着仅对所有频谱值使用恒定增益并且噪声因子为1 意味着没有平滑。然后，这种噪声因子用于改变增益因子，例如通过切断频域中增益因子的高频分量。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式