专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20190341069A1 Voice Activity Detector for Audio Signals 审中-公开
公开(公告)号：US20190341069A1
公开(公告)日：2019-11-07
申请号：US16516634
申请日：2019-07-19
申请人： Dolby Laboratories Licensing Corporation
发明人： Hannes Muesch
IPC分类号： G10L25/78 , G10L19/018 , G10L21/0364 , G10L25/93 , G10L21/02 , G10L19/012
摘要： According to one aspect, a method for determining voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having a sample rate, and spitting the audio signal into a plurality of subbands, the plurality of subbands including at least a lowest subband and a highest subband. The method further comprises filtering the lowest subband to reduce an energy of the lowest subband, estimating a noise level for at least some of the plurality of subbands, and computing a signal-to-noise ratio for at least some of the plurality of subbands. The method also includes determining a speech activity level based at least in part on the computed signal-to-noise ratios and an average of an energy of at least some of the plurality of subbands.

2. 发明授权

US10362270B2 Multimodal spatial registration of devices for congruent multimedia communications 有权
公开(公告)号：US10362270B2
公开(公告)日：2019-07-23
申请号：US15838728
申请日：2017-12-12
申请人： Dolby Laboratories Licensing Corporation
发明人： Erwin Goesnar , Hannes Muesch , David Gunawan , Michael Eckert , Glenn N. Dickins
IPC分类号： H04N7/15 , H04M3/56 , H04N7/14 , H04L12/18 , G01S3/80 , G06T7/70 , G06K9/32 , H04R3/12 , H04S7/00
摘要： Systems and methods are described for determining orientation of an external audio device in a video conference, which may be used to provide congruent multimodal representation for a video conference. A camera of a video conferencing system may be used to detect a potential location of an external audio device within a room in which the video conferencing system is providing a video conference. Within the detected potential location, a visual pattern associated with the external audio device may be identified. Using the identified visual pattern, the video conferencing system may estimate an orientation of the external audio device, the orientation being used by the video conferencing system to provide spatial audio video congruence to a far end audience.

3. 发明授权

US10141004B2 Hybrid waveform-coded and parametric-coded speech enhancement 有权
公开(公告)号：US10141004B2
公开(公告)日：2018-11-27
申请号：US14914572
申请日：2014-08-27
申请人： DOLBY LABORATORIES LICENSING CORPORATION , DOLBY INTERNATIONAL AB
发明人： Jeroen Koppens , Hannes Muesch
IPC分类号： G10L21/0364 , G10L19/008 , H04R5/04 , G10L19/20 , G10L19/22 , G10L21/0324 , H04S3/00
摘要： A method for hybrid speech enhancement which employs parametric-coded enhancement (or blend of parametric-coded and waveform-coded enhancement) under some signal conditions and waveform-coded enhancement (or a different blend of parametric-coded and waveform-coded enhancement) under other signal conditions. Other aspects are methods for generating a bitstream indicative of an audio program including speech and other content, such that hybrid speech enhancement can be performed on the program, a decoder including a buffer which stores at least one segment of an encoded audio bitstream generated by any embodiment of the inventive method, and a system or device (e.g., an encoder or decoder) configured (e.g., programmed) to perform any embodiment of the inventive method. At least some of speech enhancement operations are performed by a recipient audio decoder with Mid/Side speech enhancement metadata generated by an upstream audio encoder.

4. 发明授权

US10812759B2 Multimodal spatial registration of devices for congruent multimedia communications 有权
公开(公告)号：US10812759B2
公开(公告)日：2020-10-20
申请号：US16518887
申请日：2019-07-22
申请人： Dolby Laboratories Licensing Corporation
发明人： Erwin Goesnar , Hannes Muesch , David Gunawan , Michael Eckert , Glenn N. Dickins
IPC分类号： H04N7/15 , H04M3/56 , H04N7/14 , H04L12/18 , G01S3/80 , G06T7/70 , G01S5/18 , H04R3/12 , H04S7/00 , G06K9/32 , G06K9/00
摘要： Systems and methods are described for determining orientation of an external audio device in a video conference, which may be used to provide congruent multimodal representation for a video conference. A camera of a video conferencing system may be used to detect a potential location of an external audio device within a room in which the video conferencing system is providing a video conference. Within the detected potential location, a visual pattern associated with the external audio device may be identified. Using the identified visual pattern, the video conferencing system may estimate an orientation of the external audio device, the orientation being used by the video conferencing system to provide spatial audio video congruence to a far end audience.

5. 发明授权

US09985855B2 Call quality estimation by lost packet classification 有权
公开(公告)号：US09985855B2
公开(公告)日：2018-05-29
申请号：US14410753
申请日：2013-06-26
申请人： DOLBY LABORATORIES LICENSING CORPORATION
发明人： Glenn N. Dickins , Hannes Muesch
IPC分类号： H04L1/00 , H04L12/26 , H04L29/06 , H04L12/741 , H04L12/851 , H04L12/835 , H04L12/801
CPC分类号： H04L43/0829 , H04L45/74 , H04L47/2441 , H04L47/30 , H04L47/34 , H04L65/80
摘要： Described are: a method, an apparatus, and a tangible computer-readable storage medium comprising instructions to instruct one or more processors to carry out a method. One set of methods is for the transmit side of a communication link and another set of methods is for the receive side. A transmit side method includes assigning one of a set of classifications to media, e.g., voice/audio packets transmitted in a sequence, different classifications impacting differently a measure of perceptual quality calculated at the receive side if packets of the respective classifications are lost. A present packet is sent to the receive side containing the classification of a previous packet.

6. 发明申请

US20150243300A1 Voice Activity Detector for Audio Signals 有权
标题翻译：语音信号检测器
公开(公告)号：US20150243300A1
公开(公告)日：2015-08-27
申请号：US14701622
申请日：2015-05-01
申请人： DOLBY LABORATORIES LICENSING CORPORATION
发明人： Hannes Muesch
IPC分类号： G10L25/78 , G10L19/012
CPC分类号： G10L25/78 , G10L19/012 , G10L19/018 , G10L21/02 , G10L21/0205 , G10L21/0364 , G10L25/93 , G10L2025/932 , G10L2025/937
摘要： According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.
摘要翻译：根据一个方面，公开了一种用于检测语音活动的方法，所述方法包括接收输入音频信号的帧，所述输入音频信号具有采样率; 基于所述采样率将所述帧划分成多个子带，所述多个子带至少包括最低子带和最高子带; 用移动平均滤波器对最低子带进行滤波，以减少最低子带的能量; 估计所述多个子带中的每一个的噪声电平; 计算所述多个子带中的每一个的信噪比值; 以及基于所计算的信噪比值的平均值和所述多个子带中的每一个的能量的加权平均值来确定所述帧的语音活动水平。其他方面包括解码使用本文描述的方法编码的音频的音频解码器。

7. 发明授权

US10812401B2 Jitter buffer apparatus and method 有权
公开(公告)号：US10812401B2
公开(公告)日：2020-10-20
申请号：US16084932
申请日：2017-03-16
申请人： Dolby Laboratories Licensing Corporation
发明人： Richard J. Cartwright , Hannes Muesch
IPC分类号： H04L12/841 , H04L29/06 , H04L1/20 , H04L1/00 , H04L12/26 , H04L12/835 , H04L12/939
摘要： Disclosed is an apparatus and method operative to receive packets of media from a network including a receiver unit operative to receive the packets from the network, a jitter buffer data structure for receiving the packets in an ordered queue, the jitter buffer data structure having a tail into which the packets are input; a plurality of heads defining points in the jitter buffer data structure from which the ordered queue of packets are to be played back, the heads comprise an adjustable actual playback head coupled to an actual playback unit and at least one prototype head, each prototype head having associated therewith a target latency a processor having decision logic operable to determine a cost of achieving the associated target latency for each prototype head, wherein the decision logic compares the costs determined for each prototype head to identify a particular target latency and head location for the actual playback head of the buffer and a playback unit coupled to the processor for actual playback of the playback head of the buffer, such that the particular target latency of the jitter buffer data structure is determined at playback of the buffer rather than upon input of the packets into the jitter buffer data structure.

8. 发明授权

US10418052B2 Voice activity detector for audio signals 有权
公开(公告)号：US10418052B2
公开(公告)日：2019-09-17
申请号：US15730908
申请日：2017-10-12
申请人： Dolby Laboratories Licensing Corporation
发明人： Hannes Muesch
IPC分类号： G10L25/78 , G10L21/02 , G10L21/0364 , G10L19/012 , G10L19/018 , G10L25/93
摘要： According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.

9. 发明授权

US09418680B2 Voice activity detector for audio signals 有权
标题翻译：语音活动检测器，用于音频信号
公开(公告)号：US09418680B2
公开(公告)日：2016-08-16
申请号：US14701622
申请日：2015-05-01
申请人： Dolby Laboratories Licensing Corporation
发明人： Hannes Muesch
IPC分类号： G10L25/78 , G10L21/02 , G10L21/0364 , G10L19/012
CPC分类号： G10L25/78 , G10L19/012 , G10L19/018 , G10L21/02 , G10L21/0205 , G10L21/0364 , G10L25/93 , G10L2025/932 , G10L2025/937
摘要： According to one aspect, a method for detecting voice activity is disclosed, the method including receiving a frame of an input audio signal, the input audio signal having an sample rate; dividing the frame into a plurality of subbands based on the sample rate, the plurality of subbands including at least a lowest subband and a highest subband; filtering the lowest subband with a moving average filter to reduce an energy of the lowest subband; estimating a noise level for each of the plurality of subbands; calculating a signal to noise ratio value for each of the plurality of subbands; and determining a speech activity level of the frame based on an average of the calculated signal to noise ratio values and a weighted average of an energy of each of the plurality of subbands. Other aspects include audio decoders that decode audio that was encoded using the methods described herein.
摘要翻译：根据一个方面，公开了一种用于检测语音活动的方法，所述方法包括接收输入音频信号的帧，所述输入音频信号具有采样率; 基于所述采样率将所述帧划分为多个子带，所述多个子带至少包括最低子带和最高子带; 用移动平均滤波器对最低子带进行滤波，以减少最低子带的能量; 估计所述多个子带中的每一个的噪声电平; 计算所述多个子带中的每一个的信噪比值; 以及基于所计算的信噪比值的平均值和所述多个子带中的每一个的能量的加权平均值来确定所述帧的语音活动水平。其他方面包括解码使用本文描述的方法编码的音频的音频解码器。

10. 发明申请

US20160071527A1 Method and System for Scaling Ducking of Speech-Relevant Channels in Multi-Channel Audio 有权
标题翻译：多通道音频语音相关通道缩小方法与系统
公开(公告)号：US20160071527A1
公开(公告)日：2016-03-10
申请号：US14942706
申请日：2015-11-16
申请人： Dolby Laboratories Licensing Corporation
发明人： Hannes Muesch
IPC分类号： G10L21/0364 , G10L21/034
CPC分类号： G10L21/0364 , G10L21/0208 , G10L21/0232 , G10L21/034 , H04S3/008 , H04S7/30 , H04S2400/09 , H04S2400/13
摘要： A method and system for filtering a multi-channel audio signal having a speech channel and at least one non-speech channel, to improve intelligibility of speech determined by the signal. In typical embodiments, the method includes steps of determining at least one attenuation control value indicative of a measure of similarity between speech-related content determined by the speech channel and speech-related content determined by the non-speech channel, and attenuating the non-speech channel in response to the at least one attenuation control value. Typically, the attenuating step includes scaling of a raw attenuation control signal (e.g., a ducking gain control signal) for the non-speech channel in response to the at least one attenuation control value. Some embodiments are a general or special purpose processor programmed with software or firmware and/or otherwise configured to perform filtering in accordance the invention.
摘要翻译：一种用于对具有语音信道和至少一个非语音信道的多声道音频信号进行滤波的方法和系统，以提高由信号确定的语音的可懂度。在典型的实施例中，该方法包括以下步骤：确定指示由语音信道确定的语音相关内容与由非语音频道确定的语音相关内容之间的相似度的度量的至少一个衰减控制值，响应于所述至少一个衰减控制值的语音信道。通常，衰减步骤包括响应于至少一个衰减控制值缩放非语音信道的原始衰减控制信号（例如，下降增益控制信号）。一些实施例是用软件或固件编程和/或以其他方式配置为根据本发明执行滤波的通用或专用处理器。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式