会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 4. 发明授权
    • Method for tone/intonation recognition using auditory attention cues
    • 使用听觉注意线索的音调/语调识别方法
    • US08676574B2
    • 2014-03-18
    • US12943774
    • 2010-11-10
    • Ozlem Kalinli
    • Ozlem Kalinli
    • G10L11/04
    • G10L15/1807G10L17/26G10L25/90
    • In a spoken language processing method for tone/intonation recognition, an auditory spectrum may be determined for an input window of sound and one or more multi-scale features may be extracted from the auditory spectrum. Each multi-scale feature can be extracted using a separate two-dimensional spectro-temporal receptive filter. One or more feature maps corresponding to the one or more multi-scale features can be generated and an auditory gist vector can be extracted from each of the one or more feature maps. A cumulative gist vector may be obtained through augmentation of each auditory gist vector extracted from the one or more feature maps. One or more tonal characteristics corresponding to the input window of sound can be determined by mapping the cumulative gist vector to one or more tonal characteristics using a machine learning algorithm.
    • 在用于音调/语调识别的口语处理方法中,可以为声音的输入窗口确定听觉频谱,并且可以从听觉谱中提取一个或多个多尺度特征。 可以使用单独的二维光谱接收滤波器来提取每个多尺度特征。 可以生成与一个或多个多尺度特征相对应的一个或多个特征图,并且可以从一个或多个特征图中的每一个提取听觉要点矢量。 可以通过增加从一个或多个特征图提取的每个听觉要素矢量来获得累积的要点向量。 可以通过使用机器学习算法将累积要点向量映射到一个或多个音调特征来确定与输入声音对应的一个或多个音调特性。
    • 5. 发明授权
    • Encoding device, decoding device, and method thereof
    • 编码装置,解码装置及其方法
    • US08560328B2
    • 2013-10-15
    • US12518371
    • 2007-12-14
    • Tomofumi YamanashiMasahiro Oshikiri
    • Tomofumi YamanashiMasahiro Oshikiri
    • G10L19/00G10L11/04G01L11/06G10L15/20G10L21/04
    • G10L19/24G10L21/038
    • A decoding device is capable of flexibly calculating high-band spectrum data with a high accuracy in accordance with an encoding band selected by an upper-node layer of the encoding side. In this device: a first layer decoder decodes first layer encoded information to generate a first layer decoded signal; a second layer decoder decodes second layer encoded information to generate a second layer decoded signal; a spectrum decoder performs a band extension process by using the second layer decoded signal and the first layer decoded signal up-sampled in an up-sampler so as to generate an all-band decoded signal; and a switch outputs the first layer decoded signal or the all-band decoded signal according to the control information generated in a controller.
    • 解码装置能够根据由编码侧的上层节点选择的编码频带,以高精度灵活地计算高频带频谱数据。 在该设备中:第一层解码器解码第一层编码信息以产生第一层解码信号; 第二层解码器解码第二层编码信息以产生第二层解码信号; 频谱解码器通过使用第二层解码信号和在上采样器中被上采样的第一层解码信号来执行频带扩展处理,以便产生全频带解码信号; 并且开关根据在控制器中生成的控制信息输出第一层解码信号或全频带解码信号。
    • 6. 发明授权
    • Adaptive audio signal source vector quantization device and adaptive audio signal source vector quantization method that search for pitch period based on variable resolution
    • 自适应音频信号源矢量量化装置和自适应音频信号源矢量量化方法,基于可变分辨率搜索音调周期
    • US08521519B2
    • 2013-08-27
    • US12528661
    • 2008-02-29
    • Kaoru SatoToshiyuki Morii
    • Kaoru SatoToshiyuki Morii
    • G10L11/04
    • G10L25/90G10L19/038G10L19/09
    • An adaptive sound source vector quantization device includes a first pitch cycle instructor, a search range calculator, and a second pitch cycle instructor. The first pitch cycle instructor successively instructs pitch cycle search candidates in a predetermined search range having a search resolution which transits over a predetermined pitch cycle candidate for the first sub-frame. The search range calculator calculates a predetermined range before and after the pitch cycle of the first sub-frame as the pitch cycle search range for the second sub-frame, if the predetermined range includes the predetermined pitch cycle search candidate. In the predetermined range, the search resolution transits over a boundary defined by the predetermined pitch cycle. The second pitch cycle instructor successively instructs the pitch cycle search candidates in the search range for the second sub-frame.
    • 自适应声源矢量量化装置包括第一音调周期指导器,搜索范围计算器和第二音调周期指导器。 第一音调循环指导器在具有在第一子帧的预定音调周期候选中转移的搜索分辨率的预定搜索范围内连续地指示音调周期搜索候选。 如果预定范围包括预定的音调周期搜索候选,则搜索范围计算器计算第一子帧的音调周期之前和之后的预定范围作为第二子帧的音调周期搜索范围。 在预定范围内,搜索分辨率在由预定音调周期限定的边界上转移。 第二音调周期指导者在第二子帧的搜索范围内连续地指示音调周期搜索候选。
    • 7. 发明授权
    • Separating speech waveforms into periodic and aperiodic components, using artificial waveform generated from pitch marks
    • 将语音波形分为周期性和非周期性分量,使用由间距标记产生的人造波形
    • US08438014B2
    • 2013-05-07
    • US13358702
    • 2012-01-26
    • Masahiro MoritaJavier LatorreTakehiko Kagoshima
    • Masahiro MoritaJavier LatorreTakehiko Kagoshima
    • G10L11/06G10L11/04
    • G10L25/93G10L25/90
    • According to one embodiment, in a speech processing device, an extractor windows a part of the speech signal and extracts a partial waveform. A calculator performs frequency analysis of the partial waveform to calculate a frequency spectrum. An estimator generates an artificial waveform that is a waveform according to an interval between the pitch marks for each harmonic component having a frequency that is a predetermined multiple of a fundamental frequency of the speech signal and estimates harmonic spectral features representing characteristics of the frequency spectrum of the harmonic component from each of the artificial waveforms. A separator separates the partial waveform into a periodic component produced from periodic vocal-fold vibration as an acoustic source and an aperiodic component produced from aperiodic acoustic sources other than the vocal-fold vibration by using the respective harmonic spectral features and the frequency spectrum of the partial waveform.
    • 根据一个实施例,在语音处理设备中,提取器对一部分语音信号进行窗口并提取部分波形。 计算器执行部分波形的频率分析以计算频谱。 估计器产生人造波形,其是根据具有作为语音信号的基频的预定倍数的频率的每个谐波分量的音调标记之间的间隔的波形,并且估计表示频率的频谱特性的谐波谱特征 来自每个人造波形的谐波分量。 分离器将部分波形分离为由周期性声带振动产生的周期分量,作为声源,并且通过使用相应的谐波频谱特征和频谱的频谱,从除声带之外的非周期声源产生的非周期分量 部分波形。
    • 8. 发明授权
    • Time warped modified transform coding of audio signals
    • 音频信号的时间变形变换编码
    • US08412518B2
    • 2013-04-02
    • US12697137
    • 2010-01-29
    • Lars Villemoes
    • Lars Villemoes
    • G10L11/04G10L21/04G10L15/00G10L19/02G10L19/14G10L13/06G10L13/08G10L19/00
    • G10L19/002G10L19/0212G10L19/022
    • A representation of an audio signal having a first frame, a second frame following the first frame, and a third frame following the second frame, is derived by estimating first warp information for the first and the second frame and second warp information for the second frame and the third frame, the warp information describing a pitch information of the audio signal. First spectral coefficients for the first and the second frame are derived using the first warp information and a first weighted representation of the first and the second frame, the first weighted representation derived by applying a first window function to the first and the second frames, wherein the first window function depends on the first warp information. Second spectral coefficients for the second and the third frame are derived using the second warp information and a second weighted representation of the second and the third frame, the second weighted representation derived by applying a second window function to the second and the third frames, wherein the second window function depends on the second warp information. The representation of the audio signal is generated including the first and the second spectral coefficients.
    • 通过估计第一和第二帧的第一扭曲信息和第二帧的第二扭曲信息来导出具有第一帧,第一帧之后的第二帧和第二帧之后的第三帧的音频信号的表示 和第三帧,描述音频信号的音调信息的翘曲信息。 使用第一和第二帧的第一扭曲信息和第一加权表示来导出第一和第二帧的第一频谱系数,通过对第一和第二帧应用第一窗函数而导出第一加权表示,其中 第一个窗口函数取决于第一个warp信息。 使用第二扭曲信息和第二和第三帧的第二加权表示导出第二和第三帧的第二频谱系数​​,通过将第二窗函数应用于第二和第三帧而导出的第二加权表示,其中 第二个窗口函数取决于第二个warp信息。 产生音频信号的表示,包括第一和第二频谱系数​​。
    • 9. 发明授权
    • Scalable lossless audio codec and authoring tool
    • 可扩展的无损音频编解码器和创作工具
    • US08374858B2
    • 2013-02-12
    • US12720365
    • 2010-03-09
    • Zoran Fejzo
    • Zoran Fejzo
    • G10L19/00G10L19/14G10L11/04G10L19/02G10L21/04
    • G10L19/0017G10L19/24
    • An audio codec losslessly encodes audio data into a sequence of analysis windows in a scalable bitstream. This is suitably done by separating the audio data into MSB and LSB portions and encoding each with a different lossless algorithm. An authoring tool compares the buffered payload to an allowed payload for each window and selectively scales the losslessly encoded audio data, suitably the LSB portion, in the non-conforming windows to reduce the encoded payload, hence buffered payload. This approach satisfies the media bit rate and buffer capacity constraints without having to filter the original audio data, reencode or otherwise disrupt the lossless bitstream.
    • 音频编解码器将音频数据无损编码到可分级比特流中的分析窗口序列中。 这通过将音频数据分离成MSB和LSB部分并且使用不同的无损算法进行编码来适当地完成。 编写工具将缓冲的有效载荷与每个窗口的允许的有效载荷进行比较,并在不合格的窗口中选择性地缩放无损编码的音频数据,适当地缩放LSB部分,以减少编码的有效载荷,从而缓冲有效载荷。 这种方法满足媒体比特率和缓冲器容量约束,而不必过滤原始音频数据,重新编码或以其他方式中断无损比特流。
    • 10. 发明授权
    • Voice encoding device and voice encoding method
    • 语音编码装置和语音编码方法
    • US08364472B2
    • 2013-01-29
    • US12528880
    • 2008-02-29
    • Hiroyuki Ehara
    • Hiroyuki Ehara
    • G10L11/04
    • G10L19/005G10L19/09G10L2019/0011
    • Provided is an audio encoding device which can detect an optimal pitch pulse when using pitch pulse information as redundant information. The device includes: a search start decision unit (121) which decides the oldest point among a plurality of points where a pitch pulse may exist; a pitch pulse candidate selection unit (122) which defines a search range as a range between the search start point and the point preceding the point of the head of the current frame by one and selects a decoding sound source vector having a large amplitude in this search range as a pitch pulse position candidate; a selector switch (125) which successively switches a plurality of pitch pulse position candidates inputted from a pitch pulse candidate selection unit (122) for output to a pulse sequence generation unit (123) and an error minimization unit (124); a pulse sequence generation unit (123) which generates as a pulse sequence, a vector generated as an adaptive codebook component from the pitch pulse in the current frame when a pitch pulse is set to be a pitch pulse position candidate inputted from the selector switch (125).
    • 提供了一种在使用音调脉冲信息作为冗余信息时可以检测最佳音调脉冲的音频编码装置。 该装置包括:搜索开始判定单元,其决定可能存在音调脉冲的多个点中的最早点; 音调脉冲候选选择单元(122),其将搜索范围定义为搜索起始点和当前帧的头部之前的点之间的距离之间的范围,并且选择在该帧中具有大振幅的解码声源矢量 搜索范围作为音调脉冲位置候选; 选择器开关(125),其连续地切换从音调脉冲候选选择单元(122)输入的多个音调脉冲位置候补,以输出到脉冲序列生成单元(123)和误差最小化单元(124); 脉冲序列生成单元,当音调脉冲被设置为从选择器开关输入的音调脉冲位置候选时,作为脉​​冲序列产生当前帧中的音调脉冲作为自适应码本分量生成的矢量( 125)。