会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 2. 发明申请
    • Voice processing apparatus and program
    • 语音处理装置和程序
    • US20060004569A1
    • 2006-01-05
    • US11165695
    • 2005-06-24
    • Yasuo YoshiokaAlex Loscos
    • Yasuo YoshiokaAlex Loscos
    • G10L19/14
    • G10L13/033G10L2021/0135
    • Envelope identification section generates input envelope data (DEVin) indicative of a spectral envelope (EVin) of an input voice. Template acquisition section reads out, from a storage section, converting spectrum data (DSPt) indicative of a frequency spectrum (SPt) of a converting voice. On the basis of the input envelope data (DEVin) and the converting spectrum data (DSPt), a data generation section specifies a frequency spectrum (SPnew) corresponding in shape to the frequency spectrum (SPt) of the converting voice and having a substantially same spectral envelope as the spectral envelope (EVin) of the input voice, and the data generation section generates new spectrum data (DSPnew) indicative of the frequency spectrum (SPnew). Reverse FFT section and output processing section generates an output voice signal (Snew) on the basis of the new spectrum data (DSPnew).
    • 信封识别部分生成表示输入声音的频谱包络(EVin)的输入包络数据(DEVin)。 模板获取部从存储部读出表示转换语音的频谱(SPt)的频谱数据(DSPt)。 基于输入包络数据(DEVin)和转换频谱数据(DSPt),数据生成部分指定与转换声音的频谱(SPt)形状对应的频谱(SPnew),并具有基本相同 频谱包络作为输入语音的频谱包络(EVin),并且数据产生部分生成指示频谱(SPnew)的新频谱数据(DSPnew)。 反向FFT部分和输出处理部分基于新的频谱数据(DSPnew)生成输出语音信号(Snew)。
    • 5. 发明授权
    • Voice processing apparatus and program
    • 语音处理装置和程序
    • US08073688B2
    • 2011-12-06
    • US11165695
    • 2005-06-24
    • Yasuo YoshiokaAlex Loscos
    • Yasuo YoshiokaAlex Loscos
    • G10L19/14
    • G10L13/033G10L2021/0135
    • Envelope identification section generates input envelope data (DEVin) indicative of a spectral envelope (EVin) of an input voice. Template acquisition section reads out, from a storage section, converting spectrum data (DSPt) indicative of a frequency spectrum (SPt) of a converting voice. On the basis of the input envelope data (DEVin) and the converting spectrum data (DSPt), a data generation section specifies a frequency spectrum (SPnew) corresponding in shape to the frequency spectrum (SPt) of the converting voice and having a substantially same spectral envelope as the spectral envelope (EVin) of the input voice, and the data generation section generates new spectrum data (DSPnew) indicative of the frequency spectrum (SPnew). Reverse FFT section and output processing section generates an output voice signal (Snew) on the basis of the new spectrum data (DSPnew).
    • 信封识别部分生成表示输入声音的频谱包络(EVin)的输入包络数据(DEVin)。 模板获取部从存储部读出表示转换语音的频谱(SPt)的频谱数据(DSPt)。 基于输入包络数据(DEVin)和转换频谱数据(DSPt),数据生成部分指定与转换声音的频谱(SPt)形状对应的频谱(SPnew),并具有基本相同 频谱包络作为输入语音的频谱包络(EVin),并且数据产生部分生成指示频谱(SPnew)的新频谱数据(DSPnew)。 反向FFT部分和输出处理部分基于新的频谱数据(DSPnew)生成输出语音信号(Snew)。
    • 6. 发明授权
    • Sound signal expression mode determining apparatus method and program
    • 声音信号表达模式确定装置的方法和程序
    • US08013231B2
    • 2011-09-06
    • US11439818
    • 2006-05-24
    • Takuya FujishimaAlex LoscosJordi BonadaOscar Mayor
    • Takuya FujishimaAlex LoscosJordi BonadaOscar Mayor
    • G10H1/02
    • G10H1/361G10H2210/061G10H2210/091G10H2250/005G10H2250/015G10H2250/235G10L15/142
    • A sound signal processing apparatus which is capable of correctly detecting expression modes and expression transitions of a song or performance from an input sound signal. A sound signal produced by performance or singing of musical tones is input and divided into frames of predetermined time periods. Characteristic parameters of the input sound signal are detected on a frame-by-frame basis. An expression determining process is carried out in which a plurality of expression modes of a performance or song are modeled as respective states, the probability that a section including a frame or a plurality of continuous frames lies in a specific state is calculated with respect to a predetermined observed section based on the characteristic parameters, and the optimum route of state transition in the predetermined observed section is determined based on the calculated probabilities so as to determine expression modes of the sound signal and lengths thereof.
    • 一种声音信号处理装置,其能够从输入声音信号正确地检测歌曲或演奏的表情模式和表情转换。 通过演奏或唱歌产生的声音信号被输入并分成预定时间段的帧。 在逐帧的基础上检测输入声音信号的特征参数。 执行表达确定处理,其中表演或歌曲的多个表达模式被建模为各自的状态,关于一个或多个关于一个或多个连续帧的部分包括帧或多个连续帧的部分位于特定状态的概率被计算 基于特征参数的预定观测部分,并且基于所计算的概率来确定预定观测部分中的最佳状态转换路线,以便确定声音信号的表达模式及其长度。
    • 7. 发明申请
    • Voice converter for assimilation by frame synthesis with temporal alignment
    • 语音转换器通过帧合成与时间对准同化
    • US20050049875A1
    • 2005-03-03
    • US10951328
    • 2004-09-27
    • Takahiro KawashimaYasuo YoshiokaPedro CanoAlex LoscosXavier SerraMark SchiementzJordi Bonada
    • Takahiro KawashimaYasuo YoshiokaPedro CanoAlex LoscosXavier SerraMark SchiementzJordi Bonada
    • G10L13/02G10L21/00G10L13/00
    • G10L13/033G10L2021/0135
    • A voice converting apparatus is constructed for converting an input voice into an output voice according to a target voice. In the apparatus, a storage section provisionally stores source data, which is associated to and extracted from the target voice. An analyzing section analyzes the input voice to extract therefrom a series of input data frames representing the input voice. A producing section produces a series of target data frames representing the target voice based on the source data, while aligning the target data frames with the input data frames to secure synchronization between the target data frames and the input data frames. A synthesizing section synthesizes the output voice according to the target data frames and the input data frames. In the recognizing feature analysis, a characteristic analyzer extracts from the input voice a characteristic vector. A memory memorizes target behavior data representing a behavior of the target voice. An alignment processor determines a temporal relation between the input data frames and the target data frames according to the characteristic vector and the target behavior data so as to output alignment data. A target decoder produces the target data frames according to the alignment data, the input data frames and the source data containing phoneme of the target voice.
    • 构成语音转换装置,用于根据目标语音将输入语音转换为输出语音。 在装置中,存储部临时存储与目标语音相关联并从其中提取的源数据。 分析部分分析输入声音以从中提取代表输入声音的一系列输入数据帧。 产生部分基于源数据产生一系列表示目标语音的目标数据帧,同时使目标数据帧与输入数据帧对齐,以确保目标数据帧与输入数据帧之间的同步。 合成部根据目标数据帧和输入数据帧合成输出声音。 在识别特征分析中,特征分析器从输入语音中提取特征向量。 存储器存储表示目标语音行为的目标行为数据。 对准处理器根据特征向量和目标行为数据确定输入数据帧和目标数据帧之间的时间关系,以输出对准数据。 目标解码器根据对准数据,输入数据帧和包含目标声音的音素的源数据产生目标数据帧。