专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明申请

US20120239387A1 VOICE TRANSFORMATION WITH ENCODED INFORMATION 有权
标题翻译：语音转换与编码信息
公开(公告)号：US20120239387A1
公开(公告)日：2012-09-20
申请号：US13049924
申请日：2011-03-17
申请人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo
发明人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo
IPC分类号： G10L19/02
CPC分类号： G10L21/003 , G10L19/018
摘要： Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.
摘要翻译：提供语音转换的方法，系统和计算机程序产品。该方法包括使用变换参数来变换源语言，以及使用隐写术对输入语音中的变换参数对信息进行编码，其中可以使用输出语音和关于变换参数的信息来重构源语音。还提供了一种用于重建语音变换的方法，包括：接收语音转换系统的输出语音，其中输出语音是使用隐写术编码关于变换参数的信息的变换语音; 提取变换参数信息; 并执行输出语音的逆变换以获得原始源语音的近似。

2. 发明授权

US08930182B2 Voice transformation with encoded information 有权
标题翻译：具有编码信息的语音变换
公开(公告)号：US08930182B2
公开(公告)日：2015-01-06
申请号：US13049924
申请日：2011-03-17
申请人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo
发明人： Shay Ben-David , Ron Hoory , Zvi Kons , David Nahamoo
IPC分类号： G10L21/00 , G10L25/90 , G10L25/93 , G10L21/003 , G10L19/018
CPC分类号： G10L21/003 , G10L19/018
摘要： Method, system, and computer program product for voice transformation are provided. The method includes transforming a source speech using transformation parameters, and encoding information on the transformation parameters in an output speech using steganography, wherein the source speech can be reconstructed using the output speech and the information on the transformation parameters. A method for reconstructing voice transformation is also provided including: receiving an output speech of a voice transformation system wherein the output speech is transformed speech which has encoded information on the transformation parameters using steganography; extracting the information on the transformation parameters; and carrying out an inverse transformation of the output speech to obtain an approximation of an original source speech.
摘要翻译：提供语音转换的方法，系统和计算机程序产品。该方法包括使用变换参数来变换源语言，以及使用隐写术对输入语音中的变换参数对信息进行编码，其中可以使用输出语音和关于变换参数的信息来重构源语音。还提供了一种用于重建语音变换的方法，包括：接收语音转换系统的输出语音，其中输出语音是使用隐写术编码关于变换参数的信息的变换语音; 提取变换参数信息; 并执行输出语音的逆变换以获得原始源语音的近似。

3. 发明申请

US20130325455A1 VOCAL SOURCE EXTRACTION BY MAXIMUM PHASE DETECTION 有权
标题翻译：通过最大相位检测提取VOCAL SOURCE
公开(公告)号：US20130325455A1
公开(公告)日：2013-12-05
申请号：US13487275
申请日：2012-06-04
申请人： Aharon Satt , Zvi Kons , Ron Hoory
发明人： Aharon Satt , Zvi Kons , Ron Hoory
IPC分类号： G10L11/04
CPC分类号： G10L25/75 , G10L25/03 , G10L25/45
摘要： Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.
摘要翻译：方法，装置和计算机程序产品实现本发明的实施例，其包括接收时域语音信号，并从接收到的信号中提取单个音调周期。提取的单音调周期被转换为频域，并且识别和校正频域的错误分类的根。使用校正的根，产生频域的最大相位的指示。

4. 发明授权

US09105272B2 Vocal source extraction by maximum phase detection 有权
标题翻译：通过最大相位检测进行声源提取
公开(公告)号：US09105272B2
公开(公告)日：2015-08-11
申请号：US13487275
申请日：2012-06-04
申请人： Aharon Satt , Zvi Kons , Ron Hoory , Virgilijus Ulozas
发明人： Aharon Satt , Zvi Kons , Ron Hoory , Virgilijus Ulozas
IPC分类号： G10L25/75 , G10L25/03 , G10L25/45
CPC分类号： G10L25/75 , G10L25/03 , G10L25/45
摘要： Methods, apparatus and computer program products implement embodiments of the present invention that include receiving a time domain voice signal, and extracting a single pitch cycle from the received signal. The extracted single pitch cycle is transformed to a frequency domain, and the misclassified roots of the frequency domain are identified and corrected. Using the corrected roots, an indication of a maximum phase of the frequency domain is generated.
摘要翻译：方法，装置和计算机程序产品实现本发明的实施例，其包括接收时域语音信号，并从接收到的信号中提取单个音调周期。提取的单音调周期被转换为频域，并且识别和校正频域的错误分类的根。使用校正的根，产生频域的最大相位的指示。

5. 发明授权

US08886537B2 Method and system for text-to-speech synthesis with personalized voice 有权
标题翻译：用于个性化语音的文本到语音合成的方法和系统
公开(公告)号：US08886537B2
公开(公告)日：2014-11-11
申请号：US11688264
申请日：2007-03-20
申请人： Itzhack Goldberg , Ron Hoory , Boaz Mizrachi , Zvi Kons
发明人： Itzhack Goldberg , Ron Hoory , Boaz Mizrachi , Zvi Kons
IPC分类号： G10L13/00 , G10L13/033 , G10L13/04
CPC分类号： G10L13/00 , G10L13/033 , G10L13/04
摘要： A method and system are provided for text-to-speech synthesis with personalized voice. The method includes receiving an incidental audio input (403) of speech in the form of an audio communication from an input speaker (401) and generating a voice dataset (404) for the input speaker (401). The method includes receiving a text input (411) at the same device as the audio input (403) and synthesizing (312) the text from the text input (411) to synthesized speech including using the voice dataset (404) to personalize the synthesized speech to sound like the input speaker (401). In addition, the method includes analyzing (316) the text for expression and adding the expression (315) to the synthesized speech. The audio communication may be part of a video communication (453) and the audio input (403) may have an associated visual input (455) of an image of the input speaker. The synthesis from text may include providing a synthesized image personalized to look like the image of the input speaker with expressions added from the visual input (455).
摘要翻译：提供了一种用于具有个性化语音的文本到语音合成的方法和系统。该方法包括从输入扬声器（401）接收音频通信形式的语音的附带音频输入（403），并产生用于输入扬声器（401）的语音数据集（404）。该方法包括在与音频输入（403）相同的设备处接收文本输入（411），并将来自文本输入（411）的文本合成（312）到包括使用语音数据集（404）的合成语音，以个性化合成的语音类似于输入扬声器（401）。此外，该方法包括分析（316）表达的文本并将表达式（315）添加到合成语音。音频通信可以是视频通信的一部分（453），并且音频输入（403）可以具有输入说话者的图像的相关视觉输入（455）。来自文本的合成可以包括提供个性化的看起来像输入说话者的图像的合成图像，其中从视觉输入（455）添加表达。

6. 发明授权

US08280724B2 Speech synthesis using complex spectral modeling 有权
标题翻译：使用复谱谱建模的语音合成
公开(公告)号：US08280724B2
公开(公告)日：2012-10-02
申请号：US11046911
申请日：2005-01-31
申请人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin
发明人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin
IPC分类号： G10L11/04 , G10L19/14 , G10L11/06 , G10L19/06
CPC分类号： G10L13/08 , G10L19/02
摘要： A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.
摘要翻译：一种用于处理语音信号的方法包括将语音信号划分成一系列帧，将一个或多个帧标识为点击帧，以及从点击帧中提取相位信息。使用相位信息对语音信号进行编码。还提供了用于建模有声帧和点击帧的相位谱建模的方法。

7. 发明申请

US20080235024A1 METHOD AND SYSTEM FOR TEXT-TO-SPEECH SYNTHESIS WITH PERSONALIZED VOICE 有权
标题翻译：使用个性化语音进行语音合成的方法和系统
公开(公告)号：US20080235024A1
公开(公告)日：2008-09-25
申请号：US11688264
申请日：2007-03-20
申请人： Itzhack Goldberg , Ron Hoory , Boaz Mizrachi , Zvi Kons
发明人： Itzhack Goldberg , Ron Hoory , Boaz Mizrachi , Zvi Kons
IPC分类号： G10L13/00
CPC分类号： G10L13/00 , G10L13/033 , G10L13/04
摘要： A method and system are provided for text-to-speech synthesis with personalized voice. The method includes receiving an incidental audio input (403) of speech in the form of an audio communication from an input speaker (401) and generating a voice dataset (404) for the input speaker (401). The method includes receiving a text input (411) at the same device as the audio input (403) and synthesizing (312) the text from the text input (411) to synthesized speech including using the voice dataset (404) to personalize the synthesized speech to sound like the input speaker (401). In addition, the method includes analyzing (316) the text for expression and adding the expression (315) to the synthesized speech. The audio communication may be part of a video communication (453) and the audio input (403) may have an associated visual input (455) of an image of the input speaker. The synthesis from text may include providing a synthesized image personalized to look like the image of the input speaker with expressions added from the visual input (455).
摘要翻译：提供了一种用于具有个性化语音的文本到语音合成的方法和系统。该方法包括从输入扬声器（401）接收音频通信形式的语音的附带音频输入（403），并产生用于输入扬声器（401）的语音数据集（404）。该方法包括在与音频输入（403）相同的设备处接收文本输入（411），并将来自文本输入（411）的文本合成（312）到包括使用语音数据集（404）的合成语音，以个性化合成的语音类似于输入扬声器（401）。此外，该方法包括分析（316）表达的文本并将表达式（315）添加到合成语音。音频通信可以是视频通信的一部分（453），并且音频输入（403）可以具有输入说话者的图像的相关视觉输入（455）。来自文本的合成可以包括提供个性化的看起来像输入说话者的图像的合成图像，其中从视觉输入（455）添加表达。

8. 发明申请

US20050131680A1 Speech synthesis using complex spectral modeling 有权
标题翻译：使用复谱谱建模的语音合成
公开(公告)号：US20050131680A1
公开(公告)日：2005-06-16
申请号：US11046911
申请日：2005-01-31
申请人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin
发明人： Dan Chazan , Ron Hoory , Zvi Kons , Slava Shechtman , Alexander Sorin
IPC分类号： G10L11/00 , G10L11/04 , G10L13/08 , G10L19/02 , G10L19/14
CPC分类号： G10L13/08 , G10L19/02
摘要： A method for processing a speech signal includes dividing the speech signal into a succession of frames, identifying one or more of the frames as click frames, and extracting phase information from the click frames. The speech signal is encoded using the phase information. Methods are also provided for modeling phase spectra of voiced frames and click frames.
摘要翻译：一种用于处理语音信号的方法包括将语音信号划分成一系列帧，将一个或多个帧标识为点击帧，以及从点击帧中提取相位信息。使用相位信息对语音信号进行编码。还提供了用于建模有声帧和点击帧的相位谱建模的方法。

9. 发明授权

US07127389B2 Method for encoding and decoding spectral phase data for speech signals 有权
公开(公告)号：US07127389B2
公开(公告)日：2006-10-24
申请号：US10243580
申请日：2002-09-13
申请人： Dan Chazan , Zvi Kons
发明人： Dan Chazan , Zvi Kons
IPC分类号： G10L11/04
CPC分类号： G10L25/90
摘要： A speech decoder and a segment aligner are provided in the present invention. The speech decoder may include a spectrum reconstructor operative to reconstruct the spectrum of a speech segment from the amplitude envelope of the spectrum of said speech segment and pitch information, a phase combiner operative to reconstruct the complex spectrum of the speech segment from the reconstructed spectrum, phase information describing the speech segment, and pitch information describing the speech segment. The speech decoder may further include a delay operative to store a complex spectrum of a previous speech segment; and a segment aligner operative to determine the relative offset between the complex spectrum of the speech segment and the complex spectrum of the previous speech segment, align the position of the first pitch excitation of the current speech segment to the last pitch excitation of the previous speech segment; and to apply a time shift and a complex Hilbert filter to said complex spectra, wherein the segment aligner is operative to cross-correlate the complex spectra as C ⁡ ( τ ) = ∑ n = 0 N ⁢ ⁢ F n ⁢ G _ m ⁢ ⅇ - 2 ⁢ ⁢ π ⁢ ⁢ in ⁢ ⁢ τ , m = ⌊ n ⁢ p G p F + 0.5 ⌋ , where Fn and Gm are the computed complex magnitude of the pitch harmonics n and m of the current and previous spectra respectively, and pF and pG are their corresponding pitch periods.

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式