会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 3. 发明授权
    • Phoneme information extracting apparatus
    • 音素信息提取装置
    • US4405838A
    • 1983-09-20
    • US273400
    • 1981-06-15
    • Tsuneo NittaHideki Kasuya
    • Tsuneo NittaHideki Kasuya
    • G10L11/00G10L15/00G10L15/04G10L15/10G10L1/00
    • G10L15/00
    • A phoneme information extracting apparatus includes correlation data generators for successively generating correlation data representing the correlation between the acoustic power spectrum data corresponding to input voice and power spectrum data of various reference phonemes, selection circuits for successively transferring these correlation data when they detect that three or more successive correlation data have values greater than a predetermined value, maximum data hold circuits for holding the maximum correlation data among the correlation data transferred from the respective selection circuits, and a phoneme determination circuit for determining the optimum phoneme by detecting one of the data hold circuits that is holding the maximum correlation data among the correlation data held in the data hold circuits.
    • 音素信息提取装置包括:相关数据发生器,用于连续产生表示与输入声音相对应的声功率谱数据与各种参考音素的功率谱数据之间的相关性的相关数据;选择电路,当它们检测到三个或 更连续的相关数据具有大于预定值的值,用于保持从各个选择电路传送的相关数据之间的最大相关数据的最大数据保持电路和用于通过检测数据保持之一来确定最佳音素的音素确定电路 在保持在数据保持电路中的相关数据中保持最大相关数据的电路。
    • 4. 发明授权
    • Text-to-speech synthesis with controllable processing time and speech
quality
    • 具有可控处理时间和语音质量的文本到语音合成
    • US5615300A
    • 1997-03-25
    • US67079
    • 1993-05-26
    • Yoshiyuki HaraTsuneo Nitta
    • Yoshiyuki HaraTsuneo Nitta
    • G06F3/16G06F17/28G10L13/04G10L13/06G10L13/08G10L9/00
    • G10L13/047G10L13/04
    • Synthesized speech is generated by a software-implemented system with a programmed central processing unit. Phonetic parameters are generated from a series of phonetic symbols of an input text to be converted into synthesized speech, and prosodic parameters are also generated from prosodic information of the input text. The activity ratio of the central processing unit is determined, and the order of phonetic parameters or the arrangement of a synthesis unit or filter for speech synthesis is determined depending on the determined activity ratio of the central processing unit. Synthesized speech sounds are generated and filtered based on the phonetic and prosodic parameters according to the determined order of phonetic parameters or the determined arrangement of the filter.
    • 合成语音由具有编程的中央处理单元的软件实现的系统产生。 语音参数是从要转换为合成语音的输入文本的一系列语音符号生成的,并且还由输入文本的韵律信息生成韵律参数。 确定中央处理单元的活动比,并且根据所确定的中央处理单元的活动比确定语音合成的合成单元或滤波器的语音参数或排列顺序。 根据确定的语音参数顺序或确定的滤波器布置,基于语音和韵律参数来生成和滤波合成语音。
    • 6. 发明授权
    • Orthogonalized dictionary speech recognition apparatus and method thereof
    • 正交字典语音识别装置及其方法
    • US4979213A
    • 1990-12-18
    • US378780
    • 1989-07-12
    • Tsuneo Nitta
    • Tsuneo Nitta
    • G10L11/00G10L15/06
    • G10L15/063
    • Speech pattern data representing speech of a plurality of speakers are stored in a pattern storage section in advance. Averaged pattern data obtained by averaging a plurality of speech pattern data of the first of the plurality of speakers are obtained. Data obtained by blurring and differentiating the averaged pattern data are stored in an orthogonalized dictionary as basic orthogonalized dictionary data of first and second axes, respectively. Blurred data and differentiated data obtained with respect to the second and subsequent of the plurality of speakers are selectively stored in the orthogonalized dictionary as additional dictionary data having new axes. Speech of the plurality of speakers is recognized by computing a similarity between the orthogonalized dictionary formed in this manner and input speech.
    • 代表多个扬声器的语音的语音模式数据预先存储在模式存储部分中。 获得通过对多个扬声器中的第一个的多个语音图案数据进行平均而获得的平均图案数据。 通过模糊和区分平均图案数据获得的数据分别存储在正交字典中作为第一和第二轴的基本正交字典数据。 选择性地,在正交字典中存储相对于多个扬声器的第二和随后获得的模糊数据和微分数据作为具有新轴的附加字典数据。 通过计算以这种方式形成的正交字典和输入语音之间的相似度来识别多个扬声器的语音。
    • 7. 发明授权
    • Speech recognition system
    • 语音识别系统
    • US4624011A
    • 1986-11-18
    • US462042
    • 1983-01-28
    • Sadakazu WatanabeHidenori ShinodaTsuneo NittaYoichi TakebayashiShouichi HiraiTomio SakataKensuke UeharaYasuo TakahashiHaruo Asada
    • Sadakazu WatanabeHidenori ShinodaTsuneo NittaYoichi TakebayashiShouichi HiraiTomio SakataKensuke UeharaYasuo TakahashiHaruo Asada
    • G10L15/10G10L11/00G10L15/00G10L15/28G10L5/00
    • G10L15/00
    • An acoustic signal processing circuit extracts input speech pattern data and subsidiary feature data from an input speech signal. The input speech pattern data comprise frequency spectra, whereas the subsidiary feature data comprise phoneme and acoustic features. These data are then stored in a data buffer memory. The similarity measures between the input speech pattern data stored in the data buffer memory and reference speech pattern data stored in a dictionary memory are computed by a similarity computation circuit. When the largest similarity measure exceeds a first threshold value and when the difference between the largest similarity measure and the second largest measure exceeds a second threshold value, category data of the reference pattern which gives the largest similarity measure is produced by a control circuit to correspond to an input speech. When recognition cannot be performed, the categories of the reference speech patterns which respectively give the largest to mth similarity measures are respectively compared with the subsidiary feature data. In this manner, subsidiary feature recognition of the input voice is performed by a subsidiary feature recognition section.
    • 声信号处理电路从输入语音信号中提取输入语音模式数据和辅助特征数据。 输入语音模式数据包括频谱,而辅助特征数据包括音素和声学特征。 然后将这些数据存储在数据缓冲存储器中。 存储在数据缓冲存储器中的输入语音模式数据与字典存储器中存储的参考语音模式数据之间的相似性度量由相似度计算电路计算。 当最大相似性度量超过第一阈值时,当最大相似性度量与第二最大量度之间的差异超过第二阈值时,通过控制电路产生给出最大相似性度量的参考模式的类别数据,以对应于 输入语音。 当不能执行识别时,将分别给出最大到第m个相似性度量的参考语音模式的类别分别与辅助特征数据进行比较。 以这种方式,由辅助特征识别部执行输入语音的辅助特征识别。
    • 8. 发明授权
    • Speech search device and speech search method
    • 语音搜索设备和语音搜索方法
    • US08626508B2
    • 2014-01-07
    • US13203371
    • 2010-02-10
    • Koichi KatsuradaTsuneo NittaShigeki Teshima
    • Koichi KatsuradaTsuneo NittaShigeki Teshima
    • G10L15/00G10L17/00G10L15/06G10L21/00G10L25/00G06F7/00G06F17/30
    • G10L15/12G10L2015/025
    • Provided are a speech search device, the search speed of which is very fast, the search performance of which is also excellent, and which performs fuzzy search, and a speech search method. Not only the fuzzy search is performed, but also the distance between phoneme discrimination features included in speech data is calculated to determine the similarity with respect to the speech using both a suffix array and dynamic programming, and an object to be searched for is narrowed by means of search keyword division based on a phoneme and search thresholds relative to a plurality of the divided search keywords, the object to be searched for is repeatedly searched for while increasing the search thresholds in order, and whether or not there is the keyword division is determined according to the length of the search keywords, thereby implementing speech search, the search speed of which is very fast and the search performance of which is also excellent.
    • 提供了一种语音搜索装置,其搜索速度非常快,其搜索性能也优异,并且执行模糊搜索和语音搜索方法。 不仅执行模糊搜索,而且还计算包括在语音数据中的音素辨别特征之间的距离,以使用后缀数组和动态编程来确定相对于语音的相似度,并且要搜索的对象被 基于相对于多个划分的搜索关键词的音素和搜索阈值的搜索关键词划分的手段,重复地搜索要搜索的对象,同时依次增加搜索阈值,并且是否存在关键词分割 根据搜索关键词的长度确定,从而实现语音搜索,搜索速度非常快,搜索性能也很好。
    • 9. 发明授权
    • Speech recognition using continuous density hidden markov models and the
orthogonalizing karhunen-loeve transformation
    • 使用连续密度隐马尔可夫模型和正交化karhunen-loeve变换的语音识别
    • US5506933A
    • 1996-04-09
    • US30618
    • 1993-03-12
    • Tsuneo Nitta
    • Tsuneo Nitta
    • G10L15/10G10L15/02G10L15/06G10L15/14G10L9/00
    • G10L15/144
    • A recognition system comprises a feature extractor for extracting a feature vector x from an input speech signal, and a recognizing section for defining continuous density Hidden Markov Models of predetermined categories k as transition network models each having parameters of transition probabilities p(k,i,j) that a state Si transits to a next state Sj and output probabilities g(k,s) that a feature vector x is output in transition from the state Si to one of the states Si and Sj, and recognizing the input signal on the basis of similarity between a sequence X of feature vectors extracted by the feature extractor and the continuous density HMMs. Particularly, the recognizing section includes a memory section for storing a set of orthogonal vectors .phi..sub.m (k,s) provided for the continuous density HMMs, and a modified CDHMM processor for obtaining each of the output probabilities g(k,s) for the continuous density HMMs in accordance with corresponding orthogonal vectors .phi..sub.m (k,s).
    • 识别系统包括用于从输入语音信号中提取特征向量x的特征提取器,以及用于定义预定类别k的连续密度隐马尔科夫模型的识别部分,其中每个转移网络模型具有转移概率p(k,i, j)状态Si转换到下一状态Sj,并且输出特征向量x从状态Si向状态Si和Sj中的一个转变而输出的概率g(k,s),并且识别输入信号 由特征提取器提取的特征向量的序列X与连续密度HMM之间的相似度的基础。 特别地,识别部分包括存储部分,用于存储为连续密度HMM提供的一组正交向量phi(k,s),以及修改的CDHMM处理器,用于获得针对所述连续密度HMM的每个输出概率g(k,s) 连续密度HMM根据相应的正交向量phi(k,s)。