专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06535854B2 Speech recognition control of remotely controllable devices in a home network environment 有权
标题翻译：家庭网络环境中遥控设备的语音识别控制
公开(公告)号：US06535854B2
公开(公告)日：2003-03-18
申请号：US09175382
申请日：1998-10-19
申请人： Peter Buchner , Silke Goronzy , Ralf Kompe , Stefan Rapp
发明人： Peter Buchner , Silke Goronzy , Ralf Kompe , Stefan Rapp
IPC分类号： G10L1522
CPC分类号： H04L12/40117 , G10L15/26 , H04L12/2803 , H04L12/282
摘要： Home networks low-cost digital interfaces are introduced that integrate entertainment, communication and computing electronics into consumer multimedia. Normally, these are low-cost, easy to use systems, since they allow the user to remove or add any kind of network devices with the bus being active. To improve the user interface a speech unit (2) is proposed that enables all devices (11) connected to the bus system (31) to be controlled by a single speech recognition device. The properties of this device, e.g. the vocabulary can be dynamically and actively extended by the consumer devices (11) connected to the bus system (31). The proposed technology is independent from a specific bus standard, e.g. the IEEE 1394 standard, and is well-suited for all kinds of wired wireless home networks. The speech unit (2) receives data and messages from the device. The speech unit (2) recognizes speaker-dependent commands. A Speech synthesizer synthesizes messages. A remotely controllable device (11) has access to a medium which may be a CD-ROM. The device may ask for a logical name or identifier.
摘要翻译：家庭网络引入了低成本的数字接口，将娱乐，通信和计算电子整合到消费者多媒体中。通常，这些是低成本，易于使用的系统，因为它们允许用户去除或添加任何类型的网络设备，其中总线是活动的。为了改善用户接口，提出了一种语音单元（2），其使得能够通过单个语音识别设备来控制连接到总线系统（31）的所有设备（11）。该装置的特性，例如可以通过连接到总线系统（31）的消费者设备（11）来动态地和主动地扩展词汇。所提出的技术独立于特定总线标准，例如。 IEEE 1394标准，非常适用于各种有线无线家庭网络。语音单元（2）从设备接收数据和消息。语音单元（2）识别与扬声器相关的命令。语音合成器综合消息。远程可控设备（11）可以访问可以是CD-ROM的介质。该设备可能会要求一个逻辑名称或标识符。

2. 发明授权

US06615177B1 Merging of speech interfaces from concurrent use of devices and applications 失效
标题翻译：从并发使用设备和应用程序合并语音界面
公开(公告)号：US06615177B1
公开(公告)日：2003-09-02
申请号：US09546768
申请日：2000-04-11
申请人： Stefan Rapp , Silke Goronzy , Ralf Kompe , Peter Buchner , Franck Giron , Helmut Lucke
发明人： Stefan Rapp , Silke Goronzy , Ralf Kompe , Peter Buchner , Franck Giron , Helmut Lucke
IPC分类号： G10L2100
CPC分类号： G10L15/26 , H04M2201/40
摘要： According to the present invention network devices that can be controlled via a speech unit included in the network can send a device-document describing its functionality and its speech interface to said speech unit. The speech unit combines those documents to a general document that forms the basis to translate recognized user-commands into user-network-commands to control the connected network-devices. A device-document comprises at least the vocabulary and the commands associated therewith for the corresponding device. Furtheron, pronunciation, grammar for word sequences, rules for speech understanding and dialog can be contained in such documents as well as the same information for multiple languages or information for dynamic dialogs in speech understanding. It is possible that one device contains several documents and dynamically sends them to the speech unit in case they are needed. Furtheron, the present invention enables a device to change its functionality dynamically based on changing content, since a network device send its specifications regarding its speech capabilities to the speech unit while the speech unit is in use.
摘要翻译：根据本发明，可以通过网络中包括的语音单元进行控制的网络设备可以将描述其功能的设备文档及其语音接口发送到所述语音单元。语音单元将这些文档组合到一般文档中，该文档构成将识别的用户命令转换为用户网络命令以控制所连接的网络设备的基础。装置文件至少包括与对应装置相关联的词汇表和命令。更重要的是，语言理解和对话语言的发音，语法，语音理解和对话的规则可以包含在这样的文档中，以及用于多种语言的相同信息或用于语音理解中的动态对话的信息。一个设备可能包含多个文档，并在需要时动态地将它们发送到语音单元。因此，本发明使得设备能够基于变化的内容动态地改变其功能，因为网络设备在语音单元正在使用时将其关于其语音能力的规范发送到语音单元。

3. 发明授权

US06799162B1 Semi-supervised speaker adaptation 失效
标题翻译：半监督说话者适应
公开(公告)号：US06799162B1
公开(公告)日：2004-09-28
申请号：US09461981
申请日：1999-12-15
申请人： Silke Goronzy , Ralf Kompe , Peter Buchner , Naoto Iwahashi
发明人： Silke Goronzy , Ralf Kompe , Peter Buchner , Naoto Iwahashi
IPC分类号： G10L1506
CPC分类号： G10L15/065 , G10L15/063 , G10L2015/0638
摘要： To prevent adaptation to misrecognized words in unsupervised or on-line automatic speech recognition systems confidence measures are used or the user reaction is interpreted to decide whether a recognized phoneme, several phonemes, a word, several words or a whole utterance should be used for adaptation of the speaker independent model set to a speaker adapted model set or not and, in case an adaptation is executed, how strong the adaptation with this recognized utterance or part of this recognized utterance should be performed. Furtheron, a verification of the speaker adaptation performance is proposed to secure that the recognition rate never decreases (significantly), but only increases or stays at the same level.
摘要翻译：为了防止在无监督或在线自动语音识别系统中适应错误识别的单词，使用置信度度量或用户反应来解释是否应将识别的音素，数个音素，单词，多个单词或整个话语用于适应将扬声器独立模型设置为扬声器适配模型集合，并且在执行适应的情况下，应该执行具有该识别话语或部分该识别话语的适应性的强度。此外，提出了说话者适应性能的验证，以确保识别率不会降低（显着），但只增加或保持在同一水平。

4. 发明授权

US07680654B2 Apparatus and method for segmentation of audio data into meta patterns 失效
标题翻译：将音频数据分割为元模式的装置和方法
公开(公告)号：US07680654B2
公开(公告)日：2010-03-16
申请号：US10985615
申请日：2004-11-10
申请人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Hay Lam , Krzysztof Marasek , Raquel Tato
发明人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Hay Lam , Krzysztof Marasek , Raquel Tato
IPC分类号： G10L15/00 , G10L15/20
CPC分类号： G10L25/00
摘要： An audio data segmentation apparatus for segmenting of audio data including for supplying audio data, dividing the audio data supplied into audio clips of a predetermined length, discriminating the audio clips into predetermined audio classes, the audio classes identifying a kind of audio data included in the respective audio clip and segmenting for segmenting the audio data into audio meta patterns based on a sequence of audio classes of consecutive audio clips, each meta pattern being allocated to a predetermined type of contents of the audio data. It is difficult to achieve good results with known methods for segmentation of audio data into meta patterns since the rules for the allocation of the meta patterns are dissatisfying. This problem is solved by the inventive audio data segmentation apparatus further including a program database including program data units to identify a certain kind of program, a plurality of respective audio meta patterns being allocated to each program data unit, wherein the segmenting segments the audio data into corresponding audio meta patterns on the basis of the program data units of the program database 5.
摘要翻译：一种用于分割音频数据的音频数据分割装置，包括用于提供音频数据，将提供的音频数据分成预定长度的音频剪辑，将音频剪辑识别为预定音频类别，识别包括在音频数据中的音频数据的种类的音频类别相应的音频剪辑和分段，用于基于连续音频剪辑的音频类别序列将音频数据分割成音频元模式，每个元模式被分配给音频数据的预定类型的内容。由于元模式的分配规则不满意，因此将音频数据分割为元模式的已知方法难以获得良好的结果。本发明的音频数据分割装置还包括程序数据库，该程序数据库包括用于识别特定类型的程序的程序数据单元，分配给每个程序数据单元的多个各自的音频元模式，其中分段将音频数据分段基于程序数据库5的程序数据单元转换成对应的音频元模式。

5. 发明申请

US20050102135A1 Apparatus and method for automatic extraction of important events in audio signals 失效
标题翻译：自动提取音频信号中重要事件的装置和方法
公开(公告)号：US20050102135A1
公开(公告)日：2005-05-12
申请号：US10985446
申请日：2004-11-10
申请人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Lam , Krzysztof Marasek , Raquel Tato
发明人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Lam , Krzysztof Marasek , Raquel Tato
IPC分类号： G10L15/00 , G10L17/26 , G10L25/00 , H04N5/91
CPC分类号： G10L25/00 , G10L15/00 , G10L17/26
摘要： The present invention discloses an apparatus for automatic extraction of important events in audio signals comprising: signal input means for supplying audio signals; audio signal fragmenting means for partitioning audio signals supplied by the signal input means into audio fragments of a predetermined length and for allocating a sequence of one or more audio fragments to a respective audio window; feature extracting means for analysing acoustic characteristics of the audio signals comprised in the audio fragments and for analysing acoustic characteristics of the audio signals comprised in the audio windows; and important event extraction means for extracting important events in audio signals supplied by the audio signal fragmenting means based on predetermined important event classifying rules depending on acoustic characteristics of the audio signals comprised in the audio fragments and on acoustic characteristics of the audio signals comprised in the audio windows, wherein each important event extracted by the important event extraction means comprises a discrete sequence of cohesive audio fragments corresponding to an important event included in the audio signals.
摘要翻译：本发明公开了一种用于自动提取音频信号中的重要事件的装置，包括：用于提供音频信号的信号输入装置; 用于将由信号输入装置提供的音频信号划分成预定长度的音频片段并用于将一个或多个音频片段的序列分配到相应音频窗口的音频信号分段装置; 特征提取装置，用于分析包含在音频片段中的音频信号的声学特性并分析包含在音频窗口中的音频信号的声学特性; 以及重要事件提取装置，用于根据包含在音频片段中的音频信号的声学特性以及包含在音频片段中的音频信号的声学特性，基于预定的重要事件分类规则，提取由音频信号分段装置提供的音频信号中的重要事件。音频窗口，其中由重要事件提取装置提取的每个重要事件包括对应于包括在音频信号中的重要事件的粘性音频片段的离散序列。

6. 发明授权

US07962330B2 Apparatus and method for automatic dissection of segmented audio signals 失效
标题翻译：分段音频信号自动解剖的装置和方法
公开(公告)号：US07962330B2
公开(公告)日：2011-06-14
申请号：US10985451
申请日：2004-11-10
申请人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Hay Lam , Krzysztof Marasek , Raquel Tato
发明人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Hay Lam , Krzysztof Marasek , Raquel Tato
IPC分类号： G10L19/00 , G10L11/00 , G10L15/00 , G06F17/00 , H04N7/16 , H04N7/173
CPC分类号： G10L25/00 , G10L15/00 , Y10S707/913
摘要： An apparatus for automatic dissection of segmented audio signals, wherein at least one information signal for identifying programs included in said audio signals and for identifying contents included in said programs. Content detection device detects programs and contents belonging to the respective programs in the information signal. Program weighting device weights each program includes in the information signal based on the contents of the respective program detected by the content detection device. Program ranking device indentifies programmers of the same category and ranking said programs based on a weighting result for each program provided by the program weighting device.
摘要翻译：一种用于自动解剖分段音频信号的装置，其中至少一个信息信号用于识别包括在所述音频信号中的节目，并用于识别包括在所述节目中的内容。内容检测装置检测信息信号中属于各个节目的节目和内容。每个程序的程序加权设备权重包括在基于由内容检测设备检测到的相应程序的内容的信息信号中。程序排名装置识别相同类别的程序员，并且基于由程序加权装置提供的每个程序的加权结果对所述程序进行排序。

7. 发明申请

US20060156326A1 Methods to create a user profile and to specify a suggestion for a next selection of a user 失效
公开(公告)号：US20060156326A1
公开(公告)日：2006-07-13
申请号：US10525665
申请日：2003-08-27
申请人： Silke Goronzy , Ralf Kompe , Christian Hying , Zica Valsan , Robert Mencl , Helmut Wais , Thomas Kemp , Sunna Torge , Martin Emele
发明人： Silke Goronzy , Ralf Kompe , Christian Hying , Zica Valsan , Robert Mencl , Helmut Wais , Thomas Kemp , Sunna Torge , Martin Emele
IPC分类号： H04N7/16 , H04N5/445 , H04H9/00 , G06F3/00
CPC分类号： G06F17/30035 , H04H60/46 , H04N7/163 , H04N21/25891 , H04N21/44222 , H04N21/6582 , H04N21/8106 , Y10S707/99933 , Y10S707/99934
摘要： A user profile and/or the suggestions computed based thereon are obtained taking a special set of user features into account. The user features are defined to represent a typical general behaviour of an individual user in respect to the application where the user profile is used. In other words, for each application where a user profile is used a special set of user features are defined which are able to represent a typical general behaviour of an individual user. Based on these user features the weights in the list of word-weight pairs or weighted keywords which represents the user profile are computed or influenced during the creation of the user profile, and/or a multi-user profile is split during the creation of an individual user profile from a multi-user profile, and/or during specification of a suggestion a user history which is used to create the user profile, and/or the user profile, and/or the suggestion results are filtered.

8. 发明申请

US20050131688A1 Apparatus and method for classifying an audio signal 审中-公开
标题翻译：对音频信号进行分类的装置和方法
公开(公告)号：US20050131688A1
公开(公告)日：2005-06-16
申请号：US10985295
申请日：2004-11-10
申请人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Lam , Krzysztof Marasek , Raquel Tato
发明人： Silke Goronzy , Thomas Kemp , Ralf Kompe , Yin Lam , Krzysztof Marasek , Raquel Tato
IPC分类号： G10L15/10 , G10L15/26 , G10L25/78 , G10L15/12
CPC分类号： G10L25/78 , G10L15/26 , G11B2220/20
摘要： An apparatus for classifying audio signals comprises audio signal clipping means for partitioning audio signals into audio clips, and class discrimination means for discriminating the audio clips provided by the audio signal clipping means into predetermined audio classes based on predetermined audio class classifying rules, by analysing acoustic characteristics of the audio signals comprised in the audio clips, wherein a predetermined audio class classifying rule is provided for each audio class, and each audio class represents a respective kind of audio signals comprised in the corresponding audio clip. The determination process to find acceptable audio class classifying rules for each audio class according to the prior art is depending on both the used raw audio signals and the personal experience of the person conducting the determination process. Thus, the determination process usually is very difficult, time consuming and subjective. Furthermore, there is a high risk that not all possible peculiarities of the different programmes and the different categories the audio signal can belong to is sufficiently accounted for. This problem is solved in the inventive apparatus for classifying audio signals by class discrimination means calculating an audio class confidence value for each audio class assigned to an audio clip, wherein the audio class confidence value indicates the likelihood the respective audio class characterises the respective kind of audio signals comprised in the respective audio clip correctly. Furthermore, the class discrimination means use acoustic characteristics of audio clips of audio classes having a high audio class confidence value to train the respective audio class classifying rule.
摘要翻译：用于对音频信号进行分类的装置包括用于将音频信号分割成音频剪辑的音频信号限幅装置，以及用于基于预定音频类别分类规则，将由音频信号限幅装置提供的音频片段识别为预定音频类别的类别鉴别装置，通过分析声音包括在音频剪辑中的音频信号的特征，其中为每个音频类提供预定音频类别分类规则，并且每个音频类表示包括在相应的音频剪辑中的各种音频信号。根据现有技术，为每个音频类找到可接受的音频类别分类规则的确定过程取决于所使用的原始音频信号和进行确定处理的人的个人经历。因此，确定过程通常是非常困难，耗时和主观的。此外，充分考虑到不同于音频信号可能属于不同节目和不同类别的所有可能的特征的高风险。在本发明的用于对音频信号进行分类的设备中解决了这个问题，该类别鉴别装置为分配给音频剪辑的每个音频类别计算音频类别置信度值，其中，音频类别置信度值表示各个音频类别包含在相应音频剪辑中的音频信号正确。此外，等级识别装置使用具有高音频类置信度值的音频类别的音频剪辑的声学特性来训练各个音频类别分类规则。

9. 发明授权

US07970762B2 Methods to create a user profile and to specify a suggestion for a next selection of a user 失效
标题翻译：创建用户简档并指定下一个用户选择的建议的方法
公开(公告)号：US07970762B2
公开(公告)日：2011-06-28
申请号：US12507574
申请日：2009-07-22
申请人： Silke Goronzy , Ralf Kompe , Christian Hying , Zica Valsan , Robert Mencl , Helmut Wais , Thomas Kemp , Sunna Torge , Martin Emele
发明人： Silke Goronzy , Ralf Kompe , Christian Hying , Zica Valsan , Robert Mencl , Helmut Wais , Thomas Kemp , Sunna Torge , Martin Emele
IPC分类号： G06F17/30
CPC分类号： G06F17/30035 , H04H60/46 , H04N7/163 , H04N21/25891 , H04N21/44222 , H04N21/6582 , H04N21/8106 , Y10S707/99933 , Y10S707/99934
摘要： A user profile and/or the suggestions computed based thereon are obtained taking a special set of user features into account. The user features are defined to represent a typical general behaviour of an individual user in respect to the application where the user profile is used. In other words, for each application where a user profile is used a special set of user features are defined which are able to represent a typical general behaviour of an individual user. Based on these user features the weights in the list of word-weight pairs or weighted keywords which represents the user profile are computed or influenced during the creation of the user profile, and/or a mufti-user profile is split during the creation of an individual user profile from a mufti-user profile, and/or during specification of a suggestion a user history which is used to create the user profile, and/or the user profile, and/or the suggestion results are filtered.
摘要翻译：考虑到用户特征集合的用户简档和/或基于其计算的建议。用户特征被定义为表示个人用户相对于使用用户简档的应用的典型一般行为。换句话说，对于使用用户简档的每个应用程序，定义了能够表示单个用户的典型一般行为的一组特定用户特征。基于这些用户特征，在创建用户简档期间计算或影响表示用户简档的单词权重对或加权关键词列表中的权重，和/或在创建用户简档期间分割多个用户简档。和/或在建议的指定期间，用于创建用户简档的用户历史记录和/或用户简档和/或建议结果的个人用户简档被过滤。

10. 发明授权

US07113908B2 Method for recognizing speech using eigenpronunciations 失效
公开(公告)号：US07113908B2
公开(公告)日：2006-09-26
申请号：US10090861
申请日：2002-03-05
申请人： Silke Goronzy , Ralf Kompe
发明人： Silke Goronzy , Ralf Kompe
IPC分类号： G10L15/06
CPC分类号： G10L15/07
摘要： To increase the recognition rate and quality in a process of recognizing speech an approximative set of pronunciation rules (APR) for a current pronunciation (CP) of a current speaker is determined in a given pronunciation space (PS) and then applied to a current pronunciation lexicon (CL) so as to perform a speaker specific adaptation of said current lexicon (CL).

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式