专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US07409349B2 Servers for web enabled speech recognition 有权
标题翻译：支持Web功能的语音识别服务器
公开(公告)号：US07409349B2
公开(公告)日：2008-08-05
申请号：US09960229
申请日：2001-09-20
申请人： Kuansan Wang , Hsiao-Wuen Hon
发明人： Kuansan Wang , Hsiao-Wuen Hon
IPC分类号： G10L21/00
CPC分类号： G10L15/30 , G06F3/16 , G06F17/218 , G10L15/24 , H04M1/271 , H04M1/72561 , H04M3/493 , H04M3/4936 , H04M2207/40
摘要： A markup language for execution on a client device in a client/server system includes instructions to unify at least one of recognition-related events, GUI events and telephony events on non-display, voice input based client device and a multimodal based client for a web server interacting with each of the client devices. A recognition server for receiving data indicative of inputted data provided to a client device and an indication of a grammar to use for recognition is also provided.
摘要翻译：用于在客户端/服务器系统中的客户端设备上执行的标记语言包括用于将非显示器，基于语音输入的客户端设备和基于多模式的客户机上的识别相关事件，GUI事件和电话事件中的至少一个统一的指令，用于 Web服务器与每个客户端设备进行交互。还提供一种识别服务器，用于接收指示提供给客户端设备的输入数据的数据和用于识别的语法指示。

2. 发明授权

US07506022B2 Web enabled recognition architecture 有权
标题翻译： Web启用识别架构
公开(公告)号：US07506022B2
公开(公告)日：2009-03-17
申请号：US09960232
申请日：2001-09-20
申请人： Kuansan Wang , Hsiao-Wuen Hon
发明人： Kuansan Wang , Hsiao-Wuen Hon
IPC分类号： G06F15/16 , G10L11/00
CPC分类号： G10L15/30 , G06F3/16 , G06F17/218 , H04M1/271 , H04M1/72561 , H04M3/493 , H04M3/4936 , H04M2207/40
摘要： A server/client system for processing data includes a network having a web server with information accessible remotely. A client device includes a microphone and a rendering component such as a speaker or display. The client device is configured to obtain the information from the web server and record input data associated with fields contained in the information. The client device is adapted to send the input data to a remote location with an indication of a grammar to use for recognition. A recognition server receives the input data and the indication of the grammar. The recognition server returns data indicative of what was recognized to at least one of the client and the web server.
摘要翻译：用于处理数据的服务器/客户端系统包括具有Web服务器的网络，其中信息可远程访问。客户端设备包括麦克风和诸如扬声器或显示器的渲染组件。客户端设备配置为从Web服务器获取信息并记录与包含在信息中的字段相关联的输入数据。客户端设备适于将输入数据发送到远程位置，并具有用于识别的语法指示。识别服务器接收输入数据和语法的指示。识别服务器返回表示对客户机和web服务器中的至少一个识别的内容的数据。

3. 发明授权

US06629073B1 Speech recognition method and apparatus utilizing multi-unit models 有权
标题翻译：使用多单元模型的语音识别方法和装置
公开(公告)号：US06629073B1
公开(公告)日：2003-09-30
申请号：US09559505
申请日：2000-04-27
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： G01L1506
CPC分类号： G10L15/187 , G10L2015/022 , G10L2015/025
摘要： A speech recognition method and system utilize an acoustic model that is capable of providing probabilities for both a large acoustic unit and an acoustic sub-unit. Each of these probabilities describes the likelihood of a set of feature vectors from a series of feature vectors representing a speech signal. The large acoustic unit is formed from a plurality of acoustic sub-units. At least one sub-unit probability and at least on large unit probability from the acoustic model are used by a decoder to generate a score for a sequence of hypothesized words. When combined, the acoustic sub-units associated with all of the sub-unit probabilities used to determine the score span fewer than all of the feature vectors in the series of feature vectors. An overlapping decoding technique is also provided.
摘要翻译：语音识别方法和系统利用能够为大声学单元和声学子单元提供概率的声学模型。这些概率中的每一个描述了来自表示语音信号的一系列特征向量的一组特征向量的可能性。大型声学单元由多个声学子单元形成。解码器使用来自声学模型的至少一个子单元概率和至少基于大的单位概率来为假设词的序列生成分数。当组合时，与用于确定分数的所有子单元概率相关联的声学子单元小于该系列特征向量中的所有特征向量。还提供了重叠的解码技术。

4. 发明申请

US20050166182A1 Distributed semantic schema 审中-公开
标题翻译：分布式语义架构
公开(公告)号：US20050166182A1
公开(公告)日：2005-07-28
申请号：US10847828
申请日：2004-05-18
申请人： Kuansan Wang , Hsiao-Wuen Hon
发明人： Kuansan Wang , Hsiao-Wuen Hon
IPC分类号： G06F9/45 , G06F9/44 , G06F9/445 , G06F9/46 , G06F17/22 , G06F17/27
CPC分类号： G06F8/31 , G06F8/41 , G06F16/24522 , G06F17/2247 , G06F17/2785
摘要： The present invention relates to a computer readable medium having instructions that, when implemented on a computer cause the computer to process information. The instructions include a declarative logic module adapted to define a semantic object having at least one semantic slot and a procedural logic module adapted to define actions to be performed on the one semantic object with reference to the declarative logic module.
摘要翻译：本发明涉及一种计算机可读介质，其具有当在计算机上实现时使计算机处理信息的指令。所述指令包括适于定义具有至少一个语义时隙的语义对象的声明性逻辑模块，以及适用于参照所述声明性逻辑模块来定义要对所述一个语义对象执行的动作的程序性逻辑模块。

5. 发明申请

US20050101355A1 Sequential multimodal input 失效
标题翻译：顺序多模态输入
公开(公告)号：US20050101355A1
公开(公告)日：2005-05-12
申请号：US10705155
申请日：2003-11-11
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： G06F3/16 , G06F3/038 , G06F15/16 , G06F17/00 , H04M1/725 , H04M3/42 , H04M3/493 , H04M7/00 , H04M11/08 , H04Q7/38 , H04M1/00
CPC分类号： G06F3/038 , H04M1/72561 , H04M3/4938 , H04M7/0027 , H04M2201/38 , H04M2207/18 , H04M2250/22 , H04M2250/74
摘要： A method of interacting with a client/server architecture with a 2G mobile phone is provided. The 2G phone includes a data channel for transmitting data and a voice channel for transmitting speech. The method includes receiving a web page from a web server pursuant to an application through the data channel and rendering the web page on the 2G phone. Speech is received from the user corresponding to at least one data field on the web page. A call is established from the 2G phone to a telephony server over the voice channel. The telephony server is remote from the 2G phone and is adapted to process speech. The telephony server obtains a speech-enabled web page from the web server corresponding to the web page provided to the 2G phone. Speech is transmitted from the 2G phone to the telephony server. The speech is processed in accordance with the speech-enabled web page to obtain textual data. The textual data is transmitted to the web server. The 2G phone obtains a new web page through the data channel and renders the new web page having the textual data.
摘要翻译：提供了一种与2G手机与客户端/服务器体系结构交互的方法。 2G电话包括用于发送数据的数据信道和用于发送语音的语音信道。该方法包括根据通过数据通道的应用从Web服务器接收网页，并在2G电话上呈现网页。从用户接收到对应于网页上的至少一个数据字段的语音。通过语音信道从2G电话建立到电话服务器的呼叫。电话服务器远离2G电话，适用于处理语音。电话服务器从对应于提供给2G电话的网页的web服务器获取具有语音的网页。语音从2G电话发送到电话服务器。根据具有语音功能的网页处理语音以获得文本数据。文本数据被传送到Web服务器。 2G手机通过数据通道获取新的网页，并使新网页具有文本数据。

6. 发明申请

US20050101300A1 Sequential multimodal input 有权
标题翻译：顺序多模态输入
公开(公告)号：US20050101300A1
公开(公告)日：2005-05-12
申请号：US10705019
申请日：2003-11-11
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： G06F3/16 , G06F3/00 , G06F13/00 , G06F15/16 , G06F17/30 , H04M3/493 , H04M11/00 , H04Q7/38 , H04M7/00
CPC分类号： G06F17/30899 , H04M3/4938
摘要： A method of interacting with a client/server architecture with a 2.5G mobile phone having a data channel for transmitting data and a voice channel for transmitting speech. The method includes receiving a web page from a web server pursuant to an application through the data channel and rendering the web page on the 2.5G phone, where rendering comprises processing the web page to be responsive speech input. Speech is received from the user corresponding to at least one data field on the web page. A call is established from the 2.5G phone to a telephony server over the voice channel. The telephony server is remote from the 2.5G phone and adapted to process speech. A speech-enabled web page is obtained from the web server corresponding to the web page provided to the 2.5G phone. Speech is transmitted from the 2.5G phne to the telephony server. The speech is processed in accordance with the speech-enabled web page to obtain textual data in accordance with the speech. The textual data is transmitted to the web server. A new web page is obtained on the 2.5G phone through the data channel and rendered having the textual data.
摘要翻译：一种与具有用于发送数据的数据信道的2.5G移动电话与用于发送语音的语音信道的客户机/服务器架构交互的方法。该方法包括根据通过数据通道的应用从Web服务器接收网页，并在2.5G电话上呈现网页，其中渲染包括处理网页以进行响应语音输入。从用户接收到对应于网页上的至少一个数据字段的语音。通过语音信道从2.5G电话建立到电话服务器的呼叫。电话服务器远离2.5G手机，适用于处理语音。从与提供给2.5G电话的网页相对应的网络服务器获得支持语音的网页。语音从2.5G电话传输到电话服务器。根据具有语音功能的网页来处理语音，以根据语音获得文本数据。文本数据被传送到Web服务器。通过数据通道在2.5G手机上获得一个新的网页，并具有文本数据。

7. 发明授权

US06782362B1 Speech recognition method and apparatus utilizing segment models 失效
标题翻译：语音识别方法和利用段模型的设备
公开(公告)号：US06782362B1
公开(公告)日：2004-08-24
申请号：US09559509
申请日：2000-04-27
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： G10L1500
CPC分类号： G10L15/14
摘要： A method and apparatus determine the likelihood of a sequence of words based in part on a segment model. The segment model includes trajectory expressions formed as the product of a polynomial matrix and a generation matrix. The likelihood of the sequence of words is based in part on a segment probability derived by subtracting the trajectory expressions from a feature vector matrix that contains a sequence of feature vectors for a segment of speech. Aspects of the method and apparatus also include training the segment model using such a segment probability.
摘要翻译：方法和装置部分地基于段模型来确定单词序列的可能性。分段模型包括形成为多项式矩阵和生成矩阵的乘积的轨迹表达式。单词序列的可能性部分地基于通过从包含用于语音段的特征向量序列的特征向量矩阵中减去轨迹表达而导出的段概率。方法和装置的方面还包括使用这种分段概率训练分段模型。

8. 发明授权

US07634404B2 Speech recognition method and apparatus utilizing segment models 失效
标题翻译：语音识别方法和利用段模型的设备
公开(公告)号：US07634404B2
公开(公告)日：2009-12-15
申请号：US10866934
申请日：2004-06-14
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： G10L15/00
CPC分类号： G10L15/14
摘要： A method and apparatus determine the likelihood of a sequence of words based in part on a segment model. The segment model includes trajectory expressions formed as the product of a polynomial matrix and a generation matrix. The likelihood of the sequence of words is based in part on a segment probability derived by subtracting the trajectory expressions from a feature vector matrix that contains a sequence of feature vectors for a segment of speech. Aspects of the method and apparatus also include training the segment model using such a segment probability.
摘要翻译：方法和装置部分地基于段模型来确定单词序列的可能性。段模型包括形成为多项式矩阵和生成矩阵的乘积的轨迹表达式。单词序列的可能性部分地基于通过从包含用于语音段的特征向量序列的特征向量矩阵中减去轨迹表达而导出的段概率。方法和装置的方面还包括使用这种分段概率训练分段模型。

9. 发明授权

US07610547B2 Markup language extensions for web enabled recognition 有权
公开(公告)号：US07610547B2
公开(公告)日：2009-10-27
申请号：US10117141
申请日：2002-04-05
申请人： Kuansan Wang , Hsiao-Wuen Hon
发明人： Kuansan Wang , Hsiao-Wuen Hon
IPC分类号： G06F17/00 , G10L21/00 , H04M1/64
CPC分类号： G06F3/167 , G06F9/451 , G06F17/218 , G10L15/24 , G10L15/30 , H04M1/271 , H04M1/72561 , H04M3/493 , H04M3/4936 , H04M2207/40 , H04M2250/74
摘要： A markup language for execution on a client device in a client/server system includes extensions for recognition.

10. 发明授权

US07363027B2 Sequential multimodal input 有权
标题翻译：顺序多模态输入
公开(公告)号：US07363027B2
公开(公告)日：2008-04-22
申请号：US10705019
申请日：2003-11-11
申请人： Hsiao-Wuen Hon , Kuansan Wang
发明人： Hsiao-Wuen Hon , Kuansan Wang
IPC分类号： H04Q7/22
CPC分类号： G06F17/30899 , H04M3/4938
摘要： A method of interacting with a client/server architecture with a 2.5G mobile phone having a data channel for transmitting data and a voice channel for transmitting speech. The method includes receiving a web page from a web server pursuant to an application through the data channel and rendering the web page on the 2.5G phone, where rendering comprises processing the web page to be responsive speech input. Speech is received from the user corresponding to at least one data field on the web page. A call is established from the 2.5G phone to a telephony server over the voice channel. The telephony server is remote from the 2.5G phone and adapted to process speech. A speech-enabled web page is obtained from the web server corresponding to the web page provided to the 2.5G phone. Speech is transmitted from the 2.5G phone to the telephony server. The speech is processed in accordance with the speech-enabled web page to obtain textual data in accordance with the speech. The textual data is transmitted to the web server. A new web page is obtained on the 2.5G phone through the data channel and rendered having the textual data.
摘要翻译：一种与具有用于发送数据的数据信道的2.5G移动电话与用于发送语音的语音信道的客户机/服务器架构交互的方法。该方法包括根据通过数据通道的应用从Web服务器接收网页，并在2.5G电话上呈现网页，其中渲染包括处理网页以进行响应语音输入。从用户接收到对应于网页上的至少一个数据字段的语音。通过语音信道从2.5G电话建立到电话服务器的呼叫。电话服务器远离2.5G手机，适用于处理语音。从与提供给2.5G电话的网页相对应的网络服务器获得支持语音的网页。语音从2.5G手机发送到电话服务器。根据具有语音功能的网页来处理语音，以根据语音获得文本数据。文本数据被传送到Web服务器。通过数据通道在2.5G手机上获得一个新的网页，并具有文本数据。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式