专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

21. 发明申请

US20080208586A1 Enabling Natural Language Understanding In An X+V Page Of A Multimodal Application 审中-公开
标题翻译：在多模式应用程序的X + V页面中启用自然语言理解
公开(公告)号：US20080208586A1
公开(公告)日：2008-08-28
申请号：US11679292
申请日：2007-02-27
申请人： Soonthorn Ativanichayaphong , Charles W. Cross , Gerald M. McCobb
发明人： Soonthorn Ativanichayaphong , Charles W. Cross , Gerald M. McCobb
IPC分类号： G10L21/00
CPC分类号： G10L2015/228
摘要： Enabling natural language understanding using an X+V page of a multimodal application implemented with a statistical language model (‘SLM’) grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating in a multimodal browser on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine through a VoiceXML interpreter, including: receiving, in the ASR engine from the multimodal application, a voice utterance; generating, by the ASR engine according to the SLM grammar, at least one recognition result for the voice utterance; determining, by an action classifier for the VoiceXML interpreter, an action identifier in dependence upon the recognition result, the action identifier specifying an action to be performed by the multimodal application; and interpreting, by the VoiceXML interpreter, the multimodal application in dependence upon the action identifier.
摘要翻译：通过使用自动语音识别（“ASR”）引擎中的多模式应用程序的统计语言模型（“SLM”）语法实现的多模式应用程序的X + V页面，实现自然语言理解，多模式应用程序在多模态下运行浏览器支持包括语音模式和一个或多个非语音模式的多种交互模式的多模式设备，所述多模式应用通过VoiceXML解释器可操作地耦合到ASR引擎，包括：在多模式应用的ASR引擎中，一个声音说话; 由ASR引擎根据SLM语法生成语音话语的至少一个识别结果; 通过所述VoiceXML解释器的动作分类器确定依赖于所述识别结果的动作标识符，所述动作标识符指定要由所述多模式应用执行的动作; 并且由VoiceXML解释器根据动作标识符解释多模式应用。

22. 发明申请

US20080140410A1 ENABLING GRAMMARS IN WEB PAGE FRAME 有权
标题翻译：在网页框架中启用GRAMMARS
公开(公告)号：US20080140410A1
公开(公告)日：2008-06-12
申请号：US11567235
申请日：2006-12-06
申请人： SOONTHORN ATIVANICHAYAPHONG , Charles W. Cross , Gerald M. McCobb
发明人： SOONTHORN ATIVANICHAYAPHONG , Charles W. Cross , Gerald M. McCobb
IPC分类号： H04M1/64 , G10L21/00
CPC分类号： H04M3/4938 , G06F17/30896 , G06F17/30902 , G10L15/19 , H04M2201/40
摘要： Enabling grammars in web page frames, including receiving, in a multimodal application on a multimodal device, a frameset document, where the frameset document includes markup defining web page frames; obtaining by the multimodal application content documents for display in each of the web page frames, where the content documents include navigable markup elements; generating by the multimodal application, for each navigable markup element in each content document, a segment of markup defining a speech recognition grammar, including inserting in each such grammar markup identifying content to be displayed when words in the grammar are matched and markup identifying a frame where the content is to be displayed; and enabling by the multimodal application all the generated grammars for speech recognition.
摘要翻译：在网页框架中启用语法，包括在多模式设备上的多模式应用程序中接收框架集文档，其中框架集文档包括定义网页框架的标记; 通过多模式应用程序内容文档获取以在每个网页帧中显示，其中内容文档包括可导航标记元素; 由多模式应用为每个内容文档中的每个可导航标记元素生成定义语音识别语法的标记段，包括在每个这样的语法标记中插入标识要在语法中的词匹配时要显示的内容，并且标识标识帧要显示的内容; 并通过多模式应用程序实现所有生成的语法用于语音识别。

23. 发明申请

US20070288241A1 ORAL MODIFICATION OF AN ASR LEXICON OF AN ASR ENGINE 有权
标题翻译： ASR发动机的ASR LEXICON的ORAL修改
公开(公告)号：US20070288241A1
公开(公告)日：2007-12-13
申请号：US11423711
申请日：2006-06-13
申请人： Charles W. Cross , Frank L. Jania , James R. Lewis
发明人： Charles W. Cross , Frank L. Jania , James R. Lewis
IPC分类号： G10L21/00
CPC分类号： G10L15/22 , G10L15/06 , G10L2015/0631
摘要： Methods, apparatus, and computer program products are described for providing oral modification of an ASR lexicon of an ASR engine that include receiving, in the ASR engine from a user through a multimodal application, speech for recognition, where the ASR engine includes an ASR lexicon of words capable of recognition by the ASR engine, and the ASR lexicon does not contain at least one word of the speech for recognition; indicating by the ASR engine through the multimodal application to the user that the ASR lexicon does not contain the word; receiving by the ASR engine from the user through the multimodal application an oral instruction to add the word to the ASR lexicon, where the oral instruction is accompanied by an oral spelling of the word from the user; and executing the instruction by the ASR engine.
摘要翻译：描述了用于提供ASR引擎的ASR词汇的口头修改的方法，装置和计算机程序产品，其包括在用户通过多模式应用的ASR引擎中接收用于识别的语音，其中ASR引擎包括ASR词典能够被ASR引擎识别的字，并且ASR词典不包含用于识别的言语中的至少一个单词; 由ASR引擎通过多模态应用向用户指示ASR词典不包含该词; 由ASR引擎从用户通过多模式应用程序接收口头指令，将该单词添加到ASR词典中，其中口头指令伴随着来自用户的单词的口头拼写; 并执行ASR引擎的指令。

24. 发明申请

US20120011443A1 ENABLING SPEECH WITHIN A MULTIMODAL PROGRAM USING MARKUP 有权
标题翻译：在使用标记的多模式程序中启用语音
公开(公告)号：US20120011443A1
公开(公告)日：2012-01-12
申请号：US13237270
申请日：2011-09-20
申请人： Charles W. Cross , Leslie R. Wilson , Steven G. Woodward
发明人： Charles W. Cross , Leslie R. Wilson , Steven G. Woodward
IPC分类号： G06F3/16
CPC分类号： G10L15/22 , G06F3/167 , G10L15/26 , H04M3/4938
摘要： A method for speech enabling an application can include the step of specifying a speech input within a speech-enabled markup. The speech-enabled markup can also specify an application operation that is to be executed responsive to the detection of the speech input. After the speech input has been defined within the speech-enabled markup, the application can be instantiated. The specified speech input can then he detected and the application operation can be responsively executed in accordance with the specified speech-enabled markup.
摘要翻译：用于使应用程序语音化的方法可以包括在启用语音的标记中指定语音输入的步骤。启用语音的标记还可以指定响应于语音输入的检测执行的应用操作。在启用语音的标记中定义了语音输入后，可以实例化应用程序。然后可以检测到指定的语音输入，并且可以根据指定的启用语音的标记来响应地执行应用程序操作。

25. 发明授权

US08055504B2 Synchronizing visual and speech events in a multimodal application 有权
公开(公告)号：US08055504B2
公开(公告)日：2011-11-08
申请号：US12061750
申请日：2008-04-03
申请人： Charles W. Cross , Michael C. Hollinger , Igor R. Jablokov , David B. Lewis , Hilary A. Pike , Daniel M. Smith , David W. Wintermute , Michael A. Zaitzeff
发明人： Charles W. Cross , Michael C. Hollinger , Igor R. Jablokov , David B. Lewis , Hilary A. Pike , Daniel M. Smith , David W. Wintermute , Michael A. Zaitzeff
IPC分类号： G10L11/00
CPC分类号： G10L15/1815 , G10L2021/105
摘要： Exemplary methods, systems, and products are disclosed for synchronizing visual and speech events in a multimodal application, including receiving from a user speech; determining a semantic interpretation of the speech; calling a global application update handler; identifying, by the global application update handler, an additional processing function in dependence upon the semantic interpretation; and executing the additional function. Typical embodiments may include updating a visual element after executing the additional function. Typical embodiments may include updating a voice form after executing the additional function. Typical embodiments also may include updating a state table after updating the voice form. Typical embodiments also may include restarting the voice form after executing the additional function.

26. 发明申请

US20080235021A1 Indexing Digitized Speech With Words Represented In The Digitized Speech 有权
标题翻译：用数字化语言代表的数字化语音索引
公开(公告)号：US20080235021A1
公开(公告)日：2008-09-25
申请号：US11688331
申请日：2007-03-20
申请人： Charles W. Cross , Frank L. Jania
发明人： Charles W. Cross , Frank L. Jania
IPC分类号： G10L15/00
CPC分类号： G10L15/19 , G10L15/183 , G10L15/193 , G10L15/197 , G10L15/22 , G10L21/06 , G10L2015/228
摘要： Indexing digitized speech with words represented in the digitized speech, with a multimodal digital audio editor operating on a multimodal device supporting modes of user interaction, the modes of user interaction including a voice mode and one or more non-voice modes, the multimodal digital audio editor operatively coupled to an ASR engine, including providing by the multimodal digital audio editor to the ASR engine digitized speech for recognition; receiving in the multimodal digital audio editor from the ASR engine recognized user speech including a recognized word, also including information indicating where, in the digitized speech, representation of the recognized word begins; and inserting by the multimodal digital audio editor the recognized word, in association with the information indicating where, in the digitized speech, representation of the recognized word begins, into a speech recognition grammar, the speech recognition grammar voice enabling user interface commands of the multimodal digital audio editor.
摘要翻译：在数字化语音中表示的词索引数字化语音，多模数字音频编辑器在支持用户交互模式的多模式设备上操作，包括语音模式和一种或多种非语音模式的用户交互模式，多模式数字音频编辑器可操作地耦合到ASR引擎，包括由多模数字音频编辑器提供给ASR引擎的数字化语音进行识别; 在多模式数字音频编辑器中从包含识别字的ASR引擎识别的用户语音接收信息，还包括指示在数字化语音中识别字词的表示何处开始的信息; 并且通过多模式数字音频编辑器将识别的词与表示数字化语音在识别字的表示开始的位置的信息相关联地插入到语音识别语法中，使语音识别语法语音启用多模态的用户界面命令数字音频编辑器。

27. 发明申请

US20080208594A1 Effecting Functions On A Multimodal Telephony Device 审中-公开
标题翻译：多功能电话设备上的功能
公开(公告)号：US20080208594A1
公开(公告)日：2008-08-28
申请号：US11679312
申请日：2007-02-27
申请人： Charles W. Cross , Frank L. Jania , Darren M. Shaw
发明人： Charles W. Cross , Frank L. Jania , Darren M. Shaw
IPC分类号： G10L11/00
CPC分类号： G10L15/26
摘要： Methods, apparatus, and computer program products are described for effecting functions on a multimodal telephony device, implemented with the multimodal application operating on a multimodal telephony device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to an automated speech recognition engine. Embodiments include receiving the speech of a telephone call; identifying with the automated speech recognition engine action keywords in the speech of the telephone call; selecting a function of the multimodal telephony device in dependence upon the action keywords; identifying parameters for the function of the multimodal telephony device; and executing the function of the multimodal telephony device using the identified parameters.
摘要翻译：描述了用于在多模式电话设备上实现功能的方法，装置和计算机程序产品，该多模式电话设备通过在支持包括语音模式和一个或多个非语音模式的多种交互模式的多模式电话设备上操作的多模式应用程序来实现，多模态可操作地耦合到自动语音识别引擎的应用。实施例包括接收电话呼叫的语音; 用电话语音中的自动语音识别引擎动作关键字识别; 根据动作关键词选择多模式电话设备的功能; 识别用于多模式电话设备的功能的参数; 以及使用所识别的参数来执行所述多模式电话设备的功能。

28. 发明申请

US20080208593A1 Altering Behavior Of A Multimodal Application Based On Location 有权
标题翻译：改变基于位置的多模态应用的行为
公开(公告)号：US20080208593A1
公开(公告)日：2008-08-28
申请号：US11679301
申请日：2007-02-27
申请人： Soonthorn Ativanichayaphong , Charles W. Cross , Igor R. Jablokov , Gerald M. McCobb
发明人： Soonthorn Ativanichayaphong , Charles W. Cross , Igor R. Jablokov , Gerald M. McCobb
IPC分类号： G10L21/00
CPC分类号： G10L15/22 , G10L15/24
摘要： Methods, apparatus, and products are disclosed for altering behavior of a multimodal application based on location. The multimodal application operates on a multimodal device supporting multiple modes of user interaction with the multimodal application, including a voice mode and one or more non-voice modes. The voice mode of user interaction with the multimodal application is supported by a voice interpreter. Altering behavior of a multimodal application based on location includes: receiving a location change notification in the voice interpreter from a device location manager, the device location manager operatively coupled to a position detection component of the multimodal device, the location change notification specifying a current location of the multimodal device; updating, by the voice interpreter, location-based environment parameters for the voice interpreter in dependence upon the current location of the multimodal device; and interpreting, by the voice interpreter, the multimodal application in dependence upon the location-based environment parameters.
摘要翻译：公开了基于位置改变多模式应用的行为的方法，装置和产品。多模式应用程序在多模式设备上运行，支持与多模式应用程序的多种用户交互模式，包括语音模式和一种或多种非语音模式。与多模式应用程序的用户交互的语音模式由语音解释器支持。基于位置改变多模式应用的行为包括：从设备位置管理器在语音解释器中接收位置改变通知，该设备位置管理器可操作地耦合到多模态设备的位置检测组件，位置变化通知指定当前位置的多模式设备; 语音解释器根据多模式设备的当前位置更新语音解释器的基于位置的环境参数; 并且由语音解释器根据基于位置的环境参数来解释多模式应用。

29. 发明申请

US20080208589A1 Presenting Supplemental Content For Digital Media Using A Multimodal Application 审中-公开
标题翻译：提出使用多模态应用的数字媒体补充内容
公开(公告)号：US20080208589A1
公开(公告)日：2008-08-28
申请号：US11679225
申请日：2007-02-27
申请人： Charles W. Cross , Brian D. Goodman , Frank L. Jania , Darren M. Shaw
发明人： Charles W. Cross , Brian D. Goodman , Frank L. Jania , Darren M. Shaw
IPC分类号： G10L21/00
CPC分类号： H04N21/8543 , G10L15/26 , H04N5/45 , H04N7/17318 , H04N21/2368 , H04N21/4126 , H04N21/41407 , H04N21/4143 , H04N21/42203 , H04N21/42204 , H04N21/4316 , H04N21/4341 , H04N21/43615 , H04N21/47 , H04N21/4722 , H04N21/8106
摘要： Presenting supplemental content for digital media using a multimodal application, implemented with a grammar of the multimodal application in an automatic speech recognition (‘ASR’) engine, with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application operatively coupled to the ASR engine, includes: rendering, by the multimodal application, a portion of the digital media; receiving, by the multimodal application, a voice utterance from a user; determining, by the multimodal application using the ASR engine, a recognition result in dependence upon the voice utterance and the grammar; identifying, by the multimodal application, supplemental content for the rendered portion of the digital media in dependence upon the recognition result; and rendering, by the multimodal application, the supplemental content.
摘要翻译：在使用多模式应用程序的多模式应用程序的自动语音识别（“ASR”）引擎中实现数字媒体的补充内容，多模式应用程序在支持多种交互模式的多模式设备上运行，包括语音模式和可操作地耦合到ASR引擎的多模式应用的一个或多个非语音模式包括：由多模式应用呈现数字媒体的一部分; 由多模式应用程序接收来自用户的语音发音; 通过使用ASR引擎的多模式应用来确定依赖于语音发音和语法的识别结果; 根据识别结果，通过多模式应用识别数字媒体的渲染部分的补充内容; 并通过多模式应用程序呈现补充内容。

30. 发明申请

US20080208584A1 Pausing A VoiceXML Dialog Of A Multimodal Application 有权
标题翻译：暂停多模式应用程序的VoiceXML对话框
公开(公告)号：US20080208584A1
公开(公告)日：2008-08-28
申请号：US11679236
申请日：2007-02-27
申请人： Soonthorn Ativanichayaphong , Charles W. Cross , David Jaramillo , Gerald M. McCobb
发明人： Soonthorn Ativanichayaphong , Charles W. Cross , David Jaramillo , Gerald M. McCobb
IPC分类号： G10L13/00 , G10L11/00
CPC分类号： G10L17/10 , G10L15/00 , G10L15/22 , G10L17/06 , G10L2021/02168
摘要： Pausing a VoiceXML dialog of a multimodal application, including generating by the multimodal application a pause event; responsive to the pause event, temporarily pausing the dialogue by the VoiceXML interpreter; generating by the multimodal application a resume event; and responsive to the resume event, resuming the dialog. Embodiments are implemented with the multimodal application operating on a multimodal device supporting multiple modes of interaction including a voice mode and one or more non-voice modes, the multimodal application is operatively coupled to a VoiceXML interpreter, and the VoiceXML interpreter is interpreting the VoiceXML dialog to be paused.
摘要翻译：暂停多模式应用程序的VoiceXML对话框，包括由多模态应用程序生成暂停事件; 响应暂停事件，VoiceXML解释器临时暂停对话; 由多模式应用程序生成一个简历事件; 并响应resume事件，恢复对话。实施例是通过在多模式设备上操作的多模式应用来实现的，该多模式设备支持包括语音模式和一种或多种非语音模式的多种交互模式，多模式应用可操作地耦合到VoiceXML解释器，并且VoiceXML解释器正在解释VoiceXML对话暂停

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式