专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08768706B2 Content-based audio playback emphasis 有权
标题翻译：基于内容的音频播放强调
公开(公告)号：US08768706B2
公开(公告)日：2014-07-01
申请号：US12859883
申请日：2010-08-20
申请人： Kjell Schubert , Juergen Fritsch , Michael Finke , Detlef Koll
发明人： Kjell Schubert , Juergen Fritsch , Michael Finke , Detlef Koll
IPC分类号： G10L21/00
CPC分类号： G10L15/26 , G06F17/273 , G06F17/2785 , G10L15/1807 , G10L15/22 , G10L15/265 , G10L21/04
摘要： Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.
摘要翻译：公开了用于促进校对口头音频流的草稿的过程的技术。一般来说，通过播放对应的口语音频流，强调音频流中与那些高度相关或可能被错误地转录的那些区域，来校对草稿。例如，区域可能会被强调为比相关程度低且可能被正确转录的地区的播放速度更慢。强调音频流中最重要的那些区域是正确转录的，那些最有可能被错误转录的区域增加了校对者准确地纠正这些区域中的任何错误的可能性，从而提高了抄本的整体准确性。

2. 发明授权

US08560314B2 Applying service levels to transcripts 有权
标题翻译：将服务水平应用于成绩单
公开(公告)号：US08560314B2
公开(公告)日：2013-10-15
申请号：US11766784
申请日：2007-06-21
申请人： Detlef Koll , Michael Finke
发明人： Detlef Koll , Michael Finke
IPC分类号： G10L15/26
CPC分类号： G10L19/00 , G06F17/211 , G06F17/2785 , G06F19/00 , G06Q50/22 , G06Q50/24 , G10L15/02 , G10L15/26 , G16H10/20 , G16H10/60 , G16H15/00
摘要： Speech is transcribed to produce a draft transcript of the speech. Portions of the transcript having a high priority are identified. For example, particular sections of the transcript may be identified as high-priority sections. As another example, portions of the transcript requiring human verification may be identified as high-priority sections. High-priority portions of the transcript are verified at a first time, without verifying other portions of the transcript. Such other portions may or may not be verified at a later time. Limiting verification, either initially or entirely, to high-priority portions of the transcript limits the time required to perform such verification, thereby making it feasible to verify the most important portions of the transcript at an early stage without introducing an undue delay into the transcription process. Verifying the other portions of the transcript later ensures that early verification of the high-priority portions does not sacrifice overall verification accuracy.
摘要翻译：演讲转载为演讲稿。识别具有高优先级的部分转录本。例如，誊本的特定部分可以被识别为高优先级部分。作为另一个例子，需要人类验证的部分转录物可以被识别为高优先级部分。誊本的高优先级部分在第一时间被验证，而不验证抄本的其他部分。这样的其他部分可以在以后也可能不被验证。初步或全部将验证限制在转录本的高优先级部分，限制进行此类验证所需的时间，从而使得可以在早期阶段验证转录本的最重要部分，而不会在转录中引入不适当的延迟处理。稍后验证转录本的其他部分将确保高优先级部分的早期验证不会牺牲整体验证准确性。

3. 发明授权

US08335688B2 Document transcription system training 有权
标题翻译：文件转录系统培训
公开(公告)号：US08335688B2
公开(公告)日：2012-12-18
申请号：US10922513
申请日：2004-08-20
申请人： Girija Yegnanarayanan , Michael Finke , Juergen Fritsch , Detlef Koll , Monika Woszczyna
发明人： Girija Yegnanarayanan , Michael Finke , Juergen Fritsch , Detlef Koll , Monika Woszczyna
IPC分类号： G10L15/26 , G10L15/18
CPC分类号： G10L15/063 , G10L15/193 , G10L15/26
摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
摘要翻译：提供用于训练用于语音识别的声学模型的系统。特别地，这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式，从而产生更准确地表示语音音频流的经修改的脚本。修改和更准确的誊本可用于训练声学模型，从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

4. 发明授权

US08321199B2 Verification of extracted data 有权
标题翻译：提取数据的验证
公开(公告)号：US08321199B2
公开(公告)日：2012-11-27
申请号：US12771193
申请日：2010-04-30
申请人： Detlef Koll , Michael Finke
发明人： Detlef Koll , Michael Finke
IPC分类号： G06F17/27
CPC分类号： G10L19/00 , G06F17/211 , G06F17/2785 , G06F19/00 , G06Q50/22 , G06Q50/24 , G10L15/02 , G10L15/26 , G16H10/20 , G16H10/60 , G16H15/00
摘要： Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.
摘要翻译：事实是从言语中提取的，并使用编码记录在文档中。每个编码表示提取的事实，并包括代码和基准。代码可以表示提取的事实的类型，并且数据可以表示提取的事实的值。编码中的数据基于编码的指定特征进行渲染。例如，数据可以呈现为粗体文本以指示编码已被指定为过敏。以这种方式，使用编码的指定特征（例如，过敏性）来修改基准的渲染方式。用户检查呈现并基于呈现提供编码是否被准确地指定为具有指定特征的指示。可以存储用户指示的记录，例如在编码本身内。

5. 发明授权

US08249878B2 Distributed speech recognition using one way communication 有权
标题翻译：使用单向通信的分布式语音识别
公开(公告)号：US08249878B2
公开(公告)日：2012-08-21
申请号：US13196188
申请日：2011-08-02
申请人： Eric Carraux , Detlef Koll
发明人： Eric Carraux , Detlef Koll
IPC分类号： G10L15/22
CPC分类号： G10L15/22 , G10L15/30 , G10L15/32
摘要： A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
摘要翻译：语音识别客户端通过网络将语音流和控制流并行发送到服务器端语音识别器。网络可能是不可靠的低延迟网络。服务器侧语音识别器识别语音流的第一部分，并且如果通过语音识别结果满足预定标准，则在识别语音流的第二部分之前等待直到语音识别器已被重新配置。语音识别客户机根据来自客户端的请求从服务器端识别器接收识别结果。客户端可以在识别期间远程重新配置服务器端识别器的状态。

6. 发明申请

US20150112677A1 Document Editing Using Anchors 审中-公开
标题翻译：使用锚点进行文档编辑
公开(公告)号：US20150112677A1
公开(公告)日：2015-04-23
申请号：US14585539
申请日：2014-12-30
申请人： MULTIMODAL TECHNOLOGIES, LLC
发明人： Kjell Schubert
IPC分类号： G06F17/24 , G10L15/01 , G10L15/26
摘要： A user edits text in a draft document by providing input including left and right “anchor” text and replacement text. In response, a document editing system identifies an instance of the left anchor text followed by the right anchor text in the draft document, and replaces text between these instances with the replacement text specified by the user. For example, the user may type a string containing the left anchor text followed by the replacement text followed by the right anchor text, in response to which the system may perform the replacement just described. As a result, the user may specify both the location of, and a correction for, text in the draft document without using cursor keys or other navigation commands to navigate to the location of the text to be corrected, thereby increasing correction efficiency by avoiding the delay associated with such manual navigation.
摘要翻译：用户通过提供包括左和右“锚”文本和替换文本的输入来编辑草稿文档中的文本。作为响应，文档编辑系统在草稿文档中识别左锚文本的后跟正确锚文本的实例，并将这些实例之间的文本与用户指定的替换文本替换。例如，用户可以键入包含左锚文本的字符串，后跟替换文本，然后是右锚文本，响应于此，系统可以执行刚刚描述的替换。结果，用户可以在草稿文件中同时指定文本的位置和校正，而不用光标键或其他导航命令导航到要校正的文本的位置，从而通过避免延迟与这种手动导航有关。

7. 发明申请

US20130166297A1 Discriminative Training of Document Transcription System 有权
标题翻译：文件转录系统的歧视性培训
公开(公告)号：US20130166297A1
公开(公告)日：2013-06-27
申请号：US13773928
申请日：2013-02-22
申请人： MULTIMODAL TECHNOLOGIES, LLC
发明人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch
IPC分类号： G10L15/06
CPC分类号： G10L15/063 , G06F17/271 , G06F17/2775 , G06F17/28 , G10L15/02 , G10L15/183 , G10L15/193 , G10L15/26 , G10L2015/0631 , G10L2015/0633
摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
摘要翻译：提供用于训练用于语音识别的声学模型的系统。特别地，这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式，从而产生更准确地表示语音音频流的经修改的脚本。可以使用修改和更准确的抄本来使用辨别性训练技术训练声学模型，从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。

8. 发明授权

US08412521B2 Discriminative training of document transcription system 有权
公开(公告)号：US08412521B2
公开(公告)日：2013-04-02
申请号：US11228607
申请日：2005-09-16
申请人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch
发明人： Lambert Mathias , Girija Yegnanarayanan , Juergen Fritsch
IPC分类号： G10L15/26 , G10L15/18
CPC分类号： G10L15/063 , G06F17/271 , G06F17/2775 , G06F17/28 , G10L15/02 , G10L15/183 , G10L15/193 , G10L15/26 , G10L2015/0631 , G10L2015/0633
摘要： A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.

9. 发明授权

US08086458B2 Audio signal de-identification 有权
标题翻译：音频信号去识别
公开(公告)号：US08086458B2
公开(公告)日：2011-12-27
申请号：US12258103
申请日：2008-10-24
申请人： Michael Finke , Detlef Koll
发明人： Michael Finke , Detlef Koll
IPC分类号： G06Q50/00 , G10L15/26 , G06F17/21
CPC分类号： G10L15/1822 , G06F19/00 , G06Q50/22 , G06Q50/24
摘要： Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.
摘要翻译：公开了用于自动取消识别口头音频信号的技术。特别地，公开了用于自动从口头音频信号中移除个人识别信息并用非个人识别信息替换这些信息的技术。可以通过基于所述口语音频信号自动生成报告来执行语音音频信号的取消识别。报告可以包括对应于由口语音频信号表示的一个或多个概念的概念内容（例如，文本）。报告还可以包括指示对应于概念内容的口语音频信号中的语音的时间位置的时间戳。识别表示个人识别信息的概念内容。与个人识别概念内容相对应的音频从口头音频信号中去除。删除的音频可以被非个人识别的音频替换。

10. 发明申请

US20150095025A1 Decoding-Time Prediction of Non-Verbalized Tokens 有权
标题翻译：非语言令牌的解码时间预测
公开(公告)号：US20150095025A1
公开(公告)日：2015-04-02
申请号：US14571697
申请日：2014-12-16
申请人： Multimodal Technologies, LLC
发明人： Juergen Fritsch , Anoop Deoras , Detlef Koll
IPC分类号： G10L19/00 , G10L15/06
CPC分类号： G10L19/0018 , G10L15/063 , G10L15/197 , G10L15/26 , G10L15/265
摘要： Non-verbalized tokens, such as punctuation, are automatically predicted and inserted into a transcription of speech in which the tokens were not explicitly verbalized. Token prediction may be integrated with speech decoding, rather than performed as a post-process to speech decoding.
摘要翻译：自动预测非言语标记，例如标点符号，并将其插入到言语的转录中，其中令牌不被明确地言语化。令牌预测可以与语音解码集成，而不是作为语音解码的后处理来执行。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式