会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明授权
    • Content-based audio playback emphasis
    • 基于内容的音频播放强调
    • US08768706B2
    • 2014-07-01
    • US12859883
    • 2010-08-20
    • Kjell SchubertJuergen FritschMichael FinkeDetlef Koll
    • Kjell SchubertJuergen FritschMichael FinkeDetlef Koll
    • G10L21/00
    • G10L15/26G06F17/273G06F17/2785G10L15/1807G10L15/22G10L15/265G10L21/04
    • Techniques are disclosed for facilitating the process of proofreading draft transcripts of spoken audio streams. In general, proofreading of a draft transcript is facilitated by playing back the corresponding spoken audio stream with an emphasis on those regions in the audio stream that are highly relevant or likely to have been transcribed incorrectly. Regions may be emphasized by, for example, playing them back more slowly than regions that are of low relevance and likely to have been transcribed correctly. Emphasizing those regions of the audio stream that are most important to transcribe correctly and those regions that are most likely to have been transcribed incorrectly increases the likelihood that the proofreader will accurately correct any errors in those regions, thereby improving the overall accuracy of the transcript.
    • 公开了用于促进校对口头音频流的草稿的过程的技术。 一般来说,通过播放对应的口语音频流,强调音频流中与那些高度相关或可能被错误地转录的那些区域,来校对草稿。 例如,区域可能会被强调为比相关程度低且可能被正确转录的地区的播放速度更慢。 强调音频流中最重要的那些区域是正确转录的,那些最有可能被错误转录的区域增加了校对者准确地纠正这些区域中的任何错误的可能性,从而提高了抄本的整体准确性。
    • 2. 发明授权
    • Applying service levels to transcripts
    • 将服务水平应用于成绩单
    • US08560314B2
    • 2013-10-15
    • US11766784
    • 2007-06-21
    • Detlef KollMichael Finke
    • Detlef KollMichael Finke
    • G10L15/26
    • G10L19/00G06F17/211G06F17/2785G06F19/00G06Q50/22G06Q50/24G10L15/02G10L15/26G16H10/20G16H10/60G16H15/00
    • Speech is transcribed to produce a draft transcript of the speech. Portions of the transcript having a high priority are identified. For example, particular sections of the transcript may be identified as high-priority sections. As another example, portions of the transcript requiring human verification may be identified as high-priority sections. High-priority portions of the transcript are verified at a first time, without verifying other portions of the transcript. Such other portions may or may not be verified at a later time. Limiting verification, either initially or entirely, to high-priority portions of the transcript limits the time required to perform such verification, thereby making it feasible to verify the most important portions of the transcript at an early stage without introducing an undue delay into the transcription process. Verifying the other portions of the transcript later ensures that early verification of the high-priority portions does not sacrifice overall verification accuracy.
    • 演讲转载为演讲稿。 识别具有高优先级的部分转录本。 例如,誊本的特定部分可以被识别为高优先级部分。 作为另一个例子,需要人类验证的部分转录物可以被识别为高优先级部分。 誊本的高优先级部分在第一时间被验证,而不验证抄本的其他部分。 这样的其他部分可以在以后也可能不被验证。 初步或全部将验证限制在转录本的高优先级部分,限制进行此类验证所需的时间,从而使得可以在早期阶段验证转录本的最重要部分,而不会在转录中引入不适当的延迟 处理。 稍后验证转录本的其他部分将确保高优先级部分的早期验证不会牺牲整体验证准确性。
    • 3. 发明授权
    • Document transcription system training
    • 文件转录系统培训
    • US08335688B2
    • 2012-12-18
    • US10922513
    • 2004-08-20
    • Girija YegnanarayananMichael FinkeJuergen FritschDetlef KollMonika Woszczyna
    • Girija YegnanarayananMichael FinkeJuergen FritschDetlef KollMonika Woszczyna
    • G10L15/26G10L15/18
    • G10L15/063G10L15/193G10L15/26
    • A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
    • 提供用于训练用于语音识别的声学模型的系统。 特别地,这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。 这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。 该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式,从而产生更准确地表示语音音频流的经修改的脚本。 修改和更准确的誊本可用于训练声学模型,从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。
    • 4. 发明授权
    • Verification of extracted data
    • 提取数据的验证
    • US08321199B2
    • 2012-11-27
    • US12771193
    • 2010-04-30
    • Detlef KollMichael Finke
    • Detlef KollMichael Finke
    • G06F17/27
    • G10L19/00G06F17/211G06F17/2785G06F19/00G06Q50/22G06Q50/24G10L15/02G10L15/26G16H10/20G16H10/60G16H15/00
    • Facts are extracted from speech and recorded in a document using codings. Each coding represents an extracted fact and includes a code and a datum. The code may represent a type of the extracted fact and the datum may represent a value of the extracted fact. The datum in a coding is rendered based on a specified feature of the coding. For example, the datum may be rendered as boldface text to indicate that the coding has been designated as an “allergy.” In this way, the specified feature of the coding (e.g., “allergy”-ness) is used to modify the manner in which the datum is rendered. A user inspects the rendering and provides, based on the rendering, an indication of whether the coding was accurately designated as having the specified feature. A record of the user's indication may be stored, such as within the coding itself.
    • 事实是从言语中提取的,并使用编码记录在文档中。 每个编码表示提取的事实,并包括代码和基准。 代码可以表示提取的事实的类型,并且数据可以表示提取的事实的值。 编码中的数据基于编码的指定特征进行渲染。 例如,数据可以呈现为粗体文本以指示编码已被指定为过敏。 以这种方式,使用编码的指定特征(例如,过敏性)来修改基准的渲染方式。 用户检查呈现并基于呈现提供编码是否被准确地指定为具有指定特征的指示。 可以存储用户指示的记录,例如在编码本身内。
    • 5. 发明授权
    • Distributed speech recognition using one way communication
    • 使用单向通信的分布式语音识别
    • US08249878B2
    • 2012-08-21
    • US13196188
    • 2011-08-02
    • Eric CarrauxDetlef Koll
    • Eric CarrauxDetlef Koll
    • G10L15/22
    • G10L15/22G10L15/30G10L15/32
    • A speech recognition client sends a speech stream and control stream in parallel to a server-side speech recognizer over a network. The network may be an unreliable, low-latency network. The server-side speech recognizer recognizes a first portion of the speech stream and, if a predetermined criterion is satisfied by the speech recognition result, waits until the speech recognizer has been reconfigured before recognizing a second portion of the speech stream. The speech recognition client receives recognition results from the server-side recognizer in response to requests from the client. The client may remotely reconfigure the state of the server-side recognizer during recognition.
    • 语音识别客户端通过网络将语音流和控制流并行发送到服务器端语音识别器。 网络可能是不可靠的低延迟网络。 服务器侧语音识别器识别语音流的第一部分,并且如果通过语音识别结果满足预定标准,则在识别语音流的第二部分之前等待直到语音识别器已被重新配置。 语音识别客户机根据来自客户端的请求从服务器端识别器接收识别结果。 客户端可以在识别期间远程重新配置服务器端识别器的状态。
    • 6. 发明申请
    • Document Editing Using Anchors
    • 使用锚点进行文档编辑
    • US20150112677A1
    • 2015-04-23
    • US14585539
    • 2014-12-30
    • MULTIMODAL TECHNOLOGIES, LLC
    • Kjell Schubert
    • G06F17/24G10L15/01G10L15/26
    • A user edits text in a draft document by providing input including left and right “anchor” text and replacement text. In response, a document editing system identifies an instance of the left anchor text followed by the right anchor text in the draft document, and replaces text between these instances with the replacement text specified by the user. For example, the user may type a string containing the left anchor text followed by the replacement text followed by the right anchor text, in response to which the system may perform the replacement just described. As a result, the user may specify both the location of, and a correction for, text in the draft document without using cursor keys or other navigation commands to navigate to the location of the text to be corrected, thereby increasing correction efficiency by avoiding the delay associated with such manual navigation.
    • 用户通过提供包括左和右“锚”文本和替换文本的输入来编辑草稿文档中的文本。 作为响应,文档编辑系统在草稿文档中识别左锚文本的后跟正确锚文本的实例,并将这些实例之间的文本与用户指定的替换文本替换。 例如,用户可以键入包含左锚文本的字符串,后跟替换文本,然后是右锚文本,响应于此,系统可以执行刚刚描述的替换。 结果,用户可以在草稿文件中同时指定文本的位置和校正,而不用光标键或其他导航命令导航到要校正的文本的位置,从而通过避免 延迟与这种手动导航有关。
    • 7. 发明申请
    • Discriminative Training of Document Transcription System
    • 文件转录系统的歧视性培训
    • US20130166297A1
    • 2013-06-27
    • US13773928
    • 2013-02-22
    • MULTIMODAL TECHNOLOGIES, LLC
    • Lambert MathiasGirija YegnanarayananJuergen Fritsch
    • G10L15/06
    • G10L15/063G06F17/271G06F17/2775G06F17/28G10L15/02G10L15/183G10L15/193G10L15/26G10L2015/0631G10L2015/0633
    • A system is provided for training an acoustic model for use in speech recognition. In particular, such a system may be used to perform training based on a spoken audio stream and a non-literal transcript of the spoken audio stream. Such a system may identify text in the non-literal transcript which represents concepts having multiple spoken forms. The system may attempt to identify the actual spoken form in the audio stream which produced the corresponding text in the non-literal transcript, and thereby produce a revised transcript which more accurately represents the spoken audio stream. The revised, and more accurate, transcript may be used to train the acoustic model using discriminative training techniques, thereby producing a better acoustic model than that which would be produced using conventional techniques, which perform training based directly on the original non-literal transcript.
    • 提供用于训练用于语音识别的声学模型的系统。 特别地,这样的系统可以用于基于口语音频流和口头音频流的非文字转录来执行训练。 这样的系统可以识别表示具有多个口头形式的概念的非文字记录中的文本。 该系统可以尝试在音频流中识别在非文字转录中产生相应文本的音频流中的实际语音形式,从而产生更准确地表示语音音频流的经修改的脚本。 可以使用修改和更准确的抄本来使用辨别性训练技术训练声学模型,从而产生比使用直接基于原始非文字誊本进行训练的常规技术产生的更好的声学模型。
    • 9. 发明授权
    • Audio signal de-identification
    • 音频信号去识别
    • US08086458B2
    • 2011-12-27
    • US12258103
    • 2008-10-24
    • Michael FinkeDetlef Koll
    • Michael FinkeDetlef Koll
    • G06Q50/00G10L15/26G06F17/21
    • G10L15/1822G06F19/00G06Q50/22G06Q50/24
    • Techniques are disclosed for automatically de-identifying spoken audio signals. In particular, techniques are disclosed for automatically removing personally identifying information from spoken audio signals and replacing such information with non-personally identifying information. De-identification of a spoken audio signal may be performed by automatically generating a report based on the spoken audio signal. The report may include concept content (e.g., text) corresponding to one or more concepts represented by the spoken audio signal. The report may also include timestamps indicating temporal positions of speech in the spoken audio signal that corresponds to the concept content. Concept content that represents personally identifying information is identified. Audio corresponding to the personally identifying concept content is removed from the spoken audio signal. The removed audio may be replaced with non-personally identifying audio.
    • 公开了用于自动取消识别口头音频信号的技术。 特别地,公开了用于自动从口头音频信号中移除个人识别信息并用非个人识别信息替换这些信息的技术。 可以通过基于所述口语音频信号自动生成报告来执行语音音频信号的取消识别。 报告可以包括对应于由口语音频信号表示的一个或多个概念的概念内容(例如,文本)。 报告还可以包括指示对应于概念内容的口语音频信号中的语音的时间位置的时间戳。 识别表示个人识别信息的概念内容。 与个人识别概念内容相对应的音频从口头音频信号中去除。 删除的音频可以被非个人识别的音频替换。