专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US06704709B1 System and method for improving the accuracy of a speech recognition program 失效
标题翻译：提高语音识别程序精度的系统和方法
公开(公告)号：US06704709B1
公开(公告)日：2004-03-09
申请号：US09625657
申请日：2000-07-26
申请人： Jonathan Kahn , Thomas P Flynn , Charles Qin , Nicholas A. Linden
发明人： Jonathan Kahn , Thomas P Flynn , Charles Qin , Nicholas A. Linden
IPC分类号： G01L1526
CPC分类号： G10L15/26 , G10L2015/0631
摘要： A system and method for improving the accuracy of a speech recognition program. The system is based on a speech recognition program that automatically converts a pre-recorded audio file into a written text. The system parses the written text into segments, each of which can be corrected by the system and saved in a retrievable manner in association with the computer. The standard speech files are saved towards improving accuracy in speech-to-text conversion by the speech recognition program. The system further includes facilities to repetitively establish an independent instance of the written text from the pre-recorded audio file using the speech recognition program. This independent instance can then be broken into segments and each erroneous segment in said independent instance replaced with the corrected segment associated with that segment. In this manner, repetitive instruction of a speech recognition program can be facilitated.
摘要翻译：一种用于提高语音识别程序的准确性的系统和方法。该系统基于语音识别程序，其将预先录制的音频文件自动转换成书面文本。系统将书面文本解析成段，每个段都可以由系统进行校正，并以与计算机相关联的可检索方式保存。保存标准语音文件，以提高语音识别程序在语音到文本转换中的准确性。该系统还包括使用语音识别程序从预先记录的音频文件重复地建立书面文本的独立实例的设施。然后可以将该独立实例分解成段，并且将所述独立实例中的每个错误段替换为与该段相关联的校正段。以这种方式，可以促进语音识别程序的重复指令。

2. 发明授权

US06421643B1 Method and apparatus for directing an audio file to a speech recognition program that does not accept such files 失效
标题翻译：将音频文件引导到不接受这样的文件的语音识别程序的方法和装置
公开(公告)号：US06421643B1
公开(公告)日：2002-07-16
申请号：US09430144
申请日：1999-10-29
申请人： Jonathan Kahn , Charles Qin , Nicholas A. Linden , James A. Sells
发明人： Jonathan Kahn , Charles Qin , Nicholas A. Linden , James A. Sells
IPC分类号： G10L2106
CPC分类号： G10L15/26 , G10L2015/0631
摘要： The present invention relates to a method and apparatus for directing a pre-recorded audio file to a speech recognition program that does not normally accept such files, such as IBM Corporation's Via Voice™ speech recognition program. The method includes: (a) launching the speech recognition program to accept speech as if the speech recognition program were receiving live audio from a microphone; (b) finding a mixer utility associated with the sound card; (c) opening the mixer utility, the mixer utility having settings that determine an input source and an output path; (d) changing the settings of the mixer utility to specify a line-in input source and a wave-out output path; (e) activating a microphone input of the speech recognition software; and (f) initiating a media player associated with the computer to play the pre-recorded audio file into the line-in input source. The method may additionally save and restore the original configuration settings of the mixer utility.
摘要翻译：本发明涉及一种用于将预先录制的音频文件引导到通常不接受这样的文件的语音识别程序的方法和装置，例如IBM公司的Via Voice TM语音识别程序。该方法包括：（a）启动语音识别程序以接受语音，如同语音识别程序从麦克风接收实时音频; （b）找到与声卡相关的混音器实用程序; （c）打开混合器实用程序，混合器实用程序具有确定输入源和输出路径的设置; （d）更改混频器实用程序的设置以指定输入输入源和波形输出路径; （e）激活语音识别软件的麦克风输入; 和（f）启动与计算机相关联的媒体播放器以将预先录制的音频文件播放到线路输入源中。该方法还可以保存并恢复混频器实用程序的原始配置设置。

3. 发明授权

US07006967B1 System and method for automating transcription services 失效
标题翻译：用于自动转录服务的系统和方法
公开(公告)号：US07006967B1
公开(公告)日：2006-02-28
申请号：US09889870
申请日：2000-02-04
申请人： Jonathan Kahn , Charles Qin , Thomas P. Flynn , Robert J. Tippe
发明人： Jonathan Kahn , Charles Qin , Thomas P. Flynn , Robert J. Tippe
IPC分类号： G10L15/26
CPC分类号： G10L15/075 , G10L15/26 , G10L15/30
摘要： A system for substantially automating transcription services for multiple users (10, 11, 12) including a manual transcription station (50), speech recognition program (40) and a routing program (200). A uniquely identified voice dictation file is generated from a user and—based on the training status—routes the voice dictation file to a manual transcription station and speech recognition program. A human transcriptionist creates transcribed files for each voice dictation file. The speech recognition program creates written text for each dictation file if the training status is training or automated. If the training status of the current user is enrollment or training, a verbatim file is manually established and the speech recognition program is trained with an acoustic model using the verbatim and voice dictation files. The transcribed file is returned to the user if the training status is enrollment or training or written text is returned if the status is automated.
摘要翻译：一种用于对包括手动转录站（50），语音识别程序（40）和路由程序（200）的多个用户（10,11,12）基本自动化转录服务的系统。从用户生成唯一识别的语音听写文件，并且基于训练状态 - 将语音听写文件路由到手动转录站和语音识别程序。人类记录员为每个语音听写文件创建转录文件。如果培训状态是训练或自动化，则语音识别程序为每个口授档案创建书写文本。如果当前用户的训练状态是注册或训练，则手动建立逐字文件，并且使用逐字和语音听写文件用声学模型对语音识别程序进行训练。如果培训状态为注册，则转录的文件被返回给用户，如果状态是自动的，则返回训练或书面文本。

4. 发明授权

US6122614A System and method for automating transcription services 失效
标题翻译：用于自动转录服务的系统和方法
公开(公告)号：US6122614A
公开(公告)日：2000-09-19
申请号：US197313
申请日：1998-11-20
申请人： Jonathan Kahn , Thomas P. Flynn , Charles Qin , Robert J. Tippe
发明人： Jonathan Kahn , Thomas P. Flynn , Charles Qin , Robert J. Tippe
IPC分类号： G10L15/06 , G10L15/22 , G10L15/26 , G01L15/18 , G01L15/26
CPC分类号： G10L15/063 , G10L2015/0631
摘要： A system for substantially automating transcription services for multiple voice users including a manual transcription station, a speech recognition program and a routing program. The system establishes a profile for each of the voice users containing a training status which is selected from the group of enrollment, training, automated and stop automation. When the system receives a voice dictation file from a current voice user based on the training status the system routes the voice dictation file to a manual transcription station and the speech recognition program. A human transcriptionist creates transcribed files for each received voice dictation files. The speech recognition program automatically creates a written text for each received voice dictation file if the training status of the current user is training or automated. A verbatim file is manually established if the training status of the current user is enrollment or training and the speech recognition program is trained with an acoustic model for the current user using the verbatim file and the voice dictation file if the training status of the current user is enrollment or training. The transcribed file is returned to the current user if the training status of the current user is enrollment or training or the written text is returned if the training status of the current user is automated. An apparatus and method is also disclosed for simplifying the manual establishment of the verbatim file. A method for substantially automating transcription services is also disclosed.
摘要翻译：一种用于包括手动转录站，语音识别程序和路由程序在内的多语音用户的转录服务基本上自动化的系统。系统为包含从注册，培训，自动化和停止自动化组中选择的训练状态的每个语音用户建立配置文件。当系统基于训练状态从当前语音用户接收语音听写文件时，系统将语音听写文件路由到手动转录站和语音识别程序。人类记录员为每个接收到的语音听写文件创建转录文件。如果当前用户的训练状态是训练或自动化，则语音识别程序自动为每个接收到的语音听写文件创建书面文本。如果当前用户的训练状态是注册或训练，并且如果当前用户的训练状态使用逐字文本和语音听写文件的语音识别程序用当前用户的声学模型进行训练，则手动建立逐字文件正在招收或培训。如果当前用户的训练状态是注册或训练，或者如果当前用户的训练状态是自动的则返回书面文本，则将转录文件返回给当前用户。还公开了一种用于简化逐字文件的手动建立的装置和方法。还公开了一种用于基本自动化转录服务的方法。

5. 发明授权

US06490558B1 System and method for improving the accuracy of a speech recognition program through repetitive training 有权
标题翻译：通过重复训练提高语音识别程序的准确性的系统和方法
公开(公告)号：US06490558B1
公开(公告)日：2002-12-03
申请号：US09362255
申请日：1999-07-28
申请人： Jonathan Kahn , Thomas P. Flynn , Charles Qin
发明人： Jonathan Kahn , Thomas P. Flynn , Charles Qin
IPC分类号： G10L1526
CPC分类号： G10L15/26 , G10L2015/0631
摘要： A system and method for quickly improving the accuracy of a speech recognition program. The system is based on a speech recognition program that automatically converts a pre-recorded audio file into a written text. The system parses the written text into segments, each of which is corrected by the system and saved in an individually retrievable manner in association with the computer. The standard speech files are saved towards improving accuracy in speech-to-text conversion by the speech recognition program. The system further includes facilities to repetitively establish an independent instance of the written text from the prerecorded audio file using the speech recognition program. This independent instance can then be broken into segments and each segment in said independent instance replaced with an individually retrievable saved corrected segment associated with that segment. In this manner, repetitive instruction of a speech recognition program can be facilitated.
摘要翻译：一种用于快速提高语音识别程序精度的系统和方法。该系统基于语音识别程序，其将预先录制的音频文件自动转换成书面文本。系统将书面文本分解成段，每个段由系统修正，并以与计算机相关联的单独检索方式保存。保存标准语音文件，以提高语音识别程序在语音到文本转换中的准确性。该系统还包括使用语音识别程序从预先记录的音频文件重复地建立书面文本的独立实例的设施。然后可以将该独立实例分解为段，并且在所述独立实例中的每个段被替换为与该段相关联的单独可检索保存的校正段。以这种方式，可以促进语音识别程序的重复指令。

6. 发明授权

US06961699B1 Automated transcription system and method using two speech converting instances and computer-assisted correction 有权
标题翻译：使用两个语音转换实例和计算机辅助校正的自动转录系统和方法
公开(公告)号：US06961699B1
公开(公告)日：2005-11-01
申请号：US09889398
申请日：2000-02-18
申请人： Jonathan Kahn , Charles Qin , Thomas P. Flynn
发明人： Jonathan Kahn , Charles Qin , Thomas P. Flynn
IPC分类号： G06F17/27 , G10L15/26 , G10L21/06
CPC分类号： G06F17/273 , G10L15/26
摘要： A system for automating transcription services for one or more users. This system receives a voice dictation file from a current user, which is automatically converted into a first written text based on a set of conversion variables. The same voice dictation file is automatically converted into a second written text based on a second set of conversion variables. The first and second sets of conversion variables have at least one difference, such as different speech recognition programs, different vocabularies, and the like. The system further includes a program for manually editing a copy of the first and second written text to create a verbatim text of the voice dictation file. This verbatim text can be delivered to the current user as transcribed text. The verbatim text can also be fed back into each speech recognition instance to improve the accuracy of each instance with respect to the human voice in the file.
摘要翻译：一种用于为一个或多个用户自动转录服务的系统。该系统接收来自当前用户的语音听写文件，其基于一组转换变量自动转换成第一写入文本。基于第二组转换变量，相同的语音听写文件被自动地转换成第二写入文本。第一组和第二组转换变量具有至少一个差异，例如不同的语音识别程序，不同的词汇等。该系统还包括用于手动编辑第一和第二书写文本的副本以创建语音听写文件的逐字文本的程序。这个逐字文本可以作为转录文本传递给当前用户。逐字文本也可以反馈到每个语音识别实例中，以提高每个实例相对于文件中的人声的准确性。

7. 发明授权

US07120581B2 System and method for identifying an identical audio segment using text comparison 有权
标题翻译：用于使用文本比较识别相同音频段的系统和方法
公开(公告)号：US07120581B2
公开(公告)日：2006-10-10
申请号：US10276382
申请日：2001-05-31
申请人： Jonathan Kahn , Thomas P. Flynn
发明人： Jonathan Kahn , Thomas P. Flynn
IPC分类号： G10L13/08
CPC分类号： G06F17/2211
摘要： A method for comparing text in a first file to text in a second file. The method includes segmenting text in the first and second files to one word per line; comparing the segmented versions of the versions of the first and second files on a line by line basis; creating a result file using the segmented version of the first file; and augmenting the result file with indication of error using a sandwiching technique. This sandwiching technique includes identifying correct segments that are immediately adjacent any differences identified by comparing the segmented versions of the first and second files on a line by line basis toward sandwiching the erroneous segments between correct segments. Said method incorporates video monitor (26), keyboard (24), and mouse (23), along with microphone (25) and digital recorder (14) for implementing the invention.
摘要翻译：用于将第一文件中的文本与第二文件中的文本进行比较的方法。该方法包括将第一和第二文件中的文本分割为每行一个字; 在逐行的基础上比较第一和第二文件的分段版本; 使用第一个文件的分段版本创建一个结果文件; 并使用夹层技术来增加结果文件的错误指示。该夹层技术包括识别通过逐行地比较第一和第二文件的分段版本而将识别的任何差异紧紧相邻的正确段，以将错误段夹在正确段之间。所述方法包括视频监视器（26），键盘（24）和鼠标（23）以及用于实现本发明的麦克风（25）和数字记录器（14）。

8. 发明授权

US07693717B2 Session file modification with annotation using speech recognition or text to speech 失效
标题翻译：使用语音识别或文本到语音的注释进行会话文件修改
公开(公告)号：US07693717B2
公开(公告)日：2010-04-06
申请号：US11279551
申请日：2006-04-12
申请人： Jonathan Kahn , Michael C. Huttinger
发明人： Jonathan Kahn , Michael C. Huttinger
IPC分类号： G10L13/00 , G10L11/00
CPC分类号： G10L15/22 , G10L2015/0631
摘要： An apparatus comprising a session file, session file editor, annotation window, concatenation software and training software. The session file includes one or more audio files and text associated with each audio file segment. The session file editor displays text and provides text selection capability and plays back audio. The annotation window operably associated with the session file editor supports user modification of the selected text, the annotation window saves modified text corresponding to the selected text from the session file editor and audio associated with the modified text. The concatenation software concatenates modified text and audio associated therewith for two or more instances of the selected text. The training software trains a speech user profile using a concatenated file formed by the concatenating software. The session file may have original audio associated with the selected text, wherein the apparatus further comprises software for substituting the modified text for the selected text. In some embodiments, the concatenation software concatenates modified text and audio associated therewith for two or more instances of the selected text. In some embodiments, the training software trains a speech user profile using a concatenated file formed by the concatenating software.
摘要翻译：一种包括会话文件，会话文件编辑器，注释窗口，级联软件和训练软件的装置。会话文件包括与每个音频文件段相关联的一个或多个音频文件和文本。会话文件编辑器显示文本并提供文本选择功能并播放音频。与会话文件编辑器可操作地相关联的注释窗口支持用户修改所选择的文本，注释窗口保存对应于来自会话文件编辑器的所选择的文本的修改的文本和与修改的文本相关联的音频。连接软件将与所选文本的两个或多个实例相关联的经修改的文本和音频连接起来。培训软件使用由级联软件形成的连接文件来训练语音用户简档。会话文件可以具有与所选择的文本相关联的原始音频，其中所述设备还包括用于将所述修改的文本替换为所选择的文本的软件。在一些实施例中，级联软件将与所选文本的两个或多个实例相关联的经修改的文本和音频连接起来。在一些实施例中，训练软件使用由级联软件形成的级联文件来训练语音用户简档。

9. 发明申请

US20080270437A1 Session File Divide, Scramble, or Both for Manual or Automated Processing by One or More Processing Nodes 审中-公开
标题翻译：会话文件分割，争夺或两者用于手动或自动处理一个或多个处理节点
公开(公告)号：US20080270437A1
公开(公告)日：2008-10-30
申请号：US11848148
申请日：2007-08-30
申请人： Jonathan Kahn , Robert Lee Stephen
发明人： Jonathan Kahn , Robert Lee Stephen
IPC分类号： G06F17/30
CPC分类号： G06F16/2308
摘要： An apparatus comprising a session file and session file editor with main window and one or more document windows and annotation window and divide/merge and scramble/unscramble features. The session file may include text, audio, image, and other bounded divisions with source data divided into segments or other bounded divisions and other bounded divisions associated to original data. The session file may be derived from processing third-party application output. The session file editor displays text and other content, provides text selection capability and plays back audio of session files with audio-linked text as embedded content, and supports entry of text and password-protected document lock/unlock. The session file editor supports selection of a parent session file and divide, scramble, or merge of bounded divisions to create one or more child session files that may be processed at one or more nodes to create one or more processed child session files. The one or more processed child session files may undergo merge, unscramble, or both to create a reassembled session file with the same order of bounded divisions as the parent session file. The apparatus further comprises export of phrase-toned audio from a session file for transcription into delimited text for insert/replace into the original session file.
摘要翻译：一种包括具有主窗口和一个或多个文档窗口和注释窗口的会话文件和会话文件编辑器以及分割/合并和加扰/解扰特征的设备。会话文件可以包括文本，音频，图像和其他有界分割，其中源数据被划分成段或其他有界分区和与原始数据相关联的其他有界分割。会话文件可以从处理第三方应用程序输出中导出。会话文件编辑器显示文本和其他内容，提供文本选择功能，并将具有音频链接文本的会话文件的音频作为嵌入式内容播放，并支持输入文本和密码保护的文档锁定/解锁。会话文件编辑器支持选择父会话文件，并划分，加扰或合并有界部分以创建一个或多个可在一个或多个节点处处理的子会话文件，以创建一个或多个已处理的子会话文件。一个或多个经处理的子会话文件可以经历合并，解密或两者以创建具有与父会话文件相同的有界分割顺序的重组的会话文件。该装置还包括从用于转录的会话文件中输出短语音调的音频以用于插入/替换为原始会话文件。

10. 发明授权

US07668718B2 Synchronized pattern recognition source data processed by manual or automatic means for creation of shared speaker-dependent speech user profile 有权
标题翻译：通过手动或自动手段处理的同步模式识别源数据，用于创建共享的与扬声器相关的语音用户简档
公开(公告)号：US07668718B2
公开(公告)日：2010-02-23
申请号：US11203671
申请日：2005-08-12
申请人： Jonathan Kahn , Cenk Demiroglu , Michael C. Huttinger
发明人： Jonathan Kahn , Cenk Demiroglu , Michael C. Huttinger
IPC分类号： G10L21/00
CPC分类号： G10L15/063 , G10L15/07 , G10L15/18
摘要： An apparatus for transforming data input by dividing the data input into a uniform dataset with one or more data divisions, processing the uniform dataset to produce a first processed dataset with one or more data divisions, processing the uniform dataset to produce a second processed dataset with one or more data divisions, wherein the first and second processed datasets have the same number of data divisions, and editing data selectively within each one of the one or more divisions of the first and second processed dataset. This apparatus has particular utility in error-spotting in processed datasets, and toward training a pattern recognition application, such as speech recognition, to produce more accurate processed datasets.
摘要翻译：一种用于通过将输入的数据输入到具有一个或多个数据部分的统一数据集来变换数据输入的装置，处理统一数据集以产生具有一个或多个数据部分的第一处理数据集，处理统一数据集以产生第二处理数据集，一个或多个数据部分，其中所述第一和第二已处理数据集具有相同数量的数据部分，以及有选择地在所述第一和第二处理数据集的所述一个或多个部分中的每个部分内编辑数据。该装置在处理后的数据集中的错误检测中具有特别的实用性，并且用于训练诸如语音识别的模式识别应用，以产生更准确的处理数据集。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式