专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US08983954B2 Finding data in connected corpuses using examples 有权
标题翻译：使用示例查找连接的语料库中的数据
公开(公告)号：US08983954B2
公开(公告)日：2015-03-17
申请号：US13443681
申请日：2012-04-10
申请人： John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays
发明人： John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays
IPC分类号： G06F17/30
CPC分类号： G06F17/30758 , G06F17/30303 , G06F17/30395 , G06F17/3053 , G06F17/30539 , G06F17/30595 , G06F17/30722 , G06F17/30867
摘要： In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.
摘要翻译：在一个实施例中，数据集存储在目录中。通过在不同数据集中建立域之间的关系来丰富数据集。用户通过提供感兴趣的域的示例来搜索相关的数据集。系统识别与用户提供的示例对应的数据集。系统识别通过其他域直接链接或间接链接的数据集的连接子集。用户提供已知的关系示例来过滤连接的子集并识别与用户查询最相关的连接的子集。可以通过商业智能/分析进一步分析所选择的连接子集以创建枢轴表或处理数据。

2. 发明申请

US20130268531A1 Finding Data in Connected Corpuses Using Examples 有权
公开(公告)号：US20130268531A1
公开(公告)日：2013-10-10
申请号：US13443681
申请日：2012-04-10
申请人： John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays
发明人： John C. Platt , Surajit Chaudhuri , Lev Novik , Henricus Johannes Maria Meijer , Efim Hudis , Kunal Mukerjee , Christopher Alan Hays
IPC分类号： G06F17/30
CPC分类号： G06F17/30758 , G06F17/30303 , G06F17/30395 , G06F17/3053 , G06F17/30539 , G06F17/30595 , G06F17/30722 , G06F17/30867
摘要： In one embodiment, datasets are stored in a catalog. The datasets are enriched by establishing relationships among the domains in different datasets. A user searches for relevant datasets by providing examples of the domains of interest. The system identifies datasets corresponding to the user-provided examples. The system them identifies connected subsets of the datasets that are directly linked or indirectly linked through other domains. The user provides known relationship examples to filter the connected subsets and to identify the connected subsets that are most relevant to the user's query. The selected connected subsets may be further analyzed by business intelligence/analytics to create pivot tables or to process the data.

3. 发明授权

US08687697B2 Coding of motion vector information 有权
公开(公告)号：US08687697B2
公开(公告)日：2014-04-01
申请号：US13455094
申请日：2012-04-24
申请人： Sridhar Srinivasan , Pohsiang Hsu , Thomas W. Holcomb , Kunal Mukerjee , Bruce Chih-Lung Lin
发明人： Sridhar Srinivasan , Pohsiang Hsu , Thomas W. Holcomb , Kunal Mukerjee , Bruce Chih-Lung Lin
IPC分类号： H04N7/12 , H04N11/02 , H04N11/04 , G06K9/36 , G06K9/46
CPC分类号： H04N19/137 , G06K9/36 , G06K9/46 , H04N7/52 , H04N19/132 , H04N19/139 , H04N19/176 , H04N19/51 , H04N19/513 , H04N19/61 , H04N19/63 , H04N19/91
摘要： Techniques and tools for encoding and decoding motion vector information for video images are described. For example, a video encoder yields an extended motion vector code by jointly coding, for a set of pixels, a switch code, motion vector information, and a terminal symbol indicating whether subsequent data is encoded for the set of pixels. In another aspect, an encoder/decoder selects motion vector predictors for macroblocks. In another aspect, a video encoder/decoder uses hybrid motion vector prediction. In another aspect, a video encoder/decoder signals a motion vector mode for a predicted image. In another aspect, a video decoder decodes a set of pixels by receiving an extended motion vector code, which reflects joint encoding of motion information together with intra/inter-coding information and a terminal symbol. The decoder determines whether subsequent data exists for the set of pixels based on e.g., the terminal symbol.

4. 发明申请

US20120213280A1 CODING OF MOTION VECTOR INFORMATION 有权
标题翻译：编码运动矢量信息
公开(公告)号：US20120213280A1
公开(公告)日：2012-08-23
申请号：US13455094
申请日：2012-04-24
申请人： Sridhar Srinivasan , Pohsiang Hsu , Thomas W. Holcomb , Kunal Mukerjee , Bruce Chih-Lung Lin
发明人： Sridhar Srinivasan , Pohsiang Hsu , Thomas W. Holcomb , Kunal Mukerjee , Bruce Chih-Lung Lin
IPC分类号： H04N7/32
CPC分类号： H04N19/137 , G06K9/36 , G06K9/46 , H04N7/52 , H04N19/132 , H04N19/139 , H04N19/176 , H04N19/51 , H04N19/513 , H04N19/61 , H04N19/63 , H04N19/91
摘要： Techniques and tools for encoding and decoding motion vector information for video images are described. For example, a video encoder yields an extended motion vector code by jointly coding, for a set of pixels, a switch code, motion vector information, and a terminal symbol indicating whether subsequent data is encoded for the set of pixels. In another aspect, an encoder/decoder selects motion vector predictors for macroblocks. In another aspect, a video encoder/decoder uses hybrid motion vector prediction. In another aspect, a video encoder/decoder signals a motion vector mode for a predicted image. In another aspect, a video decoder decodes a set of pixels by receiving an extended motion vector code, which reflects joint encoding of motion information together with intra/inter-coding information and a terminal symbol. The decoder determines whether subsequent data exists for the set of pixels based on e.g., the terminal symbol.
摘要翻译：描述用于编码和解码用于视频图像的运动矢量信息的技术和工具。例如，视频编码器通过针对一组像素共同编码一个开关码，运动矢量信息和指示后续数据是否被编码用于像素集合的终端符号来产生扩展运动矢量码。在另一方面，编码器/解码器为宏块选择运动矢量预测器。在另一方面，视频编码器/解码器使用混合运动矢量预测。在另一方面，视频编码器/解码器针对预测图像发送运动矢量模式。在另一方面，视频解码器通过接收扩展运动矢量码来解码一组像素，该扩展运动矢量码反映运动信息的联合编码以及帧内/帧间编码信息和终端符号。解码器基于例如终端符号确定对于像素集合是否存在后续数据。

5. 发明申请

US20110264997A1 Scalable Incremental Semantic Entity and Relatedness Extraction from Unstructured Text 审中-公开
标题翻译：非结构化文本的可扩展增量语义实体和相关性提取
公开(公告)号：US20110264997A1
公开(公告)日：2011-10-27
申请号：US12764107
申请日：2010-04-21
申请人： Kunal Mukerjee , Sorin Gherman
发明人： Kunal Mukerjee , Sorin Gherman
IPC分类号： G06F17/30 , G06F17/21
CPC分类号： G06F16/3334
摘要： A search engine for documents containing text may process text using a statistical language model, classify the text based on entropy, and create suffix trees or other mappings of the text for each classification. From the suffix trees or mappings, a graph may be constructed with relationship strengths between different words or text strings. The graph may be used to determine search results, and may be browsed or navigated before viewing search results. As new documents are added, they may be processed and added to the suffix trees, then the graph may be created on demand in response to a search request. The graph may be represented as a adjacency matrix, and a transitive closure algorithm may process the adjacency matrix as a background process.
摘要翻译：包含文本的文档的搜索引擎可以使用统计语言模型处理文本，基于熵分类文本，并为每个分类创建后缀树或文本的其他映射。从后缀树或映射中，可以使用不同单词或文本字符串之间的关系强度来构建图形。该图可以用于确定搜索结果，并且可以在查看搜索结果之前被浏览或导航。当添加新文档时，可以对它们进行处理并添加到后缀树中，然后可以根据搜索请求按需创建图形。该图可以表示为邻接矩阵，并且传递闭包算法可以将邻接矩阵作为后台进程来处理。

6. 发明授权

US07945441B2 Quantized feature index trajectory 失效
标题翻译：量化特征索引轨迹
公开(公告)号：US07945441B2
公开(公告)日：2011-05-17
申请号：US11835389
申请日：2007-08-07
申请人： R. Donald Thompson , Kunal Mukerjee
发明人： R. Donald Thompson , Kunal Mukerjee
IPC分类号： G10L19/00
CPC分类号： G10L15/02 , G10L19/0018 , G10L2015/025
摘要： Indexing methods are described that may be used by databases, search engines, query and retrieval systems, context sensitive data mining, context mapping, language identification, image recognition, and robotic systems. Raw baseline features from an input signal are aggregated, abstracted and indexed for later retrieval or manipulation. The feature index is the quantization number for the underlying features that are represented by an abstraction. Trajectories are used to signify how the features evolve over time. Features indexes are linked in an ordered sequence indicative of time quanta, where the sequence represents the underlying input signal. An example indexing system based on the described processes is an inverted index that creates a mapping from features or atoms to the underlying documents, files, or data. A highly optimized set of operations can be used to manipulate the quantized feature indexes, where the operations can be fine tuned independent from the base feature set.
摘要翻译：描述了可由数据库，搜索引擎，查询和检索系统，上下文相关数据挖掘，上下文映射，语言识别，图像识别和机器人系统使用的索引方法。来自输入信号的原始基线特征被聚合，抽象和索引，以供以后检索或操纵。特征索引是由抽象表示的底层特征的量化数。轨迹用于表示随着时间的推移，特征如何演变。特征索引以指示时间量子的有序序列链接，其中序列表示底层输入信号。基于所描述的过程的示例索引系统是反向索引，其创建从特征或原子到底层文档，文件或数据的映射。可以使用高度优化的操作集来操纵量化的特征索引，其中可以独立于基本特征集来微调操作。

7. 发明申请

US20100280827A1 NOISE ROBUST SPEECH CLASSIFIER ENSEMBLE 有权
标题翻译：噪音强大的语音分类器ENSEMBLE
公开(公告)号：US20100280827A1
公开(公告)日：2010-11-04
申请号：US12433143
申请日：2009-04-30
申请人： Kunal Mukerjee , Kazuhito Koishida , Shankar Regunathan
发明人： Kunal Mukerjee , Kazuhito Koishida , Shankar Regunathan
IPC分类号： G10L15/00
CPC分类号： G10L15/142 , G10L15/16 , G10L15/197 , G10L15/20 , G10L21/0208 , G10L21/0216 , G10L25/18 , G10L25/93
摘要： Embodiments for implementing a speech recognition system that includes a speech classifier ensemble are disclosed. In accordance with one embodiment, the speech recognition system includes a classifier ensemble to convert feature vectors that represent a speech vector into log probability sets. The classifier ensemble includes a plurality of classifiers. The speech recognition system includes a decoder ensemble to transform the log probability sets into output symbol sequences. The speech recognition system further includes a query component to retrieve one or more speech utterances from a speech database using the output symbol sequences.
摘要翻译：公开了实现包括语音分类器集合的语音识别系统的实施例。根据一个实施例，语音识别系统包括将表示语音向量的特征向量转换为对数概率集的分类器集合。分类器集合包括多个分类器。语音识别系统包括将对数概率集合变换为输出符号序列的解码器集合。该语音识别系统还包括一个查询组件，用于使用输出符号序列从语音数据库中检索一个或多个语音话语。

8. 发明申请

US20070118376A1 Word clustering for input data 有权
标题翻译：用于输入数据的Word聚类
公开(公告)号：US20070118376A1
公开(公告)日：2007-05-24
申请号：US11283149
申请日：2005-11-18
申请人： Kunal Mukerjee
发明人： Kunal Mukerjee
IPC分类号： G10L15/06
CPC分类号： G10L15/063 , G10L15/183 , G10L15/19 , G10L2015/0631
摘要： A clustering tool to generate word clusters. In embodiments described, the clustering tool includes a clustering component that generates word clusters for words or word combinations in input data. In illustrated embodiments, the word clusters are used to modify or update a grammar for a closed vocabulary speech recognition application.
摘要翻译：用于生成单词簇的聚类工具。在所描述的实施例中，聚类工具包括为输入数据中的单词或单词组合生成单词簇的聚类组件。在所示实施例中，单词群集用于修改或更新封闭词汇语音识别应用程序的语法。

9. 发明授权

US08423546B2 Identifying key phrases within documents 有权
标题翻译：识别文档中的关键短语
公开(公告)号：US08423546B2
公开(公告)日：2013-04-16
申请号：US12959840
申请日：2010-12-03
申请人： Sorin Gherman , Kunal Mukerjee
发明人： Sorin Gherman , Kunal Mukerjee
IPC分类号： G06F7/00 , G06F17/30
CPC分类号： G06F17/3053 , G06F17/2715 , G06F17/2745 , G06F17/30864
摘要： The present invention extends to methods, systems, and computer program products for identifying key phrases within documents. Embodiments of the invention include using a tag index to determine what a document primarily relates to. For example, an integrated data flow and extract-transform-load pipeline, crawls, parses and word breaks large corpuses of documents in database tables. Documents can be broken into tuples. The tuples can be sent to a heuristically based algorithm that uses statistical language models and weight+cross-entropy threshold functions to summarize the document into its “top N” most statistically significant phrases. Accordingly, embodiments of the invention scale efficiently (e.g., linearly) and (potentially large numbers of) documents can be characterized by salient and relevant key phrases (tags).
摘要翻译：本发明扩展到用于识别文档内的关键短语的方法，系统和计算机程序产品。本发明的实施例包括使用标签索引来确定文档主要涉及的内容。例如，集成数据流和提取 - 转换 - 加载流水线，爬行，解析和单词，破坏数据库表中的大量文档。文件可以分为元组。元组可以被发送到一个启发式的算法，该算法使用统计语言模型和权重+交叉熵阈值函数来将文档归纳到其前N个最具统计意义的短语中。因此，本发明的实施例可以通过显着的和相关的关键短语（标签）来有效地（例如，线性地）和（潜在的大量的）文档的比例来表征。

10. 发明授权

US08209175B2 Uncertainty interval content sensing within communications 失效
标题翻译：通信中的不确定性间隔内容感知
公开(公告)号：US08209175B2
公开(公告)日：2012-06-26
申请号：US11449354
申请日：2006-06-08
申请人： Kunal Mukerjee , Rafael Ballesteros
发明人： Kunal Mukerjee , Rafael Ballesteros
IPC分类号： G10L15/04 , G10L21/00
CPC分类号： G06Q30/02
摘要： Repetition of content words in a communication is used to increase the certainty, or, alternatively, reduce the uncertainty, that the content words were actual words from the communication. Reducing the uncertainty of a particular content word of a communication in turn increases the likelihood that the content word is relevant to the communication. Reliable, relevant content words mined from a communication can be used for, e.g., automatic internet searches for documents and/or web sites pertinent to the communication. Reliable, relevant content words mined from a communication can also, or alternatively, be used to automatically generate one or more documents from the communication, e.g., communication summaries, communication outlines, etc.
摘要翻译：通信中的内容词的重复用于增加确定性，或者替代地减少不确定性，内容词是来自通信的实际单词。降低通信的特定内容词的不确定性反过来增加了内容词与通信相关的可能性。从通信挖掘的可靠的相关内容词可用于例如与通信相关的文档和/或网站的自动互联网搜索。从通信中挖掘的可靠的，相关的内容词也可以或者替代地用于从通信中自动生成一个或多个文档，例如通信摘要，通信大纲等。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式