专利快速检索-快速检索全球专利，免费商用专利数据库-IPRDB

1. 发明授权

US5056150A Method and apparatus for real time speech recognition with and without speaker dependency 失效
标题翻译：用于实时语音识别的方法和设备，具有和不具有扬声器依赖性
公开(公告)号：US5056150A
公开(公告)日：1991-10-08
申请号：US433098
申请日：1989-11-08
申请人： Tiecheng Yu , Ning Bi , Meiling Rong , Enyao Zhang
发明人： Tiecheng Yu , Ning Bi , Meiling Rong , Enyao Zhang
IPC分类号： G10L15/08 , G10L15/10 , G10L25/87
CPC分类号： G10L15/10 , G10L25/87
摘要： A method and apparatus for real time speech recognition with and without speaker dependency which includes the following steps. Converting the speech signals into a series of primitive sound spectrum parameter frames; detecting the beginning and ending of speech according to the primitive sound spectrum parameter frame, to determine the sound spectrum parameter frame series; performing non-linear time domain normalization on the sound spectrum parameter frame series using sound stimuli, to obtain speech characteristic parameter frame series with predefined lengths on the time domain; performing amplitude quantization normalization on the speech characteristic parameter frames; comparing the speech characteristic parameter frame series with the reference samples, to determine the reference sample which most closely matches the speech characteristic parameter frame series; and determining the recognition result according to the most closely matched reference sample.
摘要翻译：一种具有或不具有扬声器依赖性的实时语音识别的方法和装置，包括以下步骤。将语音信号转换为一系列原始声谱参数帧; 根据原始声谱参数帧检测语音的开始和结束，确定声谱参数帧序列; 使用声音刺激对声谱参数帧序列进行非线性时域归一化，获得时域上具有预定长度的语音特征参数帧序列; 对语音特征参数帧执行振幅量化归一化; 将语音特征参数帧序列与参考样本进行比较，确定与语音特征参数帧序列最匹配的参考样本; 并根据最接近匹配的参考样本确定识别结果。

2. 发明授权

US09578299B2 Stereoscopic conversion for shader based graphics content 有权
标题翻译：基于着色器的图形内容的立体转换
公开(公告)号：US09578299B2
公开(公告)日：2017-02-21
申请号：US13350467
申请日：2012-01-13
申请人： Ning Bi , Xuerui Zhang , Jian Wei
发明人： Ning Bi , Xuerui Zhang , Jian Wei
IPC分类号： G06T15/50 , H04N13/00 , G06T15/00 , G06T15/10 , G06T15/30 , H04N13/02 , G06T15/60 , G06T15/80 , G06T15/06
CPC分类号： H04N13/122 , G06T15/005 , G06T15/06 , G06T15/10 , G06T15/30 , G06T15/50 , G06T15/506 , G06T15/60 , G06T15/80 , H04N13/275 , H04N13/286
摘要： The example techniques of this disclosure are directed to generating a stereoscopic view from an application designed to generate a mono view. For example, the techniques may modify source code of a vertex shader to cause the modified vertex shader, when executed, to generate graphics content for the images of the stereoscopic view. As another example, the techniques may modify a command that defines a viewport for the mono view to commands that define the viewports for the images of the stereoscopic view.
摘要翻译：本公开的示例性技术涉及从被设计成生成单视图的应用产生立体视图。例如，这些技术可以修改顶点着色器的源代码，以便在被执行时使经修改的顶点着色器生成立体视图的图像的图形内容。作为另一示例，这些技术可以修改将单声道视图的视口定义为定义立体视图的图像的视口的命令的命令。

3. 发明授权

US08553943B2 Content-adaptive systems, methods and apparatus for determining optical flow 有权
标题翻译：用于确定光流的内容自适应系统，方法和装置
公开(公告)号：US08553943B2
公开(公告)日：2013-10-08
申请号：US13160457
申请日：2011-06-14
申请人： Yingyong Qi , Ning Bi , Xuerui Zhang
发明人： Yingyong Qi , Ning Bi , Xuerui Zhang
IPC分类号： G06K9/00 , H04N7/18
CPC分类号： G06K9/00536 , G06T7/20 , G06T2207/20012
摘要： Embodiments include methods and systems which determine pixel displacement between frames based on a respective weighting-value for each pixel or a group of pixels. The weighting-values provide an indication as to which pixels are more pertinent to optical flow computations. Computational resources and effort can be focused on pixels with higher weights, which are generally more pertinent to optical flow determinations.
摘要翻译：实施例包括基于每个像素或一组像素的相应加权值来确定帧之间的像素位移的方法和系统。加权值提供关于哪些像素与光流计算更相关的指示。计算资源和努力可以集中在具有较高权重的像素上，这通常与光流测定更相关。

4. 发明申请

US20120139906A1 HYBRID REALITY FOR 3D HUMAN-MACHINE INTERFACE 审中-公开
标题翻译：用于3D人机界面的混合现实
公开(公告)号：US20120139906A1
公开(公告)日：2012-06-07
申请号：US13234028
申请日：2011-09-15
申请人： Xuerui Zhang , Ning Bi , Yingyong Qi
发明人： Xuerui Zhang , Ning Bi , Yingyong Qi
IPC分类号： G06T15/00
CPC分类号： G06T19/006 , H04N13/156
摘要： A three dimensional (3D) mixed reality system combines a real 3D image or video, captured by a 3D camera for example, with a virtual 3D image rendered by a computer or other machine to render a 3D mixed-reality image or video. A 3D camera can acquire two separate images (a left and a right) of a common scene, and superimpose the two separate images to create a real image with a 3D depth effect. The 3D mixed-reality system can determine a distance to a zero disparity plane for the real 3D image, determine one or more parameters for a projection matrix based on the distance to the zero disparity plane, render a virtual 3D object based on the projection matrix, combine the real image and the virtual 3D object to generate a mixed-reality 3D image.
摘要翻译：三维（3D）混合现实系统将由3D摄像机捕获的真实3D图像或视频与由计算机或其他机器呈现的虚拟3D图像组合以渲染3D混合现实图像或视频。 3D摄像机可以获取公共场景的两个单独的图像（左和右），并且叠加两个分离的图像以创建具有3D深度效果的实际图像。 3D混合现实系统可以确定实际3D图像到零视差平面的距离，基于到零视差平面的距离确定用于投影矩阵的一个或多个参数，基于投影矩阵渲染虚拟3D对象，组合真实图像和虚拟3D对象以产生混合现实的3D图像。

5. 发明申请

US20090237401A1 MULTI-STAGE TESSELLATION FOR GRAPHICS RENDERING 有权
标题翻译：用于图形渲染的多阶段测量
公开(公告)号：US20090237401A1
公开(公告)日：2009-09-24
申请号：US12052628
申请日：2008-03-20
申请人： Jian Wei , Guofang Jiao , Ning Bi , Chehui Wu
发明人： Jian Wei , Guofang Jiao , Ning Bi , Chehui Wu
IPC分类号： G06T17/00
CPC分类号： G06T11/203
摘要： This disclosure describes a multi-stage tessellation technique for tessellating a curve during graphics rendering. In particular, a first tessellation stage tessellates the curve into a first set of line segments that each represents a portion of the curve. A second tessellation stage further tessellates the portion of the curve represented by each of the line segments of the first set into additional line segments that more finely represent the shape of the curve. In this manner, each portion of the curve that was represented by only one line segment after the first tessellation stage is represented by more than one line segment after the second tessellation stage. In some instances, more than two tessellation stages may be performed to tessellate the curve.
摘要翻译：本公开描述了用于在图形渲染期间细分曲线的多阶段镶嵌技术。特别地，第一细分阶段将曲线细分为第一组线段，每组线段表示曲线的一部分。第二细分阶段进一步将由第一组的每个线段表示的曲线的部分细分为更精细地表示曲线形状的附加线段。以这种方式，在第一细分阶段之后仅由一个线段表示的曲线的每个部分在第二细分阶段之后被多于一个线段表示。在一些情况下，可以执行多于两个的细分阶段来细分曲线。

6. 发明授权

US06941265B2 Voice recognition system method and apparatus 有权
标题翻译：语音识别系统的方法和装置
公开(公告)号：US06941265B2
公开(公告)日：2005-09-06
申请号：US10017270
申请日：2001-12-14
申请人： Ning Bi , Andrew DeJaco , Xin Zhong , Chienchung Chang , Chuck Han , Hari Garudadri , Naren Malayath , Suhail Jalil
发明人： Ning Bi , Andrew DeJaco , Xin Zhong , Chienchung Chang , Chuck Han , Hari Garudadri , Naren Malayath , Suhail Jalil
IPC分类号： G10L15/28 , G10L15/00
CPC分类号： G10L15/28
摘要： Generally stated a method and an accompanying apparatus provides for a voice recognition system (300) with programmable front end processing unit (400). The front end processing unit (400) requests and receives different configuration files at different times for processing voice data in the voice recognition system (300). The configuration files are communicated to the front end unit via a communication link (310) for configuring the front end processing unit (400). A microprocessor may provide the front end configuration files on the communication link at different times.
摘要翻译：通常所述方法和伴随装置提供具有可编程前端处理单元（400）的语音识别系统（300）。前端处理单元400在不同时间请求并接收不同的配置文件，以处理语音识别系统（300）中的语音数据。配置文件经由用于配置前端处理单元（400）的通信链路（310）传送到前端单元。微处理器可以在不同时间在通信链路上提供前端配置文件。

7. 发明授权

US06449496B1 Voice recognition user interface for telephone handsets 有权
标题翻译：语音识别用户界面，用于电话手机
公开(公告)号：US06449496B1
公开(公告)日：2002-09-10
申请号：US09246499
申请日：1999-02-08
申请人： Scott D. Beith , Ning Bi , Chienchung Chang , Karthick Chinnaswami , Andrew P. DeJaco , Jason B. Kenagy , Robert Opalsky , George Pan
发明人： Scott D. Beith , Ning Bi , Chienchung Chang , Karthick Chinnaswami , Andrew P. DeJaco , Jason B. Kenagy , Robert Opalsky , George Pan
IPC分类号： H04B138
CPC分类号： H04M1/271
摘要： A method and apparatus providing a user interface within a phone that responds to a limited vocabulary of user trained voice commands. The interface allows users to perform all phone handset dialing functions using voice commands. Additionally, users will be able to create and modify entries within a voice recognition phonebook, whereby a number within the voice recognition phonebook can be called by saying the name associated with the number. The user interface provides a combination of voice and LCD displayed user prompts and responses to voice input. The interface responds to user voice commands and performs the command functions based upon matches to previously user trained voice command vocabulary words stored in memory.
摘要翻译：一种在电话内提供用户界面的方法和装置，其响应于用户训练的语音命令的有限词汇。该接口允许用户使用语音命令执行所有手机拨号功能。此外，用户将能够创建和修改语音识别电话簿内的条目，由此可以通过说出与该号码相关联的名称来呼叫语音识别电话簿内的号码。用户界面提供语音和LCD组合，显示用户提示和响应语音输入。接口响应用户语音命令，并且基于与存储在存储器中的先前用户训练的语音命令词汇词的匹配来执行命令功能。

8. 发明授权

US06381569B1 Noise-compensated speech recognition templates 失效
标题翻译：噪声补偿语音识别模板
公开(公告)号：US06381569B1
公开(公告)日：2002-04-30
申请号：US09018257
申请日：1998-02-04
申请人： Gilbert C. Sih , Ning Bi
发明人： Gilbert C. Sih , Ning Bi
IPC分类号： G10L1520
CPC分类号： G10L15/20 , G10L21/0216
摘要： The speech recognition training unit is modified to store digitized speech samples into a speech database that can be accessed at recognition time. The improved recognition unit comprises a noise analysis, modeling, and synthesis unit which continually analyzes the noise characteristics present in the audio environment and produces an estimated noise signal with similar characteristics. The recognition unit then constructs a noise-compensated template database by adding the estimated noise signal to each of the speech samples in the speech database and performing parameter determination on the resulting sums. This procedure accounts for the presence of noise in the recognition phase by retraining all the templates using an estimated noise signal with similar characteristics as the actual noise signal that corrupted the word to be recognized. This method improves the likelihood of a good template match, which increases the recognition accuracy.
摘要翻译：修改语音识别训练单元以将数字化语音样本存储到可在识别时被访问的语音数据库中。改进的识别单元包括噪声分析，建模和合成单元，其连续分析存在于音频环境中的噪声特性并产生具有相似特性的估计噪声信号。然后，识别单元通过将估计的噪声信号加到语音数据库中的每个语音样本上并对所得到的和进行参数确定来构建噪声补偿模板数据库。该过程通过使用具有与损坏要识别的字的实际噪声信号相似的特性的估计噪声信号重新训练所有模板来解决识别阶段中的噪声的存在。该方法提高了模板匹配的可能性，从而提高了识别精度。

9. 发明授权

US06278972B1 System and method for segmentation and recognition of speech signals 有权
标题翻译：用于语音信号的分割和识别的系统和方法
公开(公告)号：US06278972B1
公开(公告)日：2001-08-21
申请号：US09225891
申请日：1999-01-04
申请人： Ning Bi , Chienchung Chang
发明人： Ning Bi , Chienchung Chang
IPC分类号： G01L1504
CPC分类号： G10L15/04
摘要： A system and method for forming a segmented speech signal from an input speech signal having a plurality of frames. The input speech signal is converted from a time domain signal to a frequency domain signal having a plurality of speech frames, wherein each speech frame in the frequency domain signal is represented by at least one spectral value associated with the speech frame. A spectral difference value is then determined for each pair of adjacent frames in the frequency domain signal, wherein the spectral difference value for each pair of adjacent frames is representative of a difference between the at least one spectral value associated with each frame in the pair of adjacent frames. An initial cluster boundary is set between each pair of adjacent frames in the frequency domain signal, and a variance value is assigned to each cluster in the frequency domain signal, wherein the variance value for each cluster is equal to one of the determined spectral difference values. Next, a plurality of cluster merge parameters is calculated, wherein each of the cluster merge parameters is associated with a pair of adjacent clusters in the frequency domain signal. A minimum cluster merge parameter is selected from the plurality of cluster merge parameters. A merged cluster is then formed by canceling a cluster boundary between the clusters associated with the minimum merge parameter and assigning a merged variance value to the merged cluster, wherein the merged variance value is representative of the variance values assigned to the clusters associated with the minimum merge parameter. The process is repeated in order to form a plurality of merged clusters, and the segmented speech signal is formed in accordance with the plurality of merged clusters.
摘要翻译：一种用于从具有多个帧的输入语音信号形成分段语音信号的系统和方法。输入语音信号从时域信号转换为具有多个语音帧的频域信号，其中频域信号中的每个语音帧由与语音帧相关联的至少一个频谱值表示。然后对频域信号中的每对相邻帧确定频谱差值，其中每对相邻帧的频谱差值表示与该对相邻帧中的每个帧相关联的至少一个频谱值之间的差异相邻帧。在频域信号中的每对相邻帧之间设置初始簇边界，并且将频域值分配给频域信号中的每个簇，其中每个簇的方差值等于所确定的光谱差值之一。接下来，计算多个集群合并参数，其中每个集群合并参数与频域信号中的一对相邻集群相关联。从多个集群合并参数中选择最小集群合并参数。然后通过消除与最小合并参数相关联的集群之间的集群边界并将合并的方差值分配给合并的集群来形成合并的集群，其中合并的方差值表示分配给与最小合并参数相关联的集群的方差值合并参数。重复该过程以形成多个合并的群集，并且根据多个合并的群集形成分段语音信号。

10. 发明授权

US09049423B2 Zero disparity plane for feedback-based three-dimensional video 有权
标题翻译：用于基于反馈的三维视频的零视差平面
公开(公告)号：US09049423B2
公开(公告)日：2015-06-02
申请号：US12958107
申请日：2010-12-01
申请人： Ning Bi , Xuerui Zhang , Yingyong Qi , Chienchung Chang
发明人： Ning Bi , Xuerui Zhang , Yingyong Qi , Chienchung Chang
IPC分类号： H04N13/02 , H04N13/00 , G06T7/40 , G06T5/40 , G06K9/46 , G06K9/62
CPC分类号： H04N13/128 , G06K9/4642 , G06K9/6212 , G06T5/40 , G06T2207/20228 , H04N13/271 , H04N2013/0081
摘要： The techniques of this disclosure are directed to the feedback-based stereoscopic display of three-dimensional images, such as may be used for video telephony (VT) and human-machine interface (HMI) application. According to one example, a region of interest (ROI) of stereoscopically captured images may be automatically determined based on determining disparity for at least one pixel of the captured images are described herein. According to another example, a zero disparity plane (ZDP) for the presentation of a 3D representation of stereoscopically captured images may be determined based on an identified ROI. According to this example, the ROI may be automatically identified, or identified based on receipt of user input identifying the ROI.
摘要翻译：本公开的技术涉及三维图像的基于反馈的立体显示，诸如可用于视频电话（VT）和人机界面（HMI）应用。根据一个示例，可以基于确定捕获图像的至少一个像素的视差来自动确定立体拍摄图像的感兴趣区域（ROI）。根据另一示例，可以基于所识别的ROI来确定用于呈现立体摄影图像的3D表示的零视差平面（ZDP）。根据该示例，可以基于接收到识别ROI的用户输入来自动识别或识别ROI。

你已经成功收藏专利！

检索式保存成功!

IPRDB

热门服务

关于我们

友情链接

联系方式