会员体验
专利管家(专利管理)
工作空间(专利管理)
风险监控(情报监控)
数据分析(专利分析)
侵权分析(诉讼无效)
联系我们
交流群
官方交流:
QQ群: 891211   
微信请扫码    >>>
现在联系顾问~
热词
    • 1. 发明申请
    • Model Adaptation System and Method for Speaker Recognition
    • 扬声器识别模型适应系统及方法
    • US20080208581A1
    • 2008-08-28
    • US10581227
    • 2004-12-03
    • Jason PelecanosSubramanian SridharanRobert Vogt
    • Jason PelecanosSubramanian SridharanRobert Vogt
    • G10L17/00
    • G10L17/04
    • A system and method for speaker recognition speaker modelling whereby prior speaker information is incorporated into the modelling process, utilising the maximum a posteriori (MAP) algorithm and extending it to contain prior Gaussian component correlation information. Firstly a background model (10) is estimated. Pooled acoustic reference data (11) relating to a specific demographic of speakers (population of interest) from a given total population is then trained via the Expectation Maximization (EM) algorithm (12) to produce a background model (13). The background model (13) is adapted utilising information from a plurality of reference speakers (21) in accordance with the Maximum A Posteriori (MAP) criterion (22). Utilizing MAP estimation technique, the reference speaker data and prior information obtained from the background model parameters are combined to produce a library of adapted speaker models, namely Gaussian Mixture Models (23).
    • 一种用于说话者识别扬声器建模的系统和方法,其中先前的说话者信息被并入到建模过程中,利用最大后验(MAP)算法并将其扩展为包含先前的高斯分量相关信息。 首先估计一个背景模型(10)。 然后通过期望最大化(EM)算法(12)训练与给定总人口的特定人群(兴趣人群)有关的汇集的声学参考数据(11)以产生背景模型(13)。 背景模型(13)根据最大后验(最大后验)(MAP)标准(22)利用来自多个参考扬声器(21)的信息。 利用MAP估计技术,将从背景模型参数获得的参考说话者数据和先验信息相结合,以产生适应的说话者模型库,即高斯混合模型(23)。
    • 3. 发明申请
    • Dynamic match lattice spotting for indexing speech content
    • 用于索引语音内容的动态匹配格点
    • US20070179784A1
    • 2007-08-02
    • US11377327
    • 2006-03-16
    • Albert Joseph ThambiratnamSubramanian Sridharan
    • Albert Joseph ThambiratnamSubramanian Sridharan
    • G10L15/28
    • G10L15/26G10L2015/025
    • A system for indexing and searching speech content, the system includes two distinct stages, a speech indexing stage (100) and a speech retrieval stage (200). A phone lattice (103) is generated by passing speech content (101) through a speech recogniser (102). The resulting phone lattice is then processed to produce a set of observed sequences Q=(Θ,i) where Θ are the set of observed phone sequences for each node i in the phone lattice. During the retrieval stage (200), a user first inputs a target word (205) into the system, which is then reduced to a target phone sequence P=(p1, p2, . . . , pN) (207). The system then compares target sequence P with the set of observed sequences Q (208), suitably by scoring each observed sequence against the target sequence using a Minimum Edit Distance (MED) calculation to produce a set of matching sequences R (209).
    • 一种用于索引和搜索语音内容的系统,该系统包括两个不同的阶段,语音索引阶段(100)和语音检索阶段(200)。 通过语音识别器(102)传递语音内容(101)来生成电话格(103)。 所得到的电话格子然后被处理以产生一组观察到的序列Q =(Theta,i),其中Theta是电话格中每个节点i的观察到的电话序列的集合。 在检索阶段(200)期间,用户首先将目标字(205)输入到系统中,然后将目标字(205)减少到目标电话序列P =(p1> 1,p2 < / SUB>,...,p N N N)(207)。 然后,系统通过使用最小编辑距离(MED)计算来针对目标序列评估每个观察到的序列以产生一组匹配序列R(209),将目标序列P与观察序列集合Q(208)进行比较。