
基本信息:
- 专利标题: 利用词向量结合机器学习的垃圾短信识别方法
- 专利标题(英):Junk short message identification method combining word vectors with machine learning
- 申请号:CN201910411018.3 申请日:2019-05-17
- 公开(公告)号:CN110175221A 公开(公告)日:2019-08-27
- 发明人: 刘发强 , 黄远 , 高圣翔 , 沈亮 , 林格平 , 万辛 , 洪永婷 , 吉立妍 , 宋东力
- 申请人: 国家计算机网络与信息安全管理中心 , 杭州东信北邮信息技术有限公司 , 长安通信科技有限责任公司
- 申请人地址: 北京市朝阳区裕民路甲3号
- 专利权人: 国家计算机网络与信息安全管理中心,杭州东信北邮信息技术有限公司,长安通信科技有限责任公司
- 当前专利权人: 国家计算机网络与信息安全管理中心,新讯数字科技(杭州)有限公司 长安通信科技有限责任公司
- 当前专利权人地址: 100029 北京市朝阳区裕民路甲3号
- 主分类号: G06F16/33
- IPC分类号: G06F16/33 ; G06F17/27 ; G06N3/04
The invention discloses a junk short message identification method combining word vectors with machine learning. The method comprises the following operation steps of (1) carrying out the first-step identification on a junk short message according to the short message characteristics; (2) carrying out the second-step identification on the junk short message according to the keywords; (3) calculating a short message text vector of the short message, and performing the third-step recognition on the junk short message by using a method of a support vector machine; (4) calculating a static word vector matrix of the short message, and performing the fourth-step recognition on the iunk short message by using a convolutional neural network; and (5) calculating a dynamic word vector of each segmented word of the short message, and carrying out the fifth-step recognition on the junk short message by using the convolutional neural network. According to the method, an unsupervised and supervisedcombined junk short message identification method is adopted, so that the identification accuracy of the junk short messages can be greatly improved.
公开/授权文献:
- CN110175221B 利用词向量结合机器学习的垃圾短信识别方法 公开/授权日:2021-04-20