帮助 关于我们

返回检索结果

调序规则表的深度过滤研究
Research of Deep Filtering Lexical Reordering Table

查看参考文献17篇

孔金英 1   李晓 1   王磊 1   杨雅婷 1 *   罗延根 2  
文摘 机器翻译系统中调序规则表和翻译表一般规模都很大,对翻译表进行优化过滤一直都是研究热点,而过滤调序规则表的研究却近乎空白。将调序规则表的过滤当成短文本分类问题,提出了一种基于自动编码机(Autoencoder)的调序规则表过滤模型。该模型首先使用一种基于自动编码机的分类器对调序规则进行打分评价,然后对调序规则表进行基于最小差异策略的过滤,最后使用过滤得到的调序规则表重新计算调序规则得分表用于机器翻译的解码过程。实验表明,在公开的英汉语料和维汉语料上使用该模型,可以在调序规则表减少40%的基础上分别将BLEU(bilingual evaluation understudy)值提高0.19和0.26。
其他语种文摘 In statistical machine translation system, lexical reordering table and phrase-table are always huge. Tuning and filtering the phrase-table has been research focus long time, while few researchers focus on filtering the lexical reordering table. This paper treats filtering lexical reordering table as the problem of short text classification, proposes a filtering model of lexical reordering table based on Autoencoder. This model uses the Autoencoder to score the reordering rules firstly, then filters the lexical reordering table by minimal difference strategy, finally recalculates lexical reordering score table used for machine translation decoding. The experimental results show that the size of lexical reordering table reduces 40% while the BLEU (bilingual evaluation understudy) increases 0.19 and 0.26 by using the proposed model on public English-Chinese corpus and Uyghur-Chinese corpus.
来源 计算机科学与探索 ,2017,11(5):785-793 【核心库】
DOI 10.3778/j.issn.1673-9418.1603056
关键词 自动编码机 ; 过滤模型 ; 调序规则表 ; 机器翻译
地址

1. 中国科学院新疆理化技术研究所, 新疆民族语音语言信息处理重点实验室, 乌鲁木齐, 830011  

2. 中国科学院新疆理化技术研究所, 乌鲁木齐, 830011

语种 中文
文献类型 研究性论文
ISSN 1673-9418
学科 自动化技术、计算机技术
基金 国家高技术研究发展计划(863计划) ;  中国科学院战略性先导科技专项 ;  中国科学院“西部之光“项目
文献收藏号 CSCD:5978162

参考文献 共 17 共1页

1.  Koehn P. Moses:open source toolkit for statistical machine translation. Proceedings of the 45th Annual Meeting of the ACL on Interactive Poster and Demonstration Sessions, Prague, Czech Republic, Jun 23-30, 2007,2007:177-180 被引 1    
2.  Stolcke A. SRILM-an extensible language modeling toolkit. Proceedings of the 2002 International Conference on Spoken Language Processing,2002:1409-1412 被引 1    
3.  Brown P F. The mathematics of statistical machine translation:parameter estimation. Computational linguistics,1993,19(2):263-311 被引 93    
4.  Bengio Y. Neural probabilistic language models. Innovations in Machine Learning,2006:137-186 被引 23    
5.  Deng Li. Binary coding of speech spectrograms using a deep auto-encoder. Proceedings of the 11th Annual Conference of the International Speech Communication Association,2010:1692-1695 被引 1    
6.  Graves A. Framewise phoneme classification with bidirectional LSTM and other neural network architectures. Neural Networks,2005,18(5):602-610 被引 236    
7.  Roska T. The CNN universal machine:an analogic array computer. IEEE Transactions on Circuits and Systems II:Analog and Digital Signal Processing,1993,40(3):163-173 被引 14    
8.  殷乐. 基于虚拟上下文的统计机器翻译短语表的过滤. 中文信息学报,2013,27(6):139-144 被引 1    
9.  狄萍. 基于短语的统计机器翻译中短语表的过滤. 计算机应用与软件,2011,28(5):28-30 被引 1    
10.  Zens R. A systematic comparison of phrase table pruning techniques. Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Jeju Island, Korea, Jul 12-14, 2012,2012:972-983 被引 1    
11.  Koehn P. Statistical phrase-based translation. Proceedings of the 2003 Conference of the North American Chapter of the Association for Computational Linguistics on Human Language Technology-Volume 1, Edmonton, Canada, May 27- Jun 1, 2003,2003:48-54 被引 11    
12.  Tillmann C. A localized prediction model for statistical machine translation. Proceedings of the 43rd Annual Meeting on Association for Computational Linguistics, Michigan, USA, Jun 25-30, 2005,2005:557-564 被引 1    
13.  Xiong Deyi. Maximum entropy based phrase reordering model for statistical machine translation. Proceedings of the 21st International Conference on Computational Linguistics and the 44th Annual Meeting of the Association for Computational Linguistics, Sydney, Australia, Jul 17-21, 2006,2006:521-528 被引 5    
14.  Li Peng. Recursive autoencoders for ITGbased translation. Proceedings of the 2013 Conference on Empirical Methods in Natural Language Processing, Seattle, USA, Oct 18-21, 2013,2013:567-577 被引 1    
15.  肖欣延. 面向层次短语翻译的词汇化调序方法研究. 中文信息学报,2012,26(1):37-41 被引 5    
16.  Wang Chao. Chinese syntactic reordering for statistical machine translation. Proceedings of the 2007 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, Prague, Czech Republic, Jun 28-30, 2007,2007:737-745 被引 1    
17.  Papineni K. BLEU:a method for automatic evaluation of machine translation. Proceedings of the 40th Annual Meeting on Association for Computational Linguistics, Philadelphia, USA, Jul 6-12, 2002,2002:311-318 被引 1    
引证文献 2

1 潘一荣 面向汉维机器翻译的调序表重构模型 计算机应用,2018,38(5):1283-1288
被引 3

2 侯强 机器翻译方法研究与发展综述 计算机工程与应用,2019,55(10):30-35,66
被引 4

显示所有2篇文献

论文科学数据集
PlumX Metrics
相关文献

 作者相关
 关键词相关
 参考文献相关

版权所有 ©2008 中国科学院文献情报中心 制作维护:中国科学院文献情报中心
地址:北京中关村北四环西路33号 邮政编码:100190 联系电话:(010)82627496 E-mail:cscd@mail.las.ac.cn 京ICP备05002861号-4 | 京公网安备11010802043238号