school

UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text
Title

Translation hypotheses re-ranking for statistical machine translation

English Abstract

TRANSALION HYPOTHESES RE-RANKING FOR STATISTICAL MACHINE TRANSLATION by Yan Liu Thesis Supervisor: Associate Professor, Chi Man Vong Master of Science in Computer Science In statistical machine translation (SMT), a possibly infinite number of translation hypotheses can be decoded from a source sentence, among which re-ranking is applied to sort out the best translation result. Undoubtedly, re-ranking is an essential component of SMT for efficient and effective translation. Two innovative re-ranking frameworks called Cascaded Re-ranking Modeling (CRM) and Unsupervised Hypotheses Reranking (UHR) were proposed. CRM is a supervised re-ranking model by cascading a classification model and a regression model. The proposed CRM efficiently and effectively selects the good but rare hypotheses in order to alleviate simultaneously the issues of translation quality and computational cost. CRM can be partnered with any classifier such as support vector machine (SVM) and Extreme Learning Machine (ELM). UHR is a re-ranking model takes no labels and linguistics features into consideration. The re-ranking models were constructed by the hypotheses themselves. Compared to other state-of-the-art methods, experimental results shows that CRM partnered with ELM (CRM-ELM) can raised at most 11.6% of translation quality over the popular benchmark Chinese-English Corpus (IWSLT 2014) and French-English parallel corpus (WMT 2015) with extremely fast training time for huge corpus.

Issue date

2017.

Author

Liu, Yan

Faculty

Faculty of Science and Technology

Department

Department of Computer and Information Science

Degree

M.Sc.

Subject

Machine translating

Supervisor

Vong, Chi Man

Files In This Item

Full-text (Intranet only)

Location
1/F Zone C
Library URL
991005792579706306