school

UM E-Theses Collection (澳門大學電子學位論文庫)

check Full Text
Title

Improving inter-language links between different languages of Wikipedia

English Abstract

Improving Inter-Language Links between Different Languages of Wikipedia by Kueng-Chon Cheang Thesis Supervisor: Assistant Professor Robert P. Biuk-Aghai This thesis describes a method for improving the inter-language links between two languages of Wikipedia. Inter-language links are used primarily to link a Wikipedia page to a corresponding page in another language Wikipedia [1]. We have two ways to improve the inter-language links: increase the total number of inter-language links between two languages and improve the quality of inter-language links. I have chosen four sample languages. They are Chinese, Simple English, Swedish and Norwegian Nynorsk. In the sample from Simple English Wikipedia to Swedish Wikipedia, my application adds 1178 new inter-language links to 12870 existing categories. That means 9% of existing categories could be added a new inter-language link. I also tried to eliminate some low quality matching to improve the quality of inter-language links. The works of eliminating low quality matching included proving the value of the link reciprocity ratio and the minimum number of link from a candidate. Moreover, I proposed a workflow to push my suggested inter-language links to Wikipedia. The average precision of 12 sampled pairs of Wikipedia language editions is 77.94%.

Issue date

2015.

Author

Cheang, Kueng Chon

Faculty

Faculty of Science and Technology

Department

Department of Computer and Information Science

Degree

M.Sc.

Subject

Computational linguistics

Multilingual computing

Wikipedia

Supervisor

Biuk-Aghai, Robert P.

Files In This Item

Full-text (Intranet only)

Location
1/F Zone C
Library URL
991000756599706306