Plagiarism detection in arabic using translation and medical ontology

Khaled Omar., Bassel Alkhatib and Mayssoon Dashash

The huge increase in documents on the world-wide web and the availability to reach and download them has led to a dangerous problem which is using others' works without giving them credits.
Although a number of methods have been developed to discover popular cases of plagiarism in Arabic Language, as changing sentence structure or replacing words with their synonyms, it is still difficult to diagnose plagiarism when modifying deliberately quoted sentences.
In this paper a Semantic Similarity Algorithm System is proposed for detecting plagiarism in medical Arabic papers using semantic networks in Arabic language (Arabic Word Net), automatic translation to English language, and international  medical Ontologies(in English Language). The developed algorithm depends on determining the degree of semantic similarity between original documents and suspected documents by calculating  the intersection of the semantic information between files, the proposed algorithm uses Arabic Word Net to detect sentences concepts with their synonyms, and the medical Ontologies are used to expanding the sentences of origin and suspected  texts, and calculates the semantic similarity between them, the automatic translation was used to translate text from Arabic Language to English Language to benefit from International Medical Ontologies, because of the lack of Medical Ontologies in Arabic Language.
The proposed algorithm has showed a good results  by determining the similarity between the origin documents and the suspected documents using semantic detection score, that it has discovered  the plagiarism cases even if the user replace some words with their synonyms, and if the user restructured the sentences of the plagiarized texts.


Download PDF: