பைத்தான் தானியக்கம்வழி விக்கிமூல மேலடி - கீழடி மேம்பாடு

Python-based Automation for Header-footer Improvement in Wikisource

Authors

DOI:

https://doi.org/10.5281/zenodo.10991314

Keywords:

விக்கிமூலம், கீழடி, விக்கிப்பீடியா, விக்கித்திட்டங்கள், நுட்ப மேம்பாடு, மேலடி, பைத்தான்

Abstract

 

Abstract

Tamil Wikisource is one of the best projects of the Wikimedia Foundation. This project works like an online library. This project is a repository of free content containing original works. It also serves as an online collection. Currently, there are over 2468 books in the Tamil Wikisource collection. This collection contains a wide variety of books covering a broad range of subjects, including Sangam literature, devotional literature, folklore literature, history, linguistics, a scientific encyclopedia, mathematics, science, art, creative art, grammar, travelogue, and biography. This project can be easily contributed to. Anyone can upload books of their choice. Uploaded books can be edited. Edited pages can be improved and comment on the development of the project. 

This project is a valuable record of the Tamil language and culture.  This project has been started in 2003. As of 2023, the website has over 350,000 pages. A few pages are missing in some of the books in this project. This project is actively used as an important place for modern computer data processing. It is an undeniable fact that Tamil Wikisource has also played a role in the development of present day Google Optical Character Recognition. 

After creating a user account on this Wikisource, books can be uploaded or edited. In order to upload a book, one must download it to his computer. After converting it to a PDF file, it could be uploaded to Wikisource. While uploading, one must be aware of the copyright information. The process of editing an uploaded book involves searching for the book, viewing its pages, making the necessary edits, and then saving the changes. This project, which has been improved by various volunteers, includes a wide range of classical Tamil works, such as Tolkappiyam, Nannul, Thirukkural, Silappathikaram, Manimekalai, Kambaramayanam, Sangam literature, and devotional literature. While this project is contributing to the development of the Tamil language and the spread of Tamil literature, it is also noteworthy that it is functioning as a big data repository for the development of artificial intelligence in Tamil. Since all the books in it are open source and free content, they are available to everyone without any restrictions. However, 286,926 pages are still unverified. In addition, over 200,000 pages do not have headers or footers. The number of pages will increase as more books are uploaded, and the need for improvement will continue to grow. Automating this process will reduce the time it takes. The Python language will be of great help for this. This research article aims to develop a technique to automatically improve the headers or footers of pages using Python programming.

Downloads

Published

01-04-2024

How to Cite

Thangasamy, S., A, V., A, J. P. B., S, S., S, S. S., & Rathinavel, L. (2024). பைத்தான் தானியக்கம்வழி விக்கிமூல மேலடி - கீழடி மேம்பாடு : Python-based Automation for Header-footer Improvement in Wikisource. PULAM : INTERNATIONAL JOURNAL OF TAMILOLOGY STUDIES, 37–46. https://doi.org/10.5281/zenodo.10991314