|
International Journal of Computer Applications
Foundation of Computer Science (FCS), NY, USA
|
| Volume 186 - Issue 26 |
| Published: July 2024 |
| Authors: Shlok Deshpande, Vineet Shinde, Siddharth Chaudhari, Yashodhara V. Haribhakta |
10.5120/ijca2024923738
|
Shlok Deshpande, Vineet Shinde, Siddharth Chaudhari, Yashodhara V. Haribhakta . Multilingual & Cross-Lingual Text Summarization of Marathi and English using Transformer Based Models and their Systematic Evaluation. International Journal of Computer Applications. 186, 26 (July 2024), 11-17. DOI=10.5120/ijca2024923738
@article{ 10.5120/ijca2024923738,
author = { Shlok Deshpande,Vineet Shinde,Siddharth Chaudhari,Yashodhara V. Haribhakta },
title = { Multilingual & Cross-Lingual Text Summarization of Marathi and English using Transformer Based Models and their Systematic Evaluation },
journal = { International Journal of Computer Applications },
year = { 2024 },
volume = { 186 },
number = { 26 },
pages = { 11-17 },
doi = { 10.5120/ijca2024923738 },
publisher = { Foundation of Computer Science (FCS), NY, USA }
}
%0 Journal Article
%D 2024
%A Shlok Deshpande
%A Vineet Shinde
%A Siddharth Chaudhari
%A Yashodhara V. Haribhakta
%T Multilingual & Cross-Lingual Text Summarization of Marathi and English using Transformer Based Models and their Systematic Evaluation%T
%J International Journal of Computer Applications
%V 186
%N 26
%P 11-17
%R 10.5120/ijca2024923738
%I Foundation of Computer Science (FCS), NY, USA
The proposed Methodology pioneers an approach to multilingual and cross-lingual text summarization, bridging Marathi and English languages through the innovative deployment and specialized optimization of advanced transformer-based models. The research introduces a novel framework designed to navigate and synthesize the linguistic nuances between these two languages, offering a unique contribution to the field of natural language processing. The utilization of Pegasus, T5, and BART is done for English and IndicBART, mT5, and mBART for Marathi summarization, using M2M-100 for translation, to create a synergistic framework that effectively handles the challenges of cross summarization across languages. The core objective is to perform cross-lingual summarization using these models, enhancing their ability to understand and summarize content across Marathi to English & vice-versa. The methodology includes a combination of multiple vast datasets for training and comprehensive evaluation using ROUGE, BLEU, and BERT metrics to assess summarization quality. Additionally, a novel evaluation metric is introduced, which is a combination of concept coverage, semantic similarity and relevance, tailored for assessing multi and cross-lingual summarization quality between English and Marathi. This project not only aims to advance the field of cross-lingual summarization but also seeks to improve accessibility and foster better understanding across linguistic and cultural boundaries.