Language translation remains a challenging task in the field of natural language processing, requiring models that can effectively capture the nuances and context of different languages. The objective of this project was to develop a robust Italian to English language translation model, leveraging the power of attention mechanisms, with a specific focus on the Long Attention Mechanism.
The challenge in this task lay in preserving the semantic meaning of the source language (Italian) while ensuring fluency and coherence in the translated output (English). Additionally, handling variations in sentence structure, word order, and idiomatic expressions posed a significant challenge that needed to be addressed for the model to deliver accurate and meaningful translations.
To quantitatively evaluate the performance of the language translation model, the project utilized the BLEU (Bilingual Evaluation Understudy) score.
The crux of the project lay in implementing and fine-tuning the Long Attention Mechanism. Unlike traditional attention mechanisms, the Long Attention Mechanism was explored for its capability to capture long-range dependencies between words in a sentence. This was particularly crucial for handling complex sentence structures and maintaining context in translations.
Different variations of the Long Attention Mechanism were experimented with, including the concatenation of attention weights and the dot product of attention weights. These variations aimed to enhance the model's ability to focus on relevant parts of the input sentence during the translation process, thereby improving the overall quality of the generated translations.