Advanced International Journal for Research

E-ISSN: 3048-7641     Impact Factor: 9.11

A Widely Indexed Open Access Peer Reviewed Multidisciplinary Bi-monthly Scholarly International Journal

Call for Paper Volume 6, Issue 6 (November-December 2025) Submit your research before last 3 days of December to publish your research paper in the issue of November-December.

A Privacy-Preserving Multilingual Neural TTS Framework with Automated Artifact Correction and Email-Based Communication

Author(s) Ms. Keerthana A L, Mr. Anand Biradar, Mr. S. Satheesh Kumar, Mr. Vishal M, Prof. Vishwanath Rajaput
Country India
Abstract Recent advances in neural Text-to-Speech (TTS) have enabled highly naturalistic speech synthesis, yet state-of-the-art models still suffer from artifacts such as mispronunciations, skipped words, and unnatural prosody. These errors often stem from the model's inability to contextualize rare or complex phoneme sequences. This paper presents a novel, robust multilingual synthesis framework that directly addresses this challenge by building upon an automated artifact correction methodology. The core of our system integrates an internal correction algorithm that detects abnormal encoder context vectors by analyzing their deviation from a pre-computed "normal manifold" of training data, allowing for targeted correction without model retraining. We extend this foundation into a practical, end-to-end pipeline with two major contributions: (1) a multilingual synthesis capability that translates English text input into high-fidelity, intelligible speech in English, Tamil, and Hindi, and (2) a secure communication module for sharing the generated audio from one position to another. Our comprehensive evaluation demonstrates the system's effectiveness, achieving a 25.86% reduction in alignment errors, a subjective MOS of 4.6, and a strong comparative CMOS score of +1.34, indicating significant listener preference over the uncorrected baseline. Furthermore, the multilingual outputs achieve over 98% intelligibility, proving the system is a robust, high-quality, and practical solution for real-world communication.
Keywords Neural Text-to-Speech (TTS), Artifact Correction, Multilingual Synthesis, English, Tamil, Hindi, Encoder Context Vectors, Normal Manifold, Secure Communication, Mean Opinion Score (MOS)
Field Computer > Data / Information
Published In Volume 6, Issue 6, November-December 2025
Published On 2025-12-15
DOI https://doi.org/10.63363/aijfr.2025.v06i06.2514
Short DOI https://doi.org/hbf935

Share this