Enhancing Human-Likeness of Generative AI through Explainable Techniques

Hadi Mohammadi, Anastasia Giachanou, Ayoub Bagheri

Utrecht University

This study uses explainable AI (XAI) techniques to humanize generative models by recognizing and changing important tokens in AI-generated text, thereby increasing its human-like features. First, we use XAI approaches, such as SHAP (SHapley Additive exPlanations), to identify the most influential tokens in texts generated by AI models. We then use a binary classification system to separate between human and AI-generated content. These steps help us to identify which elements of the text have an important effect on the model's output and decision-making processes. Next, we create a token replacement technique in which the discovered influential tokens in AI-generated text are systematically replaced with equivalent tokens in human-generated language. This method requires a careful selection of replacement tokens to keep coherence and context while improving human-like qualities. We then examine the effect of these replacements on class change, including whether the changed language is perceived as more human-like. We employ both quantitative and qualitative assessments to analyze our approach's effectiveness. Quantitative metrics include comparing changes in the text's perplexity and fluency scores, as well as utilizing machine learning models trained on human vs. AI language to determine whether the corrected text is more human-like. In qualitative assessments, human volunteers rating the text's naturalness, coherence, and general human-likeness before and after token substitution. The expected findings of this study include a clear identification of important tokens that contribute to the human-likeness of AI-generated text, as well as a demonstration that strategic token replacement can considerably increase the text's perceived authenticity. These studies aim to help us better comprehend the small differences between human and AI text generation, as well as provide actionable ideas for improving generative models.Furthermore, the findings of this study can be applied in two key ways. First, the insights received can be utilized to directly improve generative models, allowing them to write more human-like writing autonomously. Second, the token replacement technique can be used as a recommendation system, suggesting changes to AI-generated text to make it more human-like, allowing it to integrate easily into a variety of human-centric applications such as content creation, customer service, and educational tools. This dual application shows the flexibility and practical value of our approach to strengthening the synergy between human and AI text generation.