Controlled Text Simplification for Dutch using Generative Large Language Models

Wout Sinnaeve, Joni Kruijsbergen, Orphée De Clercq

Ghent University

In today’s globalized society the classroom can be seen as a melting-pot of pupils with varying backgrounds. The same is true in Flanders where, within a single classroom, pupils with diverse Dutch proficiency levels are present (Vlaamse Overheid, 2024). As a result, the Flemish government has pushed schools to take appropriate measures to improve Dutch language skills among pupils (Vanbuel et al., 2017). This places additional pressure on teachers, who are often already dealing with additional workload due to the teacher shortage (Draaisma et al., 2021).

In November 2022, the launch of ChatGPT took the world by storm and the classroom is no exception. Pupils and teacher alike are using this and other tools, often without the necessary background information (Gill et al., 2024). In this respect we wish to zoom in on the teacher side and see how well present-day chatbots, both open-source and proprietary, could be employed to perform the task of automated text simplification. This is particularly useful for teachers who need to deal with different levels of proficiency within a classroom and see whether they can easily offer personalized text material to pupils in their class.

To this purpose we compiled a corpus of 50 Dutch texts, originating from either textbooks or a teacher co-creation platform. This corpus comprises different text types meeting the different Flemish reading attainment targets for the highest level of secondary education. In a next phase these texts have been automatically simplified to four different levels: (i) the fourth and sixth (ii) grade of primary education, the first grade of secondary education (iii) as well as to a level suitable for low-proficient Dutch speakers (iv). This simplification was carried out through a few-shot prompting approach using open-source multilingual and monolingual chat models, such as LLAMA3 and GEITje 7B ultra.

The output was evaluated in two ways. First an Automatic Readability Assessment (ARA) model was developed to classify the simplified outputs into each of the four levels. For this classifier we relied on two existing corpora: an in-house corpus that has been specifically compiled for assessing readability of the levels (i-iii) in the framework of the Flemish centralized tests and the Wablieft corpus (Vandeghinste et al. 2019), which was specifically designed for low-proficient Dutch speakers and thus matches level (iv). By comparing a more traditional feature-based approach to one where the Dutch LLM RobBERT was fine-tuned on the task at hand, we found that the latter was able to attain an F1-score of over 75% when fine-tuned on as few as 1,000 instances.

Preliminary results are promising, suggesting that both GEITje 7B Ultra and LLAMA3 are capable of producing controlled simplified outputs for Dutch. Initial testing suggests a potential edge for the monolingual GEITje model over the multilingual LLAMA3 model in terms of performance. If further testing confirms that LLMs can accurately perform controlled text simplification with few-shot learning, they could serve as valuable tools for teachers. In order to further assess this usefulness, the simplified output will also be evaluated by teachers.

Draaisma, A., Woldman, N., Runhaar, P., den Brok, P., van Woerkom, M., Claessens, L., & Lucas, F. (2021). Anders Organiseren in Primair Onderwijsteams: Een Zoektocht Naar Minder Werkdruk En Meer Werkgeluk. https://doi.org/10.18174/544885
Gill, S. S., Xu, M., Patros, P., Wu, H., Kaur, R., Kaur, K., Fuller, S., Singh, M., Arora, P., Parlikad, A. K., Stankovski, V., Abraham, A., Ghosh, S. K., Lutfiyya, H., Kanhere, S. S., Bahsoon, R., Rana, O., Dustdar, S., Sakellariou, R., … Buyya, R. (2024). Transformative effects of CHATGPT on modern education: Emerging era of AI Chatbots. Internet of Things and Cyber-Physical Systems, 4, 19–23. https://doi.org/10.1016/j.iotcps.2023.06.002
Vanbuel, M., Borderé, A., & Van den Branden, K. (2017). Helpen talenbeleid en taalscreening taalgrenzen verleggen? Een reviewstudie naar effectieve taalstimuleringsmaatregelen. Universiteit Antwerpen. https://hdl.handle.net/10067/1541740151162165141
Vandeghinste, V., Bulté, B., & Augustinus, L. (2019). Wablieft: An Easy-to-Read Newspaper Corpus for Dutch. Proceedings of the CLARIN Annual Conference, 188-191.
Vlaamse Overheid. (2024, February 21). Taal en nationaliteit. Opgroeien.
© 2024 CLIN 34 Organisators. All rights reserved. Contact us via email.