Abstract
The Unlikely Duel: Evaluating Creative Writing in LLMs through a Unique Scenario*
Proceedings of the XX Conference of the Spanish Association for Artificial Intelligence, pp.225-226
Conference of the Spanish Association for Artificial Intelligence (CAEPIA) , 20th (A Coruña, Spain, 19-Jun-2024–21-Jun-2024)
2024
Abstract
This is a summary of the paper " A Confederacy of Models: a Comprehensive Evaluation of LLMs on Creative Writing " , which was published in Findings of EMNLP 2023. We evaluate a range of recent state-of-the-art, instruction-tuned large language models (LLMs) on an English creative writing task, and compare them to human writers. For this purpose, we use a specifically-tailored prompt (based on an epic combat between Ignatius J. Reilly, main character of John Kennedy Toole's " A Confederacy of Dunces " , and a pterodactyl) to minimize the risk of training data leakage and force the models to be creative rather than reusing existing stories. The same prompt is presented to LLMs and human writers, and evaluation is performed by humans using a detailed rubric including various aspects like fluency, style, originality or humor. Results show that some state-of-the-art commercial LLMs match or slightly outperform our human writers in most of the evaluated dimensions. Open-source LLMs lag behind. Humans keep a close lead in originality, and only the top three LLMs can handle humor at human-like levels.
Details
- Title
- The Unlikely Duel: Evaluating Creative Writing in LLMs through a Unique Scenario*
- Authors
- Carlos Gómez-Rodríguez (Author) - Universidade da CoruñaPaul Williams (Author) - University of the Sunshine Coast, Queensland, School of Business and Creative Industries
- Additional notes
- Awarded: Best Paper of the Conference.
- Publication details
- Proceedings of the XX Conference of the Spanish Association for Artificial Intelligence, pp.225-226
- Conference details
- Conference of the Spanish Association for Artificial Intelligence (CAEPIA) , 20th (A Coruña, Spain, 19-Jun-2024–21-Jun-2024)
- Publisher
- Spanish Association for Artificial Intelligence
- Date published
- 2024
- Organisation Unit
- Indigenous and Transcultural Research Centre; School of Business and Creative Industries; Sustainability Research Cluster; Healthy Ageing Research Cluster
- Language
- English
- Record Identifier
- 991038297802621
- Output Type
- Abstract
Metrics
15 Record Views