Dialogue-based computer-assisted language learning systems allow a learner to practice meaningfully an L2 with an automated agent, whether through an oral (spoken dialogue systems) or written interface (chatbots) (Bibauw, François, & Desmet, 2015). While various dialogue-based CALL systems have been developed and tested with learners against different evaluation schemes, individual evaluations have provided limited information on their effectiveness on L2 development (Bibauw, François, & Desmet, 2019).
In order to obtain a better comprehension of their effects on L2 proficiency development, we conducted a meta-analysis on all the experimental studies measuring an impact of such systems on language learning outcomes (39 publications). Effect sizes for each variable and group under observation were systematically computed from the available data ($k = 96$), using novel formulas to obtain a single metric across experimental designs, based on Morris and DeShon (2002). While most individual studies do not achieve significance in their results, often due to limitations in sample sizes and insufficiently sensitive outcome measurements, by combining all studies into a multilevel linear model, we observed a significant medium effect of dialogue-based CALL on general L2 proficiency development ($d = .60$).
By integrating moderator variables into our statistical model, we are able to provide insights on the relative effectiveness of certain design characteristics (spoken vs. written interface, task-oriented vs. open-ended interaction, form-focused vs. meaning-focused design) on different learning outcomes (holistic writing/speaking proficiency, as well as specific complexity, accuracy and fluency measures…), and to model the effect of treatment duration and spacing on these outcomes, to better inform future system and research designs.