Answers to open-ended questions are often manually coded into different categories. This is time consuming. Automated coding uses statistical/machine learning to train on a small subset of manually coded text answers. The state of the art in NLP (natural language processing) has shifted: A general language model is first pre-trained on vast amounts of unrelated data, and then this model is adapted to a specific application data set. After reviewing some earlier results, we empirically investigate whether BERT, the currently dominant pre-trained language model, is more effective at automated coding of answers to open-ended questions than non-pre-trained statistical learning approaches. In the second part of the talk, I discuss the hammock plot for visualizing categorical or mixed categorical data.
Date
22.6.2023
, 13:00 am
Speaker
Prof. Matthias Schonlau, PhD (University of Waterloo)
Venue
Institute for Employment Research
Regensburger Straße 104
90478 Nürnberg
Room Re100 E10
or online via Skype
Registration
Researchers who like to participate, please send a e-mail to IAB.Colloquium@iab.de