heal.abstract |
Among the various approaches for building conversational agents able to entertain humans, open domain generation-based chatbots is a significant field of research. However, beyond understanding
what is being discussed, human communication requires awareness of how someone is feeling. Following this perspective, in this diploma thesis, we study dialog generation and specifically we focus
on the challenging task of building empathetic conversational agents, which are able to understand
any implied feelings and respond accordingly.
First, we provide the reader with a brief theoretical background on machine learning (ML), deep learning (DL) and Natural Language Processing (NLP). Then we study in depth generation-based models
for dialog generation. More specifically, we analyze the traditional vanilla seq2seq architecture, the
vanilla seq2seq with attention and the Hierarchical Recurrent Encoder Decoder (HRED) architecture.
Afterwards, we study transformer-based models that can be used in dialogue generation such as the
Transformer Encoder Decoder, the BERT, the GPT-2, and the T5 models. After presenting the theoretical background of those architectures, we analyze the most commonly used decoding methods in dialog generation providing typical examples for better understanding. Finally, we present the most common automatic and human evaluation metrics/methods used for ranking dialog systems.
From the perspective of creating conversational agents that are able to understand the implied feelings of a conversation and respond accordingly, we focus on the Empathetic Dialogues task, a task proposed by Facebook. After, a brief introduction to the task and related work, we conduct several experiments and discuss the results. More specifically, at first, we analyze the datasets we used for the experiments (Empathetic Dialogues and ConvAI2) and then we present the baseline architectures used by other researchers on the task. Afterwards, we propose new ways for further improving the results of the task. More specifically, we experiment with the BERT2BERT and BERT2GPT2 architectures,
achieving comparable results with already proposed models, but without reaching the state-of-the-art
results. Furthermore, we experiment with three versions of the T5 model. In the first approach, we
use the T5 model as is but fine-tune it on the Empathetic Dialogues dataset. In the second and the
third approaches, we extend the T5 baseline architecture with multi-task learning. All of the T5-based
approaches achieve state-of-the-art results in average BLEU score metric, while their performance as
far as perplexity is concerned is close to the current state-of-the-art model. Moreover, after presenting the results of the experiments we provide various examples to demonstrate the performance of the
proposed models more qualitatively. To further improve the proposed approach, we refer to promising
future extensions and modifications that we suggest for future study. |
en |