What makes a good conversation? How controllable attributes affect human judgments (2024)

Abigail See,Stephen Roller,Douwe Kiela,Jason Weston

Abstract

A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.

Export citation

BibTeX
MODS XML
Endnote
Preformatted

@inproceedings{see-etal-2019-makes, title = "What makes a good conversation? How controllable attributes affect human judgments", author = "See, Abigail and Roller, Stephen and Kiela, Douwe and Weston, Jason", editor = "Burstein, Jill and Doran, Christy and Solorio, Thamar", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)", month = jun, year = "2019", address = "Minneapolis, Minnesota", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N19-1170", doi = "10.18653/v1/N19-1170", pages = "1702--1723", abstract = "A good conversation requires balance {--} between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.",}

Download as File

<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="see-etal-2019-makes"> <titleInfo> <title>What makes a good conversation? How controllable attributes affect human judgments</title> </titleInfo> <name type="personal"> <namePart type="given">Abigail</namePart> <namePart type="family">See</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Stephen</namePart> <namePart type="family">Roller</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Douwe</namePart> <namePart type="family">Kiela</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jason</namePart> <namePart type="family">Weston</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-06</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)</title> </titleInfo> <name type="personal"> <namePart type="given">Jill</namePart> <namePart type="family">Burstein</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christy</namePart> <namePart type="family">Doran</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Thamar</namePart> <namePart type="family">Solorio</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Minneapolis, Minnesota</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.</abstract> <identifier type="citekey">see-etal-2019-makes</identifier> <identifier type="doi">10.18653/v1/N19-1170</identifier> <location> <url>https://aclanthology.org/N19-1170</url> </location> <part> <date>2019-06</date> <extent unit="page"> <start>1702</start> <end>1723</end> </extent> </part></mods></modsCollection>

Download as File

%0 Conference Proceedings%T What makes a good conversation? How controllable attributes affect human judgments%A See, Abigail%A Roller, Stephen%A Kiela, Douwe%A Weston, Jason%Y Burstein, Jill%Y Doran, Christy%Y Solorio, Thamar%S Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)%D 2019%8 June%I Association for Computational Linguistics%C Minneapolis, Minnesota%F see-etal-2019-makes%X A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.%R 10.18653/v1/N19-1170%U https://aclanthology.org/N19-1170%U https://doi.org/10.18653/v1/N19-1170%P 1702-1723

Download as File

Markdown (Informal)

[What makes a good conversation? How controllable attributes affect human judgments](https://aclanthology.org/N19-1170) (See et al., NAACL 2019)

What makes a good conversation? How controllable attributes affect human judgments (See et al., NAACL 2019)

ACL

Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? How controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1702–1723, Minneapolis, Minnesota. Association for Computational Linguistics.