Abigail See,Stephen Roller,Douwe Kiela,Jason Weston
Abstract
A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.
- Anthology ID:
- N19-1170
- Volume:
- Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)
- Month:
- June
- Year:
- 2019
- Address:
- Minneapolis, Minnesota
- Editors:
- Jill Burstein,Christy Doran,Thamar Solorio
- Venue:
- NAACL
- SIG:
- Publisher:
- Association for Computational Linguistics
- Note:
- Pages:
- 1702–1723
- Language:
- URL:
- https://aclanthology.org/N19-1170
- DOI:
- 10.18653/v1/N19-1170
- Bibkey:
- Cite (ACL):
- Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? How controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1702–1723, Minneapolis, Minnesota. Association for Computational Linguistics.
- Cite (Informal):
- What makes a good conversation? How controllable attributes affect human judgments (See et al., NAACL 2019)
- Copy Citation:
- PDF:
- https://aclanthology.org/N19-1170.pdf
- Presentation:
- N19-1170.Presentation.pdf
- Poster:
- N19-1170.Poster.pdf
- Code
- additional community code
- Data
- ConvAI2
Export citation
- BibTeX
- MODS XML
- Endnote
- Preformatted
@inproceedings{see-etal-2019-makes, title = "What makes a good conversation? How controllable attributes affect human judgments", author = "See, Abigail and Roller, Stephen and Kiela, Douwe and Weston, Jason", editor = "Burstein, Jill and Doran, Christy and Solorio, Thamar", booktitle = "Proceedings of the 2019 Conference of the North {A}merican Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)", month = jun, year = "2019", address = "Minneapolis, Minnesota", publisher = "Association for Computational Linguistics", url = "https://aclanthology.org/N19-1170", doi = "10.18653/v1/N19-1170", pages = "1702--1723", abstract = "A good conversation requires balance {--} between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.",}
Download as File
<?xml version="1.0" encoding="UTF-8"?><modsCollection xmlns="http://www.loc.gov/mods/v3"><mods ID="see-etal-2019-makes"> <titleInfo> <title>What makes a good conversation? How controllable attributes affect human judgments</title> </titleInfo> <name type="personal"> <namePart type="given">Abigail</namePart> <namePart type="family">See</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Stephen</namePart> <namePart type="family">Roller</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Douwe</namePart> <namePart type="family">Kiela</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Jason</namePart> <namePart type="family">Weston</namePart> <role> <roleTerm authority="marcrelator" type="text">author</roleTerm> </role> </name> <originInfo> <dateIssued>2019-06</dateIssued> </originInfo> <typeOfResource>text</typeOfResource> <relatedItem type="host"> <titleInfo> <title>Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)</title> </titleInfo> <name type="personal"> <namePart type="given">Jill</namePart> <namePart type="family">Burstein</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Christy</namePart> <namePart type="family">Doran</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <name type="personal"> <namePart type="given">Thamar</namePart> <namePart type="family">Solorio</namePart> <role> <roleTerm authority="marcrelator" type="text">editor</roleTerm> </role> </name> <originInfo> <publisher>Association for Computational Linguistics</publisher> <place> <placeTerm type="text">Minneapolis, Minnesota</placeTerm> </place> </originInfo> <genre authority="marcgt">conference publication</genre> </relatedItem> <abstract>A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.</abstract> <identifier type="citekey">see-etal-2019-makes</identifier> <identifier type="doi">10.18653/v1/N19-1170</identifier> <location> <url>https://aclanthology.org/N19-1170</url> </location> <part> <date>2019-06</date> <extent unit="page"> <start>1702</start> <end>1723</end> </extent> </part></mods></modsCollection>
Download as File
%0 Conference Proceedings%T What makes a good conversation? How controllable attributes affect human judgments%A See, Abigail%A Roller, Stephen%A Kiela, Douwe%A Weston, Jason%Y Burstein, Jill%Y Doran, Christy%Y Solorio, Thamar%S Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers)%D 2019%8 June%I Association for Computational Linguistics%C Minneapolis, Minnesota%F see-etal-2019-makes%X A good conversation requires balance – between simplicity and detail; staying on topic and changing it; asking questions and answering them. Although dialogue agents are commonly evaluated via human judgments of overall quality, the relationship between quality and these individual factors is less well-studied. In this work, we examine two controllable neural text generation methods, conditional training and weighted decoding, in order to control four important attributes for chit-chat dialogue: repetition, specificity, response-relatedness and question-asking. We conduct a large-scale human evaluation to measure the effect of these control parameters on multi-turn interactive conversations on the PersonaChat task. We provide a detailed analysis of their relationship to high-level aspects of conversation, and show that by controlling combinations of these variables our models obtain clear improvements in human quality judgments.%R 10.18653/v1/N19-1170%U https://aclanthology.org/N19-1170%U https://doi.org/10.18653/v1/N19-1170%P 1702-1723
Download as File
Markdown (Informal)
[What makes a good conversation? How controllable attributes affect human judgments](https://aclanthology.org/N19-1170) (See et al., NAACL 2019)
- What makes a good conversation? How controllable attributes affect human judgments (See et al., NAACL 2019)
ACL
- Abigail See, Stephen Roller, Douwe Kiela, and Jason Weston. 2019. What makes a good conversation? How controllable attributes affect human judgments. In Proceedings of the 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Volume 1 (Long and Short Papers), pages 1702–1723, Minneapolis, Minnesota. Association for Computational Linguistics.