Decomposed Deep Q-Network for Coherent Task-Oriented Dialogue Policy Learning

首页 > 成果 > 详情

认领

导出

Link by DOI

反馈

作者信息关键词期刊信息基础信息归属信息摘要

成果类型：

期刊论文

作者：

Zhao, Yangyang;Yin, Kai;Wang, Zhenyu;Dastani, Mehdi*;Wang, Shihan

通讯作者：

Dastani, Mehdi;Wang, SH

作者机构：

[Zhao, Yangyang] Changsha Univ Sci & Technol, Dept Comp & Commun Engn, Changsha 410000, Peoples R China.

[Yin, Kai; Wang, Zhenyu] South China Univ Technol, Dept Software, Guangzhou 510000, Peoples R China.

[Wang, Shihan; Dastani, Mehdi] Univ Utrecht, Dept Informat & Comp Sci, NL-3508 Utrecht, Netherlands.

通讯机构：

[Wang, SH ; Dastani, M] U

Univ Utrecht, Dept Informat & Comp Sci, NL-3508 Utrecht, Netherlands.

语种：

英文

关键词：

Reinforcement learning;Periodic structures;dialogue policy;action space inflation;incoherence problem

期刊：

IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING

ISSN：

2329-9290

年：

2024

卷：

页码：

1380-1391

DOI：

10.1109/TASLP.2024.3357038

机构署名：

本校为第一机构

院系归属：

计算机与通信工程学院

摘要：

Reinforcement learning (RL) has emerged as a key technique for designing dialogue policies. However, action space inflation in dialogue tasks has led to a heavy decision burden and incoherence problems for dialogue policies. In this paper, we propose a novel decomposed deep Q-network (D2Q) that exploits the natural structure of dialogue actions to perform decomposition on Q-function, realizing efficient and coherent dialogue policy learning. Instead of directly evaluating the Q-function, it consists of two separate estimators, one for the abstract action-value functions and the other for the s...

反馈

产权有误：本人成果被他人认领

数据有误：数据基本信息有误

归属有误：成果的院系归属、机构署名归属有误

其他原因：

验证码：

看不清楚，换一个

确定

取消

成果认领

标题：

用户	作者	通讯作者	--
	请选择	请选择	--

确定

取消

Decomposed Deep Q-Network for Coherent Task-Oriented Dialogue Policy Learning

反馈

成果认领

提示

该栏目需要登录且有访问权限才可以访问