版权说明 操作指南
首页 > 成果 > 详情

Decomposed Deep Q-Network for Coherent Task-Oriented Dialogue Policy Learning

认领
导出
Link by DOI
反馈
分享
QQ微信 微博
成果类型:
期刊论文
作者:
Zhao, Yangyang;Yin, Kai;Wang, Zhenyu;Dastani, Mehdi*;Wang, Shihan
通讯作者:
Dastani, Mehdi;Wang, SH
作者机构:
[Zhao, Yangyang] Changsha Univ Sci & Technol, Dept Comp & Commun Engn, Changsha 410000, Peoples R China.
[Yin, Kai; Wang, Zhenyu] South China Univ Technol, Dept Software, Guangzhou 510000, Peoples R China.
[Wang, Shihan; Dastani, Mehdi] Univ Utrecht, Dept Informat & Comp Sci, NL-3508 Utrecht, Netherlands.
通讯机构:
[Wang, SH ; Dastani, M] U
Univ Utrecht, Dept Informat & Comp Sci, NL-3508 Utrecht, Netherlands.
语种:
英文
关键词:
Reinforcement learning;Periodic structures;dialogue policy;action space inflation;incoherence problem
期刊:
IEEE-ACM TRANSACTIONS ON AUDIO SPEECH AND LANGUAGE PROCESSING
ISSN:
2329-9290
年:
2024
卷:
32
页码:
1380-1391
机构署名:
本校为第一机构
院系归属:
计算机与通信工程学院
摘要:
Reinforcement learning (RL) has emerged as a key technique for designing dialogue policies. However, action space inflation in dialogue tasks has led to a heavy decision burden and incoherence problems for dialogue policies. In this paper, we propose a novel decomposed deep Q-network (D2Q) that exploits the natural structure of dialogue actions to perform decomposition on Q-function, realizing efficient and coherent dialogue policy learning. Instead of directly evaluating the Q-function, it consists of two separate estimators, one for the abstract action-value functions and the other for the s...

反馈

验证码:
看不清楚,换一个
确定
取消

成果认领

标题:
用户 作者 通讯作者
请选择
请选择
确定
取消

提示

该栏目需要登录且有访问权限才可以访问

如果您有访问权限,请直接 登录访问

如果您没有访问权限,请联系管理员申请开通

管理员联系邮箱:yun@hnwdkj.com