Reinforcement learning,as an unsupervised learning in machine learning,one of difficulties problem in practical application is how to balance the relation between exploration and exploitation. To solve this problem,a dynamic adjustment strategyof parameter basis of Q learning combined with ε-greedy strategy is presented. This strategy is based on the learning status of agent in various states of environment in the learning process,making parameter self-adaptation,to better balance the relation between exploration and exploitation.Meanwhile,an ...