Differences

This shows you the differences between two versions of the page.

Link to this comparison view

Both sides previous revision Previous revision
human_in_the_loop [2018/02/07 17:59]
admin
human_in_the_loop [2018/03/15 11:23] (current)
admin
Line 30: Line 30:
 This paper is a proof of concept that illustrates the potential for deep reinforcement learning to enable flexible and practical assistive systems. This paper is a proof of concept that illustrates the potential for deep reinforcement learning to enable flexible and practical assistive systems.
  
 +https://​arxiv.org/​abs/​1703.06207v5 Cooperating with Machines
 +
 + In contrast, less attention has been given to developing autonomous machines that establish mutually cooperative relationships with people who may not share the machine'​s preferences. A main challenge has been that human cooperation does not require sheer computational power, but rather relies on intuition [11], cultural norms [12], emotions and signals [13, 14, 15, 16], and pre-evolved dispositions toward cooperation [17], common-sense mechanisms that are difficult to encode in machines for arbitrary contexts. Here, we combine a state-of-the-art machine-learning algorithm with novel mechanisms for generating and acting on signals to produce a new learning algorithm that cooperates with people and other machines at levels that rival human cooperation in a variety of two-player repeated stochastic games. This is the first general-purpose algorithm that is capable, given a description of a previously unseen game environment,​ of learning to cooperate with people within short timescales in scenarios previously unanticipated by algorithm designers. This is achieved without complex opponent modeling or higher-order theories of mind, thus showing that flexible, fast, and general human-machine cooperation is computationally achievable using a non-trivial,​ but ultimately simple, set of algorithmic mechanisms.