This shows you the differences between two versions of the page.

Link to this comparison view

Next revision
Previous revision
human_in_the_loop [2017/04/17 18:14] external edit
human_in_the_loop [2018/03/15 11:23] (current)
Line 25: Line 25:
 http://​monajalal.github.io/​assets/​pdf/​cvpr2017_1315.pdf Online Graph Completion: Multivariate Signal Recovery in Computer Vision http://​monajalal.github.io/​assets/​pdf/​cvpr2017_1315.pdf Online Graph Completion: Multivariate Signal Recovery in Computer Vision
 +https://​arxiv.org/​abs/​1802.01744 Shared Autonomy via Deep Reinforcement Learning
 +This paper is a proof of concept that illustrates the potential for deep reinforcement learning to enable flexible and practical assistive systems.
 +https://​arxiv.org/​abs/​1703.06207v5 Cooperating with Machines
 + In contrast, less attention has been given to developing autonomous machines that establish mutually cooperative relationships with people who may not share the machine'​s preferences. A main challenge has been that human cooperation does not require sheer computational power, but rather relies on intuition [11], cultural norms [12], emotions and signals [13, 14, 15, 16], and pre-evolved dispositions toward cooperation [17], common-sense mechanisms that are difficult to encode in machines for arbitrary contexts. Here, we combine a state-of-the-art machine-learning algorithm with novel mechanisms for generating and acting on signals to produce a new learning algorithm that cooperates with people and other machines at levels that rival human cooperation in a variety of two-player repeated stochastic games. This is the first general-purpose algorithm that is capable, given a description of a previously unseen game environment,​ of learning to cooperate with people within short timescales in scenarios previously unanticipated by algorithm designers. This is achieved without complex opponent modeling or higher-order theories of mind, thus showing that flexible, fast, and general human-machine cooperation is computationally achievable using a non-trivial,​ but ultimately simple, set of algorithmic mechanisms.