https://arxiv.org/abs/1701.06972 Deep Network Guided Proof Search

Deep learning techniques lie at the heart of several significant AI advances in recent years including object recognition and detection, image captioning, machine translation, speech recognition and synthesis, and playing the game of Go. Automated first-order theorem provers can aid in the formalization and verification of mathematical theorems and play a crucial role in program analysis, theory reasoning, security, interpolation, and system verification. Here we suggest deep learning based guidance in the proof search of the theorem prover E. We train and compare several deep neural network models on the traces of existing ATP proofs of Mizar statements and use them to select processed clauses during proof search. We give experimental evidence that with a hybrid, two-phase approach, deep learning based guidance can significantly reduce the average number of proof search steps while increasing the number of theorems proved. Using a few proof guidance strategies that leverage deep neural networks, we have found first-order proofs of 7.36% of the first-order logic translations of the Mizar Mathematical Library theorems that did not previously have ATP generated proofs. This increases the ratio of statements in the corpus with ATP generated proofs from 56% to 59%.

https://arxiv.org/abs/1703.00426v1 HolStep: A Machine Learning Dataset for Higher-order Logic Theorem Proving

We introduce a new dataset based on Higher-Order Logic (HOL) proofs, for the purpose of developing new machine learning-based theorem-proving strategies. We make this dataset publicly available under the BSD license. We propose various machine learning tasks that can be performed on this dataset, and discuss their significance for theorem proving.

https://arxiv.org/abs/1606.04442 DeepMath - Deep Sequence Models for Premise Selection

our models are relatively shallow networks with inherent limitations to representational power and are incapable of capturing high level properties of mathematical statements.

https://arxiv.org/abs/1703.03868 Front-to-End Bidirectional Heuristic Search with Near-Optimal Node Expansions

https://arxiv.org/abs/1804.00846 Learning to Search via Self-Imitation

We study the problem of learning a good search policy. To do so, we propose the self-imitation learning setting, which builds upon imitation learning in two ways. First, self-imitation uses feedback provided by retrospective analysis of demonstrated search traces. Second, the policy can learn from its own decisions and mistakes without requiring repeated feedback from an external expert. Combined, these two properties allow our approach to iteratively scale up to larger problem sizes than the initial problem size for which expert demonstrations were provided. We showcase the effectiveness of our approach on a synthetic maze solving task and the problem of risk-aware path planning.

http://web.cs.du.edu/~sturtevant/papers/SturtevantChen16.pdf External Memory Bidirectional Search

https://arxiv.org/abs/1805.08256 Evolving Real-Time Heuristics Search Algorithms with Building Blocks

a recent work Bulitko 2016 has defined the task of finding a combination of heuristics search algorithms as a survival task. In this evolutionary approach, a space of algorithms is defined over a set of building blocks published algorithms and a simulated evolution is used to recombine these building blocks to find out the best algorithm from that space of algorithms. In this paper, we extend the set of building blocks by adding one published algorithm, namely lookahead based A-star shaped local search space generation method from LSSLRTA-star, plus an unpublished novel strategy to generate local search space with Greedy Best First Search.

https://web.stanford.edu/~jaywhang/soorl.pdf Strategic Object Oriented Reinforcement Learning

We compare different exploration strategies in a model-based setting in which exact planning is impossible. Additionally, we test our approach on perhaps the hardest Atari game Pitfall! and achieve significantly improved exploration and performance over prior methods.

https://arxiv.org/abs/1807.07665v1 Multitask Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

Unlike existing multitask RL approaches that explicitly describe what the agent should do, a subtask graph in our problem only describes properties of subtasks and relationships among them, which requires the agent to perform complex reasoning to find the optimal sequence of subtasks to execute. To tackle this problem, we propose a neural subtask graph solver (NSS) which encodes the subtask graph using a recursive neural network. To overcome the difficulty of training, we propose a novel non-parametric gradient-based policy to pre-train our NSS agent. % and further finetune it through actor-critic method. The experimental results on two 2D visual domains show that our agent can perform complex reasoning to find a near-optimal way of executing the subtask graph and generalize well to the unseen subtask graphs. In addition, we compare our agent with a Monte-Carlo tree search (MCTS) method showing that (1) our method is much more efficient than MCTS and (2) combining MCTS with NSS dramatically improves the search performance.

https://arxiv.org/abs/1807.08058v1 Learning Heuristics for Automated Reasoning through Deep Reinforcement Learning

We demonstrate how to learn efficient heuristics for automated reasoning algorithms through deep reinforcement learning. We consider search algorithms for quantified Boolean logics, that already can solve formulas of impressive size - up to 100s of thousands of variables. The main challenge is to find a representation which lends to making predictions in a scalable way. The heuristics learned through our approach significantly improve over the handwritten heuristics for several sets of formulas

https://arxiv.org/abs/1811.00796v1 Automated Theorem Proving in Intuitionistic Propositional Logic by Deep Reinforcement Learning