This is an old revision of the document! Scalable Linear Causal Inference for Irregularly Sampled Time Series with Long Range Dependencies

Linear causal analysis is central to a wide range of important application spanning finance, the physical sciences, and engineering. Much of the existing literature in linear causal analysis operates in the time domain. Unfortunately, the direct application of time domain linear causal analysis to many real-world time series presents three critical challenges: irregular temporal sampling, long range dependencies, and scale. Moreover, real-world data is often collected at irregular time intervals across vast arrays of decentralized sensors and with long range dependencies which make naive time domain correlation estimators spurious. In this paper we present a frequency domain based estimation framework which naturally handles irregularly sampled data and long range dependencies while enabled memory and communication efficient distributed processing of time series data. By operating in the frequency domain we eliminate the need to interpolate and help mitigate the effects of long range dependencies. We implement and evaluate our new work-flow in the distributed setting using Apache Spark and demonstrate on both Monte Carlo simulations and high-frequency financial trading that we can accurately recover causal structure at scale. When the Map Is Better Than the Territory

Recent research applying information theory to causal analysis has shown that the causal structure of some systems can actually come into focus and be more informative at a macroscale. That is, a macroscale description of a system (a map) can be more informative than a fully detailed microscale description of the system (the territory). This has been called “causal emergence.” Recognising Top-Down Causation

One of the basic assumptions implicit in the way physics is usually done is that all causation flows in a bottom up fashion, from micro to macro scales. However this is wrong in many cases in biology, and in particular in the way the brain functions. Here I make the case that it is also wrong in the case of digital computers - the paradigm of mechanistic algorithmic causation - and in many cases in physics, ranging from the origin of the arrow of time to the process of state vector preparation. I consider some examples from classical physics, as well as the case of digital computers, and then explain why this is possible without contradicting the causal powers of the underlying microphysics. Understanding the emergence of genuine complexity out of the underlying physics depends on recognising this kind of causation. Topological Causality in Dynamical Systems CAUSAL GENERATIVE NEURAL NETWORKS

Unlike previous approaches, CGNN leverages both conditional independences and distributional asymmetries to seamlessly discover bivariate and multivariate causal structures, with or without hidden variables. CGNN does not only estimate the causal structure, but a full and differentiable generative model of the data.

We believe that our approach opens new avenues of research, both from the point of view of leveraging the power of deep learning in causal discovery and from the point of view of building deep networks with better structure interpretability. Once the model is learned, the CGNNs present the advantage to be fully parametrized and may be used to simulate interventions on one or more variables of the model and evaluate their impact on a set of target variables. This usage is relevant in a wide variety of domains, typically among medical and sociological domains. An Algorithmic Information Calculus for Causal Discovery and Reprogramming Systems

This calculus entails finding and applying controlled interventions to an evolving object to estimate how its algorithmic information content is affected in terms of positive or negative shifts towards and away from randomness in connection to causation. The approach is an alternative to statistical approaches for inferring causal relationships and formulating theoretical expectations from perturbation analysis Theoretical Impediments to Machine Learning

Current machine learning systems operate, almost exclusively, in a purely statistical mode, which puts severe theoretical limits on their performance. We consider the feasibility of leveraging counterfactual reasoning in machine learning tasks, and to identify areas where such reasoning could lead to major breakthroughs in machine learning applications. The Blessings of Multiple Causes

We propose the deconfounder, an algorithm that combines unsupervised machine learning and predictive model checking to perform causal inference in multiple-cause settings. The deconfounder infers a latent variable as a substitute for unobserved confounders and then uses that substitute to perform causal inference. We develop theory for when the deconfounder leads to unbiased causal estimates, and show that it requires weaker assumptions than classical causal inference. We analyze its performance in three types of studies: semi-simulated data around smoking and lung cancer, semi-simulated data around genome wide association studies, and a real dataset about actors and movie revenue. The deconfounder provides a checkable approach to estimating close-to-truth causal effects. The Deconfounded Recommender: A Causal Inference Approach to Recommendation Discovering Context Specific Causal Relationships Transfer Learning for Estimating Causal Effects using Neural Networks SAM: Structural Agnostic Model, Causal Discovery and Penalized Adversarial Learning

We present the Structural Agnostic Model (SAM), a framework to estimate end-to-end non-acyclic causal graphs from observational data. In a nutshell, SAM implements an adversarial game in which a separate model generates each variable, given real values from all others. In tandem, a discriminator attempts to distinguish between the joint distributions of real and generated samples. Finally, a sparsity penalty forces each generator to consider only a small subset of the variables, yielding a sparse causal graph. SAM scales easily to hundreds variables. CAUSALGAN: LEARNING CAUSAL IMPLICIT GENERATIVE MODELS WITH ADVERSARIAL TRAINING Discovering Causal Signals in Images Learning Representations for Counterfactual Inference

We propose a new algorithmic framework for counterfactual inference which brings together ideas from domain adaptation and representation learning. Deep IV: A Flexible Approach for Counterfactual Prediction