How to estimate Fisher matrices from simulations [CL]

http://arxiv.org/abs/2305.08994


The Fisher information matrix is a quantity of fundamental importance for information geometry and asymptotic statistics. In practice, it is widely used to quickly estimate the expected information available in a data set and guide experimental design choices. In many modern applications, it is intractable to analytically compute the Fisher information and Monte Carlo methods are used instead. The standard Monte Carlo method produces estimates of the Fisher information that can be biased when the Monte-Carlo noise is non-negligible. Most problematic is noise in the derivatives as this leads to an overestimation of the available constraining power, given by the inverse Fisher information. In this work we find another simple estimate that is oppositely biased and produces an underestimate of the constraining power. This estimator can either be used to give approximate bounds on the parameter constraints or can be combined with the standard estimator to give improved, approximately unbiased estimates. Both the alternative and the combined estimators are asymptotically unbiased so can be also used as a convergence check of the standard approach. We discuss potential limitations of these estimators and provide methods to assess their reliability. These methods accelerate the convergence of Fisher forecasts, as unbiased estimates can be achieved with fewer Monte Carlo samples, and so can be used to reduce the simulated data set size by several orders of magnitude.

Read this paper on arXiv…

W. Coulton and B. Wandelt
Wed, 17 May 23
29/67

Comments: Supporting code available at this https URL

Ejecta cloud distributions for the statistical analysis of impact cratering events onto asteroids' surfaces: a sensitivity analysis [EPA]

http://arxiv.org/abs/2301.04284


This work presents the model of an ejecta cloud distribution to characterise the plume generated by the impact of a projectile onto asteroids surfaces. A continuum distribution based on the combination of probability density functions is developed to describe the size, ejection speed, and ejection angles of the fragments. The ejecta distribution is used to statistically analyse the fate of the ejecta. By combining the ejecta distribution with a space-filling sampling technique, we draw samples from the distribution and assigned them a number of \emph{representative fragments} so that the evolution in time of a single sample is representative of an ensemble of fragments. Using this methodology, we analyse the fate of the ejecta as a function of different modelling techniques and assumptions. We evaluate the effect of different types of distributions, ejection speed models, coefficients, etc. The results show that some modelling assumptions are more influential than others and, in some cases, they influence different aspects of the ejecta evolution such as the share of impacting and escaping fragments or the distribution of impacting fragments on the asteroid surface.

Read this paper on arXiv…

M. Trisolini, C. Colombo and Y. Tsuda
Thu, 12 Jan 23
60/68

Comments: N/A

A general method for goodness-of-fit tests for arbitrary multivariate models [CL]

http://arxiv.org/abs/2211.03478


Goodness-of-fit tests are often used in data analysis to test the agreement of a model to a set of data. Out of the box tests that can target any proposed distribution model are only available in the univariate case. In this note I discuss how to build a goodness-of-fit test for arbitrary multivariate distributions or multivariate data generation models. The resulting tests perform an unbinned analysis and do not need any trials factor or look-elsewhere correction since the multivariate data can be analyzed all at once. The proposed distribution or generative model is used to transform the data to an uncorrelated space where the test is developed. Depending on the complexity of the model, it is possible to perform the transformation analytically or numerically with the help of a Normalizing Flow algorithm.

Read this paper on arXiv…

L. Shtembari
Tue, 8 Nov 22
79/79

Comments: N/A

Fast and robust Bayesian Inference using Gaussian Processes with GPry [CEA]

http://arxiv.org/abs/2211.02045


We present the GPry algorithm for fast Bayesian inference of general (non-Gaussian) posteriors with a moderate number of parameters. GPry does not need any pre-training, special hardware such as GPUs, and is intended as a drop-in replacement for traditional Monte Carlo methods for Bayesian inference. Our algorithm is based on generating a Gaussian Process surrogate model of the log-posterior, aided by a Support Vector Machine classifier that excludes extreme or non-finite values. An active learning scheme allows us to reduce the number of required posterior evaluations by two orders of magnitude compared to traditional Monte Carlo inference. Our algorithm allows for parallel evaluations of the posterior at optimal locations, further reducing wall-clock times. We significantly improve performance using properties of the posterior in our active learning scheme and for the definition of the GP prior. In particular we account for the expected dynamical range of the posterior in different dimensionalities. We test our model against a number of synthetic and cosmological examples. GPry outperforms traditional Monte Carlo methods when the evaluation time of the likelihood (or the calculation of theoretical observables) is of the order of seconds; for evaluation times of over a minute it can perform inference in days that would take months using traditional methods. GPry is distributed as an open source Python package (pip install gpry) and can also be found at https://github.com/jonaselgammal/GPry.

Read this paper on arXiv…

J. Gammal, N. Schöneberg, J. Torrado, et. al.
Fri, 4 Nov 22
33/84

Comments: 36 pages, 12 figures. Comments are welcome

Simulation-based inference of Bayesian hierarchical models while checking for model misspecification [CL]

http://arxiv.org/abs/2209.11057


This paper presents recent methodological advances to perform simulation-based inference (SBI) of a general class of Bayesian hierarchical models (BHMs), while checking for model misspecification. Our approach is based on a two-step framework. First, the latent function that appears as second layer of the BHM is inferred and used to diagnose possible model misspecification. Second, target parameters of the trusted model are inferred via SBI. Simulations used in the first step are recycled for score compression, which is necessary to the second step. As a proof of concept, we apply our framework to a prey-predator model built upon the Lotka-Volterra equations and involving complex observational processes.

Read this paper on arXiv…

F. Leclercq
Fri, 23 Sep 22
24/70

Comments: 6 pages, 2 figures. Accepted for publication as proceedings of MaxEnt’22 (18-22 July 2022, IHP, Paris, France, this https URL). The pySELFI code is publicly available at this http URL and on GitHub (this https URL)

Calibrated Predictive Distributions via Diagnostics for Conditional Coverage [CL]

http://arxiv.org/abs/2205.14568


Uncertainty quantification is crucial for assessing the predictive ability of AI algorithms. A large body of work (including normalizing flows and Bayesian neural networks) has been devoted to describing the entire predictive distribution (PD) of a target variable Y given input features $\mathbf{X}$. However, off-the-shelf PDs are usually far from being conditionally calibrated; i.e., the probability of occurrence of an event given input $\mathbf{X}$ can be significantly different from the predicted probability. Most current research on predictive inference (such as conformal prediction) concerns constructing prediction sets, that do not only provide correct uncertainties on average over the entire population (that is, averaging over $\mathbf{X}$), but that are also approximately conditionally calibrated with accurate uncertainties for individual instances. It is often believed that the problem of obtaining and assessing entire conditionally calibrated PDs is too challenging to approach. In this work, we show that recalibration as well as validation are indeed attainable goals in practice. Our proposed method relies on the idea of regressing probability integral transform (PIT) scores against $\mathbf{X}$. This regression gives full diagnostics of conditional coverage across the entire feature space and can be used to recalibrate misspecified PDs. We benchmark our corrected prediction bands against oracle bands and state-of-the-art predictive inference algorithms for synthetic data, including settings with distributional shift and dependent high-dimensional sequence data. Finally, we demonstrate an application to the physical sciences in which we assess and produce calibrated PDs for measurements of galaxy distances using imaging data (i.e., photometric redshifts).

Read this paper on arXiv…

B. Dey, D. Zhao, J. Newman, et. al.
Tue, 31 May 22
87/89

Comments: 10 pages, 6 figures. Under review

MUSE: Marginal Unbiased Score Expansion and Application to CMB Lensing [CEA]

http://arxiv.org/abs/2112.09354


We present the marginal unbiased score expansion (MUSE) method, an algorithm for generic high-dimensional hierarchical Bayesian inference. MUSE performs approximate marginalization over arbitrary non-Gaussian latent parameter spaces, yielding Gaussianized asymptotically unbiased and near-optimal constraints on global parameters of interest. It is computationally much cheaper than exact alternatives like Hamiltonian Monte Carlo (HMC), excelling on funnel problems which challenge HMC, and does not require any problem-specific user supervision like other approximate methods such as Variational Inference or many Simulation-Based Inference methods. MUSE makes possible the first joint Bayesian estimation of the delensed Cosmic Microwave Background (CMB) power spectrum and gravitational lensing potential power spectrum, demonstrated here on a simulated data set as large as the upcoming South Pole Telescope 3G 1500 deg$^2$ survey, corresponding to a latent dimensionality of ${\sim}\,6$ million and of order 100 global bandpower parameters. On a subset of the problem where an exact but more expensive HMC solution is feasible, we verify that MUSE yields nearly optimal results. We also demonstrate that existing spectrum-based forecasting tools which ignore pixel-masking underestimate predicted error bars by only ${\sim}\,10\%$. This method is a promising path forward for fast lensing and delensing analyses which will be necessary for future CMB experiments such as SPT-3G, Simons Observatory, or CMB-S4, and can complement or supersede existing HMC approaches. The success of MUSE on this challenging problem strengthens its case as a generic procedure for a broad class of high-dimensional inference problems.

Read this paper on arXiv…

M. Millea and U. Seljak
Mon, 20 Dec 21
31/59

Comments: 22 pages, 8 figures

Incorporating Measurement Error in Astronomical Object Classification [IMA]

http://arxiv.org/abs/2112.06831


Most general-purpose classification methods, such as support-vector machine (SVM) and random forest (RF), fail to account for an unusual characteristic of astronomical data: known measurement error uncertainties. In astronomical data, this information is often given in the data but discarded because popular machine learning classifiers cannot incorporate it. We propose a simulation-based approach that incorporates heteroscedastic measurement error into any existing classification method to better quantify uncertainty in classification. The proposed method first simulates perturbed realizations of the data from a Bayesian posterior predictive distribution of a Gaussian measurement error model. Then, a chosen classifier is fit to each simulation. The variation across the simulations naturally reflects the uncertainty propagated from the measurement errors in both labeled and unlabeled data sets. We demonstrate the use of this approach via two numerical studies. The first is a thorough simulation study applying the proposed procedure to SVM and RF, which are well-known hard and soft classifiers, respectively. The second study is a realistic classification problem of identifying high-$z$ $(2.9 \leq z \leq 5.1)$ quasar candidates from photometric data. The data were obtained from merged catalogs of the Sloan Digital Sky Survey, the $Spitzer$ IRAC Equatorial Survey, and the $Spitzer$-HETDEX Exploratory Large-Area Survey. The proposed approach reveals that out of 11,847 high-$z$ quasar candidates identified by a random forest without incorporating measurement error, 3,146 are potential misclassifications. Additionally, out of ${\sim}1.85$ million objects not identified as high-$z$ quasars without measurement error, 936 can be considered candidates when measurement error is taken into account.

Read this paper on arXiv…

S. Shy, H. Tak, E. Feigelson, et. al.
Tue, 14 Dec 21
3/98

Comments: N/A

Machine learning assisted Bayesian model comparison: learnt harmonic mean estimator [CL]

http://arxiv.org/abs/2111.12720


We resurrect the infamous harmonic mean estimator for computing the marginal likelihood (Bayesian evidence) and solve its problematic large variance. The marginal likelihood is a key component of Bayesian model selection since it is required to evaluate model posterior probabilities; however, its computation is challenging. The original harmonic mean estimator, first proposed in 1994 by Newton and Raftery, involves computing the harmonic mean of the likelihood given samples from the posterior. It was immediately realised that the original estimator can fail catastrophically since its variance can become very large and may not be finite. A number of variants of the harmonic mean estimator have been proposed to address this issue although none have proven fully satisfactory. We present the learnt harmonic mean estimator, a variant of the original estimator that solves its large variance problem. This is achieved by interpreting the harmonic mean estimator as importance sampling and introducing a new target distribution. The new target distribution is learned to approximate the optimal but inaccessible target, while minimising the variance of the resulting estimator. Since the estimator requires samples of the posterior only it is agnostic to the strategy used to generate posterior samples. We validate the estimator on a variety of numerical experiments, including a number of pathological examples where the original harmonic mean estimator fails catastrophically. In all cases our learnt harmonic mean estimator is shown to be highly accurate. The estimator is computationally scalable and can be applied to problems of dimension $\mathcal{O}(10^3)$ and beyond. Code implementing the learnt harmonic mean estimator is made publicly available.

Read this paper on arXiv…

J. McEwen, C. Wallis, M. Price, et. al.
Mon, 29 Nov 21
68/94

Comments: 37 pages, 8 figures, code available at this https URL

Periodic Variable Stars Modulated by Time-Varying Parameters [CL]

http://arxiv.org/abs/2111.10264


Many astrophysical phenomena are time-varying, in the sense that their brightness change over time. In the case of periodic stars, previous approaches assumed that changes in period, amplitude, and phase are well described by either parametric or piecewise-constant functions. With this paper, we introduce a new mathematical model for the description of the so-called modulated light curves, as found in periodic variable stars that exhibit smoothly time-varying parameters such as amplitude, frequency, and/or phase. Our model accounts for a smoothly time-varying trend, and a harmonic sum with smoothly time-varying weights. In this sense, our approach is flexible because it avoids restrictive assumptions (parametric or piecewise-constant) about the functional form of trend and amplitudes. We apply our methodology to the light curve of a pulsating RR Lyrae star characterised by the Blazhko effect. To estimate the time-varying parameters of our model, we develop a semi-parametric method for unequally spaced time series. The estimation of our time-varying curves translates into the estimation of time-invariant parameters that can be performed by ordinary least-squares, with the following two advantages: modeling and forecasting can be implemented in a parametric fashion, and we are able to cope with missing observations. To detect serial correlation in the residuals of our fitted model, we derive the mathematical definition of the spectral density for unequally spaced time series. The proposed method is designed to estimate smoothly time-varying trend and amplitudes, as well as the spectral density function of the errors. We provide simulation results and applications to real data.

Read this paper on arXiv…

G. Motta, D. Soto and M. Catelan
Mon, 22 Nov 21
48/53

Comments: 26 pages, 6 figures, to be published in The Astrophysical Journal

A goodness-of-fit test based on a recursive product of spacings [CL]

http://arxiv.org/abs/2111.02252


We introduce a new statistical test based on the observed spacings of ordered data. The statistic is sensitive to detect non-uniformity in random samples, or short-lived features in event time series. Under some conditions, this new test can outperform existing ones, such as the well known Kolmogorov-Smirnov or Anderson-Darling tests, in particular when the number of samples is small and differences occur over a small quantile of the null hypothesis distribution. A detailed description of the test statistic is provided including an illustration and examples, together with a parameterization of its distribution based on simulation.

Read this paper on arXiv…

P. Eller and L. Shtembari
Thu, 4 Nov 21
22/73

Comments: N/A

Systematic evaluation of variability detection methods for eROSITA [HEAP]

http://arxiv.org/abs/2106.14529


The reliability of detecting source variability in sparsely and irregularly sampled X-ray light curves is investigated. This is motivated by the unprecedented survey capabilities of eROSITA onboard SRG, providing light curves for many thousand sources in its final-depth equatorial deep field survey. Four methods for detecting variability are evaluated: excess variance, amplitude maximum deviations, Bayesian blocks and a new Bayesian formulation of the excess variance. We judge the false detection rate of variability based on simulated Poisson light curves of constant sources, and calibrate significance thresholds. Simulations with flares injected favour the amplitude maximum deviation as most sensitive at low false detections. Simulations with white and red stochastic source variability favour Bayesian methods. The results are applicable also for the million sources expected in eROSITA’s all-sky survey.

Read this paper on arXiv…

J. Buchner, T. Boller, D. Bogensberger, et. al.
Tue, 29 Jun 21
64/101

Comments: Resubmitted version after a positive first referee report. Variability analysis tools available this https URL 15 min Talk: this https URL To appear on A&A, Special Issue: The Early Data Release of eROSITA and Mikhail Pavlinsky ART-XC on the SRG Mission

Fasano-Franceschini Test: an Implementation of a 2-Dimensional Kolmogorov-Smirnov test in R [CL]

http://arxiv.org/abs/2106.10539


The univariate Kolmogorov-Smirnov (KS) test is a non-parametric statistical test designed to assess whether a set of data is consistent with a given probability distribution (or, in the two-sample case, whether the two samples come from the same underlying distribution). The versatility of the KS test has made it a cornerstone of statistical analysis and is commonly used across the scientific disciplines. However, the test proposed by Kolmogorov and Smirnov does not naturally extend to multidimensional distributions. Here, we present the fasano.franceschini.test package, an R implementation of the 2-D KS two-sample test as defined by Fasano and Franceschini (Fasano and Franceschini 1987). The fasano.franceschini.test package provides three improvements over the current 2-D KS test on the Comprehensive R Archive Network (CRAN): (i) the Fasano and Franceschini test has been shown to run in $O(n^2)$ versus the Peacock implementation which runs in $O(n^3)$; (ii) the package implements a procedure for handling ties in the data; and (iii) the package implements a parallelized bootstrapping procedure for improved significance testing. Ultimately, the fasano.franceschini.test package presents a robust statistical test for analyzing random samples defined in 2-dimensions.

Read this paper on arXiv…

E. Ness-Cohn and R. Braun
Tue, 22 Jun 21
71/71

Comments: 8 pages, 4 figures

Maximum Entropy Spectral Analysis: a case study [CL]

http://arxiv.org/abs/2106.09499


The Maximum Entropy Spectral Analysis (MESA) method, developed by Burg, provides a powerful tool to perform spectral estimation of a time-series. The method relies on a Jaynes’ maximum entropy principle and provides the means of inferring the spectrum of a stochastic process in terms of the coefficients of some autoregressive process AR($p$) of order $p$. A closed form recursive solution provides an estimate of the autoregressive coefficients as well as of the order $p$ of the process. We provide a ready-to-use implementation of the algorithm in the form of a python package \texttt{memspectrum}. We characterize our implementation by performing a power spectral density analysis on synthetic data (with known power spectral density) and we compare different criteria for stopping the recursion. Furthermore, we compare the performance of our code with the ubiquitous Welch algorithm, using synthetic data generated from the released spectrum by the LIGO-Virgo collaboration. We find that, when compared to Welch’s method, Burg’s method provides a power spectral density (PSD) estimation with a systematically lower variance and bias. This is particularly manifest in the case of a little number of data points, making Burg’s method most suitable to work in this regime.

Read this paper on arXiv…

A. Martini, S. Schmidt and W. Pozzo
Fri, 18 Jun 21
28/62

Comments: 16 pages, 13 figure, submitted to A&A

High-dimensional Bayesian model selection by proximal nested sampling [CL]

http://arxiv.org/abs/2106.03646


Imaging methods often rely on Bayesian statistical inference strategies to solve difficult imaging problems. Applying Bayesian methodology to imaging requires the specification of a likelihood function and a prior distribution, which define the Bayesian statistical model from which the posterior distribution of the image is derived. Specifying a suitable model for a specific application can be very challenging, particularly when there is no reliable ground truth data available. Bayesian model selection provides a framework for selecting the most appropriate model directly from the observed data, without reference to ground truth data. However, Bayesian model selection requires the computation of the marginal likelihood (Bayesian evidence), which is computationally challenging, prohibiting its use in high-dimensional imaging problems. In this work we present the proximal nested sampling methodology to objectively compare alternative Bayesian imaging models, without reference to ground truth data. The methodology is based on nested sampling, a Monte Carlo approach specialised for model comparison, and exploits proximal Markov chain Monte Carlo techniques to scale efficiently to large problems and to tackle models that are log-concave and not necessarily smooth (e.g., involving L1 or total-variation priors). The proposed approach can be applied computationally to problems of dimension O(10^6) and beyond, making it suitable for high-dimensional inverse imaging problems. It is validated on large Gaussian models, for which the likelihood is available analytically, and subsequently illustrated on a range of imaging problems where it is used to analyse different choices for the sparsifying dictionary and measurement model.

Read this paper on arXiv…

X. Cai, J. McEwen and M. Pereyra
Tue, 8 Jun 21
28/86

Comments: N/A

Uncertainty Quantification of a Computer Model for Binary Black Hole Formation [IMA]

http://arxiv.org/abs/2106.01552


In this paper, a fast and parallelizable method based on Gaussian Processes (GPs) is introduced to emulate computer models that simulate the formation of binary black holes (BBHs) through the evolution of pairs of massive stars. Two obstacles that arise in this application are the a priori unknown conditions of BBH formation and the large scale of the simulation data. We address them by proposing a local emulator which combines a GP classifier and a GP regression model. The resulting emulator can also be utilized in planning future computer simulations through a proposed criterion for sequential design. By propagating uncertainties of simulation input through the emulator, we are able to obtain the distribution of BBH properties under the distribution of physical parameters.

Read this paper on arXiv…

L. Lin, D. Bingham, F. Broekgaarden, et. al.
Fri, 4 Jun 21
33/71

Comments: 24 pages, 11 figures

Geometric variational inference [CL]

http://arxiv.org/abs/2105.10470


Efficiently accessing the information contained in non-linear and high dimensional probability distributions remains a core challenge in modern statistics. Traditionally, estimators that go beyond point estimates are either categorized as Variational Inference (VI) or Markov-Chain Monte-Carlo (MCMC) techniques. While MCMC methods that utilize the geometric properties of continuous probability distributions to increase their efficiency have been proposed, VI methods rarely use the geometry. This work aims to fill this gap and proposes geometric Variational Inference (geoVI), a method based on Riemannian geometry and the Fisher information metric. It is used to construct a coordinate transformation that relates the Riemannian manifold associated with the metric to Euclidean space. The distribution, expressed in the coordinate system induced by the transformation, takes a particularly simple form that allows for an accurate variational approximation by a normal distribution. Furthermore, the algorithmic structure allows for an efficient implementation of geoVI which is demonstrated on multiple examples, ranging from low-dimensional illustrative ones to non-linear, hierarchical Bayesian inverse problems in thousands of dimensions.

Read this paper on arXiv…

P. Frank, R. Leike and T. Enßlin
Mon, 24 May 21
28/41

Comments: 40 pages, 16 figures, submitted to Entropy

Improving exoplanet detection capabilities with the false inclusion probability. Comparison with other detection criteria in the context of radial velocities [EPA]

http://arxiv.org/abs/2105.06995


Context. In exoplanet searches with radial velocity data, the most common statistical significance metrics are the Bayes factor and the false alarm probability (FAP). Both have proved useful, but do not directly address whether an exoplanet detection should be claimed. Furthermore, it is unclear which detection threshold should be taken and how robust the detections are to model misspecification. Aims. The present work aims at defining a detection criterion which conveys as precisely as possible the information needed to claim an exoplanet detection. We compare this new criterion to existing ones in terms of sensitivity and robustness. Methods. We define a significance metric called the false inclusion probability (FIP) based on the posterior probability of presence of a planet. Posterior distributions are computed with the nested sampling package Polychord. We show that for FIP and Bayes factor calculations, defining priors on linear parameters as Gaussian mixture models allows to significantly speed up computations. The performances of the FAP, Bayes factor and FIP are studied with simulations as well as analytical arguments. We compare the methods assuming the model is correct, then evaluate their sensitivity to the prior and likelihood choices. Results. Among other properties, the FIP offers ways to test the reliability of the significance levels, it is particularly efficient to account for aliasing and allows to exclude the presence of planets with a certain confidence. We find that, in our simulations, the FIP outperforms existing detection metrics. We show that planet detections are sensitive to priors on period and semi-amplitude and that letting free the noise parameters offers better performances than fixing a noise model based on a fit to ancillary indicators.

Read this paper on arXiv…

N. Hara, N. Unger, J. Delisle, et. al.
Mon, 17 May 21
14/55

Comments: Accepted for publication in Astronomy & Astrophysics

Model-based clustering of partial records [CL]

http://arxiv.org/abs/2103.16336


Partially recorded data are frequently encountered in many applications. In practice, such datasets are usually clustered by removing incomplete cases or features with missing values, or by imputing missing values, followed by application of a clustering algorithm to the resulting altered data set. Here, we develop clustering methodology through a model-based approach using the marginal density for the observed values, using a finite mixture model of multivariate $t$ distributions. We compare our algorithm to the corresponding full expectation-maximization (EM) approach that considers the missing values in the incomplete data set and makes a missing at random (MAR) assumption, as well as case deletion and imputation. Since only the observed values are utilized, our approach is computationally more efficient than imputation or full EM. Simulation studies demonstrate that our approach has favorable recovery of the true cluster partition compared to case deletion and imputation under various missingness mechanisms, and is more robust to extreme MAR violations than the full EM approach since it does not use the observed values to inform those that are missing. Our methodology is demonstrated on a problem of clustering gamma-ray bursts and is implemented in the this https URL R package.

Read this paper on arXiv…

E. Goren and R. Maitra
Wed, 31 Mar 2021
48/62

Comments: 13 pages, 3 figures, 1 table

Change point detection and image segmentation for time series of astrophysical images [IMA]

http://arxiv.org/abs/2101.11202


Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates), and viewed as a time series of non-homogeneous Poisson images. The method assumes that at each time point, the corresponding multi-band image stack is an unknown 3D piecewise constant function including Poisson noise. It also assumes that all image stacks between any two adjacent change points (in time domain) share the same unknown piecewise constant function. The proposed method is designed to estimate the number and the locations of all the change points (in time domain), as well as all the unknown piecewise constant functions between any pairs of the change points. The method applies the minimum description length (MDL) principle to perform this task. A practical algorithm is also developed to solve the corresponding complicated optimization problem. Simulation experiments and applications to real datasets show that the proposed method enjoys very promising empirical properties. Applications to two real datasets, the XMM observation of a flaring star and an emerging solar coronal loop, illustrate the usage of the proposed method and the scientific insight gained from it.

Read this paper on arXiv…

C. Xu, H. Günther, V. Kashyap, et. al.
Thu, 28 Jan 21
37/64

Comments: 22 pages, 10 figures

A few brief notes on the equivalence of two expressions for statistical significance in point source detections [CL]

http://arxiv.org/abs/2008.05574


The problem of point source detection in Poisson-limited count maps has been addressed by two recent papers [M. Lampton, ApJ 436, 784 (1994); D. E. Alexandreas, et al., Nucl. Instr. Meth. Phys. Res. A 328, 570 (1993)]. Both papers consider the problem of determining whether there are significantly more counts in a source region than would be expected given the number of counts observed in a background region. The arguments in the two papers are quite different (one takes a Bayesian point of view and the other does not), and the suggested formulas for computing p-values appear to be different as well. It is shown here that the expressions provided by the authors of these two articles are in fact equivalent.

Read this paper on arXiv…

J. Theiler
Fri, 14 Aug 20
-935/70

Comments: 5 pages, no figures; written in 1998, and never published (until now)

Modeling high-dimensional dependence among astronomical data [CL]

http://arxiv.org/abs/2006.06268


Fixing the relationship among a set of experimental quantities is a fundamental issue in many scientific disciplines. In the two-dimensional case, the classical approach is to compute the linear correlation coefficient from a scatterplot. This method, however, implicitly assumes a linear relationship between the variables. Such assumption is not always correct. With the use of the partial correlation coefficients, an extension to the multi-dimensional case is possible. However, the problem of the assumed mutual linear relationship among the variables still remains. A relatively recent approach which permits to avoid this problem is modeling the joint probability density function (PDF) of the data with copulas. These are functions which contain all the information on the relationship between two random variables. Although in principle this approach can work also with multi-dimensional data, theoretical as well computational difficulties often limit its use to the two-dimensional case. In this paper, we consider an approach, based on so-called vine copulas, which overcomes this limitation and at the same time is amenable to a theoretical treatment and feasible from the computational point of view. We apply this method to published data on the near-IR and far-IR luminosities and atomic and molecular masses of the Herschel Reference Sample. We determine the relationship among the luminosities and gas masses and show that the far-IR luminosity can be considered as the key parameter which relates all the other three galaxy properties. Once removed from the 4D relation, the residual relation among the other three is negligible. This may be interpreted as that the correlation between the gas masses and near-IR luminosity is driven by the far-IR luminosity, likely by the star-formation activity of the galaxy.

Read this paper on arXiv…

R. Vio, T. Nagler and P. Andreani
Fri, 12 Jun 20
2/69

Comments: submitted to A&A

Modeling Stochastic Variability in Multi-Band Time Series Data [IMA]

http://arxiv.org/abs/2005.08049


In preparation for the era of the time-domain astronomy with upcoming large-scale surveys, we propose a state-space representation of a multivariate damped random walk process as a tool to analyze irregularly-spaced multi-filter light curves with heteroscedastic measurement errors. We adopt a computationally efficient and scalable Kalman-filtering approach to evaluate the likelihood function, leading to maximum $O(k^3n)$ complexity, where $k$ is the number of available bands and $n$ is the number of unique observation times across the $k$ bands. This is a significant computational advantage over a commonly used univariate Gaussian process that can stack up all multi-band light curves in one vector with maximum $O(k^3n^3)$ complexity. Using such efficient likelihood computation, we provide both maximum likelihood estimates and Bayesian posterior samples of the model parameters. Three numerical illustrations are presented; (i) analyzing simulated five-band light curves for a comparison with independent single-band fits; (ii) analyzing five-band light curves of a quasar obtained from the Sloan Digital Sky Survey (SDSS) Stripe~82 to estimate the short-term variability and timescale; (iii) analyzing gravitationally lensed $g$- and $r$-band light curves of Q0957+561 to infer the time delay. Two R packages, Rdrw and timedelay, are publicly available to fit the proposed models.

Read this paper on arXiv…

Z. Hu and H. Tak
Tue, 19 May 20
70/92

Comments: N/A

A flexible method of estimating luminosity functions via Kernel Density Estimation [CL]

http://arxiv.org/abs/2003.13373


We propose a flexible method for estimating luminosity functions (LFs) based on kernel density estimation (KDE), the most popular nonparametric density estimation approach developed in modern statistics, to overcome issues surrounding binning of LFs. One challenge in applying KDE to LFs is how to treat the boundary bias problem, since astronomical surveys usually obtain truncated samples predominantly due to the flux-density limits of surveys. We use two solutions, the transformation KDE method ($\hat{\phi}{\mathrm{t}}$), and the transformation-reflection KDE method ($\hat{\phi}{\mathrm{tr}}$) to reduce the boundary bias. We develop a new likelihood cross-validation criterion for selecting optimal bandwidths, based on which, the posterior probability distribution of bandwidth and transformation parameters for $\hat{\phi}{\mathrm{t}}$ and $\hat{\phi}{\mathrm{tr}}$ are derived within a Markov chain Monte Carlo (MCMC) sampling procedure. The simulation result shows that $\hat{\phi}{\mathrm{t}}$ and $\hat{\phi}{\mathrm{tr}}$ perform better than the traditional binned method, especially in the sparse data regime around the flux-limit of a survey or at the bright-end of the LF. To further improve the performance of our KDE methods, we develop the transformation-reflection adaptive KDE approach ($\hat{\phi}_{\mathrm{tra}}$). Monte Carlo simulations suggest that it has a good stability and reliability in performance, and is around an order of magnitude more accurate than using the binned method. By applying our adaptive KDE method to a quasar sample, we find that it achieves estimates comparable to the rigorous determination by a previous work, while making far fewer assumptions about the LF. The KDE method we develop has the advantages of both parametric and non-parametric methods.

Read this paper on arXiv…

Z. Yuan, M. Jarvis and J. Wang
Tue, 31 Mar 20
56/94

Comments: 23 pages, accepted for publication in The Astrophysical Journal Supplement Series

New probability distributions in astrophysics: II. The generalized and double truncated Lindley [CL]

http://arxiv.org/abs/2003.13498


The statistical parameters of five generalizations of the Lindley distribution, such as the average, variance and moments, are reviewed. A new double truncated Lindley distribution with three parameters is derived. The new distributions are applied to model the initial mass function for stars.

Read this paper on arXiv…

L. Zaninetti
Tue, 31 Mar 20
62/94

Comments: 15 pages, 7 figures

Constraining the recent star formation history of galaxies : an Approximate Bayesian Computation approach [GA]

http://arxiv.org/abs/2002.07815


[Abridged] Although galaxies are found to follow a tight relation between their star formation rate and stellar mass, they are expected to exhibit complex star formation histories (SFH), with short-term fluctuations. The goal of this pilot study is to present a method that will identify galaxies that are undergoing a strong variation of star formation activity in the last tens to hundreds Myr. In other words, the proposed method will determine whether a variation in the last few hundreds of Myr of the SFH is needed to properly model the SED rather than a smooth normal SFH. To do so, we analyze a sample of COSMOS galaxies using high signal-to-noise ratio broad band photometry. We apply Approximate Bayesian Computation, a state-of-the-art statistical method to perform model choice, associated to machine learning algorithms to provide the probability that a flexible SFH is preferred based on the observed flux density ratios of galaxies. We present the method and test it on a sample of simulated SEDs. The input information fed to the algorithm is a set of broadband UV to NIR (rest-frame) flux ratios for each galaxy. The method has an error rate of 21% in recovering the right SFH and is sensitive to SFR variations larger than 1 dex. A more traditional SED fitting method using CIGALE is tested to achieve the same goal, based on fits comparisons through Bayesian Information Criterion but the best error rate obtained is higher, 28%. We apply our new method to the COSMOS galaxies sample. The stellar mass distribution of galaxies with a strong to decisive evidence against the smooth delayed-$\tau$ SFH peaks at lower M* compared to galaxies where the smooth delayed-$\tau$ SFH is preferred. We discuss the fact that this result does not come from any bias due to our training. Finally, we argue that flexible SFHs are needed to be able to cover that largest SFR-M* parameter space possible.

Read this paper on arXiv…

G. Aufort, L. Ciesla, P. Pudlo, et. al.
Thu, 20 Feb 20
22/61

Comments: N/A

Discrete Chi-square Method for Detecting Many Signals [IMA]

http://arxiv.org/abs/2002.03890


Unambiguous detection of signals superimposed on unknown trends is difficult for unevenly spaced data. Here, we formulate the Discrete Chi-square Method (DCM) that can determine the best model for many signals superimposed on arbitrary polynomial trends. DCM minimizes the Chi-square for the data in the multi-dimensional tested frequency space. The required number of tested frequency combinations remains manageable, because the method test statistic is symmetric in this tested frequency space. With our known tested constant frequency grid values, the non-linear DCM model becomes linear, and all results become unambiguous. We test DCM with simulated data containing different mixtures of signals and trends. DCM gives unambiguous results, if the signal frequencies are not too close to each other, and none of the signals is too weak. It relies on brute computational force, because all possible free parameter combinations for all reasonable linear models are tested. DCM works like winning a lottery by buying all lottery tickets. Anyone can reproduce all our results with the DCM computer code.

Read this paper on arXiv…

L. Jetsu
Tue, 11 Feb 20
50/81

Comments: 18 pages, 12 figures, 8 tables

The n-dimensional Extension of the Lomb-Scargle Method [CL]

http://arxiv.org/abs/2001.10200


The common methods of spectral analysis for n-dimensional time series investigate Fourier transform (FT) to decompose discrete data into a set of trigonometric components, i. e. amplitude and phase. Due to the limitations of discrete FT, the data set is restricted to equidistant sampling. However, in the general situation of non-equidistant sampling FT based methods will cause significant errors in the parameter estimation. Therefore, the classical Lomb-Scargle method (LSM) was developed for one dimensional data to circumvent the incorrect behaviour of FT in case of fragmented and irregularly sampled data. The present work deduces LSM for n-dimensional (multivariate) data sets by a redefinition of the shifting parameter $\tau$. An analytical derivation shows, that nD LSM extents the traditional 1D case preserving all the statistical features. Applications with ideal test data and experimental data will illustrate the derived method.

Read this paper on arXiv…

M. Seilmayer, F. Gonzalez and T. Wondrak
Wed, 29 Jan 20
44/46

Comments: to be published

The Widely Linear Complex Ornstein-Uhlenbeck Process with Application to Polar Motion [CL]

http://arxiv.org/abs/2001.05965


Complex-valued and widely linear modelling of time series signals are widespread and found in many applications. However, existing models and analysis techniques are usually restricted to signals observed in discrete time. In this paper we introduce a widely linear version of the complex Ornstein-Uhlenbeck (OU) process. This is a continuous-time process which generalises the standard complex-valued OU process such that signals generated from the process contain elliptical oscillations, as opposed to circular oscillations, when viewed in the complex plane. We determine properties of the widely linear complex OU process, including the conditions for stationarity, and the geometrical structure of the elliptical oscillations. We derive the analytical form of the power spectral density function, which then provides an efficient procedure for parameter inference using the Whittle likelihood. We apply the process to measure periodic and elliptical properties of Earth’s polar motion, including that of the Chandler wobble, for which the standard complex OU process was originally proposed.

Read this paper on arXiv…

A. Sykulski, S. Olhede and H. Sykulska-Lawrence
Fri, 17 Jan 20
48/60

Comments: Submitted for peer-review

Normalizing Constant Estimation with Gaussianized Bridge Sampling [CL]

http://arxiv.org/abs/1912.06073


Normalizing constant (also called partition function, Bayesian evidence, or marginal likelihood) is one of the central goals of Bayesian inference, yet most of the existing methods are both expensive and inaccurate. Here we develop a new approach, starting from posterior samples obtained with a standard Markov Chain Monte Carlo (MCMC). We apply a novel Normalizing Flow (NF) approach to obtain an analytic density estimator from these samples, followed by Optimal Bridge Sampling (OBS) to obtain the normalizing constant. We compare our method which we call Gaussianized Bridge Sampling (GBS) to existing methods such as Nested Sampling (NS) and Annealed Importance Sampling (AIS) on several examples, showing our method is both significantly faster and substantially more accurate than these methods, and comes with a reliable error estimation.

Read this paper on arXiv…

H. Jia and U. Seljak
Fri, 13 Dec 19
70/75

Comments: Accepted by AABI 2019 Proceedings

Deep Learning for space-variant deconvolution in galaxy surveys [IMA]

http://arxiv.org/abs/1911.00443


Deconvolution of large survey images with millions of galaxies requires to develop a new generation of methods which can take into account a space variant Point Spread Function and have to be at the same time accurate and fast. We investigate in this paper how Deep Learning could be used to perform this task. We employ a U-NET Deep Neural Network architecture to learn in a supervised setting parameters adapted for galaxy image processing and study two strategies for deconvolution. The first approach is a post-processing of a mere Tikhonov deconvolution with closed form solution and the second one is an iterative deconvolution framework based on the Alternating Direction Method of Multipliers (ADMM). Our numerical results based on GREAT3 simulations with realistic galaxy images and PSFs show that our two approaches outperforms standard techniques based on convex optimization, whether assessed in galaxy image reconstruction or shape recovery. The approach based on Tikhonov deconvolution leads to the most accurate results except for ellipticity errors at high signal to noise ratio where the ADMM approach performs slightly better, is also more computation-time efficient to process a large number of galaxies, and is therefore recommended in this scenario.

Read this paper on arXiv…

F. Sureau, A. Lechat and J. Starck
Mon, 4 Nov 19
37/55

Comments: N/A

LEO-Py: Estimating likelihoods for correlated, censored, and uncertain data with given marginal distributions [IMA]

http://arxiv.org/abs/1910.02958


Data with uncertain, missing, censored, and correlated values are commonplace in many research fields including astronomy. Unfortunately, such data are often treated in an ad hoc way in the astronomical literature potentially resulting in inconsistent parameter estimates. Furthermore, in a realistic setting, the variables of interest or their errors may have non-normal distributions which complicates the modeling. I present a novel approach to compute the likelihood function for such data sets. This approach employs Gaussian copulas to decouple the correlation structure of variables and their marginal distributions resulting in a flexible method to compute likelihood functions of data in the presence of measurement uncertainty, censoring, and missing data. I demonstrate its use by determining the slope and intrinsic scatter of the star forming sequence of nearby galaxies from observational data. The outlined algorithm is implemented as the flexible, easy-to-use, open-source Python package LEO-Py.

Read this paper on arXiv…

R. Feldmann
Wed, 9 Oct 19
43/64

Comments: 21 pages, 8 figures, 2 tables, to appear in Astronomy and Computing, LEO-Py is available at github.com/rfeldmann/leopy

Detecting new signals under background mismodelling [CL]

http://arxiv.org/abs/1906.06615


Searches for new astrophysical phenomena often involve several sources of non-random uncertainties which can lead to highly misleading results. Among these, model-uncertainty arising from background mismodelling can dramatically compromise the sensitivity of the experiment under study. Specifically, overestimating the background distribution in the signal region increases the chances of missing new physics. Conversely, underestimating the background outside the signal region leads to an artificially enhanced sensitivity and a higher likelihood of claiming false discoveries. The aim of this work is to provide a unified statistical strategy to perform modelling, estimation, inference, and signal characterization under background mismodelling. The method proposed allows to incorporate the (partial) scientific knowledge available on the background distribution and provides a data-updated version of it in a purely nonparametric fashion without requiring the specification of prior distributions. Applications in the context of dark matter searches and radio surveys show how the tools presented in this article can be used to incorporate non-stochastic uncertainty due to instrumental noise and to overcome violations of classical distributional assumptions in stacking experiments.

Read this paper on arXiv…

S. Algeri
Tue, 18 Jun 19
59/73

Comments: N/A

Gaussbock: Fast parallel-iterative cosmological parameter estimation with Bayesian nonparametrics [CEA]

http://arxiv.org/abs/1905.09800


We present and apply Gaussbock, a new embarrassingly parallel iterative algorithm for cosmological parameter estimation designed for an era of cheap parallel computing resources. Gaussbock uses Bayesian nonparametrics and truncated importance sampling to accurately draw samples from posterior distributions with an orders-of-magnitude speed-up in wall time over alternative methods. Contemporary problems in this area often suffer from both increased computational costs due to high-dimensional parameter spaces and consequent excessive time requirements, as well as the need for fine tuning of proposal distributions or sampling parameters. Gaussbock is designed specifically with these issues in mind. We explore and validate the performance and convergence of the algorithm on a fast approximation to the Dark Energy Survey Year 1 (DES Y1) posterior, finding reasonable scaling behavior with the number of parameters. We then test on the full DES Y1 posterior using large-scale supercomputing facilities, and recover reasonable agreement with previous chains, although the algorithm can underestimate the tails of poorly-constrained parameters. In addition, we provide the community with a user-friendly software tool for accelerated cosmological parameter estimation based on the methodology described in this paper.

Read this paper on arXiv…

B. Moews and J. Zuntz
Fri, 24 May 19
58/60

Comments: 17 pages, 7 figures, preprint to be submitted to ApJ

TiK-means: $K$-means clustering for skewed groups [CL]

http://arxiv.org/abs/1904.09609


The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts.

Read this paper on arXiv…

N. Berry and R. Maitra
Tue, 23 Apr 19
13/58

Comments: 15 pages, 6 figures, to appear in Statistical Analysis and Data Mining – The ASA Data Science Journal

Metric Gaussian Variational Inference [CL]

http://arxiv.org/abs/1901.11033


A variational Gaussian approximation of the posterior distribution can be an excellent way to infer posterior quantities. However, to capture all posterior correlations the parametrization of the full covariance is required, which scales quadratic with the problem size. This scaling prohibits full-covariance approximations for large-scale problems. As a solution to this limitation we propose Metric Gaussian Variational Inference (MGVI). This procedure approximates the variational covariance such that it requires no parameters on its own and still provides reliable posterior correlations and uncertainties for all model parameters. We approximate the variational covariance with the inverse Fisher metric, a local estimate of the true posterior uncertainty. This covariance is only stored implicitly and all necessary quantities can be extracted from it by independent samples drawn from the approximating Gaussian. MGVI requires the minimization of a stochastic estimate of the Kullback-Leibler divergence only with respect to the mean of the variational Gaussian, a quantity that only scales linearly with the problem size. We motivate the choice of this covariance from an information geometric perspective. The method is validated against established approaches in a small example and the scaling is demonstrated in a problem with over a million parameters.

Read this paper on arXiv…

J. Knollmüller and T. Enßlin
Fri, 1 Feb 19
45/61

Comments: NIFTy5 release paper, 30 pages, 15 figures, submitted to jmlr, code is part of NIFTy5 release at this https URL

Sampling from manifold-restricted distributions using tangent bundle projections [CL]

http://arxiv.org/abs/1811.05494


A common problem in Bayesian inference is the sampling of target probability distributions at sufficient resolution and accuracy to estimate the probability density, and to compute credible regions. Often by construction, many target distributions can be expressed as some higher-dimensional closed-form distribution with parametrically constrained variables; i.e. one that is restricted to a smooth submanifold of Euclidean space. I propose a derivative-based importance sampling framework for such distributions. A base set of $n$ samples from the target distribution is used to map out the tangent bundle of the manifold, and to seed $nm$ additional points that are projected onto the tangent bundle and weighted appropriately. The method can act as a multiplicative complement to any standard sampling algorithm, and is designed for the efficient production of approximate high-resolution histograms from manifold-restricted Gaussian distributions.

Read this paper on arXiv…

A. Chua
Thu, 15 Nov 18
43/56

Comments: 28 pages, 6 figures

Parameter inference and model comparison using theoretical predictions from noisy simulations [CEA]

http://arxiv.org/abs/1809.08246


When inferring unknown parameters or comparing different models, data must be compared to underlying theory. Even if a model has no closed-form solution to derive summary statistics, it is often still possible to simulate mock data in order to generate theoretical predictions. For realistic simulations of noisy data, this is identical to drawing realisations of the data from a likelihood distribution. Though the estimated summary statistic from simulated data vectors may be unbiased, the estimator has variance which should be accounted for. We show how to correct the likelihood in the presence of an estimated summary statistic by marginalising over the true summary statistic. For Gaussian likelihoods where the covariance must also be estimated from simulations, we present an alteration to the Sellentin-Heavens corrected likelihood. We show that excluding the proposed correction leads to an incorrect estimate of the Bayesian evidence with JLA data. The correction is highly relevant for cosmological inference that relies on simulated data for theory (e.g. weak lensing peak statistics and simulated power spectra) and can reduce the number of simulations required.

Read this paper on arXiv…

N. Jeffrey and F. Abdalla
Tue, 25 Sep 18
57/88

Comments: 7 pages, 5 figures, submitted to MNRAS

Bayesian sparse reconstruction: a brute-force approach to astronomical imaging and machine learning [IMA]

http://arxiv.org/abs/1809.04598


We present a principled Bayesian framework for signal reconstruction, in which the signal is modelled by basis functions whose number (and form, if required) is determined by the data themselves. This approach is based on a Bayesian interpretation of conventional sparse reconstruction and regularisation techniques, in which sparsity is imposed through priors via Bayesian model selection. We demonstrate our method for noisy 1- and 2-dimensional signals, including astronomical images. Furthermore, by using a product-space approach, the number and type of basis functions can be treated as integer parameters and their posterior distributions sampled directly. We show that order-of-magnitude increases in computational efficiency are possible from this technique compared to calculating the Bayesian evidences separately, and that further computational gains are possible using it in combination with dynamic nested sampling. Our approach can be readily applied to neural networks, where it allows the network architecture to be determined by the data in a principled Bayesian manner by treating the number of nodes and hidden layers as parameters.

Read this paper on arXiv…

E. Higson, W. Handley, M. Hobson, et. al.
Fri, 14 Sep 18
18/65

Comments: 16 pages + appendix, 19 figures

Separating diffuse from point-like sources – a Bayesian approach [IMA]

http://arxiv.org/abs/1804.05591


We present the starblade algorithm, a method to separate superimposed point sources from auto-correlated, diffuse flux using a Bayesian model. Point sources are assumed to be independent from each other and to follow a power-law brightness distribution. The diffuse emission is modeled by a nonparametric lognormal model with a priori unknown correlation structure. This model enforces positivity of the underlying emission and allows for variation in the order of its magnitudes. The correlation structure is recovered non-parametrically as well with the diffuse flux and used for the separation of the point sources. Additionally many measurement artifacts appear as point-like or quasi-point-like effect, not compatible with superimposed diffuse emission. We demonstrate the capabilities of the derived method on synthetic data and data obtained by the Hubble Space Telescope, emphasizing its effect on instrumental effects as well as physical sources.

Read this paper on arXiv…

J. Knollmuller, P. Frank and T. Ensslin
Tue, 17 Apr 18
41/83

Comments: N/A

Testing One Hypothesis Multiple Times: The Multidimensional Case [CL]

http://arxiv.org/abs/1803.03858


The identification of new rare signals in data, the detection of a sudden change in a trend, and the selection of competing models, are among the most challenging problems in statistical practice. These challenges can be tackled using a test of hypothesis where a nuisance parameter is present only under the alternative, and a computationally efficient solution can be obtained by the “Testing One Hypothesis Multiple times” (TOHM) method. In the one-dimensional setting, a fine discretization of the space of the non-identifiable parameter is specified, and a global p-value is obtained by approximating the distribution of the supremum of the resulting stochastic process. In this paper, we propose a computationally efficient inferential tool to perform TOHM in the multidimensional setting. Here, the approximations of interest typically involve the expected Euler Characteristics (EC) of the excursion set of the underlying random field. We introduce a simple algorithm to compute the EC in multiple dimensions and for arbitrary large significance levels. This leads to an highly generalizable computational tool to perform inference under non-standard regularity conditions.

Read this paper on arXiv…

S. Algeri and D. van Dyk
Tue, 13 Mar 2018
36/61

Comments: N/A

Scalable Bayesian uncertainty quantification in imaging inverse problems via convex optimization [CL]

http://arxiv.org/abs/1803.00889


We propose a Bayesian uncertainty quantification method for large-scale imaging inverse problems. Our method applies to all Bayesian models that are log-concave, where maximum-a-posteriori (MAP) estimation is a convex optimization problem. The method is a framework to analyse the confidence in specific structures observed in MAP estimates (e.g., lesions in medical imaging, celestial sources in astronomical imaging), to enable using them as evidence to inform decisions and conclusions. Precisely, following Bayesian decision theory, we seek to assert the structures under scrutiny by performing a Bayesian hypothesis test that proceeds as follows: firstly, it postulates that the structures are not present in the true image, and then seeks to use the data and prior knowledge to reject this null hypothesis with high probability. Computing such tests for imaging problems is generally very difficult because of the high dimensionality involved. A main feature of this work is to leverage probability concentration phenomena and the underlying convex geometry to formulate the Bayesian hypothesis test as a convex problem, that we then efficiently solve by using scalable optimization algorithms. This allows scaling to high-resolution and high-sensitivity imaging problems that are computationally unaffordable for other Bayesian computation approaches. We illustrate our methodology, dubbed BUQO (Bayesian Uncertainty Quantification by Optimization), on a range of challenging Fourier imaging problems arising in astronomy and medicine.

Read this paper on arXiv…

A. Repetti, M. Pereyra and Y. Wiaux
Mon, 5 Mar 18
8/45

Comments: N/A

An efficient $k$-means-type algorithm for clustering datasets with incomplete records [CL]

http://arxiv.org/abs/1802.08363


The $k$-means algorithm is the most popular nonparametric clustering method in use, but cannot generally be applied to data sets with missing observations. The usual practice with such data sets is to either impute the values under an assumption of a missing-at-random mechanism or to ignore the incomplete records, and then to use the desired clustering method. We develop an efficient version of the $k$-means algorithm that allows for clustering cases where not all the features have observations recorded. Our extension is called $k_m$-means and reduces to the $k$-means algorithm when all records are complete. We also provide strategies to initialize our algorithm and to estimate the number of groups in the data set. Illustrations and simulations demonstrate the efficacy of our approach in a variety of settings and patterns of missing data. Our methods are also applied to the clustering of gamma-ray bursts and to the analysis of activation images obtained from a functional Magnetic Resonance Imaging experiment.

Read this paper on arXiv…

A. Lithio and R. Maitra
Mon, 26 Feb 18
36/49

Comments: 23 pages, 14 figures, 2 tables

Bayesian model checking: A comparison of tests [IMA]

http://arxiv.org/abs/1712.07422


Two procedures for checking Bayesian models are compared using a simple test problem based on the local Hubble expansion. Over four orders of magnitude, p-values derived from a global goodness-of-fit criterion for posterior probability density functions (Lucy 2017) agree closely with posterior predictive p-values. The former can therefore serve as an effective proxy for the difficult-to-calculate posterior predictive p-values.

Read this paper on arXiv…

L. Lucy
Thu, 21 Dec 17
69/76

Comments: 4 pages, 3 figures. Submitted to Astronomy & Astrophysics

A posteriori noise estimation in variable data sets [CL]

http://arxiv.org/abs/1712.02226


Most physical data sets contain a stochastic contribution produced by measurement noise or other random sources along with the signal. Usually, neither the signal nor the noise are accurately known prior to the measurement so that both have to be estimated a posteriori. We have studied a procedure to estimate the standard deviation of the stochastic contribution assuming normality and independence, requiring a sufficiently well-sampled data set to yield reliable results. This procedure is based on estimating the standard deviation in a sample of weighted sums of arbitrarily sampled data points and is identical to the so-called DER_SNR algorithm for specific parameter settings. To demonstrate the applicability of our procedure, we present applications to synthetic data, high-resolution spectra, and a large sample of space-based light curves and, finally, give guidelines to apply the procedure in situation not explicitly considered here to promote its adoption in data analysis.

Read this paper on arXiv…

S. Czesla, T. Molle and J. Schmitt
Thu, 7 Dec 17
23/72

Comments: Accepted for publication in A&A

Inference of signals with unknown correlation structure from non-linear measurements [CL]

http://arxiv.org/abs/1711.02955


We present a method to reconstruct auto-correlated signals together with their auto-correlation structure from non-linear, noisy measurements for arbitrary monotonous non-linearities. In the presented formulation the algorithm provides a significant speedup compared to prior implementations, allowing for a wider range of application. The non-linearity can be used to model instrument characteristics or to enforce properties on the underlying signal, such as positivity. Uncertainties on any posterior quantities can be provided due to independent samples from an approximate posterior distribution. We demonstrate the methods applicability via three examples, using different measurement instruments, non-linearities and dimensionality for both, simulated measurements and real data.

Read this paper on arXiv…

J. Knollmuller, T. Steininger and T. Ensslin
Thu, 9 Nov 17
7/54

Comments: N/A

On the use of the Edgeworth expansion in cosmology I: how to foresee and evade its pitfalls [CEA]

http://arxiv.org/abs/1709.03452


Non-linear gravitational collapse introduces non-Gaussian statistics into the matter fields of the late Universe. As the large-scale structure is the target of current and future observational campaigns, one would ideally like to have the full probability density function of these non-Gaussian fields. The only viable way we see to achieve this analytically, at least approximately and in the near future, is via the Edgeworth expansion. We hence rederive this expansion for Fourier modes of non-Gaussian fields and then continue by putting it into a wider statistical context than previously done. We show that in its original form, the Edgeworth expansion only works if the non-Gaussian signal is averaged away. This is counterproductive, since we target the parameter-dependent non-Gaussianities as a signal of interest. We hence alter the analysis at the decisive step and now provide a roadmap towards a controlled and unadulterated analysis of non-Gaussianities in structure formation (with the Edgeworth expansion). Our central result is that, although the Edgeworth expansion has pathological properties, these can be predicted and avoided in a careful manner. We also show that, despite the non-Gaussianity coupling all modes, the Edgeworth series may be applied to any desired subset of modes, since this is equivalent (to the level of the approximation) to marginalising over the exlcuded modes. In this first paper of a series, we restrict ourselves to the sampling properties of the Edgeworth expansion, i.e.~how faithfully it reproduces the distribution of non-Gaussian data. A follow-up paper will detail its Bayesian use, when parameters are to be inferred.

Read this paper on arXiv…

E. Sellentin, A. Jaffe and A. Heavens
Tue, 12 Sep 17
27/71

Comments: 25 pages, 7 figures

Towards information optimal simulation of partial differential equations [CL]

http://arxiv.org/abs/1709.02859


Most simulation schemes for partial differential equations (PDEs) focus on minimizing a simple error norm of a discretized version of a field. This paper takes a fundamentally different approach; the discretized field is interpreted as data providing information about a real physical field that is unknown. This information is sought to be conserved by the scheme as the field evolves in time. Such an information theoretic approach to simulation was pursued before by information field dynamics (IFD). In this paper we work out the theory of IFD for nonlinear PDEs in a noiseless Gaussian approximation. The result is an action that can be minimized to obtain an informationally optimal simulation scheme. It can be brought into a closed form using field operators to calculate the appearing Gaussian integrals. The resulting simulation schemes are tested numerically in two instances for the Burgers equation. Their accuracy surpasses finite-difference schemes on the same resolution. The IFD scheme, however, has to be correctly informed on the subgrid correlation structure. In certain limiting cases we recover well-known simulation schemes like spectral Fourier Galerkin methods. We discuss implications of the approximations made.

Read this paper on arXiv…

R. Leike and T. Ensslin
Tue, 12 Sep 17
55/71

Comments: N/A

Field dynamics inference via spectral density estimation [CL]

http://arxiv.org/abs/1708.05250


Stochastic differential equations (SDEs) are of utmost importance in various scientific and industrial areas. They are the natural description of dynamical processes whose precise equations of motion are either not known or too expensive to solve, e.g., when modeling Brownian motion. In some cases, the equations governing the dynamics of a physical system on macroscopic scales occur to be unknown since they typically cannot be deduced from general principles. In this work, we describe how the underlying laws of a stochastic process can be approximated by the spectral density of the corresponding process. Furthermore, we show how the density can be inferred from possibly very noisy and incomplete measurements of the dynamical field. Generally, inverse problems like these can be tackled with the help of Information Field Theory (IFT). For now, we restrict to linear and autonomous processes. Though, this is a non-conceptual limitation that may be omitted in future work. To demonstrate its applicability we employ our reconstruction algorithm on a time-series and spatio-temporal processes.

Read this paper on arXiv…

P. Frank, T. Steininger and T. Ensslin
Fri, 18 Aug 17
47/47

Comments: 12 pages, 9 figures

Massive data compression for parameter-dependent covariance matrices [CEA]

http://arxiv.org/abs/1707.06529


We show how the massive data compression algorithm MOPED can be used to reduce, by orders of magnitude, the number of simulated datasets that are required to estimate the covariance matrix required for the analysis of gaussian-distributed data. This is relevant when the covariance matrix cannot be calculated directly. The compression is especially valuable when the covariance matrix varies with the model parameters. In this case, it may be prohibitively expensive to run enough simulations to estimate the full covariance matrix throughout the parameter space. This compression may be particularly valuable for the next-generation of weak lensing surveys, such as proposed for Euclid and LSST, for which the number of summary data (such as band power or shear correlation estimates) is very large, $\sim 10^4$, due to the large number of tomographic redshift bins that the data will be divided into. In the pessimistic case where the covariance matrix is estimated separately for all points in an MCMC analysis, this may require an unfeasible $10^9$ simulations. We show here that MOPED can reduce this number by a factor of 1000, or a factor of $\sim 10^6$ if some regularity in the covariance matrix is assumed, reducing the number of simulations required to a manageable $10^3$, making an otherwise intractable analysis feasible.

Read this paper on arXiv…

A. Heavens, E. Sellentin, D. Mijolla, et. al.
Fri, 21 Jul 17
5/59

Comments: 7 pages. For submission to MNRAS

Noisy independent component analysis of auto-correlated components [CL]

http://arxiv.org/abs/1705.02344


We present a new method for the separation of superimposed, independent, auto-correlated com- ponents from noisy multi-channel measurement. The presented method simultaneously reconstructs and separates the components, taking all channels into account and thereby increases the effective signal-to-noise ratio considerably, allowing separations even in the high noise regime. Characteristics of the measurement instruments can be included, allowing for application in complex measurement situations. Independent posterior samples can be provided, permitting error estimates on all de- sired quantities. Using the concept of information field theory, the algorithm is not restricted to any dimensionality of the underlying space or discretization scheme thereof.

Read this paper on arXiv…

J. Knollmuller and T. Ensslin
Tue, 9 May 17
42/82

Comments: N/A

Maximum Likelihood Estimation based on Random Subspace EDA: Application to Extrasolar Planet Detection [CL]

http://arxiv.org/abs/1704.05761


This paper addresses maximum likelihood (ML) estimation based model fitting in the context of extrasolar planet detection. This problem is featured by the following properties: 1) the candidate models under consideration are highly nonlinear; 2) the likelihood surface has a huge number of peaks; 3) the parameter space ranges in size from a few to dozens of dimensions. These properties make the ML search a very challenging problem, as it lacks any analytical or gradient based searching solution to explore the parameter space. A population based searching method, called estimation of distribution algorithm (EDA) is adopted to explore the model parameter space starting from a batch of random locations. EDA is featured by its ability to reveal and utilize problem structures. This property is desirable for characterizing the detections. However, it is well recognized that EDAs can not scale well to large scale problems, as it consists of iterative random sampling and model fitting procedures, which results in the well-known dilemma curse of dimensionality. A novel mechanism to perform EDAs in interactive random subspaces spanned by correlated variables is proposed. This mechanism is totally adaptive and is capable of alleviating curse of dimensionality for EDAs to a large extent, as the dimension of each subspace is much smaller than that of the full parameter space. The efficiency of the proposed algorithm is verified via both benchmark numerical studies and real data analysis.

Read this paper on arXiv…

B. Liu and K. Chen
Thu, 20 Apr 17
28/49

Comments: 12 pages, 5 figures, conference

Dynamic nested sampling: an improved algorithm for parameter estimation and evidence calculation [CL]

http://arxiv.org/abs/1704.03459


We introduce dynamic nested sampling: a generalisation of the nested sampling algorithm in which the number of “live points” varies to allocate samples more efficiently. In empirical tests the new method increases accuracy by up to a factor of ~8 for parameter estimation and ~3 for evidence calculation compared to standard nested sampling with the same number of samples – equivalent to speeding up the computation by factors of ~64 and ~9 respectively. In addition unlike in standard nested sampling more accurate results can be obtained by continuing the calculation for longer. Dynamic nested sampling can be easily included in existing nested sampling software such as MultiNest and PolyChord.

Read this paper on arXiv…

E. Higson, W. Handley, M. Hobson, et. al.
Thu, 13 Apr 17
36/56

Comments: 16 pages + appendix, 8 figures, submitted to Bayesian Analysis. arXiv admin note: text overlap with arXiv:1703.09701

Sampling errors in nested sampling parameter estimation [CL]

http://arxiv.org/abs/1703.09701


Sampling errors in nested sampling parameter estimation differ from those in Bayesian evidence calculation, but have been little studied in the literature. This paper provides the first explanation of the two main sources of sampling errors in nested sampling parameter estimation, and presents a new diagrammatic representation for the process. We find no current method can accurately measure the parameter estimation errors of a single nested sampling run, and propose a method for doing so using a new algorithm for dividing nested sampling runs. We empirically verify our conclusions and the accuracy of our new method.

Read this paper on arXiv…

E. Higson, W. Handley, M. Hobson, et. al.
Thu, 30 Mar 17
66/69

Comments: 22 pages + appendix, 10 figures, submitted to Bayesian Analysis

Bayesian Methods in Cosmology [CEA]

http://arxiv.org/abs/1701.01467


These notes aim at presenting an overview of Bayesian statistics, the underlying concepts and application methodology that will be useful to astronomers seeking to analyse and interpret a wide variety of data about the Universe. The level starts from elementary notions, without assuming any previous knowledge of statistical methods, and then progresses to more advanced, research-level topics. After an introduction to the importance of statistical inference for the physical sciences, elementary notions of probability theory and inference are introduced and explained. Bayesian methods are then presented, starting from the meaning of Bayes Theorem and its use as inferential engine, including a discussion on priors and posterior distributions. Numerical methods for generating samples from arbitrary posteriors (including Markov Chain Monte Carlo and Nested Sampling) are then covered. The last section deals with the topic of Bayesian model selection and how it is used to assess the performance of models, and contrasts it with the classical p-value approach. A series of exercises of various levels of difficulty are designed to further the understanding of the theoretical material, including fully worked out solutions for most of them.

Read this paper on arXiv…

R. Trotta
Mon, 9 Jan 17
41/52

Comments: 86 pages, 16 figures. Lecture notes for the 44th Saas Fee Advanced Course on Astronomy and Astrophysics, “Cosmology with wide-field surveys” (March 2014), to be published by Springer. Comments welcome

Accelerating cross-validation with total variation and its application to super-resolution imaging [CL]

http://arxiv.org/abs/1611.07197


We develop an approximation formula for the cross-validation error (CVE) of a sparse linear regression penalized by $\ell_1$-norm and total variation terms, which is based on a perturbative expansion utilizing the largeness of both the data dimensionality and the model. The developed formula allows us to reduce the necessary computational cost of the CVE evaluation significantly. The practicality of the formula is tested through application to simulated black-hole image reconstruction on the event-horizon scale with super resolution. The results demonstrate that our approximation reproduces the CVE values obtained via literally conducted cross-validation with reasonably good precision.

Read this paper on arXiv…

T. Obuchi, S. Ikeda, K. Akiyama, et. al.
Wed, 23 Nov 16
13/68

Comments: 5 pages, 1 figure

Filling the gaps: Gaussian mixture models from noisy, truncated or incomplete samples [IMA]

http://arxiv.org/abs/1611.05806


We extend the common mixtures-of-Gaussians density estimation approach to account for a known sample incompleteness by simultaneous imputation from the current model. The method called GMMis generalizes existing Expectation-Maximization techniques for truncated data to arbitrary truncation geometries and probabilistic rejection. It can incorporate an uniform background distribution as well as independent multivariate normal measurement errors for each of the observed samples, and recovers an estimate of the error-free distribution from which both observed and unobserved samples are drawn. We compare GMMis to the standard Gaussian mixture model for simple test cases with different types of incompleteness, and apply it to observational data from the NASA Chandra X-ray telescope. The python code is capable of performing density estimation with millions of samples and thousands of model components and is released as an open-source package at https://github.com/pmelchior/pyGMMis

Read this paper on arXiv…

P. Melchior and A. Goulding
Fri, 18 Nov 16
49/60

Comments: 12 pages, 6 figures, submitted to Computational Statistics & Data Analysis

Bayes Factors via Savage-Dickey Supermodels [IMA]

http://arxiv.org/abs/1609.02186


We outline a new method to compute the Bayes Factor for model selection which bypasses the Bayesian Evidence. Our method combines multiple models into a single, nested, Supermodel using one or more hyperparameters. Since the models are now nested the Bayes Factors between the models can be efficiently computed using the Savage-Dickey Density Ratio (SDDR). In this way model selection becomes a problem of parameter estimation. We consider two ways of constructing the supermodel in detail: one based on combined models, and a second based on combined likelihoods. We report on these two approaches for a Gaussian linear model for which the Bayesian evidence can be calculated analytically and a toy nonlinear problem. Unlike the combined model approach, where a standard Monte Carlo Markov Chain (MCMC) struggles, the combined-likelihood approach fares much better in providing a reliable estimate of the log-Bayes Factor. This scheme potentially opens the way to computationally efficient ways to compute Bayes Factors in high dimensions that exploit the good scaling properties of MCMC, as compared to methods such as nested sampling that fail for high dimensions.

Read this paper on arXiv…

A. Mootoovaloo, B. Bassett and M. Kunz
Fri, 9 Sep 16
4/70

Comments: 24 pages, 11 Figures

Generalisations of Fisher Matrices [CEA]

http://arxiv.org/abs/1606.06455


Fisher matrices play an important role in experimental design and in data analysis. Their primary role is to make predictions for the inference of model parameters – both their errors and covariances. In this short review, I outline a number of extensions to the simple Fisher matrix formalism, covering a number of recent developments in the field. These are: (a) situations where the data (in the form of (x,y) pairs) have errors in both x and y; (b) modifications to parameter inference in the presence of systematic errors, or through fixing the values of some model parameters; (c) Derivative Approximation for LIkelihoods (DALI) – higher-order expansions of the likelihood surface, going beyond the Gaussian shape approximation; (d) extensions of the Fisher-like formalism, to treat model selection problems with Bayesian evidence.

Read this paper on arXiv…

A. Heavens
Wed, 22 Jun 16
29/50

Comments: Invited review article for Entropy special issue on ‘Applications of Fisher Information in Sciences’. Accepted version

Looking for a Needle in a Haystack? Look Elsewhere! A statistical comparison of approximate global p-values [CL]

http://arxiv.org/abs/1602.03765


The search for new significant peaks over a energy spectrum often involves a statistical multiple hypothesis testing problem. Separate tests of hypothesis are conducted at different locations producing an ensemble of local p-values, the smallest of which is reported as evidence for the new resonance. Unfortunately, controlling the false detection rate (type I error rate) of such procedures may lead to excessively stringent acceptance criteria. In the recent physics literature, two promising statistical tools have been proposed to overcome these limitations. In 2005, a method to “find needles in haystacks” was introduced by Pilla et al. [1], and a second method was later proposed by Gross and Vitells [2] in the context of the “look elsewhere effect” and trial factors. We show that, for relatively small sample sizes, the former leads to an artificial inflation of statistical power that stems from an increase in the false detection rate, whereas the two methods exhibit similar performance for large sample sizes. Finally, we provide general guidelines to select between statistical procedures for signal detection with respect to the specifics of the physics problem under investigation.

Read this paper on arXiv…

S. Algeri, J. Conrad, D. Dyk, et. al.
Fri, 12 Feb 16
6/48

Comments: Submitted to EPJ C

Preprocessing Solar Images while Preserving their Latent Structure [IMA]

http://arxiv.org/abs/1512.04273


Telescopes such as the Atmospheric Imaging Assembly aboard the Solar Dynamics Observatory, a NASA satellite, collect massive streams of high resolution images of the Sun through multiple wavelength filters. Reconstructing pixel-by-pixel thermal properties based on these images can be framed as an ill-posed inverse problem with Poisson noise, but this reconstruction is computationally expensive and there is disagreement among researchers about what regularization or prior assumptions are most appropriate. This article presents an image segmentation framework for preprocessing such images in order to reduce the data volume while preserving as much thermal information as possible for later downstream analyses. The resulting segmented images reflect thermal properties but do not depend on solving the ill-posed inverse problem. This allows users to avoid the Poisson inverse problem altogether or to tackle it on each of $\sim$10 segments rather than on each of $\sim$10$^7$ pixels, reducing computing time by a factor of $\sim$10$^6$. We employ a parametric class of dissimilarities that can be expressed as cosine dissimilarity functions or Hellinger distances between nonlinearly transformed vectors of multi-passband observations in each pixel. We develop a decision theoretic framework for choosing the dissimilarity that minimizes the expected loss that arises when estimating identifiable thermal properties based on segmented images rather than on a pixel-by-pixel basis. We also examine the efficacy of different dissimilarities for recovering clusters in the underlying thermal properties. The expected losses are computed under scientifically motivated prior distributions. Two simulation studies guide our choices of dissimilarity function. We illustrate our method by segmenting images of a coronal hole observed on 26 February 2015.

Read this paper on arXiv…

N. Stein, D. Dyk and V. Kashyap
Tue, 15 Dec 15
48/87

Comments: N/A

Estimating sparse precision matrices [IMA]

http://arxiv.org/abs/1512.01241


We apply a method recently introduced to the statistical literature to directly estimate the precision matrix from an ensemble of samples drawn from a corresponding Gaussian distribution. Motivated by the observation that cosmological precision matrices are often approximately sparse, the method allows one to exploit this sparsity of the precision matrix to more quickly converge to an asymptotic 1/sqrt(Nsim) rate while simultaneously providing an error model for all of the terms. Such an estimate can be used as the starting point for further regularization efforts which can improve upon the 1/sqrt(Nsim) limit above, and incorporating such additional steps is straightforward within this framework. We demonstrate the technique with toy models and with an example motivated by large-scale structure two-point analysis, showing significant improvements in the rate of convergence.For the large-scale structure example we find errors on the precision matrix which are factors of 5 smaller than for the sample precision matrix for thousands of simulations or, alternatively, convergence to the same error level with more than an order of magnitude fewer simulations.

Read this paper on arXiv…

N. Padmanabhan, M. White, H. Zhou, et. al.
Mon, 7 Dec 15
43/46

Comments: 11 pages, 14 figures, submitted to MNRAS

Parameter inference with estimated covariance matrices [CEA]

http://arxiv.org/abs/1511.05969


When inferring parameters from a Gaussian-distributed data set by computing a likelihood, a covariance matrix is needed that describes the data errors and their correlations. If the covariance matrix is not known a priori, it may be estimated and thereby becomes a random object with some intrinsic uncertainty itself. We show how to infer parameters in the presence of such an estimated covariance matrix, by marginalising over the true covariance matrix, conditioned on its estimated value. This leads to a likelihood function that is no longer Gaussian, but rather an adapted version of a multivariate $t$-distribution, which has the same numerical complexity as the multivariate Gaussian. As expected, marginalisation over the true covariance matrix improves inference when compared with Hartlap et al.’s method, which uses an unbiased estimate of the inverse covariance matrix but still assumes that the likelihood is Gaussian.

Read this paper on arXiv…

E. Sellentin and A. Heavens
Fri, 20 Nov 15
1/55

Comments: To be published in MNRAS letters

Frequentist tests for Bayesian models [IMA]

http://arxiv.org/abs/1511.02363


Analogues of the frequentist chi-square and $F$ tests are proposed for testing goodness-of-fit and consistency for Bayesian models. Simple examples exhibit these tests’ detection of inconsistency between consecutive experiments with identical parameters, when the first experiment provides the prior for the second. In a related analysis, a quantitative measure is derived for judging the degree of tension between two different experiments with partially overlapping parameter vectors.

Read this paper on arXiv…

L. Lucy
Tue, 10 Nov 15
27/62

Comments: 8 pages, 4 figures

Detecting Unspecified Structure in Low-Count Images [IMA]

http://arxiv.org/abs/1510.04662


Unexpected structure in images of astronomical sources often presents itself upon visual inspection of the image, but such apparent structure may either correspond to true features in the source or be due to noise in the data. This paper presents a method for testing whether inferred structure in an image with Poisson noise represents a significant departure from a baseline (null) model of the image. To infer image structure, we conduct a Bayesian analysis of a full model that uses a multiscale component to allow flexible departures from the posited null model. As a test statistic, we use a tail probability of the posterior distribution under the full model. This choice of test statistic allows us to estimate a computationally efficient upper bound on a p-value that enables us to draw strong conclusions even when there are limited computational resources that can be devoted to simulations under the null model. We demonstrate the statistical performance of our method on simulated images. Applying our method to an X-ray image of the quasar 0730+257, we find significant evidence against the null model of a single point source and uniform background, lending support to the claim of an X-ray jet.

Read this paper on arXiv…

N. Stein, D. Dyk, V. Kashyap, et. al.
Fri, 16 Oct 15
22/67

Comments: N/A

Comparing non-nested models in the search for new physics [CL]

http://arxiv.org/abs/1509.01010


Searches for unknown physics and deciding between competing physical models to explain data rely on statistical hypotheses testing. A common approach, used for example in the discovery of the Brout-Englert-Higgs boson, is based on the statistical Likelihood Ratio Test (LRT) and its asymptotic properties. In the common situation, when neither of the two models under comparison is a special case of the other i.e., when the hypotheses are non-nested, this test is not applicable, and so far no efficient solution exists. In physics, this problem occurs when two models that reside in different parameter spaces are to be compared. An important example is the recently reported excess emission in astrophysical $\gamma$-rays and the question whether its origin is known astrophysics or dark matter. We develop and study a new, generally applicable, frequentist method and validate its statistical properties using a suite of simulations studies. We exemplify it on realistic simulated data of the Fermi-LAT $\gamma$-ray satellite, where non-nested hypotheses testing appears in the search for particle dark matter.

Read this paper on arXiv…

S. Algeri, J. Conrad and D. Dyk
Fri, 4 Sep 15
53/58

Comments: We welcome examples of non-nested models testing problems

A Gibbs Sampler for Multivariate Linear Regression [IMA]

http://arxiv.org/abs/1509.00908


Kelly (2007, hereafter K07) described an efficient algorithm, using Gibbs sampling, for performing linear regression in the fairly general case where non-zero measurement errors exist for both the covariates and response variables, where these measurements may be correlated (for the same data point), where the response variable is affected by intrinsic scatter in addition to measurement error, and where the prior distribution of covariates is modeled by a flexible mixture of Gaussians rather than assumed to be uniform. Here I extend the K07 algorithm in two ways. First, the procedure is generalized to the case of multiple response variables. Second, I describe how to model the prior distribution of covariates using a Dirichlet process, which can be thought of as a Gaussian mixture where the number of mixture components is learned from the data. I present an example of multivariate regression using the extended algorithm, namely fitting scaling relations of the gas mass, temperature, and luminosity of dynamically relaxed galaxy clusters as a function of their mass and redshift. An implementation of the Gibbs sampler in the R language, called LRGS, is provided.

Read this paper on arXiv…

A. Mantz
Fri, 4 Sep 15
54/58

Comments: 9 pages, 5 figures, 2 tables

Stochastic determination of matrix determinants [CL]

http://arxiv.org/abs/1504.02661


Matrix determinants play an important role in data analysis, in particular when Gaussian processes are involved. Due to currently exploding data volumes linear operations – matrices – acting on the data are often not accessible directly, but are only represented indirectly in form of a computer routine. Such a routine implements the transformation a data vector undergoes under matrix multiplication. Meanwhile efficient probing routines to estimate a matrix’s diagonal or trace, based solely on such computationally affordable matrix-vector multiplications, are well known and frequently used in signal inference, a stochastic estimate for its determinant is still lacking. In this work a probing method for the logarithm of a determinant of a linear operator is introduced. This method rests upon a reformulation of the log-determinant by an integral representation and the transformation of the involved terms into stochastic expressions. This stochastic determinant determination enables large-size applications in Bayesian inference, in particular evidence calculations, model comparison, and posterior determination.

Read this paper on arXiv…

S. Dorn and T. Ensslin
Mon, 13 Apr 15
49/54

Comments: 8 pages, 5 figures

Weighted principal component analysis: a weighted covariance eigendecomposition approach [IMA]

http://arxiv.org/abs/1412.4533


We present a new straightforward principal component analysis (PCA) method based on the diagonalization of the weighted variance-covariance matrix through two spectral decomposition methods: power iteration and Rayleigh quotient iteration. This method allows one to retrieve a given number of orthogonal principal components amongst the most meaningful ones for the case of problems with weighted and/or missing data. Principal coefficients are then retrieved by fitting principal components to the data while providing the final decomposition. Tests performed on real and simulated cases show that our method is optimal in the identification of the most significant patterns within data sets. We illustrate the usefulness of this method by assessing its quality on the extrapolation of Sloan Digital Sky Survey quasar spectra from measured wavelengths to shorter and longer wavelengths. Our new algorithm also benefits from a fast and flexible implementation.

Read this paper on arXiv…

L. Delchambre
Tue, 16 Dec 14
46/78

Comments: 12 pages, 9 figures

Monte Carlo error analyses of Spearman's rank test [IMA]

http://arxiv.org/abs/1411.3816


Spearman’s rank correlation test is commonly used in astronomy to discern whether a set of two variables are correlated or not. Unlike most other quantities quoted in astronomical literature, the Spearman’s rank correlation coefficient is generally quoted with no attempt to estimate the errors on its value. This is a practice that would not be accepted for those other quantities, as it is often regarded that an estimate of a quantity without an estimate of its associated uncertainties is meaningless. This manuscript describes a number of easily implemented, Monte Carlo based methods to estimate the uncertainty on the Spearman’s rank correlation coefficient, or more precisely to estimate its probability distribution.

Read this paper on arXiv…

P. Curran
Mon, 17 Nov 14
38/52

Comments: Unubmitted manuscript (comments welcome); 5 pages; Code available at this https URL

Bayesian Evidence and Model Selection [CL]

http://arxiv.org/abs/1411.3013


In this paper we review the concept of the Bayesian evidence and its application to model selection. The theory is presented along with a discussion of analytic, approximate and numerical techniques. Application to several practical examples within the context of signal processing are discussed.

Read this paper on arXiv…

K. Knuth, M. Habeck, N. Malakar, et. al.
Thu, 13 Nov 14
38/49

Comments: 39 pages, 8 figures. Submitted to DSP. Features theory, numerical methods and four applications

Finding the Most Distant Quasars Using Bayesian Selection Methods [IMA]

http://arxiv.org/abs/1405.4701


Quasars, the brightly glowing disks of material that can form around the super-massive black holes at the centres of large galaxies, are amongst the most luminous astronomical objects known and so can be seen at great distances. The most distant known quasars are seen as they were when the Universe was less than a billion years old (i.e., $\sim\!7%$ of its current age). Such distant quasars are, however, very rare, and so are difficult to distinguish from the billions of other comparably-bright sources in the night sky. In searching for the most distant quasars in a recent astronomical sky survey (the UKIRT Infrared Deep Sky Survey, UKIDSS), there were $\sim\!10^3$ apparently plausible candidates for each expected quasar, far too many to reobserve with other telescopes. The solution to this problem was to apply Bayesian model comparison, making models of the quasar population and the dominant contaminating population (Galactic stars) to utilise the information content in the survey measurements. The result was an extremely efficient selection procedure that was used to quickly identify the most promising UKIDSS candidates, one of which was subsequently confirmed as the most distant quasar known to date.

Read this paper on arXiv…

D. Mortlock
Tue, 20 May 14
36/62

Comments: Published in at this http URL the Statistical Science (this http URL) by the Institute of Mathematical Statistics (this http URL)

Functional Regression for Quasar Spectra [CL]

http://arxiv.org/abs/1404.3168


The Lyman-alpha forest is a portion of the observed light spectrum of distant galactic nuclei which allows us to probe remote regions of the Universe that are otherwise inaccessible. The observed Lyman-alpha forest of a quasar light spectrum can be modeled as a noisy realization of a smooth curve that is affected by a `damping effect’ which occurs whenever the light emitted by the quasar travels through regions of the Universe with higher matter concentration. To decode the information conveyed by the Lyman-alpha forest about the matter distribution, we must be able to separate the smooth `continuum’ from the noise and the contribution of the damping effect in the quasar light spectra. To predict the continuum in the Lyman-alpha forest, we use a nonparametric functional regression model in which both the response and the predictor variable (the smooth part of the damping-free portion of the spectrum) are function-valued random variables. We demonstrate that the proposed method accurately predicts the unobservable continuum in the Lyman-alpha forest both on simulated spectra and real spectra. Also, we introduce distribution-free prediction bands for the nonparametric functional regression model that have finite sample guarantees. These prediction bands, together with bootstrap-based confidence bands for the projection of the mean continuum on a fixed number of principal components, allow us to assess the degree of uncertainty in the model predictions.

Read this paper on arXiv…

M. Ciollaro, J. Cisewski, P. Freeman, et. al.
Mon, 14 Apr 14
4/41

Inverse Bayesian Estimation of Gravitational Mass Density in Galaxies from Missing Kinematic Data [CL]

http://arxiv.org/abs/1401.1052


In this paper we focus on a type of inverse problem in which the data is expressed as an unknown function of the sought and unknown model function (or its discretised representation as a model parameter vector). In particular, we deal with situations in which training data is not available. Then we cannot model the unknown functional relationship between data and the unknown model function (or parameter vector) with a Gaussian Process of appropriate dimensionality. A Bayesian method based on state space modelling is advanced instead. Within this framework, the likelihood is expressed in terms of the probability density function ($pdf$) of the state space variable and the sought model parameter vector is embedded within the domain of this $pdf$. As the measurable vector lives only inside an identified sub-volume of the system state space, the $pdf$ of the state space variable is projected onto the space of the measurables, and it is in terms of the projected state space density that the likelihood is written; the final form of the likelihood is achieved after convolution with the distribution of measurement errors. Application motivated vague priors are invoked and the posterior probability density of the model parameter vectors, given the data is computed. Inference is performed by taking posterior samples with adaptive MCMC. The method is illustrated on synthetic as well as real galactic data.

Read this paper on arXiv…

Wed, 8 Jan 14
52/62

A Generalized Savage-Dickey Ratio [CL]

http://arxiv.org/abs/1311.1292


In this brief research note I present a generalized version of the Savage-Dickey Density Ratio for representation of the Bayes factor (or marginal likelihood ratio) of nested statistical models; the new version takes the form of a Radon-Nikodym derivative and is thus applicable to a wider family of probability spaces than the original (restricted to those admitting an ordinary Lebesgue density). A derivation is given following the measure-theoretic construction of Marin & Robert (2010), and the equivalent estimator is demonstrated in application to a distributional modeling problem.

Read this paper on arXiv…

Thu, 7 Nov 13
16/60