Avoiding methane emission rate underestimates when using the divergence method [CL]

http://arxiv.org/abs/2304.10303


Methane is a powerful greenhouse gas, and a primary target for mitigating climate change in the short-term future due to its relatively short atmospheric lifetime and greater ability to trap heat in Earth’s atmosphere compared to carbon dioxide. Top-down observations of atmospheric methane are possible via drone and aircraft surveys as well as satellites such as the TROPOspheric Monitoring Instrument (TROPOMI). Recent work has begun to apply the divergence method to produce regional methane emission rate estimates. Here we show that spatially incomplete observations of methane can produce negatively biased time-averaged regional emission rate estimates via the divergence method, but that this effect can be counteracted by adopting a procedure in which daily advective fluxes of methane are time-averaged before the divergence method is applied. Using such a procedure with TROPOMI methane observations, we calculate yearly Permian emission rates of 3.1, 2.4 and 2.7 million tonnes per year for the years 2019 through 2021. We also show that highly-resolved plumes of methane can have negatively biased estimated emission rates by the divergence method due to the presence of turbulent diffusion in the plume, but this is unlikely to affect regional methane emission budgets constructed from TROPOMI observations of methane. The results from this work are expected to provide useful guidance for future implementations of the divergence method for emission rate estimation from satellite data – be it for methane or other gaseous species in the atmosphere.

Read this paper on arXiv…

C. Roberts, R. IJzermans, D. Randell, et. al.
Fri, 21 Apr 23
1/60

Comments: 17 pages, 10 figures, submitted to Environmental Research Letters

A statistical model of stellar variability. I. FENRIR: a physics-based model of stellar activity, and its fast Gaussian process approximation [SSA]

http://arxiv.org/abs/2304.08489


The detection of terrestrial planets by radial velocity and photometry is hindered by the presence of stellar signals. Those are often modeled as stationary Gaussian processes, whose kernels are based on qualitative considerations, which do not fully leverage the existing physical understanding of stars. Our aim is to build a formalism which allows to transfer the knowledge of stellar activity into practical data analysis methods. In particular, we aim at obtaining kernels with physical parameters. This has two purposes: better modelling signals of stellar origin to find smaller exoplanets, and extracting information about the star from the statistical properties of the data. We consider several observational channels such as photometry, radial velocity, activity indicators, and build a model called FENRIR to represent their stochastic variations due to stellar surface inhomogeneities. We compute analytically the covariance of this multi-channel stochastic process, and implement it in the S+LEAF framework to reduce the cost of likelihood evaluations from $O(N^3)$ to $O(N)$. We also compute analytically higher order cumulants of our FENRIR model, which quantify its non-Gaussianity. We obtain a fast Gaussian process framework with physical parameters, which we apply to the HARPS-N and SORCE observations of the Sun, and constrain a solar inclination compatible with the viewing geometry. We then discuss the application of our formalism to granulation. We exhibit non-Gaussianity in solar HARPS radial velocities, and argue that information is lost when stellar activity signals are assumed to be Gaussian. We finally discuss the origin of phase shifts between RVs and indicators, and how to build relevant activity indicators. We provide an open-source implementation of the FENRIR Gaussian process model with a Python interface.

Read this paper on arXiv…

N. Hara and J. Delisle
Tue, 18 Apr 23
79/80

Comments: Submitted to Astronomy \& Astrophysics

JWST MIRI Imaging Data Post-Processing Preliminary Study with Fourier Transformation to uncover potentially celestial-origin signals [IMA]

http://arxiv.org/abs/2304.00728


This manuscript reports a part of a dedicated study aiming to disentangle sources of signals from James Webb Space Telescope (JWST) Mid-Infrared Instrument (MIRI) imaging mode. An instrumental introduction and characteristics section is present regarding MIRI. Later, a Fast Fourier Transformation-based filtering approach and its results will be discussed.

Read this paper on arXiv…

G. Hatipoğlu
Tue, 4 Apr 23
4/111

Comments: 16 pages, 18 figures

PCA-based Data Reduction and Signal Separation Techniques for James-Webb Space Telescope Data Processing [IMA]

http://arxiv.org/abs/2301.00415


Principal Component Analysis (PCA)-based techniques can separate data into different uncorrelated components and facilitate the statistical analysis as a pre-processing step. Independent Component Analysis (ICA) can separate statistically independent signal sources through a non-parametric and iterative algorithm. Non-negative matrix factorization is another PCA-similar approach to categorizing dimensions in physically-interpretable groups. Singular spectrum analysis (SSA) is a time-series-related PCA-like algorithm. After an introduction and a literature review on processing JWST data from the Near-Infrared Camera (NIRCam) and Mid-Infrared Instrument (MIRI), potential parts to intervene in the James Webb Space Telescope imaging data reduction pipeline will be discussed.

Read this paper on arXiv…

G. Hatipoğlu
Tue, 3 Jan 23
24/49

Comments: 12 pages

Utility of PCA and Other Data Transformation Techniques in Exoplanet Research [EPA]

http://arxiv.org/abs/2211.14683


This paper focuses on the utility of various data transformation techniques, which might be under the principal component analysis (PCA) category, on exoplanet research. The first section introduces the methodological background of PCA and related techniques. The second section reviews the studies which utilized these techniques in the exoplanet research field and compiles the focuses in the literature under different items in the overview, with future research direction recommendations at the end.

Read this paper on arXiv…

G. Hatipoğlu
Tue, 29 Nov 22
64/80

Comments: 15 pages

Scalable Bayesian Inference for Finding Strong Gravitational Lenses [IMA]

http://arxiv.org/abs/2211.10479


Finding strong gravitational lenses in astronomical images allows us to assess cosmological theories and understand the large-scale structure of the universe. Previous works on lens detection do not quantify uncertainties in lens parameter estimates or scale to modern surveys. We present a fully amortized Bayesian procedure for lens detection that overcomes these limitations. Unlike traditional variational inference, in which training minimizes the reverse Kullback-Leibler (KL) divergence, our method is trained with an expected forward KL divergence. Using synthetic GalSim images and real Sloan Digital Sky Survey (SDSS) images, we demonstrate that amortized inference trained with the forward KL produces well-calibrated uncertainties in both lens detection and parameter estimation.

Read this paper on arXiv…

Y. Patel and J. Regier
Tue, 22 Nov 22
3/83

Comments: Accepted to the NeurIPS 2022 Workshop on Machine Learning and the Physical Sciences

Statistical Inference for Coadded Astronomical Images [IMA]

http://arxiv.org/abs/2211.09300


Coadded astronomical images are created by stacking multiple single-exposure images. Because coadded images are smaller in terms of data size than the single-exposure images they summarize, loading and processing them is less computationally expensive. However, image coaddition introduces additional dependence among pixels, which complicates principled statistical analysis of them. We present a principled Bayesian approach for performing light source parameter inference with coadded astronomical images. Our method implicitly marginalizes over the single-exposure pixel intensities that contribute to the coadded images, giving it the computational efficiency necessary to scale to next-generation astronomical surveys. As a proof of concept, we show that our method for estimating the locations and fluxes of stars using simulated coadds outperforms a method trained on single-exposure images.

Read this paper on arXiv…

M. Wang, I. Mendoza, C. Wang, et. al.
Fri, 18 Nov 22
50/70

Comments: Accepted to the NeurIPS 2022 Machine Learning and the Physical Sciences workshop. 6 pages, 2 figures

Research on the impact of asteroid mining on global equity [CL]

http://arxiv.org/abs/2211.02023


In the future situation, aiming to seek more resources, human beings decided to march towards the mysterious and bright starry sky, which opened the era of great interstellar exploration. According to the Outer Space Treaty, any exploration of celestial bodies should be aimed at promoting global equality and for the benefit of all nations. Firstly, we defined global equity and set a Unified Equity Index (UEI) model to measure it. We merge the factors with greater correlation, and finally, get 6 elements, and then use the entropy method (TEM) to find the dispersion of these elements in different countries. Then use principal component analysis (PCA) to reduce the dimensionality of the dispersion, and then use the scandalized index to obtain the global equity. Secondly, we simulated a future with asteroid mining and evaluated its impact on Unified Equity Index (UEI). Then, we divided the mineable asteroids into three classes with different mining difficulties and values, identified 28 mining entities including private companies, national and international organizations. We considered changes in the asteroid classes, mining capabilities and mining scales to determine the changes in the value of minerals mined between 2025 and 2085. We convert mining output value into mineral transaction value through allocation matrix. Based on grey relational analysis (GRA). Finally, we presented three possible versions of the future of asteroid mining by changing the conditions. We propose two sets of corresponding policies for changes in future trends in global fairness with asteroid mining. We test the separate and combined effects of these policies and find that they are positive, strongly supporting the effectiveness of our model.

Read this paper on arXiv…

H. Sun, J. Zhu and Y. Xu
Fri, 4 Nov 22
19/84

Comments: 19 pages

Towards Improved Heliosphere Sky Map Estimation with Theseus [IMA]

http://arxiv.org/abs/2210.12005


The Interstellar Boundary Explorer (IBEX) satellite has been in orbit since 2008 and detects energy-resolved energetic neutral atoms (ENAs) originating from the heliosphere. Different regions of the heliosphere generate ENAs at different rates. It is of scientific interest to take the data collected by IBEX and estimate spatial maps of heliospheric ENA rates (referred to as sky maps) at higher resolutions than before. These sky maps will subsequently be used to discern between competing theories of heliosphere properties that are not currently possible. The data IBEX collects present challenges to sky map estimation. The two primary challenges are noisy and irregularly spaced data collection and the IBEX instrumentation’s point spread function. In essence, the data collected by IBEX are both noisy and biased for the underlying sky map of inferential interest. In this paper, we present a two-stage sky map estimation procedure called Theseus. In Stage 1, Theseus estimates a blurred sky map from the noisy and irregularly spaced data using an ensemble approach that leverages projection pursuit regression and generalized additive models. In Stage 2, Theseus deblurs the sky map by deconvolving the PSF with the blurred map using regularization. Unblurred sky map uncertainties are computed via bootstrapping. We compare Theseus to a method closely related to the one operationally used today by the IBEX Science Operation Center (ISOC) on both simulated and real data. Theseus outperforms ISOC in nearly every considered metric on simulated data, indicating that Theseus is an improvement over the current state of the art.

Read this paper on arXiv…

D. Osthus, B. Weaver, L. Beesley, et. al.
Mon, 24 Oct 22
44/56

Comments: 38 pages, 18 figures, 3 tables

Inferring changes to the global carbon cycle with WOMBAT v2.0, a hierarchical flux-inversion framework [CL]

http://arxiv.org/abs/2210.10479


The natural cycles of the surface-to-atmosphere fluxes of carbon dioxide (CO$_2$) and other important greenhouse gases are changing in response to human influences. These changes need to be quantified to understand climate change and its impacts, but this is difficult to do because natural fluxes occur over large spatial and temporal scales. To infer trends in fluxes and identify phase shifts and amplitude changes in flux seasonal cycles, we construct a flux-inversion system that uses a novel spatially varying time-series decomposition of the fluxes, while also accommodating physical constraints on the fluxes. We incorporate these features into the Wollongong Methodology for Bayesian Assimilation of Trace-gases (WOMBAT, Zammit-Mangion et al., Geosci. Model Dev., 15, 2022), a hierarchical flux-inversion framework that yields posterior distributions for all unknowns in the underlying model. We apply the new method, which we call WOMBAT v2.0, to a mix of satellite observations of CO$_2$ mole fraction from the Orbiting Carbon Observatory-2 (OCO-2) satellite and direct measurements of CO$_2$ mole fraction from a variety of sources. We estimate the changes to CO$_2$ fluxes that occurred from January 2015 to December 2020, and compare our posterior estimates to those from an alternative method based on a bottom-up understanding of the physical processes involved. We find substantial trends in the fluxes, including that tropical ecosystems trended from being a net source to a net sink of CO$_2$ over the study period. We also find that the amplitude of the global seasonal cycle of ecosystem CO$_2$ fluxes increased over the study period by 0.11 PgC/month (an increase of 8%), and that the seasonal cycle of ecosystem CO$_2$ fluxes in the northern temperate and northern boreal regions shifted earlier in the year by 0.4-0.7 and 0.4-0.9 days, respectively (2.5th to 97.5th posterior percentiles).

Read this paper on arXiv…

M. Bertolacci, A. Zammit-Mangion, A. Schuh, et. al.
Thu, 20 Oct 22
71/74

Comments: N/A

Improving Power Spectral Estimation using Multitapering: Precise asteroseismic modeling of stars, exoplanets, and beyond [IMA]

http://arxiv.org/abs/2209.15027


Asteroseismic time-series data have imprints of stellar oscillation modes, whose detection and characterization through time-series analysis allows us to probe stellar interiors physics. Such analyses usually occur in the Fourier domain by computing the Lomb-Scargle (LS) periodogram, an estimator of the \textit{power spectrum} underlying unevenly-sampled time-series data. However, the LS periodogram suffers from the statistical problems of (1) inconsistency (or noise) and (2) bias due to high spectral leakage. In addition, it is designed to detect strictly periodic signals but is unsuitable for non-sinusoidal periodic or quasi-periodic signals. Here, we develop a multitaper spectral estimation method that tackles the inconsistency and bias problems of the LS periodogram. We combine this multitaper method with the Non-Uniform Fast Fourier Transform (\texttt{mtNUFFT}) to more precisely estimate the frequencies of asteroseismic signals that are non-sinusoidal periodic (e.g., exoplanet transits) or quasi-periodic (e.g., pressure modes). We illustrate this using a simulated and the Kepler-91 red giant light curve. Particularly, we detect the Kepler-91b exoplanet and precisely estimate its period, $6.246 \pm 0.002$ days, in the frequency domain using the multitaper F-test alone. We also integrate \texttt{mtNUFFT} into the \texttt{PBjam} package to obtain a Kepler-91 age estimate of $3.96 \pm 0.48$ Gyr. This $36$\% improvement in age precision relative to the $4.27 \pm 0.75$ Gyr APOKASC-2 (uncorrected) estimate illustrates that \texttt{mtNUFFT} has promising implications for Galactic archaeology, in addition to stellar interiors and exoplanet studies. Our frequency analysis method generally applies to time-domain astronomy and is implemented in the public Python package \texttt{tapify}, available at \url{https://github.com/aaryapatil/tapify}.

Read this paper on arXiv…

A. Patil, G. Eadie, J. Speagle, et. al.
Mon, 3 Oct 22
27/55

Comments: 32 pages (3 pages in the Appendix), 14 figures, 2 tables, Submitted to AJ

Automatic detection of long-duration transients in Fermi-GBM data [IMA]

http://arxiv.org/abs/2205.13649


In the era of time-domain, multi-messenger astronomy, the detection of transient events on the high-energy electromagnetic sky has become more important than ever. Previous attempts to systematically search for onboard-untriggered events in the data of Fermi-GBM have been limited to short-duration signals with variability time scales smaller than ~1 min due to the dominance of background variations on longer timescales. In this study, we aim at the detection of slowly rising or long-duration transient events with high sensitivity and full coverage of the GBM spectrum. We make use of our earlier developed physical background model, propose a novel trigger algorithm with a fully automatic data analysis pipeline. The results from extensive simulations demonstrate that the developed trigger algorithm is sensitive down to sub-Crab intensities, and has a near-optimal detection performance. During a two month test run on real Fermi-GBM data, the pipeline detected more than 300 untriggered transient signals. For one of these transient detections we verify that it originated from a known astrophysical source, namely the Vela X-1 pulsar, showing pulsed emission for more than seven hours. More generally, this method enables a systematic search for weak and/or long-duration transients.

Read this paper on arXiv…

F. Kunzweiler, B. Biltzinger, J. Greiner, et. al.
Mon, 30 May 22
30/47

Comments: N/A

Accounting for stellar activity signals in radial-velocity data by using Change Point Detection techniques [EPA]

http://arxiv.org/abs/2205.11136


Active regions on the photosphere of a star have been the major obstacle for detecting Earth-like exoplanets with the radial velocity (RV) method. A commonly employed solution to addressing stellar activity is to assume a linear relationship between the RV observations and the activity indicators along the entire time series, and then remove the estimated contribution of activity from the variation in RV data (overall correction method). However, since active regions evolve on the photosphere over time, correlations between the RV observations and the activity indicators will correspondingly be anisotropic. We present an approach which recognizes the RV locations where the correlations between the RV and the activity indicators significantly change, to better account for variations in RV caused by stellar activity. The proposed approach uses a general family of statistical breakpoint methods, often referred to as Change-Point Detection (CPD) algorithms. A thorough comparison is made between the breakpoint-based approach and the overall correction method. To ensure wide representativity we use measurements from real stars having different levels of stellar activity and whose spectra have different signal-to-noise ratios. When the corrections for stellar activity are applied separately on each temporal segment identified by the breakpoint method, the corresponding residuals in the RV time series are typically much smaller if compared to those obtained with the overall correction method. Consequently the Generalized Lomb-Scargle periodogram contains a smaller number of peaks caused by active regions. The CPD algorithm is particularly effective when focusing on active stars with long time series, such as Alpha Cen B. In that case we demonstrate that the breakpoint method improves the detection limit of exoplanets on average by 74% with respect to the overall correction method.

Read this paper on arXiv…

U. Simola, A. Bonfanti, X. Dumusque, et. al.
Tue, 24 May 22
48/92

Comments: 31 pages, 18 Figures

A statistical primer on exoplanet detection methods [EPA]

http://arxiv.org/abs/2205.10417


Historically, a lack of cross-disciplinary communication has led to the development of statistical methods for detecting exoplanets by astronomers, independent of the contemporary statistical literature. The aim of our paper is to investigate the properties of such methods. Many of these methods (both transit- and radial velocity-based) have not been discussed by statisticians despite their use in thousands of astronomical papers. Transit methods aim to detect a planet by determining whether observations of a star contain a periodic component. These methods tend to be overly rudimentary for starlight data and lack robustness to model misspecification. Conversely, radial velocity methods aim to detect planets by estimating the Doppler shift induced by an orbiting companion on the spectrum of a star. Many such methods are unable to detect Doppler shifts on the order of magnitude consistent with Earth-sized planets around Sun-like stars. Modern radial velocity approaches attempt to address this deficiency by adapting tools from contemporary statistical research in functional data analysis, but more work is needed to develop the statistical theory supporting the use of these models, to expand these models for multiplanet systems, and to develop methods for detecting ever smaller Doppler shifts in the presence of stellar activity.

Read this paper on arXiv…

N. Giertych, J. Williams and P. Haravu
Tue, 24 May 22
56/92

Comments: N/A

Wavelet Moments for Cosmological Parameter Estimation [CEA]

http://arxiv.org/abs/2204.07646


Extracting non-Gaussian information from the non-linear regime of structure formation is key to fully exploiting the rich data from upcoming cosmological surveys probing the large-scale structure of the universe. However, due to theoretical and computational complexities, this remains one of the main challenges in analyzing observational data. We present a set of summary statistics for cosmological matter fields based on 3D wavelets to tackle this challenge. These statistics are computed as the spatial average of the complex modulus of the 3D wavelet transform raised to a power $q$ and are therefore known as invariant wavelet moments. The 3D wavelets are constructed to be radially band-limited and separable on a spherical polar grid and come in three types: isotropic, oriented, and harmonic. In the Fisher forecast framework, we evaluate the performance of these summary statistics on matter fields from the Quijote suite, where they are shown to reach state-of-the-art parameter constraints on the base $\Lambda$CDM parameters, as well as the sum of neutrino masses. We show that we can improve constraints by a factor 5 to 10 in all parameters with respect to the power spectrum baseline.

Read this paper on arXiv…

M. Eickenberg, E. Allys, A. Dizgah, et. al.
Tue, 19 Apr 22
47/52

Comments: N/A

Light from the Darkness: Detecting Ultra-Diffuse Galaxies in the Perseus Cluster through Over-densities of Globular Clusters with a Log-Gaussian Cox Process [GA]

http://arxiv.org/abs/2204.05487


We introduce a new method for detecting ultra-diffuse galaxies by searching for over-densities in intergalactic globular cluster populations. Our approach is based on an application of the log-Gaussian Cox process, which is a commonly used model in the spatial statistics literature but rarely used in astronomy. This method is applied to the globular cluster data obtained from the PIPER survey, a \textit{Hubble Space Telescope} imaging program targeting the Perseus cluster. We successfully detect all confirmed ultra-diffuse galaxies with known globular cluster populations in the survey. We also identify a potential galaxy that has no detected diffuse stellar content. Preliminary analysis shows that it is unlikely to be merely an accidental clump of globular clusters or other objects. If confirmed, this system would be the first of its kind. Simulations are used to assess how the physical parameters of the globular cluster systems within ultra-diffuse galaxies affect their detectability using our method. We quantify the correlation of the detection probability with the total number of globular clusters in the galaxy and the anti-correlation with increasing half-number radius of the globular cluster system. The S\'{e}rsic index of the globular cluster distribution has little impact on detectability.

Read this paper on arXiv…

D. Dayi, G. Eadie, R. Abraham, et. al.
Wed, 13 Apr 22
50/73

Comments: 35 pages, 17 figures

Differentiating small-scale subhalo distributions in CDM and WDM models using persistent homology [IMA]

http://arxiv.org/abs/2204.00443


The spatial distribution of galaxies at sufficiently small scales will encode information about the identity of the dark matter. We develop a novel description of the halo distribution using persistent homology summaries, in which collections of points are decomposed into clusters, loops and voids. We apply these methods, together with a set of hypothesis tests, to dark matter haloes in MW-analog environment regions of the cold dark matter (CDM) and warm dark matter (WDM) Copernicus Complexio $N$-body cosmological simulations. The results of the hypothesis tests find statistically significant differences (p-values $\leq$ 0.001) between the CDM and WDM structures, and the functional summaries of persistence diagrams detect differences at scales that are distinct from the comparison spatial point process functional summaries considered (including the two-point correlation function). The differences between the models are driven most strongly at filtration scales $\sim100$~kpc, where CDM generates larger numbers of unconnected halo clusters while WDM instead generates loops. This study was conducted on dark matter haloes generally; future work will involve applying the same methods to realistic galaxy catalogues.

Read this paper on arXiv…

J. Cisewski-Kehe, B. Fasy, W. Hellwing, et. al.
Mon, 4 Apr 22
5/50

Comments: 17 pages, 11 figures

A continuous multiple hypothesis testing framework for optimal exoplanet detection [IMA]

http://arxiv.org/abs/2203.04957


The detection of exoplanets is hindered by the presence of complex astrophysical and instrumental noises. Given the difficulty of the task, it is important to ensure that the data are exploited to their fullest potential. In the present work, we search for an optimal exoplanet detection criterion. We adopt a general Bayesian multiple hypothesis testing framework, where the hypotheses are indexed by continuous variables. This framework is adaptable to the different observational methods used to detect exoplanets as well as other data analysis problems. We describe the data as a combination of several parametrized patterns and nuisance signals. We wish to determine which patterns are present, and for a detection to be valid, the parameters of the claimed pattern have to correspond to a true one with a certain accuracy. We search for a detection criterion minimizing false and missed detections, either as a function of their relative cost, or when the expected number of false detections is bounded. We find that if the patterns can be separated in a technical sense, the two approaches lead to the same optimal procedure. We apply it to the retrieval of periodic signals in unevenly sampled time series, emulating the search for exoplanets in radial velocity data. We show on a simulation that, for a given tolerance to false detections, the new criterion leads to 15 to 30\% more true detections than other criteria, including the Bayes factor.

Read this paper on arXiv…

N. Hara, T. Poyferré, J. Delisle, et. al.
Thu, 10 Mar 22
1/60

Comments: Submitted to Annals of Applied Statistics

High Dimensional Statistical Analysis and its Application to ALMA Map of NGC 253 [IMA]

http://arxiv.org/abs/2203.04535


In astronomy, if we denote the dimension of data as $d$ and the number of samples as $n$, we often meet a case with $n \ll d$. Traditionally, such a situation is regarded as ill-posed, and there was no choice but to throw away most of the information in data dimension to let $d < n$. The data with $n \ll d$ is referred to as high-dimensional low sample size (HDLSS). {}To deal with HDLSS problems, a method called high-dimensional statistics has been developed rapidly in the last decade. In this work, we first introduce the high-dimensional statistical analysis to the astronomical community. We apply two representative methods in the high-dimensional statistical analysis methods, the noise-reduction principal component analysis (NRPCA) and regularized principal component analysis (RPCA), to a spectroscopic map of a nearby archetype starburst galaxy NGC 253 taken by the Atacama Large Millimeter/Submillimeter Array (ALMA). The ALMA map is a typical HDLSS dataset. First we analyzed the original data including the Doppler shift due to the systemic rotation. The high-dimensional PCA could describe the spatial structure of the rotation precisely. We then applied to the Doppler-shift corrected data to analyze more subtle spectral features. The NRPCA and RPCA could quantify the very complicated characteristics of the ALMA spectra. Particularly, we could extract the information of the global outflow from the center of NGC 253. This method can also be applied not only to spectroscopic survey data, but also any type of data with small sample size and large dimension.

Read this paper on arXiv…

T. Takeuchi, K. Yata, M. Aoshima, et. al.
Thu, 10 Mar 22
33/60

Comments: 25 pages, 21 figures, submitted

GIGA-Lens: Fast Bayesian Inference for Strong Gravitational Lens Modeling [IMA]

http://arxiv.org/abs/2202.07663


We present GIGA-Lens: a gradient-informed, GPU-accelerated Bayesian framework for modeling strong gravitational lensing systems, implemented in TensorFlow and JAX. The three components, optimization using multi-start gradient descent, posterior covariance estimation with variational inference, and sampling via Hamiltonian Monte Carlo, all take advantage of gradient information through automatic differentiation and massive parallelization on graphics processing units (GPUs). We test our pipeline on a large set of simulated systems and demonstrate in detail its high level of performance. The average time to model a single system on four Nvidia A100 GPUs is 105 seconds. The robustness, speed, and scalability offered by this framework make it possible to model the large number of strong lenses found in current surveys and present a very promising prospect for the modeling of $\mathcal{O}(10^5)$ lensing systems expected to be discovered in the era of the Vera C. Rubin Observatory, Euclid, and the Nancy Grace Roman Space Telescope.

Read this paper on arXiv…

A. Gu, X. Huang, W. Sheu, et. al.
Thu, 17 Feb 22
45/60

Comments: 23 pages, 13 figures, 2 tables. Submitted to ApJ

Mapping Interstellar Dust with Gaussian Processes [GA]

http://arxiv.org/abs/2202.06797


Interstellar dust corrupts nearly every stellar observation, and accounting for it is crucial to measuring physical properties of stars. We model the dust distribution as a spatially varying latent field with a Gaussian process (GP) and develop a likelihood model and inference method that scales to millions of astronomical observations. Modeling interstellar dust is complicated by two factors. The first is integrated observations. The data come from a vantage point on Earth and each observation is an integral of the unobserved function along our line of sight, resulting in a complex likelihood and a more difficult inference problem than in classical GP inference. The second complication is scale; stellar catalogs have millions of observations. To address these challenges we develop ziggy, a scalable approach to GP inference with integrated observations based on stochastic variational inference. We study ziggy on synthetic data and the Ananke dataset, a high-fidelity mechanistic model of the Milky Way with millions of stars. ziggy reliably infers the spatial dust map with well-calibrated posterior uncertainties.

Read this paper on arXiv…

A. Miller, L. Anderson, B. Leistedt, et. al.
Tue, 15 Feb 22
75/75

Comments: N/A

Statistical Tools for Imaging Atmospheric Cherenkov Telescopes [IMA]

http://arxiv.org/abs/2202.04590


The development of Imaging Atmospheric Cherenkov Telescopes (IACTs) unveiled the sky in the teraelectronvolt regime, initiating the so-called “TeV revolution”, at the beginning of the new millennium. This revolution was also facilitated by the implementation and adaptation of statistical tools for analyzing the shower images collected by these telescopes and inferring the properties of the astrophysical sources that produce such events. Image reconstruction techniques, background discrimination, and signal-detection analyses are just a few of the pioneering studies applied in recent decades in the analysis of IACTs data. This (succinct) review has the intent of summarizing the most common statistical tools that are used for analyzing data collected with IACTs, focusing on their application in the full analysis chain, including references to existing literature for a deeper examination.

Read this paper on arXiv…

G. D’Amico
Thu, 10 Feb 22
23/66

Comments: 25 pages, 10 figures, published in MDPI – Universe Special Issue “High-Energy Gamma-Ray Astronomy: Results on Fundamental Questions after 30 Years of Ground-Based Observations”, 29 January 2022

Upward lightning at tall structures: Atmospheric drivers for trigger mechanisms and flash type [CL]

http://arxiv.org/abs/2201.05663


Despite its scarcity, upward lightning initiated from tall structures causes more damage than common downward lightning. One particular subtype with a continuous current only is not detectable by conventional lightning location systems (LLS) causing a significantly reduced detection efficiency. Upward lightning has become a major concern due to the recent push in the field of renewable wind energy generation . The growing number of tall wind turbines increased lightning related damages. Upward lightning may be initiated by the tall structure triggering the flash itself (self-triggered) or by a flash striking close by (other-triggered).
The major objective of this study is to find the driving atmospheric conditions influencing whether an upward flash is self-triggered or other-triggered and whether it is of the undetectable subtype. We explore upward flashes directly measured at the Gaisberg Tower in Salzburg (Austria) between 2000 and 2015. These upward flashes are combined with atmospheric reanalysis data stratified into five main meteorological groups: cloud physics, mass field, moisture field, surface exchange and wind field. We use classification methods based on tree-structured ensembles in form of conditional random forests. From these random forests we assess the meteorological influence and find the most important atmospheric drivers for one event or the other, respectively.

Read this paper on arXiv…

I. Stucke, D. Morgenstern, A. Zeileis, et. al.
Wed, 19 Jan 22
54/121

Comments: N/A

Real-time Detection of Anomalies in Multivariate Time Series of Astronomical Data [CL]

http://arxiv.org/abs/2112.08415


Astronomical transients are stellar objects that become temporarily brighter on various timescales and have led to some of the most significant discoveries in cosmology and astronomy. Some of these transients are the explosive deaths of stars known as supernovae while others are rare, exotic, or entirely new kinds of exciting stellar explosions. New astronomical sky surveys are observing unprecedented numbers of multi-wavelength transients, making standard approaches of visually identifying new and interesting transients infeasible. To meet this demand, we present two novel methods that aim to quickly and automatically detect anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model.

Read this paper on arXiv…

D. Muthukrishna, K. Mandel, M. Lochner, et. al.
Fri, 17 Dec 21
43/72

Comments: 9 pages, 5 figures, Accepted at the NeurIPS 2021 workshop on Machine Learning and the Physical Sciences

Euclid: Covariance of weak lensing pseudo-$C_\ell$ estimates. Calculation, comparison to simulations, and dependence on survey geometry [CEA]

http://arxiv.org/abs/2112.07341


An accurate covariance matrix is essential for obtaining reliable cosmological results when using a Gaussian likelihood. In this paper we study the covariance of pseudo-$C_\ell$ estimates of tomographic cosmic shear power spectra. Using two existing publicly available codes in combination, we calculate the full covariance matrix, including mode-coupling contributions arising from both partial sky coverage and non-linear structure growth. For three different sky masks, we compare the theoretical covariance matrix to that estimated from publicly available N-body weak lensing simulations, finding good agreement. We find that as a more extreme sky cut is applied, a corresponding increase in both Gaussian off-diagonal covariance and non-Gaussian super-sample covariance is observed in both theory and simulations, in accordance with expectations. Studying the different contributions to the covariance in detail, we find that the Gaussian covariance dominates along the main diagonal and the closest off-diagonals, but further away from the main diagonal the super-sample covariance is dominant. Forming mock constraints in parameters describing matter clustering and dark energy, we find that neglecting non-Gaussian contributions to the covariance can lead to underestimating the true size of confidence regions by up to 70 per cent. The dominant non-Gaussian covariance component is the super-sample covariance, but neglecting the smaller connected non-Gaussian covariance can still lead to the underestimation of uncertainties by 10–20 per cent. A real cosmological analysis will require marginalisation over many nuisance parameters, which will decrease the relative importance of all cosmological contributions to the covariance, so these values should be taken as upper limits on the importance of each component.

Read this paper on arXiv…

R. Upham, M. Brown, L. Whittaker, et. al.
Wed, 15 Dec 21
72/85

Comments: 15 pages, 8 figures; submitted to A&A; code available at this https URL

Amended Gibbs samplers for Cosmic Microwave Background power spectrum estimation [CEA]

http://arxiv.org/abs/2111.07664


We study different variants of the Gibbs sampler algorithm from the perspective of their applicability to the estimation of power spectra of the cosmic microwave background (CMB) anisotropies. We focus on approaches which aim at reducing the cost of a computationally heavy constrained realization step and capitalize on the interweaving strategy to ensure that our algorithms mix well for high and low signal-to-noise ratio components. In particular, we propose an approach, which we refer to as Centered overrelax, which avoids the constraint realization step completely at the cost of additional, auxiliary variables and the need for overrelaxation. We demonstrate these variants and compare their merits on full and cut sky simulations, quantifying their performance in terms of an effective sample size (ESS) per second.
We find that on nearly full-sky, satellite-like data, the proposed Gibbs sampler with overrelaxation performs between one and two orders of magnitude better than the usual Gibbs sampler potentially providing an interesting alternative to the currently favored approaches.

Read this paper on arXiv…

G. Ducrocq, N. Chopin, J. Errard, et. al.
Tue, 16 Nov 21
22/97

Comments: N/A

Reproducing size distributions of swarms of barchan dunes on Mars and Earth using a mean-field model [CL]

http://arxiv.org/abs/2110.15850


We apply a mean-field model of interactions between migrating barchan dunes, the CAFE model, which includes calving, aggregation, fragmentation, and mass-exchange, yielding a steady-state size distribution that can be resolved for different choices of interaction parameters. The CAFE model is applied to empirically measured distributions of dune sizes in two barchan swarms on Mars, three swarms in Morocco, and one in Mauritania, each containing ~1000 bedforms, comparing the observed size distributions to the steady-states of the CAFE model. We find that the distributions in the Martian swarm are very similar to the swarm measured in Mauritania, suggesting that the two very different planetary environments however share similar dune interaction dynamics. Optimisation of the model parameters of three specific configurations of the CAFE model shows that the fit of the theoretical steady-state is often superior to the typically assumed log-normal. In all cases, the optimised parameters indicate that mass-exchange is the most frequent type of interaction. Calving is found to occur rarely in most of the swarms, with a highest rate of only 9\% of events, showing that interactions between multiple dunes rather than spontaneous calving are the driver of barchan size distributions. Finally, the implementation of interaction parameters derived from 3D simulations of dune-pair collisions indicates that sand flux between dunes is more important in producing the size distributions of the Moroccan swarms than of those in Mauritania and on Mars.

Read this paper on arXiv…

D. Robson, A. Annibale and A. Baas
Mon, 1 Nov 21
21/58

Comments: 30 Pages, 10 figures, Submitted to Physica A: Statistical Mechanics and its Applications

Clearing the hurdle: The mass of globular cluster systems as a function of host galaxy mass [GA]

http://arxiv.org/abs/2110.15376


Current observational evidence suggests that all large galaxies contain globular clusters (GCs), while the smallest galaxies do not. Over what galaxy mass range does the transition from GCs to no GCs occur? We investigate this question using galaxies in the Local Group, nearby dwarf galaxies, and galaxies in the Virgo Cluster Survey. We consider four types of statistical models: (1) logistic regression to model the probability that a galaxy of stellar mass $M_{\star}$ has any number of GCs; (2) Poisson regression to model the number of GCs versus $M_{\star}$, (3) linear regression to model the relation between GC system mass ($\log{M_{gcs}}$) and host galaxy mass ($\log{M_{\star}}$), and (4) a Bayesian lognormal hurdle model of the GC system mass as a function of galaxy stellar mass for the entire data sample. From the logistic regression, we find that the 50% probability point for a galaxy to contain GCs is $M_{\star}=10^{6.8}M_{\odot}$. From post-fit diagnostics, we find that Poisson regression is an inappropriate description of the data. Ultimately, we find that the Bayesian lognormal hurdle model, which is able to describe how the mass of the GC system varies with $M_{\star}$ even in the presence of many galaxies with no GCs, is the most appropriate model over the range of our data. In an Appendix, we also present photometry for the little-known GC in the Local Group dwarf Ursa Major II.

Read this paper on arXiv…

G. Eadie, W. Harris and A. Springford
Mon, 1 Nov 21
40/58

Comments: accepted to ApJ; 25 pages, 12 figures, 6 tables

Spatiotemporal Characterization of VIIRS Night Light [CL]

http://arxiv.org/abs/2109.06913


The VIIRS Day Night Band sensor on the Suomi NPP satellite provides almost a decade of observations of night light. The daily frequency of sampling, without the temporal averaging of annual composites, requires the distinction between apparent changes of imaged night light related to the imaging process and actual changes in the underlying sources of the light being imaged. This study characterizes night light variability over a range of spatial and temporal scales to provide a context for interpretation of changes on both subannual and interannual time scales. This analysis uses a combination of temporal moments, spatial correlation and Empirical Orthogonal Function (EOF) analysis. A key result is the pervasive heteroskedasticity of VIIRS monthly mean night light. Specifically, the monotonic decrease of temporal variability with increasing mean brightness. Anthropogenic night light is remarkably stable on subannual time scales. Overall variance partition derived from the eigenvalues of the spatiotemporal covariance matrix are 88%, 2% and 2% for spatial, seasonal and interannual variance in the most diverse geographic region on Earth (Eurasia). Heteroskedasticity is present in all areas for all months, suggesting that much, if not most, of observed month-to-month variability may result from luminance of otherwise stable sources subjected to multiple aspects of the imaging process varying in time. Given the skewed distribution of all night light arising from radial peripheral dimming of bright sources, even aggregate metrics using thresholds must be interpreted in light of the fact that much larger numbers of more variable low luminance pixels may statistically overwhelm smaller numbers of stable higher luminance pixels and cause apparent changes related to the imaging process to be interpreted as actual changes in the light sources.

Read this paper on arXiv…

C. Small
Thu, 16 Sep 21
24/54

Comments: 18 pages, 6 figures

Self-Calibrating the Look-Elsewhere Effect: Fast Evaluation of the Statistical Significance Using Peak Heights [IMA]

http://arxiv.org/abs/2108.06333


In experiments where one searches a large parameter space for an anomaly, one often finds many spurious noise-induced peaks in the likelihood. This is known as the look-elsewhere effect, and must be corrected for when performing statistical analysis. This paper introduces a method to calibrate the false alarm probability (FAP), or $p$-value, for a given dataset by considering the heights of the highest peaks in the likelihood. In the simplest form of self-calibration, the look-elsewhere-corrected $\chi^2$ of a physical peak is approximated by the $\chi^2$ of the peak minus the $\chi^2$ of the highest noise-induced peak. Generalizing this concept to consider lower peaks provides a fast method to quantify the statistical significance with improved accuracy. In contrast to alternative methods, this approach has negligible computational cost as peaks in the likelihood are a byproduct of every peak-search analysis. We apply to examples from astronomy, including planet detection, periodograms, and cosmology.

Read this paper on arXiv…

A. Bayer, U. Seljak and J. Robnik
Mon, 16 Aug 21
1/34

Comments: 12 pages, 7 figures

A novel approach to asteroid impact monitoring and hazard assessment [EPA]

http://arxiv.org/abs/2108.03201


Orbit-determination programs find the orbit solution that best fits a set of observations by minimizing the RMS of the residuals of the fit. For near-Earth asteroids, the uncertainty of the orbit solution may be compatible with trajectories that impact Earth. This paper shows how incorporating the impact condition as an observation in the orbit-determination process results in a robust technique for finding the regions in parameter space leading to impacts. The impact pseudo-observation residuals are the b-plane coordinates at the time of close approach and the uncertainty is set to a fraction of the Earth radius. The extended orbit-determination filter converges naturally to an impacting solution if allowed by the observations. The uncertainty of the resulting orbit provides an excellent geometric representation of the virtual impactor. As a result, the impact probability can be efficiently estimated by exploring this region in parameter space using importance sampling. The proposed technique can systematically handle a large number of estimated parameters, account for nongravitational forces, deal with nonlinearities, and correct for non-Gaussian initial uncertainty distributions. The algorithm has been implemented into a new impact monitoring system at JPL called Sentry-II, which is undergoing extensive testing. The main advantages of Sentry-II over JPL’s currently operating impact monitoring system Sentry are that Sentry-II can systematically process orbits perturbed by nongravitational forces and that it is generally more robust when dealing with pathological cases. The runtimes and completeness of both systems are comparable, with the impact probability of Sentry-II for 99% completeness being $3\times10^{-7}$.

Read this paper on arXiv…

J. Roa, D. Farnocchia and S. Chesley
Mon, 9 Aug 21
34/51

Comments: 19 pages, 13 figures. To be published in the Astronomical Journal

Impact of Scene-Specific Enhancement Spectra on Matched Filter Greenhouse Gas Retrievals from Imaging Spectroscopy [CL]

http://arxiv.org/abs/2107.05578


Matched filter (MF) techniques have been widely used for retrieval of greenhouse gas enhancements (enh.) from imaging spectroscopy datasets. While multiple algorithmic techniques and refinements have been proposed, the greenhouse gas target spectrum used for concentration enh. estimation has remained largely unaltered since the introduction of quantitative MF retrievals. The magnitude of retrieved methane and carbon dioxide enh., and thereby integrated mass enh. (IME) and estimated flux of point-source emitters, is heavily dependent on this target spectrum. Current standard use of molecular absorption coefficients to create unit enh. target spectra does not account for absorption by background concentrations of greenhouse gases, solar and sensor geometry, or atmospheric water vapor absorption. We introduce geometric and atmospheric parameters into the generation of scene-specific (SS) unit enh. spectra to provide target spectra that are compatible with all greenhouse gas retrieval MF techniques. For methane plumes, IME resulting from use of standard, generic enh. spectra varied from -22 to +28.7% compared to SS enh. spectra. Due to differences in spectral shape between the generic and SS enh. spectra, differences in methane plume IME were linked to surface spectral characteristics in addition to geometric and atmospheric parameters. IME differences for carbon dioxide plumes, with generic enh. spectra producing integrated mass enh. -76.1 to -48.1% compared to SS enh. spectra. Fluxes calculated from these integrated enh. would vary by the same %s, assuming equivalent wind conditions. Methane and carbon dioxide IME were most sensitive to changes in solar zenith angle and ground elevation. SS target spectra can improve confidence in greenhouse gas retrievals and flux estimates across collections of scenes with diverse geometric and atmospheric conditions.

Read this paper on arXiv…

M. Foote, P. Dennison, P. Sullivan, et. al.
Tue, 13 Jul 21
43/79

Comments: 14 pages, 5 figures, 3 tables

Nonparametric monitoring of sunspot number observations: a case study [SSA]

http://arxiv.org/abs/2106.13535


Solar activity is an important driver of long-term climate trends and must be accounted for in climate models. Unfortunately, direct measurements of this quantity over long periods do not exist. The only observation related to solar activity whose records reach back to the seventeenth century are sunspots. Surprisingly, determining the number of sunspots consistently over time has remained until today a challenging statistical problem. It arises from the need of consolidating data from multiple observing stations around the world in a context of low signal-to-noise ratios, non-stationarity, missing data, non-standard distributions and many kinds of errors. The data from some stations experience therefore severe and various deviations over time. In this paper, we propose the first systematic and thorough statistical approach for monitoring these complex and important series. It consists of three steps essential for successful treatment of the data: smoothing on multiple timescales, monitoring using block bootstrap calibrated CUSUM charts and classifying of out-of-control situations by support vector techniques. This approach allows us to detect a wide range of anomalies (such as sudden jumps or more progressive drifts), unseen in previous analyses. It helps us to identify the causes of major deviations, which are often observer or equipment related. Their detection and identification will contribute to improve future observations. Their elimination or correction in past data will lead to a more precise reconstruction of the world reference index for solar activity: the International Sunspot Number.

Read this paper on arXiv…

S. Mathieu, L. Lefèvre, R. Sachs, et. al.
Mon, 28 Jun 21
27/51

Comments: 27 pages (without appendices), 6 figures

Uncertainty Quantification of a Computer Model for Binary Black Hole Formation [IMA]

http://arxiv.org/abs/2106.01552


In this paper, a fast and parallelizable method based on Gaussian Processes (GPs) is introduced to emulate computer models that simulate the formation of binary black holes (BBHs) through the evolution of pairs of massive stars. Two obstacles that arise in this application are the a priori unknown conditions of BBH formation and the large scale of the simulation data. We address them by proposing a local emulator which combines a GP classifier and a GP regression model. The resulting emulator can also be utilized in planning future computer simulations through a proposed criterion for sequential design. By propagating uncertainties of simulation input through the emulator, we are able to obtain the distribution of BBH properties under the distribution of physical parameters.

Read this paper on arXiv…

L. Lin, D. Bingham, F. Broekgaarden, et. al.
Fri, 4 Jun 21
33/71

Comments: 24 pages, 11 figures

Quantifying the Similarity of Planetary System Architectures [EPA]

http://arxiv.org/abs/2106.00688


The planetary systems detected so far already exhibit a wide diversity of architectures, and various methods are proposed to study quantitatively this diversity. Straightforward ways to quantify the difference between two systems and more generally, two sets of multiplanetary systems, are useful tools in the study of this diversity. In this work we present a novel approach, using a Weighted extension of the Energy Distance (WED) metric, to quantify the difference between planetary systems on the logarithmic period-radius plane. We demonstrate the use of this metric and its relation to previously introduced descriptive measures to characterise the arrangements of Kepler planetary systems. By applying exploratory machine learning tools, we attempt to find whether there is some order that can be ascribed to the set of Kepler multiplanet system architectures. Based on WED, the ‘Sequencer’, which is such an automatic tool, identifies a progression from small and compact planetary systems to systems with distant giant planets. It is reassuring to see that a WED-based tool indeed identifies this progression. Next, we extend WED to define the Inter-Catalogue Energy Distance (ICED) – a distance metric between sets of multiplanetary systems. We have made the specific implementation presented in the paper available to the community through a public repository. We suggest to use these metrics as complementary tools in attempting to compare between architectures of planetary system, and in general, catalogues of planetary systems.

Read this paper on arXiv…

D. Bashi and S. Zucker
Thu, 3 Jun 21
34/55

Comments: 8 pages, 5 figures, accepted for publication in A&A, usage examples at: this https URL

The Nanohertz Gravitational Wave Astronomer [HEAP]

http://arxiv.org/abs/2105.13270


Gravitational waves are a radically new way to peer into the darkest depths of the cosmos. Pulsars can be used to make direct detections of gravitational waves through precision timing. When a gravitational wave passes between a pulsar and the Earth, it stretches and squeezes the intermediate space-time, leading to deviations of the measured pulse arrival times away from model expectations. Combining the data from many Galactic pulsars can corroborate such a signal, and enhance its detection significance. This technique is known as a Pulsar Timing Array (PTA). Here I provide an overview of PTAs as a precision gravitational-wave detection instrument, then review the types of signal and noise processes that we encounter in typical pulsar data analysis. I take a pragmatic approach, illustrating how searches are performed in real life, and where possible directing the reader to codes or techniques that they can explore for themselves. The goal is to provide theoretical background and practical recipes for data exploration that allow the reader to join in the exciting hunt for very low frequency gravitational waves.

Read this paper on arXiv…

S. Taylor
Fri, 28 May 21
47/56

Comments: Draft of a short technical book to be published later this year by Taylor & Francis. 156 pages. Comments and errata are welcome

eBASCS: Disentangling Overlapping Astronomical Sources II, using Spatial, Spectral, and Temporal Information [IMA]

http://arxiv.org/abs/2105.08606


The analysis of individual X-ray sources that appear in a crowded field can easily be compromised by the misallocation of recorded events to their originating sources. Even with a small number of sources, that nonetheless have overlapping point spread functions, the allocation of events to sources is a complex task that is subject to uncertainty. We develop a Bayesian method designed to sift high-energy photon events from multiple sources with overlapping point spread functions, leveraging the differences in their spatial, spectral, and temporal signatures. The method probabilistically assigns each event to a given source. Such a disentanglement allows more detailed spectral or temporal analysis to focus on the individual component in isolation, free of contamination from other sources or the background. We are also able to compute source parameters of interest like their locations, relative brightness, and background contamination, while accounting for the uncertainty in event assignments. Simulation studies that include event arrival time information demonstrate that the temporal component improves event disambiguation beyond using only spatial and spectral information. The proposed methods correctly allocate up to 65% more events than the corresponding algorithms that ignore event arrival time information. We apply our methods to two stellar X-ray binaries, UV Cet and HBC515 A, observed with Chandra. We demonstrate that our methods are capable of removing the contamination due to a strong flare on UV Cet B in its companion approximately 40 times weaker during that event, and that evidence for spectral variability at timescales of a few ks can be determined in HBC515 Aa and HBC515 Ab.

Read this paper on arXiv…

A. Meyer, D. Dyk, V. Kashyap, et. al.
Wed, 19 May 21
31/64

Comments: N/A

Improving exoplanet detection capabilities with the false inclusion probability. Comparison with other detection criteria in the context of radial velocities [EPA]

http://arxiv.org/abs/2105.06995


Context. In exoplanet searches with radial velocity data, the most common statistical significance metrics are the Bayes factor and the false alarm probability (FAP). Both have proved useful, but do not directly address whether an exoplanet detection should be claimed. Furthermore, it is unclear which detection threshold should be taken and how robust the detections are to model misspecification. Aims. The present work aims at defining a detection criterion which conveys as precisely as possible the information needed to claim an exoplanet detection. We compare this new criterion to existing ones in terms of sensitivity and robustness. Methods. We define a significance metric called the false inclusion probability (FIP) based on the posterior probability of presence of a planet. Posterior distributions are computed with the nested sampling package Polychord. We show that for FIP and Bayes factor calculations, defining priors on linear parameters as Gaussian mixture models allows to significantly speed up computations. The performances of the FAP, Bayes factor and FIP are studied with simulations as well as analytical arguments. We compare the methods assuming the model is correct, then evaluate their sensitivity to the prior and likelihood choices. Results. Among other properties, the FIP offers ways to test the reliability of the significance levels, it is particularly efficient to account for aliasing and allows to exclude the presence of planets with a certain confidence. We find that, in our simulations, the FIP outperforms existing detection metrics. We show that planet detections are sensitive to priors on period and semi-amplitude and that letting free the noise parameters offers better performances than fixing a noise model based on a fit to ancillary indicators.

Read this paper on arXiv…

N. Hara, N. Unger, J. Delisle, et. al.
Mon, 17 May 21
14/55

Comments: Accepted for publication in Astronomy & Astrophysics

Identification of high-energy astrophysical point sources via hierarchical Bayesian nonparametric clustering [CL]

http://arxiv.org/abs/2104.11492


The light we receive from distant astrophysical objects carries information about their origins and the physical mechanisms that power them. The study of these signals, however, is complicated by the fact that observations are often a mixture of the light emitted by multiple localized sources situated in a spatially-varying background. A general algorithm to achieve robust and accurate source identification in this case remains an open question in astrophysics.
This paper focuses on high-energy light (such as X-rays and gamma-rays), for which observatories can detect individual photons (quanta of light), measuring their incoming direction, arrival time, and energy. Our proposed Bayesian methodology uses both the spatial and energy information to identify point sources, that is, separate them from the spatially-varying background, to estimate their number, and to compute the posterior probabilities that each photon originated from each identified source. This is accomplished via a Dirichlet process mixture while the background is simultaneously reconstructed via a flexible Bayesian nonparametric model based on B-splines. Our proposed method is validated with a suite of simulation studies and illustrated with an application to a complex region of the sky observed by the \emph{Fermi} Gamma-ray Space Telescope.

Read this paper on arXiv…

A. Sottosanti, M. Bernardi, A. Brazzale, et. al.
Mon, 26 Apr 21
29/45

Comments: N/A

On the accuracy and precision of correlation functions and field-level inference in cosmology [CEA]

http://arxiv.org/abs/2103.04158


We present a comparative study of the accuracy and precision of correlation function methods and full-field inference in cosmological data analysis. To do so, we examine a Bayesian hierarchical model that predicts log-normal fields and their two-point correlation function. Although a simplified analytic model, the log-normal model produces fields that share many of the essential characteristics of the present-day non-Gaussian cosmological density fields. We use three different statistical techniques: (i) a standard likelihood-based analysis of the two-point correlation function; (ii) a likelihood-free (simulation-based) analysis of the two-point correlation function; (iii) a field-level analysis, made possible by the more sophisticated data assimilation technique. We find that (a) standard assumptions made to write down a likelihood for correlation functions can cause significant biases, a problem that is alleviated with simulation-based inference; and (b) analysing the entire field offers considerable advantages over correlation functions, through higher accuracy, higher precision, or both. The gains depend on the degree of non-Gaussianity, but in all cases, including for weak non-Gaussianity, the advantage of analysing the full field is substantial.

Read this paper on arXiv…

F. Leclercq and A. Heavens
Tue, 9 Mar 21
60/68

Comments: 6 pages, 4 figures. Our code and data are publicly available at this https URL

Mesospheric nitric oxide model from SCIAMACHY data [CL]

http://arxiv.org/abs/2102.08455


We present an empirical model for nitric oxide NO in the mesosphere ($\approx$60–90 km) derived from SCIAMACHY (SCanning Imaging Absorption spectroMeter for Atmospheric CHartoghraphY) limb scan data. This work complements and extends the NOEM (Nitric Oxide Empirical Model; Marsh et al., 2004) and SANOMA (SMR Acquired Nitric Oxide Model Atmosphere; Kiviranta et al., 2018) empirical models in the lower thermosphere. The regression ansatz builds on the heritage of studies by Hendrickx et al. (2017) and the superposed epoch analysis by Sinnhuber et al. (2016) which estimate NO production from particle precipitation.
Our model relates the daily (longitudinally) averaged NO number densities from SCIAMACHY (Bender et al., 2017a, b) as a function of geomagnetic latitude to the solar Lyman-alpha and the geomagnetic AE (auroral electrojet) indices. We use a non-linear regression model, incorporating a finite and seasonally varying lifetime for the geomagnetically induced NO. We estimate the parameters by finding the maximum posterior probability and calculate the parameter uncertainties using Markov chain Monte Carlo sampling. In addition to providing an estimate of the NO content in the mesosphere, the regression coefficients indicate regions where certain processes dominate.

Read this paper on arXiv…

S. Bender, M. Sinnhuber, P. Espy, et. al.
Thu, 18 Feb 21
42/66

Comments: 13 pages, 6 figures; published in Atmos. Chem. Phys

WOMBAT: A fully Bayesian global flux-inversion framework [CL]

http://arxiv.org/abs/2102.04004


WOMBAT (the WOllongong Methodology for Bayesian Assimilation of Trace-gases) is a fully Bayesian hierarchical statistical framework for flux inversion of trace gases from flask, in situ, and remotely sensed data. WOMBAT extends the conventional Bayesian-synthesis framework through the consideration of a correlated error term, the capacity for online bias correction, and the provision of uncertainty quantification on all unknowns that appear in the Bayesian statistical model. We show, in an observing system simulation experiment (OSSE), that these extensions are crucial when the data are indeed biased and have errors that are correlated. Using the GEOS-Chem atmospheric transport model, we show that WOMBAT is able to obtain posterior means and uncertainties on non-fossil-fuel CO$_2$ fluxes from Orbiting Carbon Observatory-2 (OCO-2) data that are comparable to those from the Model Intercomparison Project (MIP) reported in Crowell et al. (2019, Atmos. Chem. Phys., vol. 19). We also find that our predictions of out-of-sample retrievals from the Total Column Carbon Observing Network are, for the most part, more accurate than those made by the MIP participants. Subsequent versions of the OCO-2 datasets will be ingested into WOMBAT as they become available.

Read this paper on arXiv…

A. Zammit-Mangion, M. Bertolacci, J. Fisher, et. al.
Tue, 9 Feb 21
7/87

Comments: 46 pages, 13 figures

Variational Inference for Deblending Crowded Starfields [IMA]

http://arxiv.org/abs/2102.02409


In the image data collected by astronomical surveys, stars and galaxies often overlap. Deblending is the task of distinguishing and characterizing individual light sources from survey images. We propose StarNet, a fully Bayesian method to deblend sources in astronomical images of crowded star fields. StarNet leverages recent advances in variational inference, including amortized variational distributions and the wake-sleep algorithm. Wake-sleep, which minimizes forward KL divergence, has significant benefits compared to traditional variational inference, which minimizes a reverse KL divergence. In our experiments with SDSS images of the M2 globular cluster, StarNet is substantially more accurate than two competing methods: Probablistic Cataloging (PCAT), a method that uses MCMC for inference, and a software pipeline employed by SDSS for deblending (DAOPHOT). In addition, StarNet is as much as $100,000$ times faster than PCAT, exhibiting the scaling characteristics necessary to perform fully Bayesian inference on modern astronomical surveys.

Read this paper on arXiv…

R. Liu, J. McAuliffe and J. Regier
Fri, 5 Feb 21
54/66

Comments: 37 pages; 20 figures; 3 tables. Submitted to the Journal of the American Statistical Association

A Fast Template Periodogram for Detecting Non-sinusoidal Fixed-shape Signals in Irregularly Sampled Time Series [IMA]

http://arxiv.org/abs/2101.12348


Astrophysical time series often contain periodic signals. The large and growing volume of time series data from photometric surveys demands computationally efficient methods for detecting and characterizing such signals. The most efficient algorithms available for this purpose are those that exploit the $\mathcal{O}(N\log N)$ scaling of the Fast Fourier Transform (FFT). However, these methods are not optimal for non-sinusoidal signal shapes. Template fits (or periodic matched filters) optimize sensitivity for a priori known signal shapes but at a significant computational cost. Current implementations of template periodograms scale as $\mathcal{O}(N_f N_{obs})$, where $N_f$ is the number of trial frequencies and $N_{obs}$ is the number of lightcurve observations, and due to non-convexity, they do not guarantee the best fit at each trial frequency, which can lead to spurious results. In this work, we present a non-linear extension of the Lomb-Scargle periodogram to obtain a template-fitting algorithm that is both accurate (globally optimal solutions are obtained except in pathological cases) and computationally efficient (scaling as $\mathcal{O}(N_f\log N_f)$ for a given template). The non-linear optimization of the template fit at each frequency is recast as a polynomial zero-finding problem, where the coefficients of the polynomial can be computed efficiently with the non-equispaced fast Fourier transform. We show that our method, which uses truncated Fourier series to approximate templates, is an order of magnitude faster than existing algorithms for small problems ($N\lesssim 10$ observations) and 2 orders of magnitude faster for long base-line time series with $N_{obs} \gtrsim 10^4$ observations. An open-source implementation of the fast template periodogram is available at https://www.github.com/PrincetonUniversity/FastTemplatePeriodogram.

Read this paper on arXiv…

J. Hoffman, J. Vanderplas, J. Hartman, et. al.
Mon, 1 Feb 21
18/69

Comments: 11 pages, 5 figures

Change point detection and image segmentation for time series of astrophysical images [IMA]

http://arxiv.org/abs/2101.11202


Many astrophysical phenomena are time-varying, in the sense that their intensity, energy spectrum, and/or the spatial distribution of the emission suddenly change. This paper develops a method for modeling a time series of images. Under the assumption that the arrival times of the photons follow a Poisson process, the data are binned into 4D grids of voxels (time, energy band, and x-y coordinates), and viewed as a time series of non-homogeneous Poisson images. The method assumes that at each time point, the corresponding multi-band image stack is an unknown 3D piecewise constant function including Poisson noise. It also assumes that all image stacks between any two adjacent change points (in time domain) share the same unknown piecewise constant function. The proposed method is designed to estimate the number and the locations of all the change points (in time domain), as well as all the unknown piecewise constant functions between any pairs of the change points. The method applies the minimum description length (MDL) principle to perform this task. A practical algorithm is also developed to solve the corresponding complicated optimization problem. Simulation experiments and applications to real datasets show that the proposed method enjoys very promising empirical properties. Applications to two real datasets, the XMM observation of a flaring star and an emerging solar coronal loop, illustrate the usage of the proposed method and the scientific insight gained from it.

Read this paper on arXiv…

C. Xu, H. Günther, V. Kashyap, et. al.
Thu, 28 Jan 21
37/64

Comments: 22 pages, 10 figures

Distinction of groups of gamma-ray bursts in the BATSE catalog through fuzzy clustering [CL]

http://arxiv.org/abs/2101.03536


In search for the possible astrophysical sources behind origination of the diverse gamma-ray bursts, cluster analyses are performed to find homogeneous groups, which discover an intermediate group other than the conventional short and long bursts. However, very recently, few studies indicate a possibility of the existence of more than three (namely five) groups. Therefore, in this paper, fuzzy clustering is conducted on the gamma-ray bursts from the final ‘Burst and Transient Source Experiment’ catalog to cross-check the significance of these new groups. Meticulous study on individual bursts based on their memberships in the fuzzy clusters confirms the previously well-known three groups against the newly found five.

Read this paper on arXiv…

S. Modak
Tue, 12 Jan 21
41/90

Comments: 26 pages; 6 figures

Simultaneous inference of periods and period-luminosity relations for Mira variable stars [CL]

http://arxiv.org/abs/2101.02938


The Period–Luminosity relation (PLR) of Mira variable stars is an important tool to determine astronomical distances. The common approach of estimating the PLR is a two-step procedure that first estimates the Mira periods and then runs a linear regression of magnitude on log period. When the light curves are sparse and noisy, the accuracy of period estimation decreases and can suffer from aliasing effects. Some methods improve accuracy by incorporating complex model structures at the expense of significant computational costs. Another drawback of existing methods is that they only provide point estimation without proper estimation of uncertainty. To overcome these challenges, we develop a hierarchical Bayesian model that simultaneously models the quasi-periodic variations for a collection of Mira light curves while estimating their common PLR. By borrowing strengths through the PLR, our method automatically reduces the aliasing effect, improves the accuracy of period estimation, and is capable of characterizing the estimation uncertainty. We develop a scalable stochastic variational inference algorithm for computation that can effectively deal with the multimodal posterior of period. The effectiveness of the proposed method is demonstrated through simulations, and an application to observations of Miras in the Local Group galaxy M33. Without using ad-hoc period correction tricks, our method achieves a distance estimate of M33 that is consistent with published work. Our method also shows superior robustness to downsampling of the light curves.

Read this paper on arXiv…

S. He, Z. Lin, W. Yuan, et. al.
Mon, 11 Jan 21
39/65

Comments: N/A

Sufficiency of a Gaussian power spectrum likelihood for accurate cosmology from upcoming weak lensing surveys [CEA]

http://arxiv.org/abs/2012.06267


We investigate whether a Gaussian likelihood is sufficient to obtain accurate parameter constraints from a Euclid-like combined tomographic power spectrum analysis of weak lensing, galaxy clustering and their cross-correlation. Testing its performance on the full sky against the Wishart distribution, which is the exact likelihood under the assumption of Gaussian fields, we find that the Gaussian likelihood returns accurate parameter constraints. This accuracy is robust to the choices made in the likelihood analysis, including the choice of fiducial cosmology, the range of scales included, and the random noise level. We extend our results to the cut sky by evaluating the additional non-Gaussianity of the joint cut-sky likelihood in both its marginal distributions and dependence structure. We find that the cut-sky likelihood is more non-Gaussian than the full-sky likelihood, but at a level insufficient to introduce significant inaccuracy into parameter constraints obtained using the Gaussian likelihood. Our results should not be affected by the assumption of Gaussian fields, as this approximation only becomes inaccurate on small scales, which in turn corresponds to the limit in which any non-Gaussianity of the likelihood becomes negligible. We nevertheless compare against N-body weak lensing simulations and find no evidence of significant additional non-Gaussianity in the likelihood. Our results indicate that a Gaussian likelihood will be sufficient for robust parameter constraints with power spectra from Stage IV weak lensing surveys.

Read this paper on arXiv…

R. Upham, M. Brown and L. Whittaker
Mon, 14 Dec 20
18/74

Comments: 14 pages, 19 figures, to be submitted to MNRAS

Gibbs Point Process Model for Young Star Clusters in M33 [GA]

http://arxiv.org/abs/2012.05938


We demonstrate the power of Gibbs point process models from the spatial statistics literature when applied to studies of resolved galaxies. We conduct a rigorous analysis of the spatial distributions of objects in the star formation complexes of M33, including giant molecular clouds (GMCs) and young stellar cluster candidates (YSCCs). We choose a hierarchical model structure from GMCs to YSCCs based on the natural formation hierarchy between them. This approach circumvents the limitations of the empirical two-point correlation function analysis by naturally accounting for the inhomogeneity present in the distribution of YSCCs. We also investigate the effects of GMCs’ properties on their spatial distributions. We confirm that the distribution of GMCs and YSCCs are highly correlated. We found that the spatial distributions of YSCCs reaches a peak of clustering pattern at ~250 pc scale compared to a Poisson process. This clustering mainly occurs in regions where the galactocentric distance >~4.5 kpc. Furthermore, the galactocentric distance of GMCs and their mass have strong positive effects on the correlation strength between GMCs and YSCCs. We outline some possible implications of these findings for our understanding of the cluster formation process.

Read this paper on arXiv…

D. Li and P. Barmby
Mon, 14 Dec 20
55/74

Comments: MNRAS in press; 22 pages

Hybrid analytic and machine-learned baryonic property insertion into galactic dark matter haloes [GA]

http://arxiv.org/abs/2012.05820


While cosmological dark matter-only simulations relying solely on gravitational effects are comparably fast to compute, baryonic properties in simulated galaxies require complex hydrodynamic simulations that are computationally costly to run. We explore the merging of an extended version of the equilibrium model, an analytic formalism describing the evolution of the stellar, gas, and metal content of galaxies, into a machine learning framework. In doing so, we are able to recover more properties than the analytic formalism alone can provide, creating a high-speed hydrodynamic simulation emulator that populates galactic dark matter haloes in N-body simulations with baryonic properties. While there exists a trade-off between the reached accuracy and the speed advantage this approach offers, our results outperform an approach using only machine learning for a subset of baryonic properties. We demonstrate that this novel hybrid system enables the fast completion of dark matter-only information by mimicking the properties of a full hydrodynamic suite to a reasonable degree, and discuss the advantages and disadvantages of hybrid versus machine learning-only frameworks. In doing so, we offer an acceleration of commonly deployed simulations in cosmology.

Read this paper on arXiv…

B. Moews, R. Davé, S. Mitra, et. al.
Fri, 11 Dec 20
56/75

Comments: 13 pages, 8 figures, preprint submitted to MNRAS

Cluster analysis of presolar silicon carbide grains: evaluation of their classification and astrophysical implications [SSA]

http://arxiv.org/abs/2012.04009


Cluster analysis of presolar silicon carbide grains based on literature data for 12C/13C, 14N/15N, {\delta}30Si/28Si, and {\delta}29Si/28Si including or not inferred initial 26Al/27Al data, reveals nine clusters agreeing with previously defined grain types but also highlighting new divisions. Mainstream grains reside in three clusters probably representing different parent star metallicities. One of these clusters has a compact core, with a narrow range of composition, pointing to an enhanced production of SiC grains in asymptotic giant branch (AGB) stars with a narrow range of masses and metallicities. The addition of 26Al/27Al data highlights a cluster of mainstream grains, enriched in 15N and 26Al, which cannot be explained by current AGB models. We defined two AB grain clusters, one with 15N and 26Al excesses, and the other with 14N and smaller 26Al excesses, in agreement with recent studies. Their definition does not use the solar N isotopic ratio as a divider, and the contour of the 26Al-rich AB cluster identified in this study is in better agreement with core-collapse supernova models. We also found a cluster with a mixture of putative nova and AB grains, which may have formed in supernova or nova environments. X grains make up two clusters, having either strongly correlated Si isotopic ratios or deviating from the 2/3 slope line in the Si 3-isotope plot. Finally, most Y and Z grains are jointly clustered, suggesting that the previous use of 12C/13C= 100 as a divider for Y grains was arbitrary. Our results show that cluster analysis is a powerful tool to interpret the data in light of stellar evolution and nucleosynthesis modelling and highlight the need of more multi-element isotopic data for better classification.

Read this paper on arXiv…

A. Boujibar, S. Howell, S. Zhang, et. al.
Wed, 9 Dec 20
74/80

Comments: 24 pages, 10 figures and 1 table

First optical reconstruction of dust in the region of SNR RX~J1713.7-3946 from astrometric Gaia data [HEAP]

http://arxiv.org/abs/2011.14383


The origin of the radiation observed in the region of the supernova remnant (SNR) RX$\,$J1713.7-3946, one of the brightest TeV emitters, has been debated since its discovery. The existence of atomic and molecular clouds in this object supports the idea that part of the GeV gamma rays in this region originate from proton-proton collisions. However, the observed column density of gas cannot explain the whole emission. Here we present the results of a novel technique that uses the ESA/Gaia DR2 data to reveal faint gas and dust structures in the region of RX$\,$J1713.7-3946 by making use of both astrometric and photometric data. These new structures could be an additional target for cosmic ray protons from the SNR. Our distance resolved reconstruction of dust extinction towards the SNR indicates the presence of only one faint structure in the vicinity of RX$\,$J1713.7-3946. Considering that the SNR is located in a dusty environment, we set the most precise constrain to the SNR distance to date, at ($1.12 \pm 0.01$)~kpc.

Read this paper on arXiv…

R. Leike, S. Celli, A. Krone-Martins, et. al.
Tue, 1 Dec 20
25/108

Comments: N/A

Evaluation of investigational paradigms for the discovery of non-canonical astrophysical phenomena [IMA]

http://arxiv.org/abs/2011.10086


Non-canonical phenomena – defined here as observables which are either insufficiently characterized by existing theory, or otherwise represent inconsistencies with prior observations – are of burgeoning interest in the field of astrophysics, particularly due to their relevance as potential signs of past and/or extant life in the universe (e.g. off-nominal spectroscopic data from exoplanets). However, an inherent challenge in investigating such phenomena is that, by definition, they do not conform to existing predictions, thereby making it difficult to constrain search parameters and develop an associated falsifiable hypothesis.
In this Expert Recommendation, the authors evaluate the suitability of two different approaches – conventional parameterized investigation (wherein experimental design is tailored to optimally test a focused, explicitly parameterized hypothesis of interest) and the alternative approach of anomaly searches (wherein broad-spectrum observational data is collected with the aim of searching for potential anomalies across a wide array of metrics) – in terms of their efficacy in achieving scientific objectives in this context. The authors provide guidelines on the appropriate use-cases for each paradigm, and contextualize the discussion through its applications to the interdisciplinary field of technosignatures (a discipline at the intersection of astrophysics and astrobiology), which essentially specializes in searching for non-canonical astrophysical phenomena.

Read this paper on arXiv…

C. Singam, J. Haqq-Misra, A. Balbi, et. al.
Mon, 23 Nov 20
61/63

Comments: A product of the TechnoClimes 2020 conference

MatDRAM: A pure-MATLAB Delayed-Rejection Adaptive Metropolis-Hastings Markov Chain Monte Carlo Sampler [CL]

http://arxiv.org/abs/2010.04190


Markov Chain Monte Carlo (MCMC) algorithms are widely used for stochastic optimization, sampling, and integration of mathematical objective functions, in particular, in the context of Bayesian inverse problems and parameter estimation. For decades, the algorithm of choice in MCMC simulations has been the Metropolis-Hastings (MH) algorithm. An advancement over the traditional MH-MCMC sampler is the Delayed-Rejection Adaptive Metropolis (DRAM). In this paper, we present MatDRAM, a stochastic optimization, sampling, and Monte Carlo integration toolbox in MATLAB which implements a variant of the DRAM algorithm for exploring the mathematical objective functions of arbitrary-dimensions, in particular, the posterior distributions of Bayesian models in data science, Machine Learning, and scientific inference. The design goals of MatDRAM include nearly-full automation of MCMC simulations, user-friendliness, fully-deterministic reproducibility, and the restart functionality of simulations. We also discuss the implementation details of a technique to automatically monitor and ensure the diminishing adaptation of the proposal distribution of the DRAM algorithm and a method of efficiently storing the resulting simulated Markov chains. The MatDRAM library is open-source, MIT-licensed, and permanently located and maintained as part of the ParaMonte library at https://github.com/cdslaborg/paramonte.

Read this paper on arXiv…

S. Kumbhare and A. Shahmoradi
Mon, 12 Oct 20
22/59

Comments: N/A

How unbiased statistical methods lead to biased scientific discoveries: A case study of the Efron-Petrosian statistic applied to the luminosity-redshift evolution of Gamma-Ray Bursts [HEAP]

http://arxiv.org/abs/2010.02935


Statistical methods are frequently built upon assumptions that limit their applicability to certain problems and conditions. Failure to recognize these limitations can lead to conclusions that may be inaccurate or biased. An example of such methods is the non-parametric Efron-Petrosian test statistic used in the studies of truncated data. We argue and show how the inappropriate use of this statistical method can lead to biased conclusions when the assumptions under which the method is valid do not hold. We do so by reinvestigating the evidence recently provided by multiple independent reports on the evolution of the luminosity/energetics distribution of cosmological Long-duration Gamma-Ray Bursts (LGRBs) with redshift. We show that the effects of detection threshold has been likely significantly underestimated in the majority of previous studies. This underestimation of detection threshold leads to severely-incomplete LGRB samples that exhibit strong apparent luminosity-redshift or energetics-redshift correlations. We further confirm our findings by performing extensive Monte Carlo simulations of the cosmic rates and the luminosity/energy distributions of LGRBs and their detection process.

Read this paper on arXiv…

C. Bryant, J. Osborne and A. Shahmoradi
Thu, 8 Oct 20
24/54

Comments: N/A

Comparison of classical and Bayesian imaging in radio interferometry [IMA]

http://arxiv.org/abs/2008.11435


CLEAN, the commonly employed imaging algorithm in radio interferometry, suffers from the following shortcomings: in its basic version it does not have the concept of diffuse flux, and the common practice of convolving the CLEAN components with the CLEAN beam erases the potential for super-resolution; it does not output uncertainty information; it produces images with unphysical negative flux regions; and its results are highly dependent on the so-called weighting scheme as well as on any human choice of CLEAN masks to guiding the imaging. Here, we present the Bayesian imaging algorithm resolve which solves the above problems and naturally leads to super-resolution. In this publication we take a VLA observation of Cygnus~A at four different frequencies and image it with single-scale CLEAN, multi-scale CLEAN and resolve. Alongside the sky brightness distribution resolve estimates a baseline-dependent correction function for the noise budget, the Bayesian equivalent of weighting schemes. We report noise correction factors between 0.3 and 340. The enhancements achieved by resolve are paid for by higher computational effort.

Read this paper on arXiv…

P. Arras, R. Perley, H. Bester, et. al.
Thu, 27 Aug 20
-1251/52

Comments: 22 pages, 14 figures, 4 tables, data published at this https URL

Polarization-based online interference mitigation in radio interferometry [IMA]

http://arxiv.org/abs/2006.00062


Mitigation of radio frequency interference (RFI) is essential to deliver science-ready radio interferometric data to astronomers. In this paper, using dual polarized radio interferometers, we propose to use the polarization information of post-correlation interference signals to detect and mitigate them. We use the directional statistics of the polarized signals as the detection criteria and formulate a distributed, wideband spectrum sensing problem. Using consensus optimization, we solve this in an online manner, working with mini-batches of data. We present extensive results based on simulations to demonstrate the feasibility of our method.

Read this paper on arXiv…

S. Yatawatta
Tue, 2 Jun 20
76/90

Comments: EUSIPCO 2020 accepted

A Hermite-Gaussian Based Radial Velocity Estimation Method [EPA]

http://arxiv.org/abs/2005.14083


As the first successful technique used to detect exoplanets orbiting distant stars, the Radial Velocity Method aims to detect a periodic Doppler shift in a star’s spectrum. We introduce a new, mathematically rigorous, approach to detect such a signal that accounts for functional relationships of neighboring wavelengths, minimizes the role of wavelength interpolation, accounts for heteroskedastic noise, and easily allows for statistical inference. Using Hermite-Gaussian functions, we show that the problem of detecting a Doppler shift in the spectrum can be reduced to linear regression in many settings. A simulation study demonstrates that the proposed method is able to accurately estimate an individual spectrum’s radial velocity with precision below 0.3 m/s. Furthermore, the new method outperforms the traditional Cross-Correlation Function approach by reducing the root mean squared error up to 15 cm/s. The proposed method is also demonstrated on a new set of observations from the EXtreme PREcision Spectrometer (EXPRES) for the star 51 Pegasi, and successfully recovers estimates that agree well with previous studies of this planetary system. Data and Python3 code associated with this work can be found at https://github.com/parkerholzer/hgrv_method. The method is also implemented in the open source R package rvmethod.

Read this paper on arXiv…

P. Holzer, J. Cisewski-Kehe, D. Fischer, et. al.
Fri, 29 May 20
56/75

Comments: 48 pages, 19 figures

Ridges in the Dark Energy Survey for cosmic trough identification [CEA]

http://arxiv.org/abs/2005.08583


Cosmic voids and their corresponding redshift-aggregated projections of mass densities, known as troughs, play an important role in our attempt to model the large-scale structure of the Universe. Understanding these structures leads to tests comparing the standard model with alternative cosmologies, constraints on the dark energy equation of state, and provides evidence to differentiate among gravitational theories. In this paper, we extend the subspace-constrained mean shift algorithm, a recently introduced method to estimate density ridges, and apply it to 2D weak-lensing mass density maps from the Dark Energy Survey Y1 data release to identify curvilinear filamentary structures. We compare the obtained ridges with previous approaches to extract trough structure in the same data, and apply curvelets as an alternative wavelet-based method to constrain densities. We then invoke the Wasserstein distance between noisy and noiseless simulations to validate the denoising capabilities of our method. Our results demonstrate the viability of ridge estimation as a precursor for denoising weak lensing quantities to recover the large-scale structure, paving the way for a more versatile and effective search for troughs.

Read this paper on arXiv…

B. Moews, M. Schmitz, A. Lawler, et. al.
Tue, 19 May 20
87/92

Comments: 12 pages, 5 figures, preprint submitted to MNRAS

Parametric unfolding. Method and restrictions [CL]

http://arxiv.org/abs/2004.12766


Parametric unfolding of a true distribution distorted due to finite resolution and limited efficiency for the registration of individual events is discussed. Details of the computational algorithm of the unfolding procedure are presented.

Read this paper on arXiv…

N. Gagunashvili
Tue, 28 Apr 20
68/81

Comments: 14 pages, 9 figures

Comparison of the shape and temporal evolution of even and odd solar cycles [SSA]

http://arxiv.org/abs/2004.03855


Results. The PCA confirms the existence of the Gnevyshev gap (GG) for solar cycles at about 40% from the start of the cycle. The temporal evolution of sunspot area data for even cycles shows that the GG exists at least at the 95% confidence level for all sizes of sunspots. On the other hand, the GG is shorter and statistically insignificant for the odd cycles of aerial sunspot data. Furthermore, the analysis of sunspot area sizes for even and odd cycles of SC12-SC23 shows that the greatest difference is at 4.2-4.6 years, where even cycles have a far smaller total area than odd cycles. The average area of the individual sunspots of even cycles is also smaller in this interval. The statistical analysis of the temporal evolution shows that northern sunspot groups maximise earlier than southern groups for even cycles, but are concurrent for odd cycles. Furthermore, the temporal distributions of odd cycles are slightly more leptokurtic than distributions of even cycles. The skewnesses are 0.37 and 0.49 and the kurtoses 2.79 and 2.94 for even and odd cycles, respectively. The correlation coefficient between skewness and kurtosis for even cycles is 0.69, and for odd cycles, it is 0.90. Conclusions. The separate PCAs for even and odd sunspot cycles show that odd cycles are more inhomogeneous than even cycles, especially in GSN data. Even cycles, however, have two anomalous cycles: SC4 and SC6. According to the analysis of the sunspot area size data, the GG is more distinct in even than odd cycles. We also present another Waldmeier-type rule, that is, we find a correlation between skewness and kurtosis of the sunspot group cycles.

Read this paper on arXiv…

J. Takalo and K. Mursula
Thu, 9 Apr 20
4/54

Comments: 10 pages, 13 figures

Comparison of Latitude Distribution and Evolution of Even and Odd Sunspot Cycles [SSA]

http://arxiv.org/abs/2003.14262


We study the latitudinal distribution and evolution of sunspot areas from Solar Cycle 12 to Solar Cycle 23 (SC12-SC23) and sunspot-groups of from Solar Cycle 8 to Solar Cycle 23 (SC8-SC23) for even and odd cycles. The Rician distribution is the best-fit function for both even and odd sunspots group latitudinal occurrence. The mean and variance for even northern/southern butterfly wing sunspots are 14.94/14.76 and 58.62/56.08, respectively, and the mean and variance for odd northern/southern wing sunspots are 15.52/15.58 and 61.77/58.00, respectively. Sunspot groups of even cycle wings are thus at somewhat lower latitudes on the average than sunspot groups of the odd cycle wings, i.e., about 0.6 degrees for northern hemisphere wings and 0.8 degrees for southern hemisphere wings. The spatial analysis of sunspot areas between SC12-SC23 shows that the small sunspots are at lower solar latitudes of the sun than the large sunspots for both odd and even cycles, and also for both hemispheres. Temporal evolution of sunspot areas shows a lack of large sunspots after four years (exactly between 4.2-4.5 years), i.e., about 40\% after the start of the cycle, especially for even cycles. This is related to the Gnevyshev gap and is occurring at the time when the evolution of the average sunspot latitudes cross about 15 degrees. The gap is, however, clearer for even cycles than odd ones. Gnevyshev gap divides the cycle into two disparate parts: the ascending phase/cycle maximum and the declining phase of the sunspot cycle.

Read this paper on arXiv…

J. Takalo
Wed, 1 Apr 20
36/83

Comments: 11 pages, 5 figures

Clustering of Local Extrema in Planck CMB maps [CEA]

http://arxiv.org/abs/2003.07364


The clustering of local extrema including peaks and troughs will be exploited to assess Gaussianity, asymmetry and the footprint of cosmic strings network on the CMB random field observed by {\it Planck} satellite. The number density of local extrema reveals some non-resolved shot noise in {\it Planck} maps. The \texttt{SEVEM} data has maximum number density of peaks, $n_{pk}$, and troughs, $n_{tr}$, compared to other observed maps. The cumulative of $n_{pk}$ and $n_{tr}$ above and below a threshold, $\vartheta$, for all {\it Planck} maps except for the 100GHz band are compatible with the Gaussian random field. The unweighted Two-Point Correlation Function (TPCF), $\Psi(\theta;\vartheta)$, of the local extrema illustrates significant non-Gaussianity for angular separation $\theta\le 15’$ for all available thresholds. Our results show that to put the feasible constraint on the amplitude of the mass function based on the value of $\Psi$ around the {\it Doppler peak} ($\theta\approx 70′-75’$), we should consider $\vartheta\gtrsim+1.0$. The scale independent bias factors for peak above a threshold for large separation angle and high threshold level are in agreement with that of expected for a pure Gaussian CMB. Unweighted TPCF of local extrema demonstrates a level of rejecting Gaussian hypothesis in \texttt{SMICA}. Genus topology also confirms the Gaussian hypothesis for different component separation maps. Tessellating CMB map with disk of size $6^{\circ}$ based on $n_{pk}$ and $\Psi_{pk-pk}$ demonstrate statistical symmetry in {\it Planck} maps. Combining all maps and applying the $\Psi_{pk-pk}$ puts the upper bound on the cosmic string’s tension: $G\mu^{(up)} \lesssim 5.00\times 10^{-7}$.

Read this paper on arXiv…

A. Sadr and S. Movahed
Wed, 18 Mar 20
19/46

Comments: 16 pages, 19 figures

Characterization of hot stellar systems with confidence [GA]

http://arxiv.org/abs/2003.05777


Hot stellar systems (HSS) are a collection of stars bound together by gravitational attraction. These systems hold clues to many mysteries of outer space so understanding their origin, evolution and physical properties is important but remains a huge challenge. We used multivariate $t$-mixtures model-based clustering to analyze 13456 hot stellar systems from Misgeld & Hilker (2011) that included 12763 candidate globular clusters and found eight homogeneous groups using the Bayesian Information Criterion (BIC). A nonparametric bootstrap procedure was used to estimate the confidence of each of our clustering assignments. The eight obtained groups can be characterized in terms of the correlation, mass, effective radius and surface density. Using conventional correlation-mass-effective radius-surface density notation, the largest group, Group 1, can be described as having positive-low-low-moderate characteristics. The other groups, numbered in decreasing sizes are similarly characterised, with Group 2 having positive-low-low-high characteristics, Group 3 displaying positive-low-low-moderate characteristics, Group 4 having positive-low-low-high characteristic, Group 5 displaying positive-low-moderate-moderate characteristic and Group 6 showing positive-moderate-low-high characteristic. The smallest group (Group 8) shows negative-low-moderate-moderate characteristic. Group 7 has no candidate clusters and so cannot be similarly labeled but the mass, effective radius correlation for these non-candidates indicates that they zare larger than typical globular clusters. Assertions drawn for each group are ambiguous for a few HSS having low confidence in classification. Our analysis identifies distinct kinds of HSS with varying confidence and provides novel insight into their physical and evolutionary properties.

Read this paper on arXiv…

S. Chattopadhyay and R. Maitra
Fri, 13 Mar 20
11/53

Comments: 9 pages; 9 figures

Revisiting the distributions of the physical characteristics of Jupiter's irregular moons [EPA]

http://arxiv.org/abs/2003.04810


As the identified number of Jupiter’s moons has skyrocketed to 79, some of them have been regrouped. In this work, we continue to identify the potential distributions of the physical characteristics of Jupiter’s irregular moons. By using nonparametric Kolmogorov-Smirnov tests, we verified more than 20 commonly used distributions and found that surprisingly, almost all the physical characteristics (i.e., the equatorial radius, equatorial circumference, circumference, volume, mass, surface gravity and escape velocity) of the moons in the Ananke and Carme groups follow log-logistic distributions. Additionally, more than half of the physical characteristics of the moons in the Pasiphae group are theoretically subject to this type of distribution. Combined with strict analytical derivations, it is increasingly clear and possible to anticipate that the physical characteristics of most irregular moons follow log-logistic distributions with the discovery of an increasing number of Jupiter’s irregular moons.

Read this paper on arXiv…

F. Gao and X. Liu
Wed, 11 Mar 20
34/65

Comments: 35 pages, 27 figures, 17 tables

Revisiting distribution laws for orbital characteristics of Jupiter's moons [EPA]

http://arxiv.org/abs/2003.04851


From the statistical point of view, this paper mainly emphasizes the orbital distribution laws of Jupiter’s moons, most of which are located in Ananke group, Carme group and Pasiphae group. By comparing 19 known continuous distributions, it is verified that there are suitable distribution functions to describe the distribution of these natural satellites. For each distribution type, interval estimation is used to estimate the corresponding parameter values. At a given significant level, one-sample Kolmogorov-Smirnov nonparametric test is applied to verify the specified distribution, and we often select the one with the largest $p$-value. The results show that all the semi-major axis, mean inclination and the orbital period of the moons in Ananke group and Carme group obey the Stable distribution. In addition, according to Kepler’s third planetary motion law, and by comparing the theoretically calculated best-fit cumulative distribution function (CDF) with the observed CDF, we demonstrate that the theoretical distribution is in good agreement with the empirical distribution. Therefore, these characteristics of Jupiter’s moons are indeed very likely to follow some specific distribution laws, and it will be possible to use these laws to help study certain features of poorly investigated moons or even predict undiscovered ones.

Read this paper on arXiv…

F. Gao and X. Liu
Wed, 11 Mar 20
58/65

Comments: 30 pages, 8 figures, 19 tables

Characterizing the spatial pattern of solar supergranulation using the bispectrum [SSA]

http://arxiv.org/abs/2002.08262


Context. The spatial power spectrum of supergranulation does not fully characterize the underlying physics of turbulent convection. For example, it does not describe the non-Gaussianity in the horizontal flow divergence.
Aims. Our aim is to statistically characterize the spatial pattern of solar supergranulation beyond the power spectrum. The next-order statistic is the bispectrum. It measures correlations of three Fourier components and is related to the nonlinearities in the underlying physics.
Methods. We estimated the bispectrum of supergranular horizontal surface divergence maps that were obtained using local correlation tracking (LCT) and time-distance helioseismology (TD) from one year of data from the Helioseismic and Magnetic Imager on-board the Solar Dynamics Observatory starting in May 2010.
Results. We find significantly nonzero and consistent estimates for the bispectrum. The strongest nonlinearity is present when the three coupling wave vectors are at the supergranular scale. These are the same wave vectors that are present in regular hexagons, which were used in analytical studies of solar convection. At these Fourier components, the bispectrum is positive, consistent with the positive skewness in the data and with supergranules preferentially consisting of outflows surrounded by a network of inflows. We use the bispectrum to generate synthetic divergence maps that are very similar to the data by a model that consists of a Gaussian term and a weaker quadratic nonlinear component. Thereby, we estimate the fraction of the variance in the divergence maps from the nonlinear component to be of the order of 4-6%.
Conclusions. We propose that bispectral analysis is useful for understanding solar turbulent convection, for example for comparing observations and numerical models of supergranular flows. This analysis may also be useful to generate synthetic flow fields.

Read this paper on arXiv…

V. Böning, A. Birch, L. Gizon, et. al.
Thu, 20 Feb 20
51/61

Comments: 16 pages, 12 figures, accepted for publication by A&A

The Widely Linear Complex Ornstein-Uhlenbeck Process with Application to Polar Motion [CL]

http://arxiv.org/abs/2001.05965


Complex-valued and widely linear modelling of time series signals are widespread and found in many applications. However, existing models and analysis techniques are usually restricted to signals observed in discrete time. In this paper we introduce a widely linear version of the complex Ornstein-Uhlenbeck (OU) process. This is a continuous-time process which generalises the standard complex-valued OU process such that signals generated from the process contain elliptical oscillations, as opposed to circular oscillations, when viewed in the complex plane. We determine properties of the widely linear complex OU process, including the conditions for stationarity, and the geometrical structure of the elliptical oscillations. We derive the analytical form of the power spectral density function, which then provides an efficient procedure for parameter inference using the Whittle likelihood. We apply the process to measure periodic and elliptical properties of Earth’s polar motion, including that of the Chandler wobble, for which the standard complex OU process was originally proposed.

Read this paper on arXiv…

A. Sykulski, S. Olhede and H. Sykulska-Lawrence
Fri, 17 Jan 20
48/60

Comments: Submitted for peer-review

Trend Filtering — II. Denoising Astronomical Signals with Varying Degrees of Smoothness [IMA]

http://arxiv.org/abs/2001.03552


Trend filtering—first introduced into the astronomical literature in Paper I of this series—is a state-of-the-art statistical tool for denoising one-dimensional signals that possess varying degrees of smoothness. In this work, we demonstrate the broad utility of trend filtering to observational astronomy by discussing how it can contribute to a variety of spectroscopic and time-domain studies. The observations we discuss are (1) the Lyman-$\alpha$ forest of quasar spectra; (2) more general spectroscopy of quasars, galaxies, and stars; (3) stellar light curves with planetary transits; (4) eclipsing binary light curves; and (5) supernova light curves. We study the Lyman-$\alpha$ forest in the greatest detail—using trend filtering to map the large-scale structure of the intergalactic medium along quasar-observer lines of sight. The remaining studies share broad themes of: (1) estimating observable parameters of light curves and spectra; and (2) constructing observational spectral/light-curve templates. We also briefly discuss the utility of trend filtering as a tool for one-dimensional data reduction and compression.

Read this paper on arXiv…

C. Politsch, J. Cisewski-Kehe, R. Croft, et. al.
Mon, 13 Jan 20
22/61

Comments: Part 2 of 2, Link to Part 1: arXiv:1908.07151; 15 pages, 7 figures

Applying Information Theory to Design Optimal Filters for Photometric Redshifts [IMA]

http://arxiv.org/abs/2001.01372


In this paper we apply ideas from information theory to create a method for the design of optimal filters for photometric redshift estimation. We show the method applied to a series of simple example filters in order to motivate an intuition for how photometric redshift estimators respond to the properties of photometric passbands. We then design a realistic set of six filters covering optical wavelengths that optimize photometric redshifts for $z <= 2.3$ and $i < 25.3$. We create a simulated catalog for these optimal filters and use our filters with a photometric redshift estimation code to show that we can improve the standard deviation of the photometric redshift error by 7.1% overall and improve outliers 9.9% over the standard filters proposed for the Large Synoptic Survey Telescope (LSST). We compare features of our optimal filters to LSST and find that the LSST filters incorporate key features for optimal photometric redshift estimation. Finally, we describe how information theory can be applied to a range of optimization problems in astronomy.

Read this paper on arXiv…

J. Kalmbach, J. VanderPlas and A. Connolly
Tue, 7 Jan 20
69/71

Comments: 29 pages, 17 figures, accepted to ApJ

Multilevel and hierarchical Bayesian modeling of cosmic populations [IMA]

http://arxiv.org/abs/1911.12337


Demographic studies of cosmic populations must contend with measurement errors and selection effects. We survey some of the key ideas astronomers have developed to deal with these complications, in the context of galaxy surveys and the literature on corrections for Malmquist and Eddington bias. From the perspective of modern statistics, such corrections arise naturally in the context of multilevel models, particularly in Bayesian treatments of such models: hierarchical Bayesian models. We survey some key lessons from hierarchical Bayesian modeling, including shrinkage estimation, which is closely related to traditional corrections devised by astronomers. We describe a framework for hierarchical Bayesian modeling of cosmic populations, tailored to features of astronomical surveys that are not typical of surveys in other disciplines. This thinned latent marked point process framework accounts for the tie between selection (detection) and measurement in astronomical surveys, treating selection and measurement error effects in a self-consistent manner.

Read this paper on arXiv…

T. Loredo and M. Hendry
Thu, 28 Nov 19
23/70

Comments: 33 pages, 5 figures

Multilevel and hierarchical Bayesian modeling of cosmic populations [IMA]

http://arxiv.org/abs/1911.12337


Demographic studies of cosmic populations must contend with measurement errors and selection effects. We survey some of the key ideas astronomers have developed to deal with these complications, in the context of galaxy surveys and the literature on corrections for Malmquist and Eddington bias. From the perspective of modern statistics, such corrections arise naturally in the context of multilevel models, particularly in Bayesian treatments of such models: hierarchical Bayesian models. We survey some key lessons from hierarchical Bayesian modeling, including shrinkage estimation, which is closely related to traditional corrections devised by astronomers. We describe a framework for hierarchical Bayesian modeling of cosmic populations, tailored to features of astronomical surveys that are not typical of surveys in other disciplines. This thinned latent marked point process framework accounts for the tie between selection (detection) and measurement in astronomical surveys, treating selection and measurement error effects in a self-consistent manner.

Read this paper on arXiv…

T. Loredo and M. Hendry
Thu, 28 Nov 19
19/70

Comments: 33 pages, 5 figures

Searching for new physics with profile likelihoods: Wilks and beyond [CL]

http://arxiv.org/abs/1911.10237


Particle physics experiments use likelihood ratio tests extensively to compare hypotheses and to construct confidence intervals. Often, the null distribution of the likelihood ratio test statistic is approximated by a $\chi^2$ distribution, following a theorem due to Wilks. However, many circumstances relevant to modern experiments can cause this theorem to fail. In this paper, we review how to identify these situations and construct valid inference.

Read this paper on arXiv…

S. Algeri, J. Aalbers, K. Morå, et. al.
Tue, 26 Nov 19
18/66

Comments: Submitted to Nature Expert Recommendations

LRP2020: Astrostatistics in Canada [IMA]

http://arxiv.org/abs/1910.08857


(Abridged from Executive Summary) This white paper focuses on the interdisciplinary fields of astrostatistics and astroinformatics, in which modern statistical and computational methods are applied to and developed for astronomical data. Astrostatistics and astroinformatics have grown dramatically in the past ten years, with international organizations, societies, conferences, workshops, and summer schools becoming the norm. Canada’s formal role in astrostatistics and astroinformatics has been relatively limited, but there is a great opportunity and necessity for growth in this area. We conducted a survey of astronomers in Canada to gain information on the training mechanisms through which we learn statistical methods and to identify areas for improvement. In general, the results of our survey indicate that while astronomers see statistical methods as critically important for their research, they lack focused training in this area and wish they had received more formal training during all stages of education and professional development. These findings inform our recommendations for the LRP2020 on how to increase interdisciplinary connections between astronomy and statistics at the institutional, national, and international levels over the next ten years. We recommend specific, actionable ways to increase these connections, and discuss how interdisciplinary work can benefit not only research but also astronomy’s role in training Highly Qualified Personnel (HQP) in Canada.

Read this paper on arXiv…

G. Eadie, A. Bahramian, P. Barmby, et. al.
Tue, 22 Oct 19
21/91

Comments: White paper E017 submitted to the Canadian Long Range Plan LRP2020

LEO-Py: Estimating likelihoods for correlated, censored, and uncertain data with given marginal distributions [IMA]

http://arxiv.org/abs/1910.02958


Data with uncertain, missing, censored, and correlated values are commonplace in many research fields including astronomy. Unfortunately, such data are often treated in an ad hoc way in the astronomical literature potentially resulting in inconsistent parameter estimates. Furthermore, in a realistic setting, the variables of interest or their errors may have non-normal distributions which complicates the modeling. I present a novel approach to compute the likelihood function for such data sets. This approach employs Gaussian copulas to decouple the correlation structure of variables and their marginal distributions resulting in a flexible method to compute likelihood functions of data in the presence of measurement uncertainty, censoring, and missing data. I demonstrate its use by determining the slope and intrinsic scatter of the star forming sequence of nearby galaxies from observational data. The outlined algorithm is implemented as the flexible, easy-to-use, open-source Python package LEO-Py.

Read this paper on arXiv…

R. Feldmann
Wed, 9 Oct 19
43/64

Comments: 21 pages, 8 figures, 2 tables, to appear in Astronomy and Computing, LEO-Py is available at github.com/rfeldmann/leopy

Realizing the potential of astrostatistics and astroinformatics [IMA]

http://arxiv.org/abs/1909.11714


This Astro2020 State of the Profession Consideration White Paper highlights the growth of astrostatistics and astroinformatics in astronomy, identifies key issues hampering the maturation of these new subfields, and makes recommendations for structural improvements at different levels that, if acted upon, will make significant positive impacts across astronomy.

Read this paper on arXiv…

G. Eadie, T. Loredo, A. Mahabal, et. al.
Fri, 27 Sep 19
20/64

Comments: 14 pages, 1 figure; submitted to the Decadal Survey on Astronomy and Astrophysics (Astro2020) on 10 July 2019; see this https URL

Exact joint likelihood of pseudo-$C_\ell$ estimates from correlated Gaussian cosmological fields [CEA]

http://arxiv.org/abs/1908.00795


We present the exact joint likelihood of pseudo-$C_\ell$ power spectrum estimates measured from an arbitrary number of Gaussian cosmological fields. Our method is applicable to both spin-0 fields and spin-2 fields, including a mixture of the two, and is relevant to Cosmic Microwave Background, weak lensing and galaxy clustering analyses. We show that Gaussian cosmological fields are mixed by a mask in such a way that retains their Gaussianity, without making any assumptions about the mask geometry. We then show that each auto- or cross-pseudo-$C_\ell$ estimator can be written as a quadratic form, and apply the known joint distribution of quadratic forms to obtain the exact joint likelihood of a set of pseudo-$C_\ell$ estimates in the presence of an arbitrary mask. Considering the polarisation of the Cosmic Microwave Background as an example, we show using simulations that our likelihood recovers the full, exact multivariate distribution of $EE$, $BB$ and $EB$ pseudo-$C_\ell$ power spectra. Our method provides a route to robust cosmological constraints from future Cosmic Microwave Background and large-scale structure surveys in an era of ever-increasing statistical precision.

Read this paper on arXiv…

R. Upham, L. Whittaker and M. Brown
Mon, 5 Aug 19
13/53

Comments: 17 pages, 7 figures. Submitted to MNRAS

Bias and robustness of eccentricity estimates from radial velocity data [EPA]

http://arxiv.org/abs/1907.02048


Eccentricity is a parameter of particular interest as it is an informative indicator of the past of planetary systems. It is however not always clear whether the eccentricity fitted on radial velocity data is real or if it is an artefact of an inappropriate modelling. In this work, we address this question in two steps: we first assume that the model used for inference is correct and present interesting features of classical estimators. Secondly, we study whether the eccentricity estimates are to be trusted when the data contain incorrectly modelled signals, such as missed planetary companions, non Gaussian noises, correlated noises with unknown covariance, etc. Our main conclusion is that data analysis via posterior distributions, with a model including a free error term gives reliable results provided two conditions. First, convergence of the numerical methods needs to be ascertained. Secondly, the noise power spectrum should not have a particularly strong peak at the semi period of the planet of interest. As a consequence, it is difficult to determine if the signal of an apparently eccentric planet might be due to another inner companion in 2:1 mean motion resonance. We study the use of Bayes factors to disentangle these cases. Finally, we suggest methods to check if there are hints of an incorrect model in the residuals. We show on simulated data the performance of our methods and comment on the eccentricities of Proxima b and 55 Cnc f.

Read this paper on arXiv…

N. Hara, G. Boué, J. Laskar, et. al.
Thu, 4 Jul 19
23/46

Comments: Accepted for publication in MNRAS

Discrete-time autoregressive model for unequally spaced time-series observations [IMA]

http://arxiv.org/abs/1906.11158


Most time-series models assume that the data come from observations that are equally spaced in time. However, this assumption does not hold in many diverse scientific fields, such as astronomy, finance, and climatology, among others. There are some techniques that fit unequally spaced time series, such as the continuous-time autoregressive moving average (CARMA) processes. These models are defined as the solution of a stochastic differential equation. It is not uncommon in astronomical time series, that the time gaps between observations are large. Therefore, an alternative suitable approach to modeling astronomical time series with large gaps between observations should be based on the solution of a difference equation of a discrete process. In this work we propose a novel model to fit irregular time series called the complex irregular autoregressive (CIAR) model that is represented directly as a discrete-time process. We show that the model is weakly stationary and that it can be represented as a state-space system, allowing efficient maximum likelihood estimation based on the Kalman recursions. Furthermore, we show via Monte Carlo simulations that the finite sample performance of the parameter estimation is accurate. The proposed methodology is applied to light curves from periodic variable stars, illustrating how the model can be implemented to detect poor adjustment of the harmonic model. This can occur when the period has not been accurately estimated or when the variable stars are multiperiodic. Last, we show how the CIAR model, through its state space representation, allows unobserved measurements to be forecast.

Read this paper on arXiv…

F. Elorrieta, S. Eyheramendy and W. Palma
Thu, 27 Jun 19
35/62

Comments: 12 pages, 8 figures, 1 table. Accepted for publication in Astronomy & Astrophysics

Introducing Bayesian Analysis with $\text{m&m's}^\circledR$: an active-learning exercise for undergraduates [CL]

http://arxiv.org/abs/1904.11006


We present an active-learning strategy for undergraduates that applies Bayesian analysis to candy-covered chocolate $\text{m&m’s}^\circledR$. The exercise is best suited for small class sizes and tutorial settings, after students have been introduced to the concepts of Bayesian statistics. The exercise takes advantage of the non-uniform distribution of $\text{m&m’s}^\circledR~$ colours, and the difference in distributions made at two different factories. In this paper, we provide the intended learning outcomes, lesson plan and step-by-step guide for instruction, and open-source teaching materials. We also suggest an extension to the exercise for the graduate-level, which incorporates hierarchical Bayesian analysis.

Read this paper on arXiv…

G. Eadie, D. Huppenkothen, A. Springford, et. al.
Fri, 26 Apr 19
2/69

Comments: Accepted to the Journal of Statistics Education (in press); 15 pages, 7 figures

A Preferential Attachment Model for the Stellar Initial Mass Function [IMA]

http://arxiv.org/abs/1904.11306


Accurate specification of a likelihood function is becoming increasingly difficult in many inference problems in astronomy. As sample sizes resulting from astronomical surveys continue to grow, deficiencies in the likelihood function lead to larger biases in key parameter estimates. These deficiencies result from the oversimplification of the physical processes that generated the data, and from the failure to account for observational limitations. Unfortunately, realistic models often do not yield an analytical form for the likelihood. The estimation of a stellar initial mass function (IMF) is an important example. The stellar IMF is the mass distribution of stars initially formed in a given cluster of stars, a population which is not directly observable due to stellar evolution and other disruptions and observational limitations of the cluster. There are several difficulties with specifying a likelihood in this setting since the physical processes and observational challenges result in measurable masses that cannot legitimately be considered independent draws from an IMF. This work improves inference of the IMF by using an approximate Bayesian computation approach that both accounts for observational and astrophysical effects and incorporates a physically-motivated model for star cluster formation. The methodology is illustrated via a simulation study, demonstrating that the proposed approach can recover the true posterior in realistic situations, and applied to observations from astrophysical simulation data.

Read this paper on arXiv…

J. Cisewski-Kehe, G. Weller and C. Schafer
Fri, 26 Apr 19
38/69

Comments: N/A

TiK-means: $K$-means clustering for skewed groups [CL]

http://arxiv.org/abs/1904.09609


The $K$-means algorithm is extended to allow for partitioning of skewed groups. Our algorithm is called TiK-Means and contributes a $K$-means type algorithm that assigns observations to groups while estimating their skewness-transformation parameters. The resulting groups and transformation reveal general-structured clusters that can be explained by inverting the estimated transformation. Further, a modification of the jump statistic chooses the number of groups. Our algorithm is evaluated on simulated and real-life datasets and then applied to a long-standing astronomical dispute regarding the distinct kinds of gamma ray bursts.

Read this paper on arXiv…

N. Berry and R. Maitra
Tue, 23 Apr 19
13/58

Comments: 15 pages, 6 figures, to appear in Statistical Analysis and Data Mining – The ASA Data Science Journal

Damping of Propagating Kink Waves in the Solar Corona [SSA]

http://arxiv.org/abs/1904.08834


Alfv\’enic waves have gained renewed interest since the existence of ubiquitous propagating kink waves were discovered in the corona. {It has long been suggested that Alfv\’enic} waves play an important role in coronal heating and the acceleration of the solar wind. To this effect, it is imperative to understand the mechanisms that enable their energy to be transferred to the plasma. Mode conversion via resonant absorption is believed to be one of the main mechanisms for kink wave damping, and is considered to play a key role in the process of energy transfer. This study examines the damping of propagating kink waves in quiescent coronal loops using the Coronal Multi-channel Polarimeter (CoMP). A coherence-based method is used to track the Doppler velocity signal of the waves, enabling us to investigate the spatial evolution of velocity perturbations. The power ratio of outward to inward propagating waves is used to estimate the associated damping lengths and quality factors. To enable accurate estimates of these quantities, {we provide the first derivation of a likelihood function suitable for fitting models to the ratio of two power spectra obtained from discrete Fourier transforms. Maximum likelihood estimation is used to fit an exponential damping model to the observed variation in power ratio as a function of frequency.} We confirm earlier indications that propagating kink waves are undergoing frequency dependent damping. Additionally, we find that the rate of damping decreases, or equivalently the damping length increases, for longer coronal loops that reach higher in the corona.

Read this paper on arXiv…

A. Tiwari, R. Morton, S. Régnier, et. al.
Fri, 19 Apr 19
25/50

Comments: Accepted for publication in The Astrophysical Journal

Statistical discrimination of RFI and astronomical transients in 2-bit digitized time domain signals [IMA]

http://arxiv.org/abs/1903.00588


We investigate the performance of the generalized Spectral Kurtosis (SK) estimator in detecting and discriminating natural and artificial very short duration transients in the 2-bit sampling time domain Very-Long-Baseline Interferometry (VLBI) data. We demonstrate that, while both types of transients may be efficiently detected, their natural or artificial nature cannot be distinguished if only a time domain SK analysis is performed. However, these two types of transients become distinguishable from each other in the spectral domain, after a 32-bit FFT operation is performed on the 2-bit time domain voltages. We discuss the implication of these findings on the ability of the Spectral Kurtosis estimator to automatically detect bright astronomical transient signals of interests — such as pulsar or fast radio bursts (FRB) — in VLBI data streams that have been severely contaminated by unwanted radio frequency interference.

Read this paper on arXiv…

G. Nita, A. Keimpema and Z. Paragi
Tue, 5 Mar 19
55/73

Comments: 16 pages, 9 figures

Stress testing the dark energy equation of state imprint on supernova data [CEA]

http://arxiv.org/abs/1812.09786


This work determines the degree to which a standard Lambda-CDM analysis based on type Ia supernovae can identify deviations from a cosmological constant in the form of a redshift-dependent dark energy equation of state w(z). We introduce and apply a novel random curve generator to simulate instances of w(z) from constraint families with increasing distinction from a cosmological constant. After producing a series of mock catalogs of binned type Ia supernovae corresponding to each w(z) curve, we perform a standard Lambda-CDM analysis to estimate the corresponding posterior densities of the absolute magnitude of type Ia supernovae, the present-day matter density, and the equation of state parameter. Using the Kullback-Leibler divergence between posterior densities as a difference measure, we demonstrate that a standard type Ia supernova cosmology analysis has limited sensitivity to extensive redshift dependencies of the dark energy equation of state. In addition, we report that larger redshift-dependent departures from a cosmological constant do not necessarily manifest easier-detectable incompatibilities with the Lambda-CDM model. Our results suggest that physics beyond the standard model may simply be hidden in plain sight.

Read this paper on arXiv…

B. Moews, R. Souza, E. Ishida, et. al.
Thu, 27 Dec 18
37/80

Comments: 13 pages, 9 figures, preprint submitted to PRD

Stress testing the dark energy equation of state imprint on supernova data [CEA]

http://arxiv.org/abs/1812.09786


This work determines the degree to which a standard Lambda-CDM analysis based on type Ia supernovae can identify deviations from a cosmological constant in the form of a redshift-dependent dark energy equation of state w(z). We introduce and apply a novel random curve generator to simulate instances of w(z) from constraint families with increasing distinction from a cosmological constant. After producing a series of mock catalogs of binned type Ia supernovae corresponding to each w(z) curve, we perform a standard Lambda-CDM analysis to estimate the corresponding posterior densities of the absolute magnitude of type Ia supernovae, the present-day matter density, and the equation of state parameter. Using the Kullback-Leibler divergence between posterior densities as a difference measure, we demonstrate that a standard type Ia supernova cosmology analysis has limited sensitivity to extensive redshift dependencies of the dark energy equation of state. In addition, we report that larger redshift-dependent departures from a cosmological constant do not necessarily manifest easier-detectable incompatibilities with the Lambda-CDM model. Our results suggest that physics beyond the standard model may simply be hidden in plain sight.

Read this paper on arXiv…

B. Moews, R. Souza, E. Ishida, et. al.
Thu, 27 Dec 18
24/80

Comments: 13 pages, 9 figures, preprint submitted to PRD

Astronomical observations: a guide for allied researchers [CL]

http://arxiv.org/abs/1812.07963


Observational astrophysics uses sophisticated technology to collect and measure electromagnetic and other radiation from beyond the Earth. Modern observatories produce large, complex datasets and extracting the maximum possible information from them requires the expertise of specialists in many fields beyond physics and astronomy, from civil engineers to statisticians and software engineers. This article introduces the essentials of professional astronomical observations to colleagues in allied fields, to provide context and relevant background for both facility construction and data analysis. It covers the path of electromagnetic radiation through telescopes, optics, detectors, and instruments, its transformation through processing into measurements and information, and the use of that information to improve our understanding of the physics of the cosmos and its history.

Read this paper on arXiv…

P. Barmby
Thu, 20 Dec 18
38/62

Comments: Review for non-astronomers; comments welcome

Measuring precise radial velocities and cross-correlation function line-profile variations using a Skew Normal density [EPA]

http://arxiv.org/abs/1811.12718


Stellar activity is one of the primary limitations to the detection of low-mass exoplanets using the radial-velocity (RV) technique. We propose to estimate the variations in shape of the CCF by fitting a Skew Normal (SN) density which, unlike the commonly employed Normal density, includes a skewness parameter to capture the asymmetry of the CCF induced by stellar activity and the convective blueshift. The performances of the proposed method are compared to the commonly employed Normal density using both simulations and real observations, with different levels of activity and signal-to-noise ratio. When considering real observations, the correlation between the RV and the asymmetry of the CCF and between the RV and the width of the CCF are stronger when using the parameters estimated with the SN density rather than the ones obtained with the commonly employed Normal density. Using the proposed SN approach, the uncertainties estimated on the RV defined as the median of the SN are on average 10% smaller than the uncertainties calculated on the mean of the Normal. The uncertainties estimated on the asymmetry parameter of the SN are on average 15% smaller than the uncertainties measured on the Bisector Inverse Slope Span (BIS SPAN), which is the commonly used parameter to evaluate the asymmetry of the CCF. We also propose a new model to account for stellar activity when fitting a planetary signal to RV data. Based on simple simulations, we were able to demonstrate that this new model improves the planetary detection limits by 12% compared to the model commonly used to account for stellar activity. The SN density is a better model than the Normal density for characterizing the CCF since the correlations used to probe stellar activity are stronger and the uncertainties of the RV estimate and the asymmetry of the CCF are both smaller.

Read this paper on arXiv…

U. Simola, X. Dumusque and J. Cisewski-Kehe
Mon, 3 Dec 18
33/63

Comments: N/A

Labeling Bias in Galaxy Morphologies [GA]

http://arxiv.org/abs/1811.03577


We present a metric to quantify systematic labeling bias in galaxy morphology data sets stemming from the quality of the labeled data. This labeling bias is independent from labeling errors and requires knowledge about the intrinsic properties of the data with respect to the observed properties. We conduct a relative comparison of label bias for different low redshift galaxy morphology data sets. We show our metric is able to recover previous de-biasing procedures based on redshift as biasing parameter. By using the image resolution instead, we find biases that have not been addressed. We find that the morphologies based on supervised machine-learning trained over features such as colors, shape, and concentration show significantly less bias than morphologies based on expert or citizen-science classifiers. This result holds even when there is underlying bias present in the training sets used in the supervised machine learning process. We use catalog simulations to validate our bias metric, and show how to bin the multidimensional intrinsic and observed galaxy properties used in the bias quantification. Our approach is designed to work on any other labeled multidimensional data sets and the code is publicly available.

Read this paper on arXiv…

G. Cabrera-Vives, C. Miller and J. Schneider
Fri, 9 Nov 18
50/64

Comments: N/A

The cumulative mass profile of the Milky Way as determined by globular cluster kinematics from Gaia DR2 [GA]

http://arxiv.org/abs/1810.10036


We present new mass estimates and cumulative mass profiles (CMPs) with Bayesian credible regions for the Milky Way (MW) Galaxy, given the kinematic data of globular clusters as provided by (1) the Gaia DR2 collaboration and the HSTPROMO team, and (2) the new catalog in Vasiliev (2018). We use globular clusters beyond 15kpc to estimate the CMP of the MW, assuming a total gravitational potential model $\Phi(r) = \Phi_{\circ}r^{-\gamma}$, which approximates an NFW-type potential at large distances when $\gamma=0.5$. We compare the resulting CMPs given data sets (1) and (2), and find the results to be nearly identical. The median estimate for the total mass is $M_{200}= 0.71 \times 10^{12} M_{\odot}$ and the 50\% Bayesian credible region bounds are $(0.63, 0.81) \times 10^{12} M_{\odot}$. However, because the Vasiliev catalog contains more complete data at large $r$, the MW total mass is better constrained by these data. In this work, we also supply instructions for how to create a CMP for the MW with Bayesian credible regions, given a model for $M(<r)$ and samples drawn from a posterior distribution. With the CMP, we can report median estimates and 50\% Bayesian credible regions for the MW mass within any distance (e.g. $M(r=25\text{kpc})= 0.26~(0.24, 0.29)\times 10^{12} M_{\odot}$, $M(r=50\text{\kpc})= 0.37~(0.34, 0.41) \times 10^{12} M_{\odot}$, $M(r=100\text{kpc}) = 0.53~(0.49, 0.58) \times10^{12} M_{\odot}$, etc), making it easy to compare our results directly to other studies.

Read this paper on arXiv…

G. Eadie and M. Jurić
Thu, 25 Oct 18
28/65

Comments: submitted to ApJ, 11 pages, 8 figures

Bayesian cosmic density field inference from redshift space dark matter maps [CEA]

http://arxiv.org/abs/1810.05189


We present a self-consistent Bayesian formalism to sample the primordial density fields compatible with a set of dark matter density tracers after cosmic evolution observed in redshift space. Previous works on density reconstruction considered redshift space distortions as noise or included an additional iterative distortion correction step. We present here the analytic solution of coherent flows within a Hamiltonian Monte Carlo posterior sampling of the primordial density field. We test our method within the Zel’dovich approximation, presenting also an analytic solution including tidal fields and spherical collapse on small scales using augmented Lagrangian perturbation theory. Our resulting reconstructed fields are isotropic and their power spectra are unbiased compared to the true one defined by our mock observations. Novel algorithmic implementations are introduced regarding the mass assignment kernels when defining the dark matter density field and optimization of the time step in the Hamiltonian equations of motions. Our algorithm, dubbed barcode, promises to be especially suited for analysis of the dark matter cosmic web implied by the observed spatial distribution of galaxy clusters — such as obtained from X-ray, SZ or weak lensing surveys — as well as that of the intergalactic medium sampled by the Lyman alpha forest or perhaps even by deep hydrogen intensity mapping. In these cases, virialized motions are negligible, and the tracers cannot be modeled as point-like objects. It could be used in all of these contexts as a baryon acoustic oscillation reconstruction algorithm.

Read this paper on arXiv…

E. Bos, F. Kitaura and R. Weygaert
Mon, 15 Oct 18
41/56

Comments: 33 pages, 20 figures, 1 table. Submitted to MNRAS. Accompanying code at this https URL

An irregular discrete time series model to identify residuals with autocorrelation in astronomical light curves [IMA]

http://arxiv.org/abs/1809.04131


Time series observations are ubiquitous in astronomy, and are generated to distinguish between different types of supernovae, to detect and characterize extrasolar planets and to classify variable stars. These time series are usually modeled using a parametric and/or physical model that assumes independent and homoscedastic errors, but in many cases these assumptions are not accurate and there remains a temporal dependency structure on the errors. This can occur, for example, when the proposed model cannot explain all the variability of the data or when the parameters of the model are not properly estimated. In this work we define an autoregressive model for irregular discrete-time series, based on the discrete time representation of the continuous autoregressive model of order 1. We show that the model is ergodic and stationary. We further propose a maximum likelihood estimation procedure and assess the finite sample performance by Monte Carlo simulations. We implement the model on real and simulated data from Gaussian as well as other distributions, showing that the model can flexibly adapt to different data distributions. We apply the irregular autoregressive model to the residuals of a transit of an extrasolar planet to illustrate errors that remain with temporal structure. We also apply this model to residuals of an harmonic fit of light-curves from variable stars to illustrate how the model can be used to detect incorrect parameter estimation.

Read this paper on arXiv…

S. Eyheramendy, F. Elorrieta and W. Palma
Thu, 13 Sep 18
65/68

Comments: 14 pages, 7 figures, 6 tables; Monthly Notices of the Royal Astronomical Society (MNRAS), in press

Robust distributed calibration of radio interferometers with direction dependent distortions [IMA]

http://arxiv.org/abs/1807.11738


In radio astronomy, accurate calibration is of crucial importance for the new generation of radio interferometers. More specifically, because of the potential presence of outliers which affect the measured data, robustness needs to be ensured. On the other hand, calibration is improved by taking advantage of these new instruments and exploiting the known structure of parameters of interest across frequency. Therefore, we propose in this paper an iterative robust multi-frequency calibration algorithm based on a distributed and consensus optimization scheme which aims to estimate the complex gains of the receivers and the directional perturbations caused by the ionosphere. Numerical simulations reveal that the proposed distributed calibration technique outperforms the conventional non-robust algorithm and per-channel calibration.

Read this paper on arXiv…

V. Ollier, M. Korso, A. Ferrari, et. al.
Wed, 1 Aug 18
40/65

Comments: N/A

The Efficiency of Geometric Samplers for Exoplanet Transit Timing Variation Models [IMA]

http://arxiv.org/abs/1807.11591


Transit timing variations (TTVs) are a valuable tool to determine the masses and orbits of transiting planets in multi-planet systems. TTVs can be readily modeled given knowledge of the interacting planets’ orbital configurations and planet-star mass ratios, but such models are highly nonlinear and difficult to invert. Markov chain Monte Carlo (MCMC) methods are often used to explore the posterior distribution for model parameters, but, due to the high correlations between parameters, nonlinearity, and potential multi-modality in the posterior, many samplers perform very inefficiently. Therefore, we assess the performance of several MCMC samplers that use varying degrees of geometric information about the target distribution. We generate synthetic datasets from multiple models, including the TTVFaster model and a simple sinusoidal model, and test the efficiencies of various MCMC samplers. We find that sampling efficiency can be greatly improved for all models by sampling from a parameter space transformed using an estimate of the covariance and means of the target distribution. No one sampler performs the best for all datasets, but several samplers, such as Differential Evolution Monte Carlo and Geometric adaptive Monte Carlo, have consistently efficient performance. For datasets with near Gaussian posteriors, Hamiltonian Monte Carlo samplers with 2 or 3 leapfrog steps obtained the highest efficiencies. Based on differences in effective sample sizes per time, we show that the right choice of sampler can improve sampling efficiencies by several orders of magnitude.

Read this paper on arXiv…

N. Tuchow, E. Ford, T. Papamarkou, et. al.
Wed, 1 Aug 18
44/65

Comments: N/A

Bayesian Calibration using Different Prior Distributions: an Iterative Maximum A Posteriori Approach for Radio Interferometers [CL]

http://arxiv.org/abs/1807.11382


In this paper, we aim to design robust estimation techniques based on the compound-Gaussian (CG) process and adapted for calibration of radio interferometers. The motivation beyond this is due to the presence of outliers leading to an unrealistic traditional Gaussian noise assumption. Consequently, to achieve robustness, we adopt a maximum a posteriori (MAP) approach which exploits Bayesian statistics and follows a sequential updating procedure here. The proposed algorithm is applied in a multi-frequency scenario in order to enhance the estimation and correction of perturbation effects. Numerical simulations assess the performance of the proposed algorithm for different noise models, Student’s t, K, Laplace, Cauchy and inverse-Gaussian compound-Gaussian distributions w.r.t. the classical non-robust Gaussian noise assumption.

Read this paper on arXiv…

V. Ollier, M. Korso, A. Ferrari, et. al.
Tue, 31 Jul 18
30/69

Comments: N/A

Sparse Bayesian Imaging of Solar Flares [IMA]

http://arxiv.org/abs/1807.11287


We consider imaging of solar flares from NASA RHESSI data as a parametric imaging problem, where flares are represented as a finite collection of geometric shapes. We set up a Bayesian model in which the number of objects forming the image is a priori unknown, as well as their shapes. We use a Sequential Monte Carlo algorithm to explore the corresponding posterior distribution. We apply the method to synthetic and experimental data, largely known in the RHESSI community. The method reconstructs improved images of solar flares, with the additional advantage of providing uncertainty quantification of the estimated parameters.

Read this paper on arXiv…

F. Sciacchitano, S. Lugaro and A. Sorrentino
Tue, 31 Jul 18
41/69

Comments: submitted

Robust Calibration of Radio Interferometers in Multi-Frequency Scenario [CL]

http://arxiv.org/abs/1807.11314


This paper investigates calibration of sensor arrays in the radio astronomy context. Current and future radio telescopes require computationally efficient algorithms to overcome the new technical challenges as large collecting area, wide field of view and huge data volume. Specifically, we study the calibration of radio interferometry stations with significant direction dependent distortions. We propose an iterative robust calibration algorithm based on a relaxed maximum likelihood estimator for a specific context: i) observations are affected by the presence of outliers and ii) parameters of interest have a specific structure depending on frequency. Variation of parameters across frequency is addressed through a distributed procedure, which is consistent with the new radio synthesis arrays where the full observing bandwidth is divided into multiple frequency channels. Numerical simulations reveal that the proposed robust distributed calibration estimator outperforms the conventional non-robust algorithm and/or the mono-frequency case.

Read this paper on arXiv…

V. Ollier, M. Korso, A. Ferrari, et. al.
Tue, 31 Jul 18
55/69

Comments: N/A

Near-infrared Mira Period-Luminosity Relations in M33 [SSA]

http://arxiv.org/abs/1807.03544


We analyze sparsely-sampled near-infrared (JHKs) light curves of a sample of 1781 Mira variable candidates in M33, originally discovered using I-band time-series observations. We extend our single-band semi-parametric Gaussian process modeling of Mira light curves to a multi-band version and obtain improved period determinations. We use our previous results on near-infrared properties of candidate Miras in the LMC to classify the majority of the M33 sample into Oxygen- or Carbon-rich subsets. We derive Period-Luminosity relations for O-rich Miras and determine a distance modulus for M33 of 24.80 +- 0.06 mag.

Read this paper on arXiv…

W. Yuan, L. Macri, A. Javadi, et. al.
Wed, 11 Jul 18
11/64

Comments: N/A

A case study of hurdle and generalized additive models in astronomy: the escape of ionizing radiation [IMA]

http://arxiv.org/abs/1805.07435


The dark ages of the Universe end with the formation of the first generation of stars residing in primeval galaxies. These objects were the first to produce ultraviolet ionizing photons in a period when the cosmic gas changed from a neutral state to an ionized one, known as Epoch of Reionization (EoR). A pivotal aspect to comprehend the EoR is to probe the intertwined relationship between the fraction of ionizing photons capable to escape dark haloes, also known as the escape fraction ($f_{esc}$), and the physical properties of the galaxy. This work develops a sound statistical model suitable to account for such non-linear relationships and the non-Gaussian nature of $f_{esc}$. This model simultaneously estimates the probability that a given primordial galaxy starts the ionizing photon production and estimates the mean level of the $f_{esc}$ once it is triggered. We show that the baryonic fraction and the rate of ionizing photons appear to have a larger impact on $f_{esc}$ than previously thought. A naive univariate analysis of the same problem would suggest smaller effects for these properties and a much larger impact for the specific star formation rate, which is lessened after accounting for other galaxy properties and non-linearities in the statistical model.

Read this paper on arXiv…

M. Hattab, R. Souza, B. Ciardi, et. al.
Tue, 22 May 18
33/69

Comments: Comments are very welcome

Bayesian optimisation for likelihood-free cosmological inference [CEA]

http://arxiv.org/abs/1805.07152


Many cosmological models have only a finite number of parameters of interest, but a very expensive data-generating process and an intractable likelihood function. We address the problem of performing likelihood-free Bayesian inference from such black-box simulation-based models, under the constraint of a very limited simulation budget (typically a few thousand). To do so, we propose an approach based on the likelihood of an alternative parametric model. Conventional approaches to Approximate Bayesian computation such as likelihood-free rejection sampling are impractical for the considered problem, due to the lack of knowledge about how the parameters affect the discrepancy between observed and simulated data. As a response, our strategy combines Gaussian process regression of the discrepancy to build a surrogate surface with Bayesian optimisation to actively acquire training data. We derive and make use of an acquisition function tailored for the purpose of minimising the expected uncertainty in the approximate posterior density. The resulting algorithm (Bayesian optimisation for likelihood-free inference, BOLFI) is applied to the problems of summarising Gaussian signals and inferring cosmological parameters from the Joint Lightcurve Analysis supernovae data. We show that the number of required simulations is reduced by several orders of magnitude, and that the proposed acquisition function produces more accurate posterior approximations, as compared to common strategies.

Read this paper on arXiv…

F. Leclercq
Mon, 21 May 18
62/71

Comments: 15+9 pages, 12 figures