Application of Common Spatial Patterns in Gravitational Waves Detection [CL]

http://arxiv.org/abs/2201.04086


Common Spatial Patterns (CSP) is a feature extraction algorithm widely used in Brain-Computer Interface (BCI) Systems for detecting Event-Related Potentials (ERPs) in multi-channel magneto/electroencephalography (MEG/EEG) time series data. In this article, we develop and apply a CSP algorithm to the problem of identifying whether a given epoch of multi-detector Gravitational Wave (GW) strains contains coalescenses. Paired with Signal Processing techniques and a Logistic Regression classifier, we find that our pipeline is correctly able to detect 76 out of 82 confident events from Gravitational Wave Transient Catalog, using H1 and L1 strains, with a classification score of $93.72 \pm 0.04\%$ using $10 \times 5$ cross validation. The false negative events were: GW170817-v3, GW191219 163120-v1, GW200115 042309-v2, GW200210 092254-v1, GW200220 061928-v1, and GW200322 091133-v1.

Read this paper on arXiv…

D. Dahal
Wed, 12 Jan 22
66/89

Comments: N/A

SpectraNet: Learned Recognition of Artificial Satellites From High Contrast Spectroscopic Imagery [CL]

http://arxiv.org/abs/2201.03614


Effective space traffic management requires positive identification of artificial satellites. Current methods for extracting object identification from observed data require spatially resolved imagery which limits identification to objects in low earth orbits. Most artificial satellites, however, operate in geostationary orbits at distances which prohibit ground based observatories from resolving spatial information. This paper demonstrates an object identification solution leveraging modified residual convolutional neural networks to map distance-invariant spectroscopic data to object identity. We report classification accuracies exceeding 80% for a simulated 64-class satellite problem–even in the case of satellites undergoing constant, random re-orientation. An astronomical observing campaign driven by these results returned accuracies of 72% for a nine-class problem with an average of 100 examples per class, performing as expected from simulation. We demonstrate the application of variational Bayesian inference by dropout, stochastic weight averaging (SWA), and SWA-focused deep ensembling to measure classification uncertainties–critical components in space traffic management where routine decisions risk expensive space assets and carry geopolitical consequences.

Read this paper on arXiv…

J. Gazak, I. McQuaid, R. Swindle, et. al.
Wed, 12 Jan 22
77/89

Comments: 8 pages, 8 figures, 5 tables. Published at WACV 2022

Systematic biases when using deep neural networks for annotating large catalogs of astronomical images [GA]

http://arxiv.org/abs/2201.03131


Deep convolutional neural networks (DCNNs) have become the most common solution for automatic image annotation due to their non-parametric nature, good performance, and their accessibility through libraries such as TensorFlow. Among other fields, DCNNs are also a common approach to the annotation of large astronomical image databases acquired by digital sky surveys. One of the main downsides of DCNNs is the complex non-intuitive rules that make DCNNs act as a “black box”, providing annotations in a manner that is unclear to the user. Therefore, the user is often not able to know what information is used by the DCNNs for the classification. Here we demonstrate that the training of a DCNN is sensitive to the context of the training data such as the location of the objects in the sky. We show that for basic classification of elliptical and spiral galaxies, the sky location of the galaxies used for training affects the behavior of the algorithm, and leads to a small but consistent and statistically significant bias. That bias exhibits itself in the form of cosmological-scale anisotropy in the distribution of basic galaxy morphology. Therefore, while DCNNs are powerful tools for annotating images of extended sources, the construction of training sets for galaxy morphology should take into consideration more aspects than the visual appearance of the object. In any case, catalogs created with deep neural networks that exhibit signs of cosmological anisotropy should be interpreted with the possibility of consistent bias.

Read this paper on arXiv…

S. Dhar and L. Shamir
Tue, 11 Jan 22
59/95

Comments: A&C, accepted

Unsupervised Machine Learning for Exploratory Data Analysis of Exoplanet Transmission Spectra [EPA]

http://arxiv.org/abs/2201.02696


Transit spectroscopy is a powerful tool to decode the chemical composition of the atmospheres of extrasolar planets. In this paper we focus on unsupervised techniques for analyzing spectral data from transiting exoplanets. We demonstrate methods for i) cleaning and validating the data, ii) initial exploratory data analysis based on summary statistics (estimates of location and variability), iii) exploring and quantifying the existing correlations in the data, iv) pre-processing and linearly transforming the data to its principal components, v) dimensionality reduction and manifold learning, vi) clustering and anomaly detection, vii) visualization and interpretation of the data. To illustrate the proposed unsupervised methodology, we use a well-known public benchmark data set of synthetic transit spectra. We show that there is a high degree of correlation in the spectral data, which calls for appropriate low-dimensional representations. We explore a number of different techniques for such dimensionality reduction and identify several suitable options in terms of summary statistics, principal components, etc. We uncover interesting structures in the principal component basis, namely, well-defined branches corresponding to different chemical regimes of the underlying atmospheres. We demonstrate that those branches can be successfully recovered with a K-means clustering algorithm in fully unsupervised fashion. We advocate for a three-dimensional representation of the spectroscopic data in terms of the first three principal components, in order to reveal the existing structure in the data and quickly characterize the chemical class of a planet.

Read this paper on arXiv…

K. Matchev, K. Matcheva and A. Roman
Tue, 11 Jan 22
76/95

Comments: 10 pages, 11 figures, submitted to MNRAS

The CAMELS project: public data release [CEA]

http://arxiv.org/abs/2201.01300


The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present the CAMELS public data release, describing the characteristics of the CAMELS simulations and a variety of data products generated from them, including halo, subhalo, galaxy, and void catalogues, power spectra, bispectra, Lyman-$\alpha$ spectra, probability distribution functions, halo radial profiles, and X-rays photon lists. We also release over one thousand catalogues that contain billions of galaxies from CAMELS-SAM: a large collection of N-body simulations that have been combined with the Santa Cruz Semi-Analytic Model. We release all the data, comprising more than 350 terabytes and containing 143,922 snapshots, millions of halos, galaxies and summary statistics. We provide further technical details on how to access, download, read, and process the data at \url{https://camels.readthedocs.io}.

Read this paper on arXiv…

F. Villaescusa-Navarro, S. Genel, D. Anglés-Alcázar, et. al.
Thu, 6 Jan 22
8/56

Comments: 18 pages, 3 figures. More than 350 Tb of data from thousands of simulations publicly available at this https URL

Augmenting astrophysical scaling relations with machine learning : application to reducing the SZ flux-mass scatter [CEA]

http://arxiv.org/abs/2201.01305


Complex systems (stars, supernovae, galaxies, and clusters) often exhibit low scatter relations between observable properties (e.g., luminosity, velocity dispersion, oscillation period, temperature). These scaling relations can illuminate the underlying physics and can provide observational tools for estimating masses and distances. Machine learning can provide a systematic way to search for new scaling relations (or for simple extensions to existing relations) in abstract high-dimensional parameter spaces. We use a machine learning tool called symbolic regression (SR), which models the patterns in a given dataset in the form of analytic equations. We focus on the Sunyaev-Zeldovich flux$-$cluster mass relation ($Y_\mathrm{SZ}-M$), the scatter in which affects inference of cosmological parameters from cluster abundance data. Using SR on the data from the IllustrisTNG hydrodynamical simulation, we find a new proxy for cluster mass which combines $Y_\mathrm{SZ}$ and concentration of ionized gas ($c_\mathrm{gas}$): $M \propto Y_\mathrm{conc}^{3/5} \equiv Y_\mathrm{SZ}^{3/5} (1-A\, c_\mathrm{gas})$. $Y_\mathrm{conc}$ reduces the scatter in the predicted $M$ by $\sim 20-30$% for large clusters ($M\gtrsim 10^{14}\, h^{-1} \, M_\odot$) at both high and low redshifts, as compared to using just $Y_\mathrm{SZ}$. We show that the dependence on $c_\mathrm{gas}$ is linked to cores of clusters exhibiting larger scatter than their outskirts. Finally, we test $Y_\mathrm{conc}$ on clusters from simulations of the CAMELS project and show that $Y_\mathrm{conc}$ is robust against variations in cosmology, astrophysics, subgrid physics, and cosmic variance. Our results and methodology can be useful for accurate multiwavelength cluster mass estimation from current and upcoming CMB and X-ray surveys like ACT, SO, SPT, eROSITA and CMB-S4.

Read this paper on arXiv…

D. Wadekar, L. Thiele, F. Villaescusa-Navarro, et. al.
Thu, 6 Jan 22
18/56

Comments: 11+6 pages, 8+3 figures. The code and data associated with this paper will be uploaded upon the acceptance of this paper

Quantifying Uncertainty in Deep Learning Approaches to Radio Galaxy Classification [CEA]

http://arxiv.org/abs/2201.01203


In this work we use variational inference to quantify the degree of uncertainty in deep learning model predictions of radio galaxy classification. We show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for a variety of different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we show that we can prune 30% of the fully-connected layer weights without significant loss of performance by removing the weights with the lowest signal-to-noise ratio (SNR). We demonstrate that a larger degree of pruning can be achieved using a Fisher information based ranking, but we note that both pruning methods affect the uncertainty calibration for Fanaroff-Riley type I and type II radio galaxies differently. Finally we show that, like other work in this field, we experience a cold posterior effect, whereby the posterior must be down-weighted to achieve good predictive performance. We examine whether adapting the cost function to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that this improves upon the baseline but also does not compensate for the observed effect. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future.

Read this paper on arXiv…

D. Mohan, A. Scaife, F. Porter, et. al.
Wed, 5 Jan 22
43/54

Comments: submitted to MNRAS

Processing Images from Multiple IACTs in the TAIGA Experiment with Convolutional Neural Networks [IMA]

http://arxiv.org/abs/2112.15382


Extensive air showers created by high-energy particles interacting with the Earth atmosphere can be detected using imaging atmospheric Cherenkov telescopes (IACTs). The IACT images can be analyzed to distinguish between the events caused by gamma rays and by hadrons and to infer the parameters of the event such as the energy of the primary particle. We use convolutional neural networks (CNNs) to analyze Monte Carlo-simulated images from the telescopes of the TAIGA experiment. The analysis includes selection of the images corresponding to the showers caused by gamma rays and estimating the energy of the gamma rays. We compare performance of the CNNs using images from a single telescope and the CNNs using images from two telescopes as inputs.

Read this paper on arXiv…

S. Polyakov, A. Demichev, A. Kryukov, et. al.
Mon, 3 Jan 22
14/49

Comments: In Proceedings of 5th International Workshop on Deep Learning in Computational Physics (DLCP2021), 28-29 June, 2021, Moscow, Russia

Digital Rock Typing DRT Algorithm Formulation with Optimal Supervised Semantic Segmentation [CL]

http://arxiv.org/abs/2112.15068


Each grid block in a 3D geological model requires a rock type that represents all physical and chemical properties of that block. The properties that classify rock types are lithology, permeability, and capillary pressure. Scientists and engineers determined these properties using conventional laboratory measurements, which embedded destructive methods to the sample or altered some of its properties (i.e., wettability, permeability, and porosity) because the measurements process includes sample crushing, fluid flow, or fluid saturation. Lately, Digital Rock Physics (DRT) has emerged to quantify these properties from micro-Computerized Tomography (uCT) and Magnetic Resonance Imaging (MRI) images. However, the literature did not attempt rock typing in a wholly digital context. We propose performing Digital Rock Typing (DRT) by: (1) integrating the latest DRP advances in a novel process that honors digital rock properties determination, while; (2) digitalizing the latest rock typing approaches in carbonate, and (3) introducing a novel carbonate rock typing process that utilizes computer vision capabilities to provide more insight about the heterogeneous carbonate rock texture.

Read this paper on arXiv…

O. Alfarisi, D. Ouzzane, M. Sassi, et. al.
Mon, 3 Jan 22
30/49

Comments: N/A

Astronomical Image Colorization and upscaling with Generative Adversarial Networks [CL]

http://arxiv.org/abs/2112.13865


Automatic colorization of images without human intervention has been a subject of interest in the machine learning community for a brief period of time. Assigning color to an image is a highly ill-posed problem because of its innate nature of possessing very high degrees of freedom; given an image, there is often no single color-combination that is correct. Besides colorization, another problem in reconstruction of images is Single Image Super Resolution, which aims at transforming low resolution images to a higher resolution. This research aims to provide an automated approach for the problem by focusing on a very specific domain of images, namely astronomical images, and process them using Generative Adversarial Networks (GANs). We explore the usage of various models in two different color spaces, RGB and Lab. We use transferred learning owing to a small data set, using pre-trained ResNet-18 as a backbone, i.e. encoder for the U-net and fine-tune it further. The model produces visually appealing images which hallucinate high resolution, colorized data in these results which does not exist in the original image. We present our results by evaluating the GANs quantitatively using distance metrics such as L1 distance and L2 distance in each of the color spaces across all channels to provide a comparative analysis. We use Frechet inception distance (FID) to compare the distribution of the generated images with the distribution of the real image to assess the model’s performance.

Read this paper on arXiv…

S. Kalvankar, H. Pandit, P. Parwate, et. al.
Thu, 30 Dec 21
6/71

Comments: 14 pages, 10 figures, 7 tables

DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification [CL]

http://arxiv.org/abs/2112.14299


Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, G. Snyder, et. al.
Thu, 30 Dec 21
53/71

Comments: 19 pages, 7 figures, 5 tables, submitted to Astronomy & Computing

Astronomical Image Colorization and upscaling with Generative Adversarial Networks [CL]

http://arxiv.org/abs/2112.13865


Automatic colorization of images without human intervention has been a subject of interest in the machine learning community for a brief period of time. Assigning color to an image is a highly ill-posed problem because of its innate nature of possessing very high degrees of freedom; given an image, there is often no single color-combination that is correct. Besides colorization, another problem in reconstruction of images is Single Image Super Resolution, which aims at transforming low resolution images to a higher resolution. This research aims to provide an automated approach for the problem by focusing on a very specific domain of images, namely astronomical images, and process them using Generative Adversarial Networks (GANs). We explore the usage of various models in two different color spaces, RGB and Lab. We use transferred learning owing to a small data set, using pre-trained ResNet-18 as a backbone, i.e. encoder for the U-net and fine-tune it further. The model produces visually appealing images which hallucinate high resolution, colorized data in these results which does not exist in the original image. We present our results by evaluating the GANs quantitatively using distance metrics such as L1 distance and L2 distance in each of the color spaces across all channels to provide a comparative analysis. We use Frechet inception distance (FID) to compare the distribution of the generated images with the distribution of the real image to assess the model’s performance.

Read this paper on arXiv…

S. Kalvankar, H. Pandit, P. Parwate, et. al.
Thu, 30 Dec 21
15/71

Comments: 14 pages, 10 figures, 7 tables

DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification [CL]

http://arxiv.org/abs/2112.14299


Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, G. Snyder, et. al.
Thu, 30 Dec 21
45/71

Comments: 19 pages, 7 figures, 5 tables, submitted to Astronomy & Computing

Constraining cosmological parameters from N-body simulations with Bayesian Neural Networks [CEA]

http://arxiv.org/abs/2112.11865


In this paper, we use The Quijote simulations in order to extract the cosmological parameters through Bayesian Neural Networks. This kind of model has a remarkable ability to estimate the associated uncertainty, which is one of the ultimate goals in the precision cosmology era. We demonstrate the advantages of BNNs for extracting more complex output distributions and non-Gaussianities information from the simulations.

Read this paper on arXiv…

H. Hortua
Thu, 23 Dec 21
8/63

Comments: Published at NeurIPS 2021 workshop: Bayesian Deep Learning

Analytical Modelling of Exoplanet Transit Specroscopy with Dimensional Analysis and Symbolic Regression [EPA]

http://arxiv.org/abs/2112.11600


The physical characteristics and atmospheric chemical composition of newly discovered exoplanets are often inferred from their transit spectra which are obtained from complex numerical models of radiative transfer. Alternatively, simple analytical expressions provide insightful physical intuition into the relevant atmospheric processes. The deep learning revolution has opened the door for deriving such analytical results directly with a computer algorithm fitting to the data. As a proof of concept, we successfully demonstrate the use of symbolic regression on synthetic data for the transit radii of generic hot Jupiter exoplanets to derive a corresponding analytical formula. As a preprocessing step, we use dimensional analysis to identify the relevant dimensionless combinations of variables and reduce the number of independent inputs, which improves the performance of the symbolic regression. The dimensional analysis also allowed us to mathematically derive and properly parametrize the most general family of degeneracies among the input atmospheric parameters which affect the characterization of an exoplanet atmosphere through transit spectroscopy.

Read this paper on arXiv…

K. Matchev, K. Matcheva and A. Roman
Thu, 23 Dec 21
36/63

Comments: Submitted to AAS Journals, 24 pages, 7 figures

The Preliminary Results on Analysis of TAIGA-IACT Images Using Convolutional Neural Networks [IMA]

http://arxiv.org/abs/2112.10168


The imaging Cherenkov telescopes TAIGA-IACT, located in the Tunka valley of the republic Buryatia, accumulate a lot of data in a short period of time which must be efficiently and quickly analyzed. One of the methods of such analysis is the machine learning, which has proven its effectiveness in many technological and scientific fields in recent years. The aim of the work is to study the possibility of the machine learning application to solve the tasks set for TAIGA-IACT: the identification of the primary particle of cosmic rays and reconstruction their physical parameters. In the work the method of Convolutional Neural Networks (CNN) was applied to process and analyze Monte-Carlo events simulated with CORSIKA. Also various CNN architectures for the processing were considered. It has been demonstrated that this method gives good results in the determining the type of primary particles of Extensive Air Shower (EAS) and the reconstruction of gamma-rays energy. The results are significantly improved in the case of stereoscopic observations.

Read this paper on arXiv…

E. Gres and a. Kryukov
Tue, 21 Dec 21
14/86

Comments: In Proceedings of 5th International Workshop on Deep Learning in Computational Physics (DLCP2021), 28-29 June, 2021, Moscow, Russia. 9 pages, 3 figures, 2 tables

Analysis of the HiSCORE Simulated Events in TAIGA Experiment Using Convolutional Neural Networks [IMA]

http://arxiv.org/abs/2112.10170


TAIGA is a hybrid observatory for gamma-ray astronomy at high energies in range from 10 TeV to several EeV. It consists of instruments such as TAIGA-IACT, TAIGA-HiSCORE, and others. TAIGA-HiSCORE, in particular, is an array of wide-angle timing Cherenkov light stations. TAIGA-HiSCORE data enable to reconstruct air shower characteristics, such as air shower energy, arrival direction, and axis coordinates. In this report, we propose to consider the use of convolution neural networks in task of air shower characteristics determination. We use Convolutional Neural Networks (CNN) to analyze HiSCORE events, treating them like images. For this, the times and amplitudes of events recorded at HiSCORE stations are used. The work discusses a simple convolutional neural network and its training. In addition, we present some preliminary results on the determination of the parameters of air showers such as the direction and position of the shower axis and the energy of the primary particle and compare them with the results obtained by the traditional method.

Read this paper on arXiv…

A. Vlaskina and A. Kryukov
Tue, 21 Dec 21
71/86

Comments: In Proceedings of 5th International Workshop on Deep Learning in Computational Physics (DLCP2021), 28-29 June, 2021, Moscow, Russia. 8 pages, 5 figures, 1 table

Real-time Detection of Anomalies in Multivariate Time Series of Astronomical Data [CL]

http://arxiv.org/abs/2112.08415


Astronomical transients are stellar objects that become temporarily brighter on various timescales and have led to some of the most significant discoveries in cosmology and astronomy. Some of these transients are the explosive deaths of stars known as supernovae while others are rare, exotic, or entirely new kinds of exciting stellar explosions. New astronomical sky surveys are observing unprecedented numbers of multi-wavelength transients, making standard approaches of visually identifying new and interesting transients infeasible. To meet this demand, we present two novel methods that aim to quickly and automatically detect anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model.

Read this paper on arXiv…

D. Muthukrishna, K. Mandel, M. Lochner, et. al.
Fri, 17 Dec 21
43/72

Comments: 9 pages, 5 figures, Accepted at the NeurIPS 2021 workshop on Machine Learning and the Physical Sciences

Simultaneous Multivariate Forecast of Space Weather Indices using Deep Neural Network Ensembles [SSA]

http://arxiv.org/abs/2112.09051


Solar radio flux along with geomagnetic indices are important indicators of solar activity and its effects. Extreme solar events such as flares and geomagnetic storms can negatively affect the space environment including satellites in low-Earth orbit. Therefore, forecasting these space weather indices is of great importance in space operations and science. In this study, we propose a model based on long short-term memory neural networks to learn the distribution of time series data with the capability to provide a simultaneous multivariate 27-day forecast of the space weather indices using time series as well as solar image data. We show a 30-40\% improvement of the root mean-square error while including solar image data with time series data compared to using time series data alone. Simple baselines such as a persistence and running average forecasts are also compared with the trained deep neural network models. We also quantify the uncertainty in our prediction using a model ensemble.

Read this paper on arXiv…

B. Benson, E. Brown, S. Bonasera, et. al.
Fri, 17 Dec 21
53/72

Comments: Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021)

unrolling palm for sparse semi-blind source separation [IMA]

http://arxiv.org/abs/2112.05694


Sparse Blind Source Separation (BSS) has become a well established tool for a wide range of applications – for instance, in astrophysics and remote sensing. Classical sparse BSS methods, such as the Proximal Alternating Linearized Minimization (PALM) algorithm, nevertheless often suffer from a difficult hyperparameter choice, which undermines their results. To bypass this pitfall, we propose in this work to build on the thriving field of algorithm unfolding/unrolling. Unrolling PALM enables to leverage the data-driven knowledge stemming from realistic simulations or ground-truth data by learning both PALM hyperparameters and variables. In contrast to most existing unrolled algorithms, which assume a fixed known dictionary during the training and testing phases, this article further emphasizes on the ability to deal with variable mixing matrices (a.k.a. dictionaries). The proposed Learned PALM (LPALM) algorithm thus enables to perform semi-blind source separation, which is key to increase the generalization of the learnt model in real-world applications. We illustrate the relevance of LPALM in astrophysical multispectral imaging: the algorithm not only needs up to $10^4-10^5$ times fewer iterations than PALM, but also improves the separation quality, while avoiding the cumbersome hyperparameter and initialization choice of PALM. We further show that LPALM outperforms other unrolled source separation methods in the semi-blind setting.

Read this paper on arXiv…

M. Fahes, C. Kervazo, J. Bobin, et. al.
Mon, 13 Dec 21
45/70

Comments: N/A

Neural Symplectic Integrator with Hamiltonian Inductive Bias for the Gravitational $N$-body Problem [CL]

http://arxiv.org/abs/2111.15631


The gravitational $N$-body problem, which is fundamentally important in astrophysics to predict the motion of $N$ celestial bodies under the mutual gravity of each other, is usually solved numerically because there is no known general analytical solution for $N>2$. Can an $N$-body problem be solved accurately by a neural network (NN)? Can a NN observe long-term conservation of energy and orbital angular momentum? Inspired by Wistom & Holman (1991)’s symplectic map, we present a neural $N$-body integrator for splitting the Hamiltonian into a two-body part, solvable analytically, and an interaction part that we approximate with a NN. Our neural symplectic $N$-body code integrates a general three-body system for $10^{5}$ steps without diverting from the ground truth dynamics obtained from a traditional $N$-body integrator. Moreover, it exhibits good inductive bias by successfully predicting the evolution of $N$-body systems that are no part of the training set.

Read this paper on arXiv…

M. Cai, S. Zwart and D. Podareanu
Fri, 10 Dec 21
37/94

Comments: 7 pages, 2 figures, accepted for publication at the NeurIPS 2021 workshop “Machine Learning and the Physical Sciences”

COSMIC: fast closed-form identification from large-scale data for LTV systems [CL]

http://arxiv.org/abs/2112.04355


We introduce a closed-form method for identification of discrete-time linear time-variant systems from data, formulating the learning problem as a regularized least squares problem where the regularizer favors smooth solutions within a trajectory. We develop a closed-form algorithm with guarantees of optimality and with a complexity that increases linearly with the number of instants considered per trajectory. The COSMIC algorithm achieves the desired result even in the presence of large volumes of data. Our method solved the problem using two orders of magnitude less computational power than a general purpose convex solver and was about 3 times faster than a Stochastic Block Coordinate Descent especially designed method. Computational times of our method remained in the order of magnitude of the second even for 10k and 100k time instants, where the general purpose solver crashed. To prove its applicability to real world systems, we test with spring-mass-damper system and use the estimated model to find the optimal control path. Our algorithm was applied to both a Low Fidelity and Functional Engineering Simulators for the Comet Interceptor mission, that requires precise pointing of the on-board cameras in a fast dynamics environment. Thus, this paper provides a fast alternative to classical system identification techniques for linear time-variant systems, while proving to be a solid base for applications in the Space industry and a step forward to the incorporation of algorithms that leverage data in such a safety-critical environment.

Read this paper on arXiv…

M. Carvalho, C. Soares, P. Lourenço, et. al.
Thu, 9 Dec 21
12/63

Comments: N/A

Automation Of Transiting Exoplanet Detection, Identification and Habitability Assessment Using Machine Learning Approaches [EPA]

http://arxiv.org/abs/2112.03298


We are at a unique timeline in the history of human evolution where we may be able to discover earth-like planets around stars outside our solar system where conditions can support life or even find evidence of life on those planets. With the launch of several satellites in recent years by NASA, ESA, and other major space agencies, an ample amount of datasets are at our disposal which can be utilized to train machine learning models that can automate the arduous tasks of exoplanet detection, its identification, and habitability determination. Automating these tasks can save a considerable amount of time and minimize human errors due to manual intervention. To achieve this aim, we first analyze the light intensity curves from stars captured by the Kepler telescope to detect the potential curves that exhibit the characteristics of an existence of a possible planetary system. For this detection, along with training conventional models, we propose a stacked GBDT model that can be trained on multiple representations of the light signals simultaneously. Subsequently, we address the automation of exoplanet identification and habitability determination by leveraging several state-of-art machine learning and ensemble approaches. The identification of exoplanets aims to distinguish false positive instances from the actual instances of exoplanets whereas the habitability assessment groups the exoplanet instances into different clusters based on their habitable characteristics. Additionally, we propose a new metric called Adequate Thermal Adequacy (ATA) score to establish a potential linear relationship between habitable and non-habitable instances. Experimental results suggest that the proposed stacked GBDT model outperformed the conventional models in detecting transiting exoplanets. Furthermore, the incorporation of ATA scores in habitability classification enhanced the performance of models.

Read this paper on arXiv…

P. Pratyush and A. Gangrade
Wed, 8 Dec 21
29/77

Comments: 19 pages, 21 figures

Quantum Machine Learning for Radio Astronomy [CL]

http://arxiv.org/abs/2112.02655


In this work we introduce a novel approach to the pulsar classification problem in time-domain radio astronomy using a Born machine, often referred to as a \emph{quantum neural network}. Using a single-qubit architecture, we show that the pulsar classification problem maps well to the Bloch sphere and that comparable accuracies to more classical machine learning approaches are achievable. We introduce a novel single-qubit encoding for the pulsar data used in this work and show that this performs comparably to a multi-qubit QAOA encoding.

Read this paper on arXiv…

M. Kordzanganeh, A. Utting and A. Scaife
Tue, 7 Dec 21
36/91

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version

Identifying mass composition of ultra-high-energy cosmic rays using deep learning [IMA]

http://arxiv.org/abs/2112.02072


We introduce a novel method for identifying the mass composition of ultra-high-energy cosmic rays using deep learning. The key idea of the method is to use a chain of two neural networks. The first network predicts the type of a primary particle for individual events, while the second infers the mass composition of an ensemble of events. We apply this method to the Monte-Carlo data for the Telescope Array Surface Detectors readings, on which it yields an unprecedented low error of 7% for 4-component approximation. The statistical error is shown to be inferior to the systematic one related to the choice of the hadronic interaction model used for simulations.

Read this paper on arXiv…

O. Kalashev, I. Kharuk, M. Kuznetsov, et. al.
Mon, 6 Dec 21
33/61

Comments: 18 pages, 5 figures

How to quantify fields or textures? A guide to the scattering transform [IMA]

http://arxiv.org/abs/2112.01288


Extracting information from stochastic fields or textures is a ubiquitous task in science, from exploratory data analysis to classification and parameter estimation. From physics to biology, it tends to be done either through a power spectrum analysis, which is often too limited, or the use of convolutional neural networks (CNNs), which require large training sets and lack interpretability. In this paper, we advocate for the use of the scattering transform (Mallat 2012), a powerful statistic which borrows mathematical ideas from CNNs but does not require any training, and is interpretable. We show that it provides a relatively compact set of summary statistics with visual interpretation and which carries most of the relevant information in a wide range of scientific applications. We present a non-technical introduction to this estimator and we argue that it can benefit data analysis, comparison to models and parameter inference in many fields of science. Interestingly, understanding the core operations of the scattering transform allows one to decipher many key aspects of the inner workings of CNNs.

Read this paper on arXiv…

S. Cheng and B. Ménard
Fri, 3 Dec 21
12/81

Comments: 18 pages, 16 figures

PGNets: Planet mass prediction using convolutional neural networks for radio continuum observations of protoplanetary disks [EPA]

http://arxiv.org/abs/2111.15196


We developed Convolutional Neural Networks (CNNs) to rapidly and directly infer the planet mass from radio dust continuum images. Substructures induced by young planets in protoplanetary disks can be used to infer the potential young planets’ properties. Hydrodynamical simulations have been used to study the relationships between the planet’s properties and these disk features. However, these attempts either fine-tuned numerical simulations to fit one protoplanetary disk at a time, which was time-consuming, or azimuthally averaged simulation results to derive some linear relationships between the gap width/depth and the planet mass, which lost information on asymmetric features in disks. To cope with these disadvantages, we developed Planet Gap neural Networks (PGNets) to infer the planet mass from 2D images. We first fit the gridded data in Zhang et al. (2018) as a classification problem. Then, we quadrupled the data set by running additional simulations with near-randomly sampled parameters, and derived the planet mass and disk viscosity together as a regression problem. The classification approach can reach an accuracy of 92\%, whereas the regression approach can reach 1$\sigma$ as 0.16 dex for planet mass and 0.23 dex for disk viscosity. We can reproduce the degeneracy scaling $\alpha$ $\propto$ $M_p^3$ found in the linear fitting method, which means that the CNN method can even be used to find degeneracy relationship. The gradient-weighted class activation mapping effectively confirms that PGNets use proper disk features to constrain the planet mass. We provide programs for PGNets and the traditional fitting method from Zhang et al. (2018), and discuss each method’s advantages and disadvantages.

Read this paper on arXiv…

S. Zhang, Z. Zhu and M. Kang
Wed, 1 Dec 21
36/110

Comments: 12 pages, 7 figures, accepted to MNRAS

Weighing the Milky Way and Andromeda with Artificial Intelligence [GA]

http://arxiv.org/abs/2111.14874


We present new constraints on the masses of the halos hosting the Milky Way and Andromeda galaxies derived using graph neural networks. Our models, trained on thousands of state-of-the-art hydrodynamic simulations of the CAMELS project, only make use of the positions, velocities and stellar masses of the galaxies belonging to the halos, and are able to perform likelihood-free inference on halo masses while accounting for both cosmological and astrophysical uncertainties. Our constraints are in agreement with estimates from other traditional methods.

Read this paper on arXiv…

P. Villanueva-Domingo, F. Villaescusa-Navarro, S. Genel, et. al.
Wed, 1 Dec 21
107/110

Comments: 2 figures, 2 tables, 7 pages. Code publicly available at this https URL

A Ubiquitous Unifying Degeneracy in 2-body Microlensing Systems [EPA]

http://arxiv.org/abs/2111.13696


While gravitational microlensing by planetary systems can provide unique vistas on the properties of exoplanets, observations of such 2-body microlensing events can often be explained with multiple and distinct physical configurations, so-called model degeneracies. An understanding of the intrinsic and exogenous origins of different classes of degeneracy provides a foundation for phenomenological interpretation. Here, leveraging a fast machine-learning based inference framework, we present the discovery of a new regime of degeneracy–the offset degeneracy–which unifies the previously known close-wide and inner-outer degeneracies, generalises to resonant caustics, and upon reanalysis, is ubiquitous in previously published planetary events with 2-fold degenerate solutions. Importantly, our discovery suggests that the commonly reported close-wide degeneracy essentially never arises in actual events and should, instead, be more suitably viewed as a transition point of the offset degeneracy. While previous studies of microlensing degeneracies are largely studies of degenerate caustics, our discovery demonstrates that degenerate caustics do not necessarily result in degenerate events, which for the latter it is more relevant to study magnifications at the location of the source. This discovery fundamentally changes the way in which degeneracies in planetary microlensing events should be interpreted, suggests a deeper symmetry in the mathematics of 2-body lenses than has previously been recognised, and will increasingly manifest itself in data from new generations of microlensing surveys.

Read this paper on arXiv…

K. Zhang, B. Gaudi and J. Bloom
Tue, 30 Nov 21
30/105

Comments: 21 pages, 8 figures, submitted

Group equivariant neural posterior estimation [CL]

http://arxiv.org/abs/2111.13139


Simulation-based inference with conditional neural density estimators is a powerful approach to solving inverse problems in science. However, these methods typically treat the underlying forward model as a black box, with no way to exploit geometric properties such as equivariances. Equivariances are common in scientific models, however integrating them directly into expressive inference networks (such as normalizing flows) is not straightforward. We here describe an alternative method to incorporate equivariances under joint transformations of parameters and data. Our method — called group equivariant neural posterior estimation (GNPE) — is based on self-consistently standardizing the “pose” of the data while estimating the posterior over parameters. It is architecture-independent, and applies both to exact and approximate equivariances. As a real-world application, we use GNPE for amortized inference of astrophysical binary black hole systems from gravitational-wave observations. We show that GNPE achieves state-of-the-art accuracy while reducing inference times by three orders of magnitude.

Read this paper on arXiv…

M. Dax, S. Green, J. Gair, et. al.
Mon, 29 Nov 21
14/94

Comments: 13+11 pages, 5+8 figures

Asteroid Flyby Cycler Trajectory Design Using Deep Neural Networks [IMA]

http://arxiv.org/abs/2111.11858


Asteroid exploration has been attracting more attention in recent years. Nevertheless, we have just visited tens of asteroids while we have discovered more than one million bodies. As our current observation and knowledge should be biased, it is essential to explore multiple asteroids directly to better understand the remains of planetary building materials. One of the mission design solutions is utilizing asteroid flyby cycler trajectories with multiple Earth gravity assists. An asteroid flyby cycler trajectory design problem is a subclass of global trajectory optimization problems with multiple flybys, involving a trajectory optimization problem for a given flyby sequence and a combinatorial optimization problem to decide the sequence of the flybys. As the number of flyby bodies grows, the computation time of this optimization problem expands maliciously. This paper presents a new method to design asteroid flyby cycler trajectories utilizing a surrogate model constructed by deep neural networks approximating trajectory optimization results. Since one of the bottlenecks of machine learning approaches is to generate massive trajectory databases, we propose an efficient database generation strategy by introducing pseudo-asteroids satisfying the Karush-Kuhn-Tucker conditions. The numerical result applied to JAXA’s DESTINY+ mission shows that the proposed method can significantly reduce the computational time for searching asteroid flyby sequences.

Read this paper on arXiv…

N. Ozaki, K. Yanagida, T. Chikazawa, et. al.
Wed, 24 Nov 21
22/61

Comments: N/A

Machine Learning for Mars Exploration [EPA]

http://arxiv.org/abs/2111.11537


Risk to human astronauts and interplanetary distance causing slow and limited communication drives scientists to pursue an autonomous approach to exploring distant planets, such as Mars. A portion of exploration of Mars has been conducted through the autonomous collection and analysis of Martian data by spacecraft such as the Mars rovers and the Mars Express Orbiter. The autonomy used on these Mars exploration spacecraft and on Earth to analyze data collected by these vehicles mainly consist of machine learning, a field of artificial intelligence where algorithms collect data and self-improve with the data. Additional applications of machine learning techniques for Mars exploration have potential to resolve communication limitations and human risks of interplanetary exploration. In addition, analyzing Mars data with machine learning has the potential to provide a greater understanding of Mars in numerous domains such as its climate, atmosphere, and potential future habitation. To explore further utilizations of machine learning techniques for Mars exploration, this paper will first summarize the general features and phenomena of Mars to provide a general overview of the planet, elaborate upon uncertainties of Mars that would be beneficial to explore and understand, summarize every current or previous usage of machine learning techniques in the exploration of Mars, explore implementations of machine learning that will be utilized in future Mars exploration missions, and explore machine learning techniques used in Earthly domains to provide solutions to the previously described uncertainties of Mars.

Read this paper on arXiv…

A. Momennasab
Wed, 24 Nov 21
28/61

Comments: 16 pages, 0 figures

Fink: early supernovae Ia classification using active learning [IMA]

http://arxiv.org/abs/2111.11438


We describe how the Fink broker early supernova Ia classifier optimizes its ML classifications by employing an active learning (AL) strategy. We demonstrate the feasibility of implementation of such strategies in the current Zwicky Transient Facility (ZTF) public alert data stream. We compare the performance of two AL strategies: uncertainty sampling and random sampling. Our pipeline consists of 3 stages: feature extraction, classification and learning strategy. Starting from an initial sample of 10 alerts (5 SN Ia and 5 non-Ia), we let the algorithm identify which alert should be added to the training sample. The system is allowed to evolve through 300 iterations. Our data set consists of 23 840 alerts from the ZTF with confirmed classification via cross-match with SIMBAD database and the Transient name server (TNS), 1 600 of which were SNe Ia (1 021 unique objects). The data configuration, after the learning cycle was completed, consists of 310 alerts for training and 23 530 for testing. Averaging over 100 realizations, the classifier achieved 89% purity and 54% efficiency. From 01/November/2020 to 31/October/2021 Fink has applied its early supernova Ia module to the ZTF stream and communicated promising SN Ia candidates to the TNS. From the 535 spectroscopically classified Fink candidates, 459 (86%) were proven to be SNe Ia. Our results confirm the effectiveness of active learning strategies for guiding the construction of optimal training samples for astronomical classifiers. It demonstrates in real data that the performance of learning algorithms can be highly improved without the need of extra computational resources or overwhelmingly large training samples. This is, to our knowledge, the first application of AL to real alerts data.

Read this paper on arXiv…

M. Leoni, E. Ishida, J. Peloton, et. al.
Wed, 24 Nov 21
41/61

Comments: 8 pages, 7 figures – submitted to Astronomy and Astrophysics. Comments are welcome

Weight Pruning and Uncertainty in Radio Galaxy Classification [IMA]

http://arxiv.org/abs/2111.11654


In this work we use variational inference to quantify the degree of epistemic uncertainty in model predictions of radio galaxy classification and show that the level of model posterior variance for individual test samples is correlated with human uncertainty when labelling radio galaxies. We explore the model performance and uncertainty calibration for a variety of different weight priors and suggest that a sparse prior produces more well-calibrated uncertainty estimates. Using the posterior distributions for individual weights, we show that signal-to-noise ratio (SNR) ranking allows pruning of the fully-connected layers to the level of 30\% without significant loss of performance, and that this pruning increases the predictive uncertainty in the model. Finally we show that, like other work in this field, we experience a cold posterior effect. We examine whether adapting the cost function in our model to accommodate model misspecification can compensate for this effect, but find that it does not make a significant difference. We also examine the effect of principled data augmentation and find that it improves upon the baseline but does not compensate for the observed effect fully. We interpret this as the cold posterior effect being due to the overly effective curation of our training sample leading to likelihood misspecification, and raise this as a potential issue for Bayesian deep learning approaches to radio galaxy classification in future.

Read this paper on arXiv…

D. Mohan and A. Scaife
Wed, 24 Nov 21
58/61

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version

Accelerating non-LTE synthesis and inversions with graph networks [SSA]

http://arxiv.org/abs/2111.10552


Context: The computational cost of fast non-LTE synthesis is one of the challenges that limits the development of 2D and 3D inversion codes. It also makes the interpretation of observations of lines formed in the chromosphere and transition region a slow and computationally costly process, which limits the inference of the physical properties on rather small fields of view. Having access to a fast way of computing the deviation from the LTE regime through the departure coefficients could largely alleviate this problem. Aims: We propose to build and train a graph network that quickly predicts the atomic level populations without solving the non-LTE problem. Methods: We find an optimal architecture for the graph network for predicting the departure coefficients of the levels of an atom from the physical conditions of a model atmosphere. A suitable dataset with a representative sample of potential model atmospheres is used for training. This dataset has been computed using existing non-LTE synthesis codes. Results: The graph network has been integrated into existing synthesis and inversion codes for the particular case of \caii. We demonstrate orders of magnitude gain in computing speed. We analyze the generalization capabilities of the graph network and demonstrate that it produces good predicted departure coefficients for unseen models. We implement this approach in \hazel\ and show how the inversions nicely compare with those obtained with standard non-LTE inversion codes. Our approximate method opens up the possibility of extracting physical information from the chromosphere on large fields-of-view with time evolution. This allows us to understand better this region of the Sun, where large spatial and temporal scales are crucial.

Read this paper on arXiv…

A. Arévalo, A. Ramos and S. Pozuelo
Tue, 23 Nov 21
33/84

Comments: 12 pages, 10 figures, Submitted to A&A

ExoMiner: A Highly Accurate and Explainable Deep Learning Classifier to Mine Exoplanets [EPA]

http://arxiv.org/abs/2111.10009


The kepler and TESS missions have generated over 100,000 potential transit signals that must be processed in order to create a catalog of planet candidates. During the last few years, there has been a growing interest in using machine learning to analyze these data in search of new exoplanets. Different from the existing machine learning works, ExoMiner, the proposed deep learning classifier in this work, mimics how domain experts examine diagnostic tests to vet a transit signal. ExoMiner is a highly accurate, explainable, and robust classifier that 1) allows us to validate 301 new exoplanets from the MAST Kepler Archive and 2) is general enough to be applied across missions such as the on-going TESS mission. We perform an extensive experimental study to verify that ExoMiner is more reliable and accurate than the existing transit signal classifiers in terms of different classification and ranking metrics. For example, for a fixed precision value of 99%, ExoMiner retrieves 93.6% of all exoplanets in the test set (i.e., recall=0.936) while this rate is 76.3% for the best existing classifier. Furthermore, the modular design of ExoMiner favors its explainability. We introduce a simple explainability framework that provides experts with feedback on why ExoMiner classifies a transit signal into a specific class label (e.g., planet candidate or not planet candidate).

Read this paper on arXiv…

H. Valizadegan, M. Martinho, L. Wilkens, et. al.
Mon, 22 Nov 21
22/53

Comments: Accepted for Publication in Astrophysical Journals, November 20201

Unsupervised Spectral Unmixing For Telluric Correction Using A Neural Network Autoencoder [IMA]

http://arxiv.org/abs/2111.09081


The absorption of light by molecules in the atmosphere of Earth is a complication for ground-based observations of astrophysical objects. Comprehensive information on various molecular species is required to correct for this so called telluric absorption. We present a neural network autoencoder approach for extracting a telluric transmission spectrum from a large set of high-precision observed solar spectra from the HARPS-N radial velocity spectrograph. We accomplish this by reducing the data into a compressed representation, which allows us to unveil the underlying solar spectrum and simultaneously uncover the different modes of variation in the observed spectra relating to the absorption of $\mathrm{H_2O}$ and $\mathrm{O_2}$ in the atmosphere of Earth. We demonstrate how the extracted components can be used to remove $\mathrm{H_2O}$ and $\mathrm{O_2}$ tellurics in a validation observation with similar accuracy and at less computational expense than a synthetic approach with molecfit.

Read this paper on arXiv…

R. Kjærsgaard, A. Bello-Arufe, A. Rathcke, et. al.
Thu, 18 Nov 21
90/92

Comments: Presented at Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021)

Automatically detecting anomalous exoplanet transits [CL]

http://arxiv.org/abs/2111.08679


Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more amenable to outlier detection than raw data or the latent representation of a traditional variational autoencoder. We then apply our method to real exoplanet transit data. Our study is the first which automatically identifies anomalous exoplanet transit light curves. We additionally release three first-of-their-kind datasets to enable further research.

Read this paper on arXiv…

C. Hönes, B. Miller, A. Heras, et. al.
Wed, 17 Nov 21
4/64

Comments: 12 pages, 4 figures, 4 tables, Accepted at NeurIPS 2021 (Workshop for Machine Learning and the Physical Sciences)

Fast and Credible Likelihood-Free Cosmology with Truncated Marginal Neural Ratio Estimation [CEA]

http://arxiv.org/abs/2111.08030


Sampling-based inference techniques are central to modern cosmological data analysis; these methods, however, scale poorly with dimensionality and typically require approximate or intractable likelihoods. In this paper we describe how Truncated Marginal Neural Ratio Estimation (TMNRE) (a new approach in so-called simulation-based inference) naturally evades these issues, improving the $(i)$ efficiency, $(ii)$ scalability, and $(iii)$ trustworthiness of the inferred posteriors. Using measurements of the Cosmic Microwave Background (CMB), we show that TMNRE can achieve converged posteriors using orders of magnitude fewer simulator calls than conventional Markov Chain Monte Carlo (MCMC) methods. Remarkably, the required number of samples is effectively independent of the number of nuisance parameters. In addition, a property called \emph{local amortization} allows the performance of rigorous statistical consistency checks that are not accessible to sampling-based methods. TMNRE promises to become a powerful tool for cosmological data analysis, particularly in the context of extended cosmologies, where the timescale required for conventional sampling-based inference methods to converge can greatly exceed that of simple cosmological models such as $\Lambda$CDM. To perform these computations, we use an implementation of TMNRE via the open-source code \texttt{swyft}.

Read this paper on arXiv…

A. Cole, B. Miller, S. Witte, et. al.
Wed, 17 Nov 21
25/64

Comments: 37 pages, 13 figures. \texttt{swyft} is available at this https URL, and demonstration code for cosmological examples is available at this https URL

Inferring halo masses with Graph Neural Networks [CEA]

http://arxiv.org/abs/2111.08683


Understanding the halo-galaxy connection is fundamental in order to improve our knowledge on the nature and properties of dark matter. In this work we build a model that infers the mass of a halo given the positions, velocities, stellar masses, and radii of the galaxies it hosts. In order to capture information from correlations among galaxy properties and their phase-space, we use Graph Neural Networks (GNNs), that are designed to work with irregular and sparse data. We train our models on galaxies from more than 2,000 state-of-the-art simulations from the Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project. Our model, that accounts for cosmological and astrophysical uncertainties, is able to constrain the masses of the halos with a $\sim$0.2 dex accuracy. Furthermore, a GNN trained on a suite of simulations is able to preserve part of its accuracy when tested on simulations run with a different code that utilizes a distinct subgrid physics model, showing the robustness of our method. The PyTorch Geometric implementation of the GNN is publicly available on Github at https://github.com/PabloVD/HaloGraphNet

Read this paper on arXiv…

P. Villanueva-Domingo, F. Villaescusa-Navarro, D. Anglés-Alcázar, et. al.
Wed, 17 Nov 21
26/64

Comments: 18 pages, 8 figures, code publicly available at this https URL

Alleviating the transit timing variation bias in transit surveys. I. RIVERS: Method and detection of a pair of resonant super-Earths around Kepler-1705 [EPA]

http://arxiv.org/abs/2111.06825


Transit timing variations (TTVs) can provide useful information for systems observed by transit, as they allow us to put constraints on the masses and eccentricities of the observed planets, or even to constrain the existence of non-transiting companions. However, TTVs can also act as a detection bias that can prevent the detection of small planets in transit surveys that would otherwise be detected by standard algorithms such as the Boxed Least Square algorithm (BLS) if their orbit was not perturbed. This bias is especially present for surveys with a long baseline, such as Kepler, some of the TESS sectors, and the upcoming PLATO mission. Here we introduce a detection method that is robust to large TTVs, and illustrate its use by recovering and confirming a pair of resonant super-Earths with ten-hour TTVs around Kepler-1705. The method is based on a neural network trained to recover the tracks of low-signal-to-noise-ratio(S/N) perturbed planets in river diagrams. We recover the transit parameters of these candidates by fitting the light curve. The individual transit S/N of Kepler-1705b and c are about three times lower than all the previously known planets with TTVs of 3 hours or more, pushing the boundaries in the recovery of these small, dynamically active planets. Recovering this type of object is essential for obtaining a complete picture of the observed planetary systems, and solving for a bias not often taken into account in statistical studies of exoplanet populations. In addition, TTVs are a means of obtaining mass estimates which can be essential for studying the internal structure of planets discovered by transit surveys. Finally, we show that due to the strong orbital perturbations, it is possible that the spin of the outer resonant planet of Kepler-1705 is trapped in a sub- or super-synchronous spin-orbit resonance.

Read this paper on arXiv…

A. Leleu, G. Chatel, S. Udry, et. al.
Mon, 15 Nov 21
1/52

Comments: N/A

Super-resolving Dark Matter Halos using Generative Deep Learning [CEA]

http://arxiv.org/abs/2111.06393


Generative deep learning methods built upon Convolutional Neural Networks (CNNs) provide a great tool for predicting non-linear structure in cosmology. In this work we predict high resolution dark matter halos from large scale, low resolution dark matter only simulations. This is achieved by mapping lower resolution to higher resolution density fields of simulations sharing the same cosmology, initial conditions and box-sizes. To resolve structure down to a factor of 8 increase in mass resolution, we use a variation of U-Net with a conditional GAN, generating output that visually and statistically matches the high resolution target extremely well. This suggests that our method can be used to create high resolution density output over Gpc/h box-sizes from low resolution simulations with negligible computational effort.

Read this paper on arXiv…

D. Schaurecker, Y. Li, J. Tinker, et. al.
Fri, 12 Nov 21
47/53

Comments: 9 pages, 8 figures

Can semi-supervised learning reduce the amount of manual labelling required for effective radio galaxy morphology classification? [GA]

http://arxiv.org/abs/2111.04357


In this work, we examine the robustness of state-of-the-art semi-supervised learning (SSL) algorithms when applied to morphological classification in modern radio astronomy. We test whether SSL can achieve performance comparable to the current supervised state of the art when using many fewer labelled data points and if these results generalise to using truly unlabelled data. We find that although SSL provides additional regularisation, its performance degrades rapidly when using very few labels, and that using truly unlabelled data leads to a significant drop in performance.

Read this paper on arXiv…

I. Slijepcevic and A. Scaife
Tue, 9 Nov 21
50/102

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version

A Comparison of Deep Learning Architectures for Optical Galaxy Morphology Classification [CL]

http://arxiv.org/abs/2111.04353


The classification of galaxy morphology plays a crucial role in understanding galaxy formation and evolution. Traditionally, this process is done manually. The emergence of deep learning techniques has given room for the automation of this process. As such, this paper offers a comparison of deep learning architectures to determine which is best suited for optical galaxy morphology classification. Adapting the model training method proposed by Walmsley et al in 2021, the Zoobot Python library is used to train models to predict Galaxy Zoo DECaLS decision tree responses, made by volunteers, using EfficientNet B0, DenseNet121 and ResNet50 as core model architectures. The predicted results are then used to generate accuracy metrics per decision tree question to determine architecture performance. DenseNet121 was found to produce the best results, in terms of accuracy, with a reasonable training time. In future, further testing with more deep learning architectures could prove beneficial.

Read this paper on arXiv…

E. Fielding, C. Nyirenda and M. Vaccari
Tue, 9 Nov 21
84/102

Comments: N/A

A deep ensemble approach to X-ray polarimetry [IMA]

http://arxiv.org/abs/2111.03047


X-ray polarimetry will soon open a new window on the high energy universe with the launch of NASA’s Imaging X-ray Polarimetry Explorer (IXPE). Polarimeters are currently limited by their track reconstruction algorithms, which typically use linear estimators and do not consider individual event quality. We present a modern deep learning method for maximizing the sensitivity of X-ray telescopic observations with imaging polarimeters, with a focus on the gas pixel detectors (GPDs) to be flown on IXPE. We use a weighted maximum likelihood combination of predictions from a deep ensemble of ResNets, trained on Monte Carlo event simulations. We derive and apply the optimal event weighting for maximizing the polarization signal-to-noise ratio (SNR) in track reconstruction algorithms. For typical power-law source spectra, our method improves on the current state of the art, providing a ~40% decrease in required exposure times for a given SNR.

Read this paper on arXiv…

A. A.L.Peirson and R. R.W.Romani
Fri, 5 Nov 21
3/72

Comments: Fourth Workshop on Machine Learning and the Physical Sciences (NeurIPS 2021)

Photometric Search for Exomoons by using Convolutional Neural Networks [EPA]

http://arxiv.org/abs/2111.02293


Until now, there is no confirmed moon beyond our solar system (exomoon). Exomoons offer us new possibly habitable places which might also be outside the classical habitable zone. But until now, the search for exomoons needs much computational power because classical statistical methods are employed. It is shown that exomoon signatures can be found by using deep learning and Convolutional Neural Networks (CNNs), respectively, trained with synthetic light curves combined with real light curves with no transits. It is found that CNNs trained by combined synthetic and observed light curves may be used to find moons bigger or equal to roughly 2-3 earth radii in the Kepler data set or comparable data sets. Using neural networks in future missions like Planetary Transits and Oscillation of stars (PLATO) might enable the detection of exomoons.

Read this paper on arXiv…

L. Weghs
Thu, 4 Nov 21
2/73

Comments: 11 pages, 4 figures

Realistic galaxy image simulation via score-based generative models [IMA]

http://arxiv.org/abs/2111.01713


We show that a Denoising Diffusion Probabalistic Model (DDPM), a class of score-based generative model, can be used to produce realistic yet fake images that mimic observations of galaxies. Our method is tested with Dark Energy Spectroscopic Instrument grz imaging of galaxies from the Photometry and Rotation curve OBservations from Extragalactic Surveys (PROBES) sample and galaxies selected from the Sloan Digital Sky Survey. Subjectively, the generated galaxies are highly realistic when compared with samples from the real dataset. We quantify the similarity by borrowing from the deep generative learning literature, using the Fr\'echet Inception Distance' to test for subjective and morphological similarity. We also introduce theSynthetic Galaxy Distance’ metric to compare the emergent physical properties (such as total magnitude, colour and half light radius) of a ground truth parent and synthesised child dataset. We argue that the DDPM approach produces sharper and more realistic images than other generative methods such as Adversarial Networks (with the downside of more costly inference), and could be used to produce large samples of synthetic observations tailored to a specific imaging survey. We demonstrate two potential uses of the DDPM: (1) accurate in-painting of occluded data, such as satellite trails, and (2) domain transfer, where new input images can be processed to mimic the properties of the DDPM training set. Here we `DESI-fy’ cartoon images as a proof of concept for domain transfer. Finally, we suggest potential applications for score-based approaches that could motivate further research on this topic within the astronomical community.

Read this paper on arXiv…

M. Smith, J. Geach, R. Jackson, et. al.
Wed, 3 Nov 21
91/106

Comments: 10 pages, 8 figures. Code: this https URL . Follow the Twitter bot @ThisIsNotAnApod for DDPM-generated APODs

Time Series Comparisons in Deep Space Network [CL]

http://arxiv.org/abs/2111.01393


The Deep Space Network is NASA’s international array of antennas that support interplanetary spacecraft missions. A track is a block of multi-dimensional time series from the beginning to end of DSN communication with the target spacecraft, containing thousands of monitor data items lasting several hours at a frequency of 0.2-1Hz. Monitor data on each track reports on the performance of specific spacecraft operations and the DSN itself. DSN is receiving signals from 32 spacecraft across the solar system. DSN has pressure to reduce costs while maintaining the quality of support for DSN mission users. DSN Link Control Operators need to simultaneously monitor multiple tracks and identify anomalies in real time. DSN has seen that as the number of missions increases, the data that needs to be processed increases over time. In this project, we look at the last 8 years of data for analysis. Any anomaly in the track indicates a problem with either the spacecraft, DSN equipment, or weather conditions. DSN operators typically write Discrepancy Reports for further analysis. It is recognized that it would be quite helpful to identify 10 similar historical tracks out of the huge database to quickly find and match anomalies. This tool has three functions: (1) identification of the top 10 similar historical tracks, (2) detection of anomalies compared to the reference normal track, and (3) comparison of statistical differences between two given tracks. The requirements for these features were confirmed by survey responses from 21 DSN operators and engineers. The preliminary machine learning model has shown promising performance (AUC=0.92). We plan to increase the number of data sets and perform additional testing to improve performance further before its planned integration into the track visualizer interface to assist DSN field operators and engineers.

Read this paper on arXiv…

K. Yun, R. Verma and U. Rebbapragada
Wed, 3 Nov 21
94/106

Comments: 7 pages, 8 figures, AIAA-ASCEND 2021

Robustness of deep learning algorithms in astronomy — galaxy morphology studies [GA]

http://arxiv.org/abs/2111.00961


Deep learning models are being increasingly adopted in wide array of scientific domains, especially to handle high-dimensionality and volume of the scientific data. However, these models tend to be brittle due to their complexity and overparametrization, especially to the inadvertent adversarial perturbations that can appear due to common image processing such as compression or blurring that are often seen with real scientific data. It is crucial to understand this brittleness and develop models robust to these adversarial perturbations. To this end, we study the effect of observational noise from the exposure time, as well as the worst case scenario of a one-pixel attack as a proxy for compression or telescope errors on performance of ResNet18 trained to distinguish between galaxies of different morphologies in LSST mock data. We also explore how domain adaptation techniques can help improve model robustness in case of this type of naturally occurring attacks and help scientists build more trustworthy and stable models.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, G. Perdue, et. al.
Tue, 2 Nov 21
31/93

Comments: Accepted in: Fourth Workshop on Machine Learning and the Physical Sciences (35th Conference on Neural Information Processing Systems; NeurIPS2021); final version

Swift sky localization of gravitational waves using deep learning seeded importance sampling [CL]

http://arxiv.org/abs/2111.00833


Fast, highly accurate, and reliable inference of the sky origin of gravitational waves would enable real-time multi-messenger astronomy. Current Bayesian inference methodologies, although highly accurate and reliable, are slow. Deep learning models have shown themselves to be accurate and extremely fast for inference tasks on gravitational waves, but their output is inherently questionable due to the blackbox nature of neural networks. In this work, we join Bayesian inference and deep learning by applying importance sampling on an approximate posterior generated by a multi-headed convolutional neural network. The neural network parametrizes Von Mises-Fisher and Gaussian distributions for the sky coordinates and two masses for given simulated gravitational wave injections in the LIGO and Virgo detectors. We generate skymaps for unseen gravitational-wave events that highly resemble predictions generated using Bayesian inference in a few minutes. Furthermore, we can detect poor predictions from the neural network, and quickly flag them.

Read this paper on arXiv…

A. Kolmus, G. Baltus, J. Janquart, et. al.
Tue, 2 Nov 21
58/93

Comments: 12 pages, 9 figures, 1 table

Real-time detection of anomalies in large-scale transient surveys [IMA]

http://arxiv.org/abs/2111.00036


New time-domain surveys, such as the Rubin Observatory Legacy Survey of Space and Time (LSST), will observe millions of transient alerts each night, making standard approaches of visually identifying new and interesting transients infeasible. We present two novel methods of automatically detecting anomalous transient light curves in real-time. Both methods are based on the simple idea that if the light curves from a known population of transients can be accurately modelled, any deviations from model predictions are likely anomalies. The first modelling approach is a probabilistic neural network built using Temporal Convolutional Networks (TCNs) and the second is an interpretable Bayesian parametric model of a transient. We demonstrate our methods’ ability to provide anomaly scores as a function of time on light curves from the Zwicky Transient Facility. We show that the flexibility of neural networks, the attribute that makes them such a powerful tool for many regression tasks, is what makes them less suitable for anomaly detection when compared with our parametric model. The parametric model is able to identify anomalies with respect to common supernova classes with low false anomaly rates and high true anomaly rates achieving Area Under the Receive Operating Characteristic (ROC) Curve (AUC) scores above 0.8 for most rare classes such as kilonovae, tidal disruption events, intermediate luminosity transients, and pair-instability supernovae. Our ability to identify anomalies improves over the lifetime of the light curves. Our framework, used in conjunction with transient classifiers, will enable fast and prioritised follow-up of unusual transients from new large-scale surveys.

Read this paper on arXiv…

D. Muthukrishna, K. Mandel, M. Lochner, et. al.
Tue, 2 Nov 21
80/93

Comments: 25 pages, 21 figures, submitted to MNRAS

Convolutional Deep Denoising Autoencoders for Radio Astronomical Images [IMA]

http://arxiv.org/abs/2110.08618


We apply a Machine Learning technique known as Convolutional Denoising Autoencoder to denoise synthetic images of state-of-the-art radio telescopes, with the goal of detecting the faint, diffused radio sources predicted to characterise the radio cosmic web. In our application, denoising is intended to address both the reduction of random instrumental noise and the minimisation of additional spurious artefacts like the sidelobes, resulting from the aperture synthesis technique. The effectiveness and the accuracy of the method are analysed for different kinds of corrupted input images, together with its computational performance. Specific attention has been devoted to create realistic mock observations for the training, exploiting the outcomes of cosmological numerical simulations, to generate images corresponding to LOFAR HBA 8 hours observations at 150 MHz. Our autoencoder can effectively denoise complex images identifying and extracting faint objects at the limits of the instrumental sensitivity. The method can efficiently scale on large datasets, exploiting high performance computing solutions, in a fully automated way (i.e. no human supervision is required after training). It can accurately perform image segmentation, identifying low brightness outskirts of diffused sources, proving to be a viable solution for detecting challenging extended objects hidden in noisy radio observations.

Read this paper on arXiv…

C. Gheller and F. Vazza
Tue, 19 Oct 21
22/98

Comments: 21 pages, 14 figures, Accepted for publication by MNRAS

Predicting Solar Flares with Remote Sensing and Machine Learning [CL]

http://arxiv.org/abs/2110.07658


High energy solar flares and coronal mass ejections have the potential to destroy Earth’s ground and satellite infrastructures, causing trillions of dollars in damage and mass human suffering. Destruction of these critical systems would disable power grids and satellites, crippling communications and transportation. This would lead to food shortages and an inability to respond to emergencies. A solution to this impending problem is proposed herein using satellites in solar orbit that continuously monitor the Sun, use artificial intelligence and machine learning to calculate the probability of massive solar explosions from this sensed data, and then signal defense mechanisms that will mitigate the threat. With modern technology there may be only safeguards that can be implemented with enough warning, which is why the best algorithm must be identified and continuously trained with existing and new data to maximize true positive rates while minimizing false negatives. This paper conducts a survey of current machine learning models using open source solar flare prediction data. The rise of edge computing allows machine learning hardware to be placed on the same satellites as the sensor arrays, saving critical time by not having to transmit remote sensing data across the vast distances of space. A system of systems approach will allow enough warning for safety measures to be put into place mitigating the risk of disaster.

Read this paper on arXiv…

E. Larsen
Mon, 18 Oct 21
8/68

Comments: 16 pages, 10 figures, 3 tables

Astronomical source finding services for the CIRASA visual analytic platform [IMA]

http://arxiv.org/abs/2110.08211


Innovative developments in data processing, archiving, analysis, and visualization are nowadays unavoidable to deal with the data deluge expected in next-generation facilities for radio astronomy, such as the Square Kilometre Array (SKA) and its precursors. In this context, the integration of source extraction and analysis algorithms into data visualization tools could significantly improve and speed up the cataloguing process of large area surveys, boosting astronomer productivity and shortening publication time. To this aim, we are developing a visual analytic platform (CIRASA) for advanced source finding and classification, integrating state-of-the-art tools, such as the CAESAR source finder, the ViaLactea Visual Analytic (VLVA) and Knowledge Base (VLKB). In this work, we present the project objectives and the platform architecture, focusing on the implemented source finding services.

Read this paper on arXiv…

S. Riggia, C. Bordiu, F. Vitello, et. al.
Mon, 18 Oct 21
52/68

Comments: 16 pages, 6 figures

A neural simulation-based inference approach for characterizing the Galactic Center $γ$-ray excess [HEAP]

http://arxiv.org/abs/2110.06931


The nature of the Fermi gamma-ray Galactic Center Excess (GCE) has remained a persistent mystery for over a decade. Although the excess is broadly compatible with emission expected due to dark matter annihilation, an explanation in terms of a population of unresolved astrophysical point sources e.g., millisecond pulsars, remains viable. The effort to uncover the origin of the GCE is hampered in particular by an incomplete understanding of diffuse emission of Galactic origin. This can lead to spurious features that make it difficult to robustly differentiate smooth emission, as expected for a dark matter origin, from more “clumpy” emission expected for a population of relatively bright, unresolved point sources. We use recent advancements in the field of simulation-based inference, in particular density estimation techniques using normalizing flows, in order to characterize the contribution of modeled components, including unresolved point source populations, to the GCE. Compared to traditional techniques based on the statistical distribution of photon counts, our machine learning-based method is able to utilize more of the information contained in a given model of the Galactic Center emission, and in particular can perform posterior parameter estimation while accounting for pixel-to-pixel spatial correlations in the gamma-ray map. This makes the method demonstrably more resilient to certain forms of model misspecification. On application to Fermi data, the method generically attributes a smaller fraction of the GCE flux to unresolved point sources when compared to traditional approaches. We nevertheless infer such a contribution to make up a non-negligible fraction of the GCE across all analysis variations considered, with at least $38^{+9}_{-19}\%$ of the excess attributed to unresolved points sources in our baseline analysis.

Read this paper on arXiv…

S. Mishra-Sharma and K. Cranmer
Fri, 15 Oct 21
38/56

Comments: 20+3 pages, 10+4 figures

VLBInet: Radio Interferometry Data Classification for EHT with Neural Networks [HEAP]

http://arxiv.org/abs/2110.07185


The Event Horizon Telescope (EHT) recently released the first horizon-scale images of the black hole in M87. Combined with other astronomical data, these images constrain the mass and spin of the hole as well as the accretion rate and magnetic flux trapped on the hole. An important question for the EHT is how well key parameters, such as trapped magnetic flux and the associated disk models, can be extracted from present and future EHT VLBI data products. The process of modeling visibilities and analyzing them is complicated by the fact that the data are sparsely sampled in the Fourier domain while most of the theory/simulation is constructed in the image domain. Here we propose a data-driven approach to analyze complex visibilities and closure quantities for radio interferometric data with neural networks. Using mock interferometric data, we show that our neural networks are able to infer the accretion state as either high magnetic flux (MAD) or low magnetic flux (SANE), suggesting that it is possible to perform parameter extraction directly in the visibility domain without image reconstruction. We have applied VLBInet to real M87 EHT data taken on four different days in 2017 (April 5, 6, 10, 11), and our neural networks give a score prediction 0.52, 0.4, 0.43, 0.76 for each day, with an average score 0.53, which shows no significant indication for the data to lean toward either the MAD or SANE state.

Read this paper on arXiv…

J. Lin, D. Pesce, G. Wong, et. al.
Fri, 15 Oct 21
51/56

Comments: 10 pages, 7 figures

Satellite galaxy abundance dependency on cosmology in Magneticum simulations [CEA]

http://arxiv.org/abs/2110.05498


Context: Modelling satellite galaxy abundance $N_s$ in Galaxy Clusters (GCs) is a key element in modelling the Halo Occupation Distribution (HOD), which itself is a powerful tool to connect observational studies with numerical simulations. Aims: To study the impact of cosmological parameters on satellite abundance both in cosmological simulations and in mock observations. Methods: We build an emulator (HODEmu, \url{https://github.com/aragagnin/HODEmu/}) of satellite abundance based on cosmological parameters $\Omega_m, \Omega_b, \sigma_8, h_0$ and redshift $z.$ We train our emulator using \magneticum hydrodynamic simulations that span 15 different cosmologies, each over $4$ redshift slices between $0<z<0.5,$ and for each setup we fit normalisation $A$, log-slope $\beta$ and Gaussian fractional-scatter $\sigma$ of the $N_s-M$ relation. The emulator is based on multi-variate output Gaussian Process Regression (GPR). Results: We find that $A$ and $\beta$ depend on cosmological parameters, even if weakly, especially on $\Omega_m,$ $\Omega_b.$ This dependency can explain some discrepancies found in literature between satellite HOD of different cosmological simulations (Magneticum, Illustris, BAHAMAS). We also show that satellite abundance cosmology dependency differs between full-physics (FP) simulations, dark-matter only (DMO), and non-radiative simulations. Conclusions: This work provides a preliminary calibration of the cosmological dependency of the satellite abundance of high mass halos, and we showed that modelling HOD with cosmological parameters is necessary to interpret satellite abundance, and we showed the importance of using FP simulations in modelling this dependency.

Read this paper on arXiv…

A. Ragagnin, A. Fumagalli, T. Castro, et. al.
Wed, 13 Oct 21
11/80

Comments: 15 pages, 13 figues, submitted to A&A

Measuring chemical likeness of stars with RSCA [GA]

http://arxiv.org/abs/2110.02250


Identification of chemically similar stars using elemental abundances is core to many pursuits within Galactic archaeology. However, measuring the chemical likeness of stars using abundances directly is limited by systematic imprints of imperfect synthetic spectra in abundance derivation. We present a novel data-driven model that is capable of identifying chemically similar stars from spectra alone. We call this Relevant Scaled Component Analysis (RSCA). RSCA finds a mapping from stellar spectra to a representation that optimizes recovery of known open clusters. By design, RSCA amplifies factors of chemical abundance variation and minimizes those of non-chemical parameters, such as instrument systematics. The resultant representation of stellar spectra can therefore be used for precise measurements of chemical similarity between stars. We validate RSCA using 185 cluster stars in 22 open clusters in the APOGEE survey. We quantify our performance in measuring chemical similarity using a reference set of 151,145 field stars. We find that our representation identifies known stellar siblings more effectively than stellar abundance measurements. Using RSCA, 1.8% of pairs of field stars are as similar as birth siblings, compared to 2.3% when using stellar abundance labels. We find that almost all of the information within spectra leveraged by RSCA fits into a two-dimensional basis, which we link to [Fe/H] and alpha-element abundances. We conclude that chemical tagging of stars to their birth clusters remains prohibitive. However, using the spectra has noticeable gain, and our approach is poised to benefit from larger datasets and improved algorithm designs.

Read this paper on arXiv…

D. Mijolla and M. Ness
Thu, 7 Oct 21
39/51

Comments: submitted to ApJ, 16 pages, code:this https URL

Inferring dark matter substructure with astrometric lensing beyond the power spectrum [CEA]

http://arxiv.org/abs/2110.01620


Astrometry — the precise measurement of positions and motions of celestial objects — has emerged as a promising avenue for characterizing the dark matter population in our Galaxy. By leveraging recent advances in simulation-based inference and neural network architectures, we introduce a novel method to search for global dark matter-induced gravitational lensing signatures in astrometric datasets. Our method based on neural likelihood-ratio estimation shows significantly enhanced sensitivity to a cold dark matter population and more favorable scaling with measurement noise compared to existing approaches based on two-point correlation statistics, establishing machine learning as a powerful tool for characterizing dark matter using astrometric data.

Read this paper on arXiv…

S. Mishra-Sharma
Wed, 6 Oct 21
8/56

Comments: 10 pages, 3 figures, extended version of paper submitted to the Machine Learning and the Physical Sciences workshop at NeurIPS 2021

Arbitrary Marginal Neural Ratio Estimation for Simulation-based Inference [CL]

http://arxiv.org/abs/2110.00449


In many areas of science, complex phenomena are modeled by stochastic parametric simulators, often featuring high-dimensional parameter spaces and intractable likelihoods. In this context, performing Bayesian inference can be challenging. In this work, we present a novel method that enables amortized inference over arbitrary subsets of the parameters, without resorting to numerical integration, which makes interpretation of the posterior more convenient. Our method is efficient and can be implemented with arbitrary neural network architectures. We demonstrate the applicability of the method on parameter inference of binary black hole systems from gravitational waves observations.

Read this paper on arXiv…

F. Rozet and G. Louppe
Mon, 4 Oct 21
29/76

Comments: 4 pages, 3 figures, submitted to the Machine Learning and the Physical Sciences workshop at NeurIPS 2021

Feature Selection on a Flare Forecasting Testbed: A Comparative Study of 24 Methods [SSA]

http://arxiv.org/abs/2109.14770


The Space-Weather ANalytics for Solar Flares (SWAN-SF) is a multivariate time series benchmark dataset recently created to serve the heliophysics community as a testbed for solar flare forecasting models. SWAN-SF contains 54 unique features, with 24 quantitative features computed from the photospheric magnetic field maps of active regions, describing their precedent flare activity. In this study, for the first time, we systematically attacked the problem of quantifying the relevance of these features to the ambitious task of flare forecasting. We implemented an end-to-end pipeline for preprocessing, feature selection, and evaluation phases. We incorporated 24 Feature Subset Selection (FSS) algorithms, including multivariate and univariate, supervised and unsupervised, wrappers and filters. We methodologically compared the results of different FSS algorithms, both on the multivariate time series and vectorized formats, and tested their correlation and reliability, to the extent possible, by using the selected features for flare forecasting on unseen data, in univariate and multivariate fashions. We concluded our investigation with a report of the best FSS methods in terms of their top-k features, and the analysis of the findings. We wish the reproducibility of our study and the availability of the data allow the future attempts be comparable with our findings and themselves.

Read this paper on arXiv…

A. Yeoleka, S. Patel, S. Talla, et. al.
Fri, 1 Oct 21
63/65

Comments: 10 pages, 7 figures, 1 table, IEEE ICDM 2021, SFE-TSDM Workshop

Graph Neural Network-based Resource AllocationStrategies for Multi-Object Spectroscopy [IMA]

http://arxiv.org/abs/2109.13361


Resource allocation problems are often approached with linear program-ming techniques. But many concrete allocation problems in the experimental and ob-servational sciences cannot or should not be expressed in the form of linear objectivefunctions. Even if the objective is linear, its parameters may not be known beforehandbecause they depend on the results of the experiment for which the allocation is to bedetermined. To address these challenges, we present a bipartite Graph Neural Networkarchitecture for trainable resource allocation strategies. Items of value and constraintsform the two sets of graph nodes, which are connected by edges corresponding to pos-sible allocations. The GNN is trained on simulations or past problem occurrences tomaximize any user-supplied, scientifically motivated objective function, augmented byan infeasibility penalty. The amount of feasibility violation can be tuned in relation toany available slack in the system. We apply this method to optimize the astronomicaltarget selection strategy for the highly multiplexed Subaru Prime Focus Spectrographinstrument, where it shows superior results to direct gradient descent optimization andextends the capabilities of the currently employed solver which uses linear objectivefunctions. The development of this method enables fast adjustment and deployment ofallocation strategies, statistical analyses of allocation patterns, and fully differentiable,science-driven solutions for resource allocation problems.

Read this paper on arXiv…

T. Wang and P. Melchior
Wed, 29 Sep 21
43/78

Comments: N/A

Multifield Cosmology with Artificial Intelligence [CEA]

http://arxiv.org/abs/2109.09747


Astrophysical processes such as feedback from supernovae and active galactic nuclei modify the properties and spatial distribution of dark matter, gas, and galaxies in a poorly understood way. This uncertainty is one of the main theoretical obstacles to extract information from cosmological surveys. We use 2,000 state-of-the-art hydrodynamic simulations from the CAMELS project spanning a wide variety of cosmological and astrophysical models and generate hundreds of thousands of 2-dimensional maps for 13 different fields: from dark matter to gas and stellar properties. We use these maps to train convolutional neural networks to extract the maximum amount of cosmological information while marginalizing over astrophysical effects at the field level. Although our maps only cover a small area of $(25~h^{-1}{\rm Mpc})^2$, and the different fields are contaminated by astrophysical effects in very different ways, our networks can infer the values of $\Omega_{\rm m}$ and $\sigma_8$ with a few percent level precision for most of the fields. We find that the marginalization performed by the network retains a wealth of cosmological information compared to a model trained on maps from gravity-only N-body simulations that are not contaminated by astrophysical effects. Finally, we train our networks on multifields — 2D maps that contain several fields as different colors or channels — and find that not only they can infer the value of all parameters with higher accuracy than networks trained on individual fields, but they can constrain the value of $\Omega_{\rm m}$ with higher accuracy than the maps from the N-body simulations.

Read this paper on arXiv…

F. Villaescusa-Navarro, D. Anglés-Alcázar, S. Genel, et. al.
Wed, 22 Sep 21
10/57

Comments: 11 pages, 7 figures. First paper of a series of four. All 2D maps, codes, and networks weights publicly available at this https URL

Analyzing the Habitable Zones of Circumbinary Planets Using Machine Learning [EPA]

http://arxiv.org/abs/2109.08735


Exoplanet detection in the past decade by efforts including NASA’s Kepler and TESS missions has discovered many worlds that differ substantially from planets in our own Solar System, including more than 150 exoplanets orbiting binary or multi-star systems. This not only broadens our understanding of the diversity of exoplanets, but also promotes our study of exoplanets in the complex binary systems and provides motivation to explore their habitability. In this study, we investigate the Habitable Zones of circumbinary planets based on planetary trajectory and dynamically informed habitable zones. Our results indicate that the mass ratio and orbital eccentricity of binary stars are important factors affecting the orbital stability and habitability of planetary systems. Moreover, planetary trajectory and dynamically informed habitable zones divide planetary habitability into three categories: habitable, part-habitable and uninhabitable. Therefore, we train a machine learning model to quickly and efficiently classify these planetary systems.

Read this paper on arXiv…

Z. Kong, J. Jiang, R. Burn, et. al.
Tue, 21 Sep 21
76/85

Comments: arXiv admin note: text overlap with arXiv:2101.02316

DECORAS: detection and characterization of radio-astronomical sources using deep learning [IMA]

http://arxiv.org/abs/2109.09077


We present DECORAS, a deep learning based approach to detect both point and extended sources from Very Long Baseline Interferometry (VLBI) observations. Our approach is based on an encoder-decoder neural network architecture that uses a low number of convolutional layers to provide a scalable solution for source detection. In addition, DECORAS performs source characterization in terms of the position, effective radius and peak brightness of the detected sources. We have trained and tested the network with images that are based on realistic Very Long Baseline Array (VLBA) observations at 20 cm. Also, these images have not gone through any prior de-convolution step and are directly related to the visibility data via a Fourier transform. We find that the source catalog generated by DECORAS has a better overall completeness and purity, when compared to a traditional source detection algorithm. DECORAS is complete at the 7.5$\sigma$ level, and has an almost factor of two improvement in reliability at 5.5$\sigma$. We find that DECORAS can recover the position of the detected sources to within 0.61 $\pm$ 0.69 mas, and the effective radius and peak surface brightness are recovered to within 20 per cent for 98 and 94 per cent of the sources, respectively. Overall, we find that DECORAS provides a reliable source detection and characterization solution for future wide-field VLBI surveys.

Read this paper on arXiv…

S. S.Rezaei, J. J.P.McKean, M. M.Biehl, et. al.
Tue, 21 Sep 21
77/85

Comments: N/A

Reconstructing Cosmic Polarization Rotation with ResUNet-CMB [CEA]

http://arxiv.org/abs/2109.09715


Cosmic polarization rotation, which may result from parity-violating new physics or the presence of primordial magnetic fields, converts $E$-mode polarization of the cosmic microwave background (CMB) into $B$-mode polarization. Anisotropic cosmic polarization rotation leads to statistical anisotropy in CMB polarization and can be reconstructed with quadratic estimator techniques similar to those designed for gravitational lensing of the CMB. At the sensitivity of upcoming CMB surveys, lensing-induced $B$-mode polarization will act as a limiting factor in the search for anisotropic cosmic polarization rotation, meaning that an analysis which incorporates some form of delensing will be required to improve constraints on the effect with future surveys. In this paper we extend the ResUNet-CMB convolutional neural network to reconstruct anisotropic cosmic polarization rotation in the presence of gravitational lensing and patchy reionization, and we show that the network simultaneously reconstructs all three effects with variance that is lower than that from the standard quadratic estimator nearly matching the performance of an iterative reconstruction method.

Read this paper on arXiv…

E. Guzman and J. Meyers
Tue, 21 Sep 21
82/85

Comments: 11 pages, 7 figures. Code available from this https URL

Testing Self-Organized Criticality Across the Main Sequence using Stellar Flares from TESS [SSA]

http://arxiv.org/abs/2109.07011


Stars produce explosive flares, which are believed to be powered by the release of energy stored in coronal magnetic field configurations. It has been shown that solar flares exhibit energy distributions typical of self-organized critical systems. This study applies a novel flare detection technique to data obtained by NASA’s TESS mission and identifies $\sim10^6$ flaring events on $\sim10^5$ stars across spectral types. Our results suggest that magnetic reconnection events that maintain the topology of the magnetic field in a self-organized critical state are ubiquitous among stellar coronae.

Read this paper on arXiv…

A. Feinstein, D. Seligman, M. Günther, et. al.
Thu, 16 Sep 21
53/54

Comments: 6 pages, 3 figures, Submitted to journal

AstronomicAL: An interactive dashboard for visualisation, integration and classification of data using Active Learning [IMA]

http://arxiv.org/abs/2109.05207


AstronomicAL is a human-in-the-loop interactive labelling and training dashboard that allows users to create reliable datasets and robust classifiers using active learning. This technique prioritises data that offer high information gain, leading to improved performance using substantially less data. The system allows users to visualise and integrate data from different sources and deal with incorrect or missing labels and imbalanced class sizes. AstronomicAL enables experts to visualise domain-specific plots and key information relating both to broader context and details of a point of interest drawn from a variety of data sources, ensuring reliable labels. In addition, AstronomicAL provides functionality to explore all aspects of the training process, including custom models and query strategies. This makes the software a tool for experimenting with both domain-specific classifications and more general-purpose machine learning strategies. We illustrate using the system with an astronomical dataset due to the field’s immediate need; however, AstronomicAL has been designed for datasets from any discipline. Finally, by exporting a simple configuration file, entire layouts, models, and assigned labels can be shared with the community. This allows for complete transparency and ensures that the process of reproducing results is effortless

Read this paper on arXiv…

G. Stevens, S. Fotopoulou, M. Bremer, et. al.
Tue, 14 Sep 21
49/88

Comments: 7 pages, 4 figures, Journal of Open Source Software

Unsupervised classification of simulated magnetospheric regions [CL]

http://arxiv.org/abs/2109.04916


In magnetospheric missions, burst mode data sampling should be triggered in the presence of processes of scientific or operational interest. We present an unsupervised classification method for magnetospheric regions, that could constitute the first-step of a multi-step method for the automatic identification of magnetospheric processes of interest. Our method is based on Self Organizing Maps (SOMs), and we test it preliminarily on data points from global magnetospheric simulations obtained with the OpenGGCM-CTIM-RCM code. The dimensionality of the data is reduced with Principal Component Analysis before classification. The classification relies exclusively on local plasma properties at the selected data points, without information on their neighborhood or on their temporal evolution. We classify the SOM nodes into an automatically selected number of classes, and we obtain clusters that map to well defined magnetospheric regions. We validate our classification results by plotting the classified data in the simulated space and by comparing with K-means classification. For the sake of result interpretability, we examine the SOM feature maps (magnetospheric variables are called features in the context of classification), and we use them to unlock information on the clusters. We repeat the classification experiments using different sets of features, we quantitatively compare different classification results, and we obtain insights on which magnetospheric variables make more effective features for unsupervised classification.

Read this paper on arXiv…

M. Innocenti, J. Amaya, J. Raeder, et. al.
Mon, 13 Sep 21
24/52

Comments: N/A

Postulating Exoplanetary Habitability via a Novel Anomaly Detection Method [EPA]

http://arxiv.org/abs/2109.02273


A profound shift in the study of cosmology came with the discovery of thousands of exoplanets and the possibility of the existence of billions of them in our Galaxy. The biggest goal in these searches is whether there are other life-harbouring planets. However, the question which of these detected planets are habitable, potentially-habitable, or maybe even inhabited, is still not answered. Some potentially habitable exoplanets have been hypothesized, but since Earth is the only known habitable planet, measures of habitability are necessarily determined with Earth as the reference. Several recent works introduced new habitability metrics based on optimization methods. Classification of potentially habitable exoplanets using supervised learning is another emerging area of study. However, both modeling and supervised learning approaches suffer from drawbacks. We propose an anomaly detection method, the Multi-Stage Memetic Algorithm (MSMA), to detect anomalies and extend it to an unsupervised clustering algorithm MSMVMCA to use it to detect potentially habitable exoplanets as anomalies. The algorithm is based on the postulate that Earth is an anomaly, with the possibility of existence of few other anomalies among thousands of data points. We describe an MSMA-based clustering approach with a novel distance function to detect habitable candidates as anomalies (including Earth). The results are cross-matched with the habitable exoplanet catalog (PHL-HEC) of the Planetary Habitability Laboratory (PHL) with both optimistic and conservative lists of potentially habitable exoplanets.

Read this paper on arXiv…

J. Sarkar, K. Bhatia, S. Saha, et. al.
Tue, 7 Sep 21
87/89

Comments: 12 pages, 3 figures, submitted to MNRAS

Segmentation of turbulent computational fluid dynamics simulations with unsupervised ensemble learning [CL]

http://arxiv.org/abs/2109.01381


Computer vision and machine learning tools offer an exciting new way for automatically analyzing and categorizing information from complex computer simulations. Here we design an ensemble machine learning framework that can independently and robustly categorize and dissect simulation data output contents of turbulent flow patterns into distinct structure catalogues. The segmentation is performed using an unsupervised clustering algorithm, which segments physical structures by grouping together similar pixels in simulation images. The accuracy and robustness of the resulting segment region boundaries are enhanced by combining information from multiple simultaneously-evaluated clustering operations. The stacking of object segmentation evaluations is performed using image mask combination operations. This statistically-combined ensemble (SCE) of different cluster masks allows us to construct cluster reliability metrics for each pixel and for the associated segments without any prior user input. By comparing the similarity of different cluster occurrences in the ensemble, we can also assess the optimal number of clusters needed to describe the data. Furthermore, by relying on ensemble-averaged spatial segment region boundaries, the SCE method enables reconstruction of more accurate and robust region of interest (ROI) boundaries for the different image data clusters. We apply the SCE algorithm to 2-dimensional simulation data snapshots of magnetically-dominated fully-kinetic turbulent plasma flows where accurate ROI boundaries are needed for geometrical measurements of intermittent flow structures known as current sheets.

Read this paper on arXiv…

M. Bussov and J. Nättilä
Mon, 6 Sep 21
46/48

Comments: 15 pages, 8 figures. Accepted to Signal Processing: Image Communication. Code available from a repository: this https URL

Noisy Labels for Weakly Supervised Gamma Hadron Classification [CL]

http://arxiv.org/abs/2108.13396


Gamma hadron classification, a central machine learning task in gamma ray astronomy, is conventionally tackled with supervised learning. However, the supervised approach requires annotated training data to be produced in sophisticated and costly simulations. We propose to instead solve gamma hadron classification with a noisy label approach that only uses unlabeled data recorded by the real telescope. To this end, we employ the significance of detection as a learning criterion which addresses this form of weak supervision. We show that models which are based on the significance of detection deliver state-of-the-art results, despite being exclusively trained with noisy labels; put differently, our models do not require the costly simulated ground-truth labels that astronomers otherwise employ for classifier training. Our weakly supervised models exhibit competitive performances also on imbalanced data sets that stem from a variety of other application domains. In contrast to existing work on class-conditional label noise, we assume that only one of the class-wise noise rates is known.

Read this paper on arXiv…

L. Pfahler, M. Bunse and K. Morik
Tue, 31 Aug 21
20/73

Comments: N/A

Machine Learning for Discovering Effective Interaction Kernels between Celestial Bodies from Ephemerides [EPA]

http://arxiv.org/abs/2108.11894


Building accurate and predictive models of the underlying mechanisms of celestial motion has inspired fundamental developments in theoretical physics. Candidate theories seek to explain observations and predict future positions of planets, stars, and other astronomical bodies as faithfully as possible. We use a data-driven learning approach, extending that developed in Lu et al. ($2019$) and extended in Zhong et al. ($2020$), to a derive stable and accurate model for the motion of celestial bodies in our Solar System. Our model is based on a collective dynamics framework, and is learned from the NASA Jet Propulsion Lab’s development ephemerides. By modeling the major astronomical bodies in the Solar System as pairwise interacting agents, our learned model generate extremely accurate dynamics that preserve not only intrinsic geometric properties of the orbits, but also highly sensitive features of the dynamics, such as perihelion precession rates. Our learned model can provide a unified explanation to the observation data, especially in terms of reproducing the perihelion precession of Mars, Mercury, and the Moon. Moreover, Our model outperforms Newton’s Law of Universal Gravitation in all cases and performs similarly to, and exceeds on the Moon, the Einstein-Infeld-Hoffman equations derived from Einstein’s theory of general relativity.

Read this paper on arXiv…

M. Zhong, J. Miller and M. Maggioni
Fri, 27 Aug 21
52/67

Comments: N/A

Self-optimizing adaptive optics control with Reinforcement Learning for high-contrast imaging [IMA]

http://arxiv.org/abs/2108.11332


Current and future high-contrast imaging instruments require extreme adaptive optics (XAO) systems to reach contrasts necessary to directly image exoplanets. Telescope vibrations and the temporal error induced by the latency of the control loop limit the performance of these systems. One way to reduce these effects is to use predictive control. We describe how model-free Reinforcement Learning can be used to optimize a Recurrent Neural Network controller for closed-loop predictive control. First, we verify our proposed approach for tip-tilt control in simulations and a lab setup. The results show that this algorithm can effectively learn to mitigate vibrations and reduce the residuals for power-law input turbulence as compared to an optimal gain integrator. We also show that the controller can learn to minimize random vibrations without requiring online updating of the control law. Next, we show in simulations that our algorithm can also be applied to the control of a high-order deformable mirror. We demonstrate that our controller can provide two orders of magnitude improvement in contrast at small separations under stationary turbulence. Furthermore, we show more than an order of magnitude improvement in contrast for different wind velocities and directions without requiring online updating of the control law.

Read this paper on arXiv…

R. Landman, S. Haffert, V. Radhakrishnan, et. al.
Thu, 26 Aug 21
32/52

Comments: Accepted for publication in JATIS. arXiv admin note: substantial text overlap with arXiv:2012.01997

Deep learning for surrogate modelling of 2D mantle convection [EPA]

http://arxiv.org/abs/2108.10105


Traditionally, 1D models based on scaling laws have been used to parameterized convective heat transfer rocks in the interior of terrestrial planets like Earth, Mars, Mercury and Venus to tackle the computational bottleneck of high-fidelity forward runs in 2D or 3D. However, these are limited in the amount of physics they can model (e.g. depth dependent material properties) and predict only mean quantities such as the mean mantle temperature. We recently showed that feedforward neural networks (FNN) trained using a large number of 2D simulations can overcome this limitation and reliably predict the evolution of entire 1D laterally-averaged temperature profile in time for complex models [Agarwal et al. 2020]. We now extend that approach to predict the full 2D temperature field, which contains more information in the form of convection structures such as hot plumes and cold downwellings. Using a dataset of 10,525 two-dimensional simulations of the thermal evolution of the mantle of a Mars-like planet, we show that deep learning techniques can produce reliable parameterized surrogates (i.e. surrogates that predict state variables such as temperature based only on parameters) of the underlying partial differential equations. We first use convolutional autoencoders to compress the temperature fields by a factor of 142 and then use FNN and long-short term memory networks (LSTM) to predict the compressed fields. On average, the FNN predictions are 99.30% and the LSTM predictions are 99.22% accurate with respect to unseen simulations. Proper orthogonal decomposition (POD) of the LSTM and FNN predictions shows that despite a lower mean absolute relative accuracy, LSTMs capture the flow dynamics better than FNNs. When summed, the POD coefficients from FNN predictions and from LSTM predictions amount to 96.51% and 97.66% relative to the coefficients of the original simulations, respectively.

Read this paper on arXiv…

S. Agarwal, N. Tosi, P. Kessel, et. al.
Tue, 24 Aug 21
66/76

Comments: N/A

AGNet: Weighing Black Holes with Deep Learning [GA]

http://arxiv.org/abs/2108.07749


Supermassive black holes (SMBHs) are ubiquitously found at the centers of most massive galaxies. Measuring SMBH mass is important for understanding the origin and evolution of SMBHs. However, traditional methods require spectroscopic data which is expensive to gather. We present an algorithm that weighs SMBHs using quasar light time series, circumventing the need for expensive spectra. We train, validate, and test neural networks that directly learn from the Sloan Digital Sky Survey (SDSS) Stripe 82 light curves for a sample of $38,939$ spectroscopically confirmed quasars to map out the nonlinear encoding between SMBH mass and multi-color optical light curves. We find a 1$\sigma$ scatter of 0.37 dex between the predicted SMBH mass and the fiducial virial mass estimate based on SDSS single-epoch spectra, which is comparable to the systematic uncertainty in the virial mass estimate. Our results have direct implications for more efficient applications with future observations from the Vera C. Rubin Observatory. Our code, \textsf{AGNet}, is publicly available at
{\color{red} \url{https://github.com/snehjp2/AGNet}}.

Read this paper on arXiv…

J. Lin, S. Pandya, D. Pratap, et. al.
Wed, 18 Aug 21
13/70

Comments: 8 pages, 7 figures, 1 table, submitting to MNRAS

A Machine-Learning-Ready Dataset Prepared from the Solar and Heliospheric Observatory Mission [SSA]

http://arxiv.org/abs/2108.06394


We present a Python tool to generate a standard dataset from solar images that allows for user-defined selection criteria and a range of pre-processing steps. Our Python tool works with all image products from both the Solar and Heliospheric Observatory (SoHO) and Solar Dynamics Observatory (SDO) missions. We discuss a dataset produced from the SoHO mission’s multi-spectral images which is free of missing or corrupt data as well as planetary transits in coronagraph images, and is temporally synced making it ready for input to a machine learning system. Machine-learning-ready images are a valuable resource for the community because they can be used, for example, for forecasting space weather parameters. We illustrate the use of this data with a 3-5 day-ahead forecast of the north-south component of the interplanetary magnetic field (IMF) observed at Lagrange point one (L1). For this use case, we apply a deep convolutional neural network (CNN) to a subset of the full SoHO dataset and compare with baseline results from a Gaussian Naive Bayes classifier.

Read this paper on arXiv…

C. Shneider, A. Hu, A. Tiwari, et. al.
Tue, 17 Aug 21
16/56

Comments: under review

Discovering outliers in the Mars Express thermal power consumption patterns [IMA]

http://arxiv.org/abs/2108.02067


The Mars Express (MEX) spacecraft has been orbiting Mars since 2004. The operators need to constantly monitor its behavior and handle sporadic deviations (outliers) from the expected patterns of measurements of quantities that the satellite is sending to Earth. In this paper, we analyze the patterns of the electrical power consumption of MEX’s thermal subsystem, that maintains the spacecraft’s temperature at the desired level. The consumption is not constant, but should be roughly periodic in the short term, with the period that corresponds to one orbit around Mars. By using long short-term memory neural networks, we show that the consumption pattern is more irregular than expected, and successfully detect such irregularities, opening possibility for automatic outlier detection on MEX in the future.

Read this paper on arXiv…

M. Petković, L. Lucas, T. Stepišnik, et. al.
Thu, 5 Aug 21
55/57

Comments: Presented at the SMC-IT 2021 conference

Automatic classification of eclipsing binary stars using deep learning methods [SSA]

http://arxiv.org/abs/2108.01640


In the last couple of decades, tremendous progress has been achieved in developing robotic telescopes and, as a result, sky surveys (both terrestrial and space) have become the source of a substantial amount of new observational data. These data contain a lot of information about binary stars, hidden in their light curves. With the huge amount of astronomical data gathered, it is not reasonable to expect all the data to be manually processed and analyzed. Therefore, in this paper, we focus on the automatic classification of eclipsing binary stars using deep learning methods. Our classifier provides a tool for the categorization of light curves of binary stars into two classes: detached and over-contact. We used the ELISa software to obtain synthetic data, which we then used for the training of the classifier. For evaluation purposes, we collected 100 light curves of observed binary stars, in order to evaluate a number of classifiers. We evaluated semi-detached eclipsing binary stars as detached. The best-performing classifier combines bidirectional Long Short-Term Memory (LSTM) and a one-dimensional convolutional neural network, which achieved 98% accuracy on the evaluation set. Omitting semi-detached eclipsing binary stars, we could obtain 100% accuracy in classification.

Read this paper on arXiv…

M. Čokina, V. Maslej-Krešňáková, P. Butka, et. al.
Wed, 4 Aug 21
5/66

Comments: N/A

A Machine-Learning-Based Direction-of-Origin Filter for the Identification of Radio Frequency Interference in the Search for Technosignatures [IMA]

http://arxiv.org/abs/2108.00559


Radio frequency interference (RFI) mitigation remains a major challenge in the search for radio technosignatures. Typical mitigation strategies include a direction-of-origin (DoO) filter, where a signal is classified as RFI if it is detected in multiple directions on the sky. These classifications generally rely on estimates of signal properties, such as frequency and frequency drift rate. Convolutional neural networks (CNNs) offer a promising complement to existing filters because they can be trained to analyze dynamic spectra directly, instead of relying on inferred signal properties. In this work, we compiled several data sets consisting of labeled pairs of images of dynamic spectra, and we designed and trained a CNN that can determine whether or not a signal detected in one scan is also present in another scan. This CNN-based DoO filter outperforms both a baseline 2D correlation model as well as existing DoO filters over a range of metrics, with precision and recall values of 99.15% and 97.81%, respectively. We found that the CNN reduces the number of signals requiring visual inspection after the application of traditional DoO filters by a factor of 6-16 in nominal situations.

Read this paper on arXiv…

P. Pinchuk and J. Margot
Tue, 3 Aug 21
4/90

Comments: 26 pages, 14 figures, submitted for publication (submitted on July 28, 2021)

Source-Agnostic Gravitational-Wave Detection with Recurrent Autoencoders [CL]

http://arxiv.org/abs/2107.12698


We present an application of anomaly detection techniques based on deep recurrent autoencoders to the problem of detecting gravitational wave signals in laser interferometers. Trained on noise data, this class of algorithms could detect signals using an unsupervised strategy, i.e., without targeting a specific kind of source. We develop a custom architecture to analyze the data from two interferometers. We compare the obtained performance to that obtained with other autoencoder architectures and with a convolutional classifier. The unsupervised nature of the proposed strategy comes with a cost in terms of accuracy, when compared to more traditional supervised techniques. On the other hand, there is a qualitative gain in generalizing the experimental sensitivity beyond the ensemble of pre-computed signal templates. The recurrent autoencoder outperforms other autoencoders based on different architectures. The class of recurrent autoencoders presented in this paper could complement the search strategy employed for gravitational wave detection and extend the reach of the ongoing detection campaigns.

Read this paper on arXiv…

E. Moreno, J. Vlimant, M. Spiropulu, et. al.
Wed, 28 Jul 21
17/68

Comments: 16 pages, 6 figures

Constraining dark matter annihilation with cosmic ray antiprotons using neural networks [HEAP]

http://arxiv.org/abs/2107.12395


The interpretation of data from indirect detection experiments searching for dark matter annihilations requires computationally expensive simulations of cosmic-ray propagation. In this work we present a new method based on Recurrent Neural Networks that significantly accelerates simulations of secondary and dark matter Galactic cosmic ray antiprotons while achieving excellent accuracy. This approach allows for an efficient profiling or marginalisation over the nuisance parameters of a cosmic ray propagation model in order to perform parameter scans for a wide range of dark matter models. We identify importance sampling as particularly suitable for ensuring that the network is only evaluated in well-trained parameter regions. We present resulting constraints using the most recent AMS-02 antiproton data on several models of Weakly Interacting Massive Particles. The fully trained networks are released as DarkRayNet together with this work and achieve a speed-up of the runtime by at least two orders of magnitude compared to conventional approaches.

Read this paper on arXiv…

F. Kahlhoefer, M. Korsmeier, M. Krämer, et. al.
Wed, 28 Jul 21
51/68

Comments: N/A

Combining Maximum-Likelihood with Deep Learning for Event Reconstruction in IceCube [HEAP]

http://arxiv.org/abs/2107.12110


The field of deep learning has become increasingly important for particle physics experiments, yielding a multitude of advances, predominantly in event classification and reconstruction tasks. Many of these applications have been adopted from other domains. However, data in the field of physics are unique in the context of machine learning, insofar as their generation process and the laws and symmetries they abide by are usually well understood. Most commonly used deep learning architectures fail at utilizing this available information. In contrast, more traditional likelihood-based methods are capable of exploiting domain knowledge, but they are often limited by computational complexity. In this contribution, a hybrid approach is presented that utilizes generative neural networks to approximate the likelihood, which may then be used in a traditional maximum-likelihood setting. Domain knowledge, such as invariances and detector characteristics, can easily be incorporated in this approach. The hybrid approach is illustrated by the example of event reconstruction in IceCube.

Read this paper on arXiv…

M. Hünnefeld
Tue, 27 Jul 21
71/97

Comments: Presented at the 37th International Cosmic Ray Conference (ICRC 2021). See arXiv:2107.06966 for all IceCube contributions

Deep Learning Based Reconstruction of Total Solar Irradiance [SSA]

http://arxiv.org/abs/2107.11042


The Earth’s primary source of energy is the radiant energy generated by the Sun, which is referred to as solar irradiance, or total solar irradiance (TSI) when all of the radiation is measured. A minor change in the solar irradiance can have a significant impact on the Earth’s climate and atmosphere. As a result, studying and measuring solar irradiance is crucial in understanding climate changes and solar variability. Several methods have been developed to reconstruct total solar irradiance for long and short periods of time; however, they are physics-based and rely on the availability of data, which does not go beyond 9,000 years. In this paper we propose a new method, called TSInet, to reconstruct total solar irradiance by deep learning for short and long periods of time that span beyond the physical models’ data availability. On the data that are available, our method agrees well with the state-of-the-art physics-based reconstruction models. To our knowledge, this is the first time that deep learning has been used to reconstruct total solar irradiance for more than 9,000 years.

Read this paper on arXiv…

Y. Abduallah, J. Wang, Y. Shen, et. al.
Mon, 26 Jul 21
2/62

Comments: 8 pages, 11 figures

Dim but not entirely dark: Extracting the Galactic Center Excess' source-count distribution with neural nets [HEAP]

http://arxiv.org/abs/2107.09070


The two leading hypotheses for the Galactic Center Excess (GCE) in the $\textit{Fermi}$ data are an unresolved population of faint millisecond pulsars (MSPs) and dark-matter (DM) annihilation. The dichotomy between these explanations is typically reflected by modeling them as two separate emission components. However, point-sources (PSs) such as MSPs become statistically degenerate with smooth Poisson emission in the ultra-faint limit (formally where each source is expected to contribute much less than one photon on average), leading to an ambiguity that can render questions such as whether the emission is PS-like or Poissonian in nature ill-defined. We present a conceptually new approach that describes the PS and Poisson emission in a unified manner and only afterwards derives constraints on the Poissonian component from the so obtained results. For the implementation of this approach, we leverage deep learning techniques, centered around a neural network-based method for histogram regression that expresses uncertainties in terms of quantiles. We demonstrate that our method is robust against a number of systematics that have plagued previous approaches, in particular DM / PS misattribution. In the $\textit{Fermi}$ data, we find a faint GCE described by a median source-count distribution (SCD) peaked at a flux of $\sim4 \times 10^{-11} \ \text{counts} \ \text{cm}^{-2} \ \text{s}^{-1}$ (corresponding to $\sim3 – 4$ expected counts per PS), which would require $N \sim \mathcal{O}(10^4)$ sources to explain the entire excess (median value $N = \text{29,300}$ across the sky). Although faint, this SCD allows us to derive the constraint $\eta_P \leq 66\%$ for the Poissonian fraction of the GCE flux $\eta_P$ at 95% confidence, suggesting that a substantial amount of the GCE flux is due to PSs.

Read this paper on arXiv…

F. List, N. Rodd and G. Lewis
Wed, 21 Jul 21
3/83

Comments: 36+8 pages, 15+6 figures, main results in Figs. 8 and 12

Reconstruction of the Density Power Spectrum from Quasar Spectra using Machine Learning [CEA]

http://arxiv.org/abs/2107.09082


We describe a novel end-to-end approach using Machine Learning to reconstruct the power spectrum of cosmological density perturbations at high redshift from observed quasar spectra. State-of-the-art cosmological simulations of structure formation are used to generate a large synthetic dataset of line-of-sight absorption spectra paired with 1-dimensional fluid quantities along the same line-of-sight, such as the total density of matter and the density of neutral atomic hydrogen. With this dataset, we build a series of data-driven models to predict the power spectrum of total matter density. We are able to produce models which yield reconstruction to accuracy of about 1% for wavelengths $k \leq 2 h Mpc^{-1}$, while the error increases at larger $k$. We show the size of data sample required to reach a particular error rate, giving a sense of how much data is necessary to reach a desired accuracy. This work provides a foundation for developing methods to analyse very large upcoming datasets with the next-generation observational facilities.

Read this paper on arXiv…

M. Veiga, X. Meng, O. Gnedin, et. al.
Wed, 21 Jul 21
5/83

Comments: 10 pages, 9 figures

DPNNet-2.0 Part I: Finding hidden planets from simulated images of protoplanetary disk gaps [EPA]

http://arxiv.org/abs/2107.09086


The observed sub-structures, like annular gaps, in dust emissions from protoplanetary disk, are often interpreted as signatures of embedded planets. Fitting a model of planetary gaps to these observed features using customized simulations or empirical relations can reveal the characteristics of the hidden planets. However, customized fitting is often impractical owing to the increasing sample size and the complexity of disk-planet interaction. In this paper we introduce the architecture of DPNNet-2.0, second in the series after DPNNet \citep{aud20}, designed using a Convolutional Neural Network ( CNN, here specifically ResNet50) for predicting exoplanet masses directly from simulated images of protoplanetary disks hosting a single planet. DPNNet-2.0 additionally consists of a multi-input framework that uses both a CNN and multi-layer perceptron (a class of artificial neural network) for processing image and disk parameters simultaneously. This enables DPNNet-2.0 to be trained using images directly, with the added option of considering disk parameters (disk viscosities, disk temperatures, disk surface density profiles, dust abundances, and particle Stokes numbers) generated from disk-planet hydrodynamic simulations as inputs. This work provides the required framework and is the first step towards the use of computer vision (implementing CNN) to directly extract mass of an exoplanet from planetary gaps observed in dust-surface density maps by telescopes such as the Atacama Large (sub-)Millimeter Array.

Read this paper on arXiv…

S. Auddy, R. Dey, M. Lin, et. al.
Wed, 21 Jul 21
63/83

Comments: 15 pages, 10 figures, to appear in ApJ

Tracing Halpha Fibrils through Bayesian Deep Learning [SSA]

http://arxiv.org/abs/2107.07886


We present a new deep learning method, dubbed FibrilNet, for tracing chromospheric fibrils in Halpha images of solar observations. Our method consists of a data pre-processing component that prepares training data from a threshold-based tool, a deep learning model implemented as a Bayesian convolutional neural network for probabilistic image segmentation with uncertainty quantification to predict fibrils, and a post-processing component containing a fibril-fitting algorithm to determine fibril orientations. The FibrilNet tool is applied to high-resolution Halpha images from an active region (AR 12665) collected by the 1.6 m Goode Solar Telescope (GST) equipped with high-order adaptive optics at the Big Bear Solar Observatory (BBSO). We quantitatively assess the FibrilNet tool, comparing its image segmentation algorithm and fibril-fitting algorithm with those employed by the threshold-based tool. Our experimental results and major findings are summarized as follows. First, the image segmentation results (i.e., detected fibrils) of the two tools are quite similar, demonstrating the good learning capability of FibrilNet. Second, FibrilNet finds more accurate and smoother fibril orientation angles than the threshold-based tool. Third, FibrilNet is faster than the threshold-based tool and the uncertainty maps produced by FibrilNet not only provide a quantitative way to measure the confidence on each detected fibril, but also help identify fibril structures that are not detected by the threshold-based tool but are inferred through machine learning. Finally, we apply FibrilNet to full-disk Halpha images from other solar observatories and additional high-resolution Halpha images collected by BBSO/GST, demonstrating the tool’s usability in diverse datasets.

Read this paper on arXiv…

H. Jiang, J. Jing, J. Wang, et. al.
Mon, 19 Jul 21
30/70

Comments: 20 pages, 12 figures

Autoencoder-driven Spiral Representation Learning for Gravitational Wave Surrogate Modelling [CL]

http://arxiv.org/abs/2107.04312


Recently, artificial neural networks have been gaining momentum in the field of gravitational wave astronomy, for example in surrogate modelling of computationally expensive waveform models for binary black hole inspiral and merger. Surrogate modelling yields fast and accurate approximations of gravitational waves and neural networks have been used in the final step of interpolating the coefficients of the surrogate model for arbitrary waveforms outside the training sample. We investigate the existence of underlying structures in the empirical interpolation coefficients using autoencoders. We demonstrate that when the coefficient space is compressed to only two dimensions, a spiral structure appears, wherein the spiral angle is linearly related to the mass ratio. Based on this finding, we design a spiral module with learnable parameters, that is used as the first layer in a neural network, which learns to map the input space to the coefficients. The spiral module is evaluated on multiple neural network architectures and consistently achieves better speed-accuracy trade-off than baseline models. A thorough experimental study is conducted and the final result is a surrogate model which can evaluate millions of input parameters in a single forward pass in under 1ms on a desktop GPU, while the mismatch between the corresponding generated waveforms and the ground-truth waveforms is better than the compared baseline methods. We anticipate the existence of analogous underlying structures and corresponding computational gains also in the case of spinning black hole binaries.

Read this paper on arXiv…

P. Nousi, S. Fragkouli, N. Passalis, et. al.
Tue, 13 Jul 21
74/79

Comments: N/A

SpecGrav — Detection of Gravitational Waves using Deep Learning [IMA]

http://arxiv.org/abs/2107.03607


Gravitational waves are ripples in the fabric of space-time that travel at the speed of light. The detection of gravitational waves by LIGO is a major breakthrough in the field of astronomy. Deep Learning has revolutionized many industries including health care, finance and education. Deep Learning techniques have also been explored for detection of gravitational waves to overcome the drawbacks of traditional matched filtering method. However, in several researches, the training phase of neural network is very time consuming and hardware devices with large memory are required for the task. In order to reduce the extensive amount of hardware resources and time required in training a neural network for detecting gravitational waves, we made SpecGrav. We use 2D Convolutional Neural Network and spectrograms of gravitational waves embedded in noise to detect gravitational waves from binary black hole merger and binary neutron star merger. The training phase of our neural network was of about just 19 minutes on a 2GB GPU.

Read this paper on arXiv…

H. Dodia, H. Tandel and L. D’Mello
Fri, 9 Jul 21
56/62

Comments: N/A

Truncated Marginal Neural Ratio Estimation [CL]

http://arxiv.org/abs/2107.01214


Parametric stochastic simulators are ubiquitous in science, often featuring high-dimensional input parameters and/or an intractable likelihood. Performing Bayesian parameter inference in this context can be challenging. We present a neural simulator-based inference algorithm which simultaneously offers simulation efficiency and fast empirical posterior testability, which is unique among modern algorithms. Our approach is simulation efficient by simultaneously estimating low-dimensional marginal posteriors instead of the joint posterior and by proposing simulations targeted to an observation of interest via a prior suitably truncated by an indicator function. Furthermore, by estimating a locally amortized posterior our algorithm enables efficient empirical tests of the robustness of the inference results. Such tests are important for sanity-checking inference in real-world applications, which do not feature a known ground truth. We perform experiments on a marginalized version of the simulation-based inference benchmark and two complex and narrow posteriors, highlighting the simulator efficiency of our algorithm as well as the quality of the estimated marginal posteriors. Implementation on GitHub.

Read this paper on arXiv…

B. Miller, A. Cole, P. Forré, et. al.
Tue, 6 Jul 21
14/74

Comments: 9 pages. 23 pages with references and supplemental material. Code available at this http URL Underlying library this http URL

On the Efficiency of Various Deep Transfer Learning Models in Glitch Waveform Detection in Gravitational-Wave Data [CL]

http://arxiv.org/abs/2107.01863


LIGO is considered the most sensitive and complicated gravitational experiment ever built. Its main objective is to detect the gravitational wave from the strongest events in the universe by observing if the length of its 4-kilometer arms change by a distance 10,000 times smaller than the diameter of a proton. Due to its sensitivity, LIGO is prone to the disturbance of external noises which affects the data being collected to detect the gravitational wave. These noises are commonly called by the LIGO community as glitches. The objective of this study is to evaluate the effeciency of various deep trasnfer learning models namely VGG19, ResNet50V2, VGG16 and ResNet101 to detect glitch waveform in gravitational wave data. The accuracy achieved by the said models are 98.98%, 98.35%, 97.56% and 94.73% respectively. Even though the models achieved fairly high accuracy, it is observed that all of the model suffered from the lack of data for certain classes which is the main concern found in the experiment

Read this paper on arXiv…

R. Mesuga and B. Bayanay
Tue, 6 Jul 21
71/74

Comments: 13 pages, 8 figures

Shared Data and Algorithms for Deep Learning in Fundamental Physics [CL]

http://arxiv.org/abs/2107.00656


We introduce a collection of datasets from fundamental physics research — including particle physics, astroparticle physics, and hadron- and nuclear physics — for supervised machine learning studies. These datasets, containing hadronic top quarks, cosmic-ray induced air showers, phase transitions in hadronic matter, and generator-level histories, are made public to simplify future work on cross-disciplinary machine learning and transfer learning in fundamental physics. Based on these data, we present a simple yet flexible graph-based neural network architecture that can easily be applied to a wide range of supervised learning tasks in these domains. We show that our approach reaches performance close to state-of-the-art dedicated methods on all datasets. To simplify adaptation for various problems, we provide easy-to-follow instructions on how graph-based representations of data structures, relevant for fundamental physics, can be constructed and provide code implementations for several of them. Implementations are also provided for our proposed method and all reference algorithms.

Read this paper on arXiv…

L. Benato, E. Buhmann, M. Erdmann, et. al.
Mon, 5 Jul 21
18/52

Comments: 13 pages, 5 figures, 5 tables

Morphological classification of compact and extended radio galaxies using convolutional neural networks and data augmentation techniques [GA]

http://arxiv.org/abs/2107.00385


Machine learning techniques have been increasingly used in astronomical applications and have proven to successfully classify objects in image data with high accuracy. The current work uses archival data from the Faint Images of the Radio Sky at Twenty Centimeters (FIRST) to classify radio galaxies into four classes: Fanaroff-Riley Class I (FRI), Fanaroff-Riley Class II (FRII), Bent-Tailed (BENT), and Compact (COMPT). The model presented in this work is based on Convolutional Neural Networks (CNNs). The proposed architecture comprises three parallel blocks of convolutional layers combined and processed for final classification by two feed-forward layers. Our model classified selected classes of radio galaxy sources on an independent testing subset with an average of 96\% for precision, recall, and F1 score. The best selected augmentation techniques were rotations, horizontal or vertical flips, and increase of brightness. Shifts, zoom and decrease of brightness worsened the performance of the model. The current results show that model developed in this work is able to identify different morphological classes of radio galaxies with a high efficiency and performance

Read this paper on arXiv…

V. Maslej-Krešňáková, K. Bouchefry and P. Butka
Fri, 2 Jul 21
63/67

Comments: 12 pages, 7 figures, 9 tables, published in Monthly Notices of the Royal Astronomical Society

Primordial non-Gaussianity from the Completed SDSS-IV extended Baryon Oscillation Spectroscopic Survey I: Catalogue Preparation and Systematic Mitigation [CEA]

http://arxiv.org/abs/2106.13724


We investigate the large-scale clustering of the final spectroscopic sample of quasars from the recently completed extended Baryon Oscillation Spectroscopic Survey (eBOSS). The sample contains $343708$ objects in the redshift range $0.8<z<2.2$ and $72667$ objects with redshifts $2.2<z<3.5$, covering an effective area of $4699~{\rm deg}^{2}$. We develop a neural network-based approach to mitigate spurious fluctuations in the density field caused by spatial variations in the quality of the imaging data used to select targets for follow-up spectroscopy. Simulations are used with the same angular and radial distributions as the real data to estimate covariance matrices, perform error analyses, and assess residual systematic uncertainties. We measure the mean density contrast and cross-correlations of the eBOSS quasars against maps of potential sources of imaging systematics to address algorithm effectiveness, finding that the neural network-based approach outperforms standard linear regression. Stellar density is one of the most important sources of spurious fluctuations, and a new template constructed using data from the Gaia spacecraft provides the best match to the observed quasar clustering. The end-product from this work is a new value-added quasar catalogue with the improved weights to correct for nonlinear imaging systematic effects, which will be made public. Our quasar catalogue is used to measure the local-type primordial non-Gaussianity in our companion paper, Mueller et al. in preparation.

Read this paper on arXiv…

M. Rezaie, A. Ross, H. Seo, et. al.
Mon, 28 Jun 21
12/51

Comments: 17 pages, 13 figures, 2 tables. Accepted for publication in MNRAS. For the associated code and value-added catalogs see this https URL and this https URL

Real-time gravitational-wave science with neural posterior estimation [CL]

http://arxiv.org/abs/2106.12594


We demonstrate unprecedented accuracy for rapid gravitational-wave parameter estimation with deep learning. Using neural networks as surrogates for Bayesian posterior distributions, we analyze eight gravitational-wave events from the first LIGO-Virgo Gravitational-Wave Transient Catalog and find very close quantitative agreement with standard inference codes, but with inference times reduced from O(day) to a minute per event. Our networks are trained using simulated data, including an estimate of the detector-noise characteristics near the event. This encodes the signal and noise models within millions of neural-network parameters, and enables inference for any observed data consistent with the training distribution, accounting for noise nonstationarity from event to event. Our algorithm — called “DINGO” — sets a new standard in fast-and-accurate inference of physical parameters of detected gravitational-wave events, which should enable real-time data analysis without sacrificing accuracy.

Read this paper on arXiv…

M. Dax, S. Green, J. Gair, et. al.
Fri, 25 Jun 21
56/62

Comments: 7+12 pages, 4+11 figures

Stratified Learning: a general-purpose statistical method for improved learning under Covariate Shift [CL]

http://arxiv.org/abs/2106.11211


Covariate shift arises when the labelled training (source) data is not representative of the unlabelled (target) data due to systematic differences in the covariate distributions. A supervised model trained on the source data subject to covariate shift may suffer from poor generalization on the target data. We propose a novel, statistically principled and theoretically justified method to improve learning under covariate shift conditions, based on propensity score stratification, a well-established methodology in causal inference. We show that the effects of covariate shift can be reduced or altogether eliminated by conditioning on propensity scores. In practice, this is achieved by fitting learners on subgroups (“strata”) constructed by partitioning the data based on the estimated propensity scores, leading to balanced covariates and much-improved target prediction. We demonstrate the effectiveness of our general-purpose method on contemporary research questions in observational cosmology, and on additional benchmark examples, matching or outperforming state-of-the-art importance weighting methods, widely studied in the covariate shift literature. We obtain the best reported AUC (0.958) on the updated “Supernovae photometric classification challenge” and improve upon existing conditional density estimation of galaxy redshift from Sloan Data Sky Survey (SDSS) data.

Read this paper on arXiv…

M. Autenrieth, D. Dyk, R. Trotta, et. al.
Tue, 22 Jun 21
31/71

Comments: N/A

Unsupervised Resource Allocation with Graph Neural Networks [CL]

http://arxiv.org/abs/2106.09761


We present an approach for maximizing a global utility function by learning how to allocate resources in an unsupervised way. We expect interactions between allocation targets to be important and therefore propose to learn the reward structure for near-optimal allocation policies with a GNN. By relaxing the resource constraint, we can employ gradient-based optimization in contrast to more standard evolutionary algorithms. Our algorithm is motivated by a problem in modern astronomy, where one needs to select-based on limited initial information-among $10^9$ galaxies those whose detailed measurement will lead to optimal inference of the composition of the universe. Our technique presents a way of flexibly learning an allocation strategy by only requiring forward simulators for the physics of interest and the measurement process. We anticipate that our technique will also find applications in a range of resource allocation problems.

Read this paper on arXiv…

M. Cranmer, P. Melchior and B. Nord
Mon, 21 Jun 21
29/54

Comments: Accepted to PMLR/contributed oral at NeurIPS 2020 Pre-registration Workshop. Code at this https URL

Using Convolutional Neural Networks for the Helicity Classification of Magnetic Fields [HEAP]

http://arxiv.org/abs/2106.06718


The presence of non-zero helicity in intergalactic magnetic fields is a smoking gun for their primordial origin since they have to be generated by processes that break CP invariance. As an experimental signature for the presence of helical magnetic fields, an estimator $Q$ based on the triple scalar product of the wave-vectors of photons generated in electromagnetic cascades from, e.g., TeV blazars, has been suggested previously. We propose to apply deep learning to helicity classification employing Convolutional Neural Networks and show that this method outperforms the $Q$ estimator.

Read this paper on arXiv…

N. Vago, I. Hameed and M. Kachelriess
Tue, 15 Jun 21
1/67

Comments: 14 pages, extended version of a contribution to the proceedings of the 37.th ICRC 2021

Recovery of Meteorites Using an Autonomous Drone and Machine Learning [EPA]

http://arxiv.org/abs/2106.06523


The recovery of freshly fallen meteorites from tracked and triangulated meteors is critical to determining their source asteroid families. However, locating meteorite fragments in strewn fields remains a challenge with very few meteorites being recovered from the meteors triangulated in past and ongoing meteor camera networks. We examined if locating meteorites can be automated using machine learning and an autonomous drone. Drones can be programmed to fly a grid search pattern and take systematic pictures of the ground over a large survey area. Those images can be analyzed using a machine learning classifier to identify meteorites in the field among many other features. Here, we describe a proof-of-concept meteorite classifier that deploys off-line a combination of different convolution neural networks to recognize meteorites from images taken by drones in the field. The system was implemented in a conceptual drone setup and tested in the suspected strewn field of a recent meteorite fall near Walker Lake, Nevada.

Read this paper on arXiv…

R. Citron, P. Jenniskens, C. Watkins, et. al.
Mon, 14 Jun 21
15/58

Comments: 16 pages, 9 Figures