GraphNeT: Graph neural networks for neutrino telescope event reconstruction [IMA]

http://arxiv.org/abs/2210.12194


GraphNeT is an open-source python framework aimed at providing high quality, user friendly, end-to-end functionality to perform reconstruction tasks at neutrino telescopes using graph neural networks (GNNs). GraphNeT makes it fast and easy to train complex models that can provide event reconstruction with state-of-the-art performance, for arbitrary detector configurations, with inference times that are orders of magnitude faster than traditional reconstruction techniques. GNNs from GraphNeT are flexible enough to be applied to data from all neutrino telescopes, including future projects such as IceCube extensions or P-ONE. This means that GNN-based reconstruction can be used to provide state-of-the-art performance on most reconstruction tasks in neutrino telescopes, at real-time event rates, across experiments and physics analyses, with vast potential impact for neutrino and astro-particle physics.

Read this paper on arXiv…

A. Søgaard, R. Ørsøe, L. Bozianu, et. al.
Tue, 25 Oct 22
98/111

Comments: 6 pages, 1 figure. Code can be found at this https URL . Submitted to the Journal of Open Source Software (JOSS)

Machine-Learning Love: classifying the equation of state of neutron stars with Transformers [IMA]

http://arxiv.org/abs/2210.08382


The use of the Audio Spectrogram Transformer (AST) model for gravitational-wave data analysis is investigated. The AST machine-learning model is a convolution-free classifier that captures long-range global dependencies through a purely attention-based mechanism. In this paper a model is applied to a simulated dataset of inspiral gravitational wave signals from binary neutron star coalescences, built from five distinct, cold equations of state (EOS) of nuclear matter. From the analysis of the mass dependence of the tidal deformability parameter for each EOS class it is shown that the AST model achieves a promising performance in correctly classifying the EOS purely from the gravitational wave signals, especially when the component masses of the binary system are in the range $[1,1.5]M_{\odot}$. Furthermore, the generalization ability of the model is investigated by using gravitational-wave signals from a new EOS not used during the training of the model, achieving fairly satisfactory results. Overall, the results, obtained using the simplified setup of noise-free waveforms, show that the AST model, once trained, might allow for the instantaneous inference of the cold nuclear matter EOS directly from the inspiral gravitational-wave signals produced in binary neutron star coalescences.

Read this paper on arXiv…

G. Gonçalves, M. Ferreira, J. Aveiro, et. al.
Tue, 18 Oct 22
86/99

Comments: 11 pages, 11 figures

Neural Importance Sampling for Rapid and Reliable Gravitational-Wave Inference [CL]

http://arxiv.org/abs/2210.05686


We combine amortized neural posterior estimation with importance sampling for fast and accurate gravitational-wave inference. We first generate a rapid proposal for the Bayesian posterior using neural networks, and then attach importance weights based on the underlying likelihood and prior. This provides (1) a corrected posterior free from network inaccuracies, (2) a performance diagnostic (the sample efficiency) for assessing the proposal and identifying failure cases, and (3) an unbiased estimate of the Bayesian evidence. By establishing this independent verification and correction mechanism we address some of the most frequent criticisms against deep learning for scientific inference. We carry out a large study analyzing 42 binary black hole mergers observed by LIGO and Virgo with the SEOBNRv4PHM and IMRPhenomXPHM waveform models. This shows a median sample efficiency of $\approx 10\%$ (two orders-of-magnitude better than standard samplers) as well as a ten-fold reduction in the statistical uncertainty in the log evidence. Given these advantages, we expect a significant impact on gravitational-wave inference, and for this approach to serve as a paradigm for harnessing deep learning methods in scientific applications.

Read this paper on arXiv…

M. Dax, S. Green, J. Gair, et. al.
Thu, 13 Oct 22
13/68

Comments: 7+7 pages, 1+5 figures

Contrastive Neural Ratio Estimation [CL]

http://arxiv.org/abs/2210.06170


Likelihood-to-evidence ratio estimation is usually cast as either a binary (NRE-A) or a multiclass (NRE-B) classification task. In contrast to the binary classification framework, the current formulation of the multiclass version has an intrinsic and unknown bias term, making otherwise informative diagnostics unreliable. We propose a multiclass framework free from the bias inherent to NRE-B at optimum, leaving us in the position to run diagnostics that practitioners depend on. It also recovers NRE-A in one corner case and NRE-B in the limiting case. For fair comparison, we benchmark the behavior of all algorithms in both familiar and novel training regimes: when jointly drawn data is unlimited, when data is fixed but prior draws are unlimited, and in the commonplace fixed data and parameters setting. Our investigations reveal that the highest performing models are distant from the competitors (NRE-A, NRE-B) in hyperparameter space. We make a recommendation for hyperparameters distinct from the previous models. We suggest a bound on the mutual information as a performance metric for simulation-based inference methods, without the need for posterior samples, and provide experimental results.

Read this paper on arXiv…

B. Miller, C. Weniger and P. Forré
Thu, 13 Oct 22
53/68

Comments: 10 pages. 32 pages with references and supplemental material. Accepted at NeurIPS 2022. Code at this https URL

Inferring Line-of-Sight Velocities and Doppler Widths from Stokes Profiles of GST/NIRIS Using Stacked Deep Neural Networks [SSA]

http://arxiv.org/abs/2210.04122


Obtaining high-quality magnetic and velocity fields through Stokes inversion is crucial in solar physics. In this paper, we present a new deep learning method, named Stacked Deep Neural Networks (SDNN), for inferring line-of-sight (LOS) velocities and Doppler widths from Stokes profiles collected by the Near InfraRed Imaging Spectropolarimeter (NIRIS) on the 1.6 m Goode Solar Telescope (GST) at the Big Bear Solar Observatory (BBSO). The training data of SDNN is prepared by a Milne-Eddington (ME) inversion code used by BBSO. We quantitatively assess SDNN, comparing its inversion results with those obtained by the ME inversion code and related machine learning (ML) algorithms such as multiple support vector regression, multilayer perceptrons and a pixel-level convolutional neural network. Major findings from our experimental study are summarized as follows. First, the SDNN-inferred LOS velocities are highly correlated to the ME-calculated ones with the Pearson product-moment correlation coefficient being close to 0.9 on average. Second, SDNN is faster, while producing smoother and cleaner LOS velocity and Doppler width maps, than the ME inversion code. Third, the maps produced by SDNN are closer to ME’s maps than those from the related ML algorithms, demonstrating the better learning capability of SDNN than the ML algorithms. Finally, comparison between the inversion results of ME and SDNN based on GST/NIRIS and those from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory in flare-prolific active region NOAA 12673 is presented. We also discuss extensions of SDNN for inferring vector magnetic fields with empirical evaluation.

Read this paper on arXiv…

H. Jiang, Q. Li, Y. Xu, et. al.
Tue, 11 Oct 22
8/92

Comments: 16 pages, 8 figures

Rejecting noise in Baikal-GVD data with neural networks [IMA]

http://arxiv.org/abs/2210.04653


Baikal-GVD is a large ($\sim$ 1 km$^3$) underwater neutrino telescope installed in the fresh waters of Lake Baikal. The deep lake water environment is pervaded by background light, which produces detectable signals in the Baikal-GVD photosensors. We introduce a neural network for an efficient separation of these noise hits from the signal ones, stemming from the propagation of relativistic particles through the detector. The neural network has a U-net like architecture and employs temporal (causal) structure of events. On Monte-Carlo simulated data, it reaches 99% signal purity (precision) and 98% survival efficiency (recall). The benefits of using neural network for data analysis are discussed, and other possible architectures of neural networks, including graph based, are examined.

Read this paper on arXiv…

I. Kharuk, G. Rubtsov and G. Safronov
Tue, 11 Oct 22
19/92

Comments: N/A

Residual Neural Networks for the Prediction of Planetary Collision Outcomes [EPA]

http://arxiv.org/abs/2210.04248


Fast and accurate treatment of collisions in the context of modern N-body planet formation simulations remains a challenging task due to inherently complex collision processes. We aim to tackle this problem with machine learning (ML), in particular via residual neural networks. Our model is motivated by the underlying physical processes of the data-generating process and allows for flexible prediction of post-collision states. We demonstrate that our model outperforms commonly used collision handling methods such as perfect inelastic merging and feed-forward neural networks in both prediction accuracy and out-of-distribution generalization. Our model outperforms the current state of the art in 20/24 experiments. We provide a dataset that consists of 10164 Smooth Particle Hydrodynamics (SPH) simulations of pairwise planetary collisions. The dataset is specifically suited for ML research to improve computational aspects for collision treatment and for studying planetary collisions in general. We formulate the ML task as a multi-task regression problem, allowing simple, yet efficient training of ML models for collision treatment in an end-to-end manner. Our models can be easily integrated into existing N-body frameworks and can be used within our chosen parameter space of initial conditions, i.e. where similar-sized collisions during late-stage terrestrial planet formation typically occur.

Read this paper on arXiv…

P. Winter, C. Burger, S. Lehner, et. al.
Tue, 11 Oct 22
57/92

Comments: 13 pages, 7 figures, 7 tables

Galaxy Spin Classification I: Z-wise vs S-wise Spirals With Chirality Equivariant Residual Network [CEA]

http://arxiv.org/abs/2210.04168


The angular momentum of galaxies (galaxy spin) contains rich information about the initial condition of the Universe, yet it is challenging to efficiently measure the spin direction for the tremendous amount of galaxies that are being mapped by the ongoing and forthcoming cosmological surveys. We present a machine learning based classifier for the Z-wise vs S-wise spirals, which can help to break the degeneracy in the galaxy spin direction measurement. The proposed Chirality Equivariant Residual Network (CE-ResNet) is manifestly equivariant under a reflection of the input image, which guarantees that there is no inherent asymmetry between the Z-wise and S-wise probability estimators. We train the model with Sloan Digital Sky Survey (SDSS) images, with the training labels given by the Galaxy Zoo 1 (GZ1) project. A combination of data augmentation tricks are used during the training, making the model more robust to be applied to other surveys. We find a $\sim!30\%$ increase of both types of spirals when Dark Energy Spectroscopic Instrument (DESI) images are used for classification, due to the better imaging quality of DESI. We verify that the $\sim!7\sigma$ difference between the numbers of Z-wise and S-wise spirals is due to human bias, since the discrepancy drops to $<!1.8\sigma$ with our CE-ResNet classification results. We discuss the potential systematics that are relevant to the future cosmological applications.

Read this paper on arXiv…

H. Jia, H. Zhu and U. Pen
Tue, 11 Oct 22
64/92

Comments: 13+4 pages, 11 figures, 2 tables, to be submitted to ApJ

Strong Gravitational Lensing Parameter Estimation with Vision Transformer [CEA]

http://arxiv.org/abs/2210.04143


Quantifying the parameters and corresponding uncertainties of hundreds of strongly lensed quasar systems holds the key to resolving one of the most important scientific questions: the Hubble constant ($H_{0}$) tension. The commonly used Markov chain Monte Carlo (MCMC) method has been too time-consuming to achieve this goal, yet recent work has shown that convolution neural networks (CNNs) can be an alternative with seven orders of magnitude improvement in speed. With 31,200 simulated strongly lensed quasar images, we explore the usage of Vision Transformer (ViT) for simulated strong gravitational lensing for the first time. We show that ViT could reach competitive results compared with CNNs, and is specifically good at some lensing parameters, including the most important mass-related parameters such as the center of lens $\theta_{1}$ and $\theta_{2}$, the ellipticities $e_1$ and $e_2$, and the radial power-law slope $\gamma’$. With this promising preliminary result, we believe the ViT (or attention-based) network architecture can be an important tool for strong lensing science for the next generation of surveys. The open source of our code and data is in \url{https://github.com/kuanweih/strong_lensing_vit_resnet}.

Read this paper on arXiv…

K. Huang, G. Chen, P. Chang, et. al.
Tue, 11 Oct 22
91/92

Comments: Accepted by ECCV 2022 AI for Space Workshop

Particle clustering in turbulence: Prediction of spatial and statistical properties with deep learning [EPA]

http://arxiv.org/abs/2210.02339


We demonstrate the utility of deep learning for modeling the clustering of particles that are aerodynamically coupled to turbulent fluids. Using a Lagrangian particle module within the ATHENA++ hydrodynamics code, we simulate the dynamics of particles in the Epstein drag regime within a periodic domain of isotropic forced hydrodynamic turbulence. This setup is an idealized model relevant to the collisional growth of micron to mmsized dust particles in early stage planet formation. The simulation data is used to train a U-Net deep learning model to predict gridded three-dimensional representations of the particle density and velocity fields, given as input the corresponding fluid fields. The trained model qualitatively captures the filamentary structure of clustered particles in a highly non-linear regime. We assess model fidelity by calculating metrics of the density structure (the radial distribution function) and of the velocity field (the relative velocity and the relative radial velocity between particles). Although trained only on the spatial fields, the model predicts these statistical quantities with errors that are typically < 10%. Our results suggest that, given appropriately expanded training data, deep learning could be used to accelerate calculations of particle clustering and collision outcomes both in protoplanetary disks, and in related two-fluid turbulence problems that arise in other disciplines.

Read this paper on arXiv…

Y. Chan, N. Manger, Y. Li, et. al.
Thu, 6 Oct 22
59/77

Comments: 19 pages, 13 figures, submitted to ApJ

Neural network for determining an asteroid mineral composition from reflectance spectra [EPA]

http://arxiv.org/abs/2210.01006


Chemical and mineral compositions of asteroids reflect the formation and history of our Solar System. This knowledge is also important for planetary defence and in-space resource utilisation. We aim to develop a fast and robust neural-network-based method for deriving the mineral modal and chemical compositions of silicate materials from their visible and near-infrared spectra. The method should be able to process raw spectra without significant pre-processing. We designed a convolutional neural network with two hidden layers for the analysis of the spectra, and trained it using labelled reflectance spectra. For the training, we used a dataset that consisted of reflectance spectra of real silicate samples stored in the RELAB and C-Tape databases, namely olivine, orthopyroxene, clinopyroxene, their mixtures, and olivine-pyroxene-rich meteorites. We used the model on two datasets. First, we evaluated the model reliability on a test dataset where we compared the model classification with known compositional reference values. The individual classification results are mostly within 10 percentage-point intervals around the correct values. Second, we classified the reflectance spectra of S-complex (Q-type and V-type, also including A-type) asteroids with known Bus-DeMeo taxonomy classes. The predicted mineral chemical composition of S-type and Q-type asteroids agree with the chemical composition of ordinary chondrites. The modal abundances of V-type and A-type asteroids show a dominant contribution of orthopyroxene and olivine, respectively. Additionally, our predictions of the mineral modal composition of S-type and Q-type asteroids show an apparent depletion of olivine related to the attenuation of its diagnostic absorptions with space weathering. This trend is consistent with previous results of the slower pyroxene response to space weathering relative to olivine.

Read this paper on arXiv…

D. Korda, A. Penttilä, A. Klami, et. al.
Tue, 4 Oct 22
2/71

Comments: main text: 12 pages, 12 figures, 10 tables; appendix: 8 pages, 20 figures, 6 tables

Explainable classification of astronomical uncertain time series [CL]

http://arxiv.org/abs/2210.00869


Exploring the expansion history of the universe, understanding its evolutionary stages, and predicting its future evolution are important goals in astrophysics. Today, machine learning tools are used to help achieving these goals by analyzing transient sources, which are modeled as uncertain time series. Although black-box methods achieve appreciable performance, existing interpretable time series methods failed to obtain acceptable performance for this type of data. Furthermore, data uncertainty is rarely taken into account in these methods. In this work, we propose an uncertaintyaware subsequence based model which achieves a classification comparable to that of state-of-the-art methods. Unlike conformal learning which estimates model uncertainty on predictions, our method takes data uncertainty as additional input. Moreover, our approach is explainable-by-design, giving domain experts the ability to inspect the model and explain its predictions. The explainability of the proposed method has also the potential to inspire new developments in theoretical astrophysics modeling by suggesting important subsequences which depict details of light curve shapes. The dataset, the source code of our experiment, and the results are made available on a public repository.

Read this paper on arXiv…

M. Mbouopda, E. Ishida, E. Nguifo, et. al.
Tue, 4 Oct 22
58/71

Comments: N/A

Solar Flare Index Prediction Using SDO/HMI Vector Magnetic Data Products with Statistical and Machine Learning Methods [SSA]

http://arxiv.org/abs/2209.13779


Solar flares, especially the M- and X-class flares, are often associated with coronal mass ejections (CMEs). They are the most important sources of space weather effects, that can severely impact the near-Earth environment. Thus it is essential to forecast flares (especially the M-and X-class ones) to mitigate their destructive and hazardous consequences. Here, we introduce several statistical and Machine Learning approaches to the prediction of the AR’s Flare Index (FI) that quantifies the flare productivity of an AR by taking into account the numbers of different class flares within a certain time interval. Specifically, our sample includes 563 ARs appeared on solar disk from May 2010 to Dec 2017. The 25 magnetic parameters, provided by the Space-weather HMI Active Region Patches (SHARP) from Helioseismic and Magnetic Imager (HMI) on board the Solar Dynamics Observatory (SDO), characterize coronal magnetic energy stored in ARs by proxy and are used as the predictors. We investigate the relationship between these SHARP parameters and the FI of ARs with a machine-learning algorithm (spline regression) and the resampling method (Synthetic Minority Over-Sampling Technique for Regression with Gaussian Noise, short by SMOGN). Based on the established relationship, we are able to predict the value of FIs for a given AR within the next 1-day period. Compared with other 4 popular machine learning algorithms, our methods improve the accuracy of FI prediction, especially for large FI. In addition, we sort the importance of SHARP parameters by Borda Count method calculated from the ranks that are rendered by 9 different machine learning methods.

Read this paper on arXiv…

H. Zhang, Q. Li, Y. Yang, et. al.
Thu, 29 Sep 22
17/70

Comments: N/A

Scalable and Equivariant Spherical CNNs by Discrete-Continuous (DISCO) Convolutions [CL]

http://arxiv.org/abs/2209.13603


No existing spherical convolutional neural network (CNN) framework is both computationally scalable and rotationally equivariant. Continuous approaches capture rotational equivariance but are often prohibitively computationally demanding. Discrete approaches offer more favorable computational performance but at the cost of equivariance. We develop a hybrid discrete-continuous (DISCO) group convolution that is simultaneously equivariant and computationally scalable to high-resolution. While our framework can be applied to any compact group, we specialize to the sphere. Our DISCO spherical convolutions not only exhibit $\text{SO}(3)$ rotational equivariance but also a form of asymptotic $\text{SO}(3)/\text{SO}(2)$ rotational equivariance, which is more desirable for many applications (where $\text{SO}(n)$ is the special orthogonal group representing rotations in $n$-dimensions). Through a sparse tensor implementation we achieve linear scaling in number of pixels on the sphere for both computational cost and memory usage. For 4k spherical images we realize a saving of $10^9$ in computational cost and $10^4$ in memory usage when compared to the most efficient alternative equivariant spherical convolution. We apply the DISCO spherical CNN framework to a number of benchmark dense-prediction problems on the sphere, such as semantic segmentation and depth estimation, on all of which we achieve the state-of-the-art performance.

Read this paper on arXiv…

J. Ocampo, M. Price and J. McEwen
Thu, 29 Sep 22
29/70

Comments: 17 pages, 6 figures

DVGAN: Stabilize Wasserstein GAN training for time-domain Gravitational Wave physics [IMA]

http://arxiv.org/abs/2209.13592


Simulating time-domain observations of gravitational wave (GW) detector environments will allow for a better understanding of GW sources, augment datasets for GW signal detection and help in characterizing the noise of the detectors, leading to better physics. This paper presents a novel approach to simulating fixed-length time-domain signals using a three-player Wasserstein Generative Adversarial Network (WGAN), called DVGAN, that includes an auxiliary discriminator that discriminates on the derivatives of input signals. An ablation study is used to compare the effects of including adversarial feedback from an auxiliary derivative discriminator with a vanilla two-player WGAN. We show that discriminating on derivatives can stabilize the learning of GAN components on 1D continuous signals during their training phase. This results in smoother generated signals that are less distinguishable from real samples and better capture the distributions of the training data. DVGAN is also used to simulate real transient noise events captured in the advanced LIGO GW detector.

Read this paper on arXiv…

T. Dooney, S. Bromuri and L. Curier
Thu, 29 Sep 22
45/70

Comments: 10 pages, 6 figures, 3 tables

Predicting Swarm Equatorial Plasma Bubbles Via Supervised Machine Learning [CL]

http://arxiv.org/abs/2209.13482


Equatorial Plasma Bubbles (EPBs) are plumes of low density plasma that rise up from the bottomside of the F layer towards the exosphere. EPBs are known causes of radio wave scintillations which can degrade communications with spacecraft. We build a random forest regressor to predict and forecast the probability of an EPB [0-1] detected by the IBI processor on-board the SWARM spacecraft. We use 8-years of Swarm data from 2014 to 2021 and transform the data from a time series into a 5 dimensional space consisting of latitude, longitude, mlt, year, and day-of-the-year. We also add Kp, F10.7cm and solar wind speed. The observations of EPBs with respect to geolocation, local time, season and solar activity mostly agrees with existing work, whilst the link geomagnetic activity is less clear. The prediction has an accuracy of 88% and performs well across the EPB specific spatiotemporal scales. This proves that the XGBoost method is able to successfully capture the climatological and daily variability of SWARM EPBs. Capturing the daily variance has long evaded researchers because of local and stochastic features within the ionosphere. We take advantage of Shapley Values to explain the model and to gain insight into the physics of EPBs. We find that as the solar wind speed increases the probability of an EPB decreases. We also identify a spike in EPB probability around the Earth-Sun perihelion. Both of these insights were derived directly from the XGBoost and Shapley technique.

Read this paper on arXiv…

S. Reddy, C. Forsyth, A. Aruliah, et. al.
Wed, 28 Sep 22
32/89

Comments: 26 Pages, 18 Figures

Machine learning-accelerated chemistry modeling of protoplanetary disks [EPA]

http://arxiv.org/abs/2209.13336


Aims. With the large amount of molecular emission data from (sub)millimeter observatories and incoming James Webb Space Telescope infrared spectroscopy, access to fast forward models of the chemical composition of protoplanetary disks is of paramount importance.
Methods. We used a thermo-chemical modeling code to generate a diverse population of protoplanetary disk models. We trained a K-nearest neighbors (KNN) regressor to instantly predict the chemistry of other disk models.
Results. We show that it is possible to accurately reproduce chemistry using just a small subset of physical conditions, thanks to correlations between the local physical conditions in adopted protoplanetary disk models. We discuss the uncertainties and limitations of this method.
Conclusions. The proposed method can be used for Bayesian fitting of the line emission data to retrieve disk properties from observations. We present a pipeline for reproducing the same approach on other disk chemical model sets.

Read this paper on arXiv…

G. Smirnov-Pinchukov, T. Molyarova, D. Semenov, et. al.
Wed, 28 Sep 22
71/89

Comments: 11 pages, 5 figures

Multi-Hour Ahead Dst Index Prediction Using Multi-Fidelity Boosted Neural Networks [CL]

http://arxiv.org/abs/2209.12571


The Disturbance storm time (Dst) index has been widely used as a proxy for the ring current intensity, and therefore as a measure of geomagnetic activity. It is derived by measurements from four ground magnetometers in the geomagnetic equatorial regions.
We present a new model for predicting $Dst$ with a lead time between 1 and 6 hours. The model is first developed using a Gated Recurrent Unit (GRU) network that is trained using solar wind parameters. The uncertainty of the $Dst$ model is then estimated by using the ACCRUE method [Camporeale et al. 2021]. Finally, a multi-fidelity boosting method is developed in order to enhance the accuracy of the model and reduce its associated uncertainty. It is shown that the developed model can predict $Dst$ 6 hours ahead with a root-mean-square-error (RMSE) of 13.54 $\mathrm{nT}$. This is significantly better than the persistence model and a simple GRU model.

Read this paper on arXiv…

A. Hu, E. Camporeale and B. Swiger
Tue, 27 Sep 22
34/89

Comments: arXiv admin note: text overlap with arXiv:2203.11001

MLGWSC-1: The first Machine Learning Gravitational-Wave Search Mock Data Challenge [IMA]

http://arxiv.org/abs/2209.11146


We present the results of the first Machine Learning Gravitational-Wave Search Mock Data Challenge (MLGWSC-1). For this challenge, participating groups had to identify gravitational-wave signals from binary black hole mergers of increasing complexity and duration embedded in progressively more realistic noise. The final of the 4 provided datasets contained real noise from the O3a observing run and signals up to a duration of 20 seconds with the inclusion of precession effects and higher order modes. We present the average sensitivity distance and runtime for the 6 entered algorithms derived from 1 month of test data unknown to the participants prior to submission. Of these, 4 are machine learning algorithms. We find that the best machine learning based algorithms are able to achieve up to 95% of the sensitive distance of matched-filtering based production analyses for simulated Gaussian noise at a false-alarm rate (FAR) of one per month. In contrast, for real noise, the leading machine learning search achieved 70%. For higher FARs the differences in sensitive distance shrink to the point where select machine learning submissions outperform traditional search algorithms at FARs $\geq 200$ per month on some datasets. Our results show that current machine learning search algorithms may already be sensitive enough in limited parameter regions to be useful for some production settings. To improve the state-of-the-art, machine learning algorithms need to reduce the false-alarm rates at which they are capable of detecting signals and extend their validity to regions of parameter space where modeled searches are computationally expensive to run. Based on our findings we compile a list of research areas that we believe are the most important to elevate machine learning searches to an invaluable tool in gravitational-wave signal detection.

Read this paper on arXiv…

M. Schäfer, O. Zelenka, A. Nitz, et. al.
Fri, 23 Sep 22
6/70

Comments: 25 pages, 6 figures, 4 tables, additional material available at this https URL

Simulation-based inference of Bayesian hierarchical models while checking for model misspecification [CL]

http://arxiv.org/abs/2209.11057


This paper presents recent methodological advances to perform simulation-based inference (SBI) of a general class of Bayesian hierarchical models (BHMs), while checking for model misspecification. Our approach is based on a two-step framework. First, the latent function that appears as second layer of the BHM is inferred and used to diagnose possible model misspecification. Second, target parameters of the trusted model are inferred via SBI. Simulations used in the first step are recycled for score compression, which is necessary to the second step. As a proof of concept, we apply our framework to a prey-predator model built upon the Lotka-Volterra equations and involving complex observational processes.

Read this paper on arXiv…

F. Leclercq
Fri, 23 Sep 22
24/70

Comments: 6 pages, 2 figures. Accepted for publication as proceedings of MaxEnt’22 (18-22 July 2022, IHP, Paris, France, this https URL). The pySELFI code is publicly available at this http URL and on GitHub (this https URL)

Probabilistic Dalek — Emulator framework with probabilistic prediction for supernova tomography [CL]

http://arxiv.org/abs/2209.09453


Supernova spectral time series can be used to reconstruct a spatially resolved explosion model known as supernova tomography. In addition to an observed spectral time series, a supernova tomography requires a radiative transfer model to perform the inverse problem with uncertainty quantification for a reconstruction. The smallest parametrizations of supernova tomography models are roughly a dozen parameters with a realistic one requiring more than 100. Realistic radiative transfer models require tens of CPU minutes for a single evaluation making the problem computationally intractable with traditional means requiring millions of MCMC samples for such a problem. A new method for accelerating simulations known as surrogate models or emulators using machine learning techniques offers a solution for such problems and a way to understand progenitors/explosions from spectral time series. There exist emulators for the TARDIS supernova radiative transfer code but they only perform well on simplistic low-dimensional models (roughly a dozen parameters) with a small number of applications for knowledge gain in the supernova field. In this work, we present a new emulator for the radiative transfer code TARDIS that not only outperforms existing emulators but also provides uncertainties in its prediction. It offers the foundation for a future active-learning-based machinery that will be able to emulate very high dimensional spaces of hundreds of parameters crucial for unraveling urgent questions in supernovae and related fields.

Read this paper on arXiv…

W. Kerzendorf, N. Chen, J. O’Brien, et. al.
Wed, 21 Sep 22
9/68

Comments: 7 pages, accepted at ICML 2022 Workshop on Machine Learning for Astrophysics

Toward an understanding of the properties of neural network approaches for supernovae light curve approximation [IMA]

http://arxiv.org/abs/2209.07542


The modern time-domain photometric surveys collect a lot of observations of various astronomical objects, and the coming era of large-scale surveys will provide even more information. Most of the objects have never received a spectroscopic follow-up, which is especially crucial for transients e.g. supernovae. In such cases, observed light curves could present an affordable alternative. Time series are actively used for photometric classification and characterization, such as peak and luminosity decline estimation. However, the collected time series are multidimensional, irregularly sampled, contain outliers, and do not have well-defined systematic uncertainties. Machine learning methods help extract useful information from available data in the most efficient way. We consider several light curve approximation methods based on neural networks: Multilayer Perceptrons, Bayesian Neural Networks, and Normalizing Flows, to approximate observations of a single light curve. Tests using both the simulated PLAsTiCC and real Zwicky Transient Facility data samples demonstrate that even few observations are enough to fit networks and achieve better approximation quality than other state-of-the-art methods. We show that the methods described in this work have better computational complexity and work faster than Gaussian Processes. We analyze the performance of the approximation techniques aiming to fill the gaps in the observations of the light curves, and show that the use of appropriate technique increases the accuracy of peak finding and supernova classification. In addition, the study results are organized in a Fulu Python library available on GitHub, which can be easily used by the community.

Read this paper on arXiv…

M. Demianenko, K. Malanchev, E. Samorodova, et. al.
Mon, 19 Sep 22
35/50

Comments: Submitted to MNRAS. 14 pages, 6 figures, 9 tables

Trustworthy modelling of atmospheric formaldehyde powered by deep learning [CL]

http://arxiv.org/abs/2209.07414


Formaldehyde (HCHO) is one one of the most important trace gas in the atmosphere, as it is a pollutant causing respiratory and other diseases. It is also a precursor of tropospheric ozone which damages crops and deteriorates human health. Study of HCHO chemistry and long-term monitoring using satellite data is important from the perspective of human health, food security and air pollution. Dynamic atmospheric chemistry models struggle to simulate atmospheric formaldehyde and often overestimate by up to two times relative to satellite observations and reanalysis. Spatial distribution of modelled HCHO also fail to match satellite observations. Here, we present deep learning approach using a simple super-resolution based convolutional neural network towards simulating fast and reliable atmospheric HCHO. Our approach is an indirect method of HCHO estimation without the need to chemical equations. We find that deep learning outperforms dynamical model simulations which involves complicated atmospheric chemistry representation. Causality establishing the nonlinear relationships of different variables to target formaldehyde is established in our approach by using a variety of precursors from meteorology and chemical reanalysis to target OMI AURA satellite based HCHO predictions. We choose South Asia for testing our implementation as it doesnt have in situ measurements of formaldehyde and there is a need for improved quality data over the region. Moreover, there are spatial and temporal data gaps in the satellite product which can be removed by trustworthy modelling of atmospheric formaldehyde. This study is a novel attempt using computer vision for trustworthy modelling of formaldehyde from remote sensing can lead to cascading societal benefits.

Read this paper on arXiv…

M. Biswas and M. Singh
Fri, 16 Sep 22
30/84

Comments: N/A

Robust field-level inference with dark matter halos [CEA]

http://arxiv.org/abs/2209.06843


We train graph neural networks on halo catalogues from Gadget N-body simulations to perform field-level likelihood-free inference of cosmological parameters. The catalogues contain $\lesssim$5,000 halos with masses $\gtrsim 10^{10}~h^{-1}M_\odot$ in a periodic volume of $(25~h^{-1}{\rm Mpc})^3$; every halo in the catalogue is characterized by several properties such as position, mass, velocity, concentration, and maximum circular velocity. Our models, built to be permutationally, translationally, and rotationally invariant, do not impose a minimum scale on which to extract information and are able to infer the values of $\Omega_{\rm m}$ and $\sigma_8$ with a mean relative error of $\sim6\%$, when using positions plus velocities and positions plus masses, respectively. More importantly, we find that our models are very robust: they can infer the value of $\Omega_{\rm m}$ and $\sigma_8$ when tested using halo catalogues from thousands of N-body simulations run with five different N-body codes: Abacus, CUBEP$^3$M, Enzo, PKDGrav3, and Ramses. Surprisingly, the model trained to infer $\Omega_{\rm m}$ also works when tested on thousands of state-of-the-art CAMELS hydrodynamic simulations run with four different codes and subgrid physics implementations. Using halo properties such as concentration and maximum circular velocity allow our models to extract more information, at the expense of breaking the robustness of the models. This may happen because the different N-body codes are not converged on the relevant scales corresponding to these parameters.

Read this paper on arXiv…

H. Shao, F. Villaescusa-Navarro, P. Villanueva-Domingo, et. al.
Fri, 16 Sep 22
42/84

Comments: 25 pages, 11 figures, summary video: this https URL

Towards Coupling Full-disk and Active Region-based Flare Prediction for Operational Space Weather Forecasting [CL]

http://arxiv.org/abs/2209.07406


Solar flare prediction is a central problem in space weather forecasting and has captivated the attention of a wide spectrum of researchers due to recent advances in both remote sensing as well as machine learning and deep learning approaches. The experimental findings based on both machine and deep learning models reveal significant performance improvements for task specific datasets. Along with building models, the practice of deploying such models to production environments under operational settings is a more complex and often time-consuming process which is often not addressed directly in research settings. We present a set of new heuristic approaches to train and deploy an operational solar flare prediction system for $\geq$M1.0-class flares with two prediction modes: full-disk and active region-based. In full-disk mode, predictions are performed on full-disk line-of-sight magnetograms using deep learning models whereas in active region-based models, predictions are issued for each active region individually using multivariate time series data instances. The outputs from individual active region forecasts and full-disk predictors are combined to a final full-disk prediction result with a meta-model. We utilized an equal weighted average ensemble of two base learners’ flare probabilities as our baseline meta learner and improved the capabilities of our two base learners by training a logistic regression model. The major findings of this study are: (i) We successfully coupled two heterogeneous flare prediction models trained with different datasets and model architecture to predict a full-disk flare probability for next 24 hours, (ii) Our proposed ensembling model, i.e., logistic regression, improves on the predictive performance of two base learners and the baseline meta learner measured in terms of two widely used metrics True Skill Statistic (TSS) and Heidke Skill core (HSS), and (iii) Our result analysis suggests that the logistic regression-based ensemble (Meta-FP) improves on the full-disk model (base learner) by $\sim9\%$ in terms TSS and $\sim10\%$ in terms of HSS. Similarly, it improves on the AR-based model (base learner) by $\sim17\%$ and $\sim20\%$ in terms of TSS and HSS respectively. Finally, when compared to the baseline meta model, it improves on TSS by $\sim10\%$ and HSS by $\sim15\%$.

Read this paper on arXiv…

C. Pandey, A. Ji, R. Angryk, et. al.
Fri, 16 Sep 22
44/84

Comments: N/A

Data-driven, multi-moment fluid modeling of Landau damping [CL]

http://arxiv.org/abs/2209.04726


Deriving governing equations of complex physical systems based on first principles can be quite challenging when there are certain unknown terms and hidden physical mechanisms in the systems. In this work, we apply a deep learning architecture to learn fluid partial differential equations (PDEs) of a plasma system based on the data acquired from a fully kinetic model. The learned multi-moment fluid PDEs are demonstrated to incorporate kinetic effects such as Landau damping. Based on the learned fluid closure, the data-driven, multi-moment fluid modeling can well reproduce all the physical quantities derived from the fully kinetic model. The calculated damping rate of Landau damping is consistent with both the fully kinetic simulation and the linear theory. The data-driven fluid modeling of PDEs for complex physical systems may be applied to improve fluid closure and reduce the computational cost of multi-scale modeling of global systems.

Read this paper on arXiv…

W. Cheng, H. Fu, L. Wang, et. al.
Tue, 13 Sep 22
10/85

Comments: 10 pages, 8 figures. Computer Physics Communications, in press

Symbolic Knowledge Extraction from Opaque Predictors Applied to Cosmic-Ray Data Gathered with LISA Pathfinder [HEAP]

http://arxiv.org/abs/2209.04697


Machine learning models are nowadays ubiquitous in space missions, performing a wide variety of tasks ranging from the prediction of multivariate time series through the detection of specific patterns in the input data. Adopted models are usually deep neural networks or other complex machine learning algorithms providing predictions that are opaque, i.e., human users are not allowed to understand the rationale behind the provided predictions. Several techniques exist in the literature to combine the impressive predictive performance of opaque machine learning models with human-intelligible prediction explanations, as for instance the application of symbolic knowledge extraction procedures. In this paper are reported the results of different knowledge extractors applied to an ensemble predictor capable of reproducing cosmic-ray data gathered on board the LISA Pathfinder space mission. A discussion about the readability/fidelity trade-off of the extracted knowledge is also presented.

Read this paper on arXiv…

F. Sabbatini and C. Grimani
Tue, 13 Sep 22
44/85

Comments: N/A

Investigation of a Machine learning methodology for the SKA pulsar search pipeline [IMA]

http://arxiv.org/abs/2209.04430


The SKA pulsar search pipeline will be used for real time detection of pulsars. Modern radio telescopes such as SKA will be generating petabytes of data in their full scale of operation. Hence experience-based and data-driven algorithms become indispensable for applications such as candidate detection. Here we describe our findings from testing a state of the art object detection algorithm called Mask R-CNN to detect candidate signatures in the SKA pulsar search pipeline. We have trained the Mask R-CNN model to detect candidate images. A custom annotation tool was developed to mark the regions of interest in large datasets efficiently. We have successfully demonstrated this algorithm by detecting candidate signatures on a simulation dataset. The paper presents details of this work with a highlight on the future prospects.

Read this paper on arXiv…

S. Bhat, P. Thiagaraj, B. Stappers, et. al.
Mon, 12 Sep 22
51/54

Comments: N/A

Graph Neural Networks for Low-Energy Event Classification & Reconstruction in IceCube [CL]

http://arxiv.org/abs/2209.03042


IceCube, a cubic-kilometer array of optical sensors built to detect atmospheric and astrophysical neutrinos between 1 GeV and 1 PeV, is deployed 1.45 km to 2.45 km below the surface of the ice sheet at the South Pole. The classification and reconstruction of events from the in-ice detectors play a central role in the analysis of data from IceCube. Reconstructing and classifying events is a challenge due to the irregular detector geometry, inhomogeneous scattering and absorption of light in the ice and, below 100 GeV, the relatively low number of signal photons produced per event. To address this challenge, it is possible to represent IceCube events as point cloud graphs and use a Graph Neural Network (GNN) as the classification and reconstruction method. The GNN is capable of distinguishing neutrino events from cosmic-ray backgrounds, classifying different neutrino event types, and reconstructing the deposited energy, direction and interaction vertex. Based on simulation, we provide a comparison in the 1-100 GeV energy range to the current state-of-the-art maximum likelihood techniques used in current IceCube analyses, including the effects of known systematic uncertainties. For neutrino event classification, the GNN increases the signal efficiency by 18% at a fixed false positive rate (FPR), compared to current IceCube methods. Alternatively, the GNN offers a reduction of the FPR by over a factor 8 (to below half a percent) at a fixed signal efficiency. For the reconstruction of energy, direction, and interaction vertex, the resolution improves by an average of 13%-20% compared to current maximum likelihood techniques in the energy range of 1-30 GeV. The GNN, when run on a GPU, is capable of processing IceCube events at a rate nearly double of the median IceCube trigger rate of 2.7 kHz, which opens the possibility of using low energy neutrinos in online searches for transient events.

Read this paper on arXiv…

R. Abbasi, M. Ackermann, J. Adams, et. al.
Thu, 8 Sep 22
49/77

Comments: Prepared for submission to JINST

Predicting the Stability of Hierarchical Triple Systems with Convolutional Neural Networks [SSA]

http://arxiv.org/abs/2206.12402


Understanding the long-term evolution of hierarchical triple systems is challenging due to its inherent chaotic nature, and it requires computationally expensive simulations. Here we propose a convolutional neural network model to predict the stability of hierarchical triples by looking at their evolution during the first $5 \times 10^5$ inner binary orbits. We employ the regularized few-body code TSUNAMI to simulate $5\times 10^6$ hierarchical triples, from which we generate a large training and test dataset. We develop twelve different network configurations that use different combinations of the triples’ orbital elements and compare their performances. Our best model uses 6 time-series, namely, the semimajor axes ratio, the inner and outer eccentricities, the mutual inclination and the arguments of pericenter. This model achieves an area under the curve of over $95\%$ and informs of the relevant parameters to study triple systems stability. All trained models are made publicly available, allowing to predict the stability of hierarchical triple systems $200$ times faster than pure $N$-body methods.

Read this paper on arXiv…

F. Lalande and A. Trani
Mon, 5 Sep 22
27/53

Comments: 12 pages, 6 figures, accepted for publication in ApJ

Predicting the Stability of Hierarchical Triple Systems with Convolutional Neural Networks [SSA]

http://arxiv.org/abs/2206.12402


Understanding the long-term evolution of hierarchical triple systems is challenging due to its inherent chaotic nature, and it requires computationally expensive simulations. Here we propose a convolutional neural network model to predict the stability of hierarchical triples by looking at their evolution during the first $5 \times 10^5$ inner binary orbits. We employ the regularized few-body code TSUNAMI to simulate $5\times 10^6$ hierarchical triples, from which we generate a large training and test dataset. We develop twelve different network configurations that use different combinations of the triples’ orbital elements and compare their performances. Our best model uses 6 time-series, namely, the semimajor axes ratio, the inner and outer eccentricities, the mutual inclination and the arguments of pericenter. This model achieves an area under the curve of over $95\%$ and informs of the relevant parameters to study triple systems stability. All trained models are made publicly available, allowing to predict the stability of hierarchical triple systems $200$ times faster than pure $N$-body methods.

Read this paper on arXiv…

F. Lalande and A. Trani
Mon, 5 Sep 22
15/53

Comments: 12 pages, 6 figures, accepted for publication in ApJ

Light curve completion and forecasting using fast and scalable Gaussian processes (MuyGPs) [IMA]

http://arxiv.org/abs/2208.14592


Temporal variations of apparent magnitude, called light curves, are observational statistics of interest captured by telescopes over long periods of time. Light curves afford the exploration of Space Domain Awareness (SDA) objectives such as object identification or pose estimation as latent variable inference problems. Ground-based observations from commercial off the shelf (COTS) cameras remain inexpensive compared to higher precision instruments, however, limited sensor availability combined with noisier observations can produce gappy time-series data that can be difficult to model. These external factors confound the automated exploitation of light curves, which makes light curve prediction and extrapolation a crucial problem for applications. Traditionally, image or time-series completion problems have been approached with diffusion-based or exemplar-based methods. More recently, Deep Neural Networks (DNNs) have become the tool of choice due to their empirical success at learning complex nonlinear embeddings. However, DNNs often require large training data that are not necessarily available when looking at unique features of a light curve of a single satellite.
In this paper, we present a novel approach to predicting missing and future data points of light curves using Gaussian Processes (GPs). GPs are non-linear probabilistic models that infer posterior distributions over functions and naturally quantify uncertainty. However, the cubic scaling of GP inference and training is a major barrier to their adoption in applications. In particular, a single light curve can feature hundreds of thousands of observations, which is well beyond the practical realization limits of a conventional GP on a single machine. Consequently, we employ MuyGPs, a scalable framework for hyperparameter estimation of GP models that uses nearest neighbors sparsification and local cross-validation. MuyGPs…

Read this paper on arXiv…

I. Goumiri, A. Dunton, A. Muyskens, et. al.
Thu, 1 Sep 22
52/68

Comments: 14 pages, 7 figures, accepted to AMOS 2022 conference

Inferring subhalo effective density slopes from strong lensing observations with neural likelihood-ratio estimation [CEA]

http://arxiv.org/abs/2208.13796


Strong gravitational lensing has emerged as a promising approach for probing dark matter models on sub-galactic scales. Recent work has proposed the subhalo effective density slope as a more reliable observable than the commonly used subhalo mass function. The subhalo effective density slope is a measurement independent of assumptions about the underlying density profile and can be inferred for individual subhalos through traditional sampling methods. To go beyond individual subhalo measurements, we leverage recent advances in machine learning and introduce a neural likelihood-ratio estimator to infer an effective density slope for populations of subhalos. We demonstrate that our method is capable of harnessing the statistical power of multiple subhalos (within and across multiple images) to distinguish between characteristics of different subhalo populations. The computational efficiency warranted by the neural likelihood-ratio estimator over traditional sampling enables statistical studies of dark matter perturbers and is particularly useful as we expect an influx of strong lensing systems from upcoming surveys.

Read this paper on arXiv…

G. Zhang, S. Mishra-Sharma and C. Dvorkin
Wed, 31 Aug 22
33/86

Comments: 11 pages, 5 figures

Leap-frog neural network for learning the symplectic evolution from partitioned data [EPA]

http://arxiv.org/abs/2208.14148


For the Hamiltonian system, this work considers the learning and prediction of the position (q) and momentum (p) variables generated by a symplectic evolution map. Similar to Chen & Tao (2021), the symplectic map is represented by the generating function. In addition, we develop a new learning scheme by splitting the time series (q_i, p_i) into several partitions, and then train a leap-frog neural network (LFNN) to approximate the generating function between the first (i.e. initial condition) and one of the rest partitions. For predicting the system evolution in a short timescale, the LFNN could effectively avoid the issue of accumulative error. Then the LFNN is applied to learn the behavior of the 2:3 resonant Kuiper belt objects, in a much longer time period, and there are two significant improvements on the neural network constructed in our previous work (Li et al. 2022): (1) conservation of the Jacobi integral ; (2) highly accurate prediction of the orbital evolution. We propose that the LFNN may be useful to make the prediction of the long time evolution of the Hamiltonian system.

Read this paper on arXiv…

X. Li, J. Li and Z. Xia
Wed, 31 Aug 22
60/86

Comments: 10 pages, 5 figures, comments welcome

Uncovering dark matter density profiles in dwarf galaxies with graph neural networks [CEA]

http://arxiv.org/abs/2208.12825


Dwarf galaxies are small, dark matter-dominated galaxies, some of which are embedded within the Milky Way. Their lack of baryonic matter (e.g., stars and gas) makes them perfect test beds for probing the properties of dark matter — understanding the spatial dark matter distribution in these systems can be used to constrain microphysical dark matter interactions that influence the formation and evolution of structures in our Universe. We introduce a new method that leverages simulation-based inference and graph-based machine learning in order to infer the dark matter density profiles of dwarf galaxies from observable kinematics of stars gravitationally bound to these systems. Our approach aims to address some of the limitations of established methods based on dynamical Jeans modeling. We show that this novel method can place stronger constraints on dark matter profiles and, consequently, has the potential to weigh in on some of the ongoing puzzles associated with the small-scale structure of dark matter halos, such as the core-cusp discrepancy.

Read this paper on arXiv…

T. Nguyen, S. Mishra-Sharma, R. Williams, et. al.
Tue, 30 Aug 22
62/76

Comments: 9 + 11 pages, 4 + 9 figures

Exploring the Limits of Synthetic Creation of Solar EUV Images via Image-to-Image Translation [SSA]

http://arxiv.org/abs/2208.09512


The Solar Dynamics Observatory (SDO), a NASA multi-spectral decade-long mission that has been daily producing terabytes of observational data from the Sun, has been recently used as a use-case to demonstrate the potential of machine learning methodologies and to pave the way for future deep-space mission planning. In particular, the idea of using image-to-image translation to virtually produce extreme ultra-violet channels has been proposed in several recent studies, as a way to both enhance missions with less available channels and to alleviate the challenges due to the low downlink rate in deep space. This paper investigates the potential and the limitations of such a deep learning approach by focusing on the permutation of four channels and an encoder–decoder based architecture, with particular attention to how morphological traits and brightness of the solar surface affect the neural network predictions. In this work we want to answer the question: can synthetic images of the solar corona produced via image-to-image translation be used for scientific studies of the Sun? The analysis highlights that the neural network produces high-quality images over three orders of magnitude in count rate (pixel intensity) and can generally reproduce the covariance across channels within a 1% error. However the model performance drastically diminishes in correspondence of extremely high energetic events like flares, and we argue that the reason is related to the rareness of such events posing a challenge to model training.

Read this paper on arXiv…

V. Salvatelli, L. Santos, S. Bose, et. al.
Tue, 23 Aug 22
12/79

Comments: 16 pages, 8 figures. To be published on ApJ (submitted on Feb 21st, accepted on July 28th)

Discovering Faint and High Apparent Motion Rate Near-Earth Asteroids Using A Deep Learning Program [IMA]

http://arxiv.org/abs/2208.09098


Although many near-Earth objects have been found by ground-based telescopes, some fast-moving ones, especially those near detection limits, have been missed by observatories. We developed a convolutional neural network for detecting faint fast-moving near-Earth objects. It was trained with artificial streaks generated from simulations and was able to find these asteroid streaks with an accuracy of 98.7% and a false positive rate of 0.02% on simulated data. This program was used to search image data from the Zwicky Transient Facility (ZTF) in four nights in 2019, and it identified six previously undiscovered asteroids. The visual magnitudes of our detections range from ~19.0 – 20.3 and motion rates range from ~6.8 – 24 deg/day, which is very faint compared to other ZTF detections moving at similar motion rates. Our asteroids are also ~1 – 51 m diameter in size and ~5 – 60 lunar distances away at close approach, assuming their albedo values follow the albedo distribution function of known asteroids. The use of a purely simulated dataset to train our model enables the program to gain sensitivity in detecting faint and fast-moving objects while still being able to recover nearly all discoveries made by previously designed neural networks which used real detections to train neural networks. Our approach can be adopted by any observatory for detecting fast-moving asteroid streaks.

Read this paper on arXiv…

F. Wang, J. Ge and K. Willis
Mon, 22 Aug 22
3/53

Comments: 14 pages, 22 Figures, 4 Tables; To be published in the Monthly Notices of the Royal Astronomical Society (MNRAS)

SNGuess: A method for the selection of young extragalactic transients [IMA]

http://arxiv.org/abs/2208.06534


With a rapidly rising number of transients detected in astronomy, classification methods based on machine learning are increasingly being employed. Their goals are typically to obtain a definitive classification of transients, and for good performance they usually require the presence of a large set of observations. However, well-designed, targeted models can reach their classification goals with fewer computing resources. This paper presents SNGuess, a model designed to find young extragalactic nearby transients with high purity. SNGuess works with a set of features that can be efficiently calculated from astronomical alert data. Some of these features are static and associated with the alert metadata, while others must be calculated from the photometric observations contained in the alert. Most of the features are simple enough to be obtained or to be calculated already at the early stages in the lifetime of a transient after its detection. We calculate these features for a set of labeled public alert data obtained over a time span of 15 months from the Zwicky Transient Facility (ZTF). The core model of SNGuess consists of an ensemble of decision trees, which are trained via gradient boosting. Approximately 88% of the candidates suggested by SNGuess from a set of alerts from ZTF spanning from April 2020 to August 2021 were found to be true relevant supernovae (SNe). For alerts with bright detections, this number ranges between 92% and 98%. Since April 2020, transients identified by SNGuess as potential young SNe in the ZTF alert stream are being published to the Transient Name Server (TNS) under the AMPEL_ZTF_NEW group identifier. SNGuess scores for any transient observed by ZTF can be accessed via a web service. The source code of SNGuess is publicly available.

Read this paper on arXiv…

N. Miranda, J. Freytag, J. Nordin, et. al.
Tue, 16 Aug 22
38/74

Comments: 14 pages, 10 figures, Astronomy & Astrophysics (A&A), Forthcoming article, source code this https URL

Virgo: Scalable Unsupervised Classification of Cosmological Shock Waves [IMA]

http://arxiv.org/abs/2208.06859


Cosmological shock waves are essential to understanding the formation of cosmological structures. To study them, scientists run computationally expensive high-resolution 3D hydrodynamic simulations. Interpreting the simulation results is challenging because the resulting data sets are enormous, and the shock wave surfaces are hard to separate and classify due to their complex morphologies and multiple shock fronts intersecting. We introduce a novel pipeline, Virgo, combining physical motivation, scalability, and probabilistic robustness to tackle this unsolved unsupervised classification problem. To this end, we employ kernel principal component analysis with low-rank matrix approximations to denoise data sets of shocked particles and create labeled subsets. We perform supervised classification to recover full data resolution with stochastic variational deep kernel learning. We evaluate on three state-of-the-art data sets with varying complexity and achieve good results. The proposed pipeline runs automatically, has only a few hyperparameters, and performs well on all tested data sets. Our results are promising for large-scale applications, and we highlight now enabled future scientific work.

Read this paper on arXiv…

M. Lamparth, L. Böss, U. Steinwandel, et. al.
Tue, 16 Aug 22
46/74

Comments: N/A

A deep learning approach to halo merger tree construction [GA]

http://arxiv.org/abs/2205.15988


A key ingredient for semi-analytic models (SAMs) of galaxy formation is the mass assembly history of haloes, encoded in a tree structure. The most commonly used method to construct halo merger histories is based on the outcomes of high-resolution, computationally intensive N-body simulations. We show that machine learning (ML) techniques, in particular Generative Adversarial Networks (GANs), are a promising new tool to tackle this problem with a modest computational cost and retaining the best features of merger trees from simulations. We train our GAN model with a limited sample of merger trees from the EAGLE simulation suite, constructed using two halo finders-tree builder algorithms: SUBFIND-D-TREES and ROCKSTAR-ConsistentTrees. Our GAN model successfully learns to generate well-constructed merger tree structures with high temporal resolution, and to reproduce the statistical features of the sample of merger trees used for training, when considering up to three variables in the training process. These inputs, whose representations are also learned by our GAN model, are mass of the halo progenitors and the final descendant, progenitor type (main halo or satellite) and distance of a progenitor to that in the main branch. The inclusion of the latter two inputs greatly improves the final learned representation of the halo mass growth history, especially for SUBFIND-like ML trees. When comparing equally sized samples of ML merger trees with those of the EAGLE simulation, we find better agreement for SUBFIND-like ML trees. Finally, our GAN-based framework can be utilised to construct merger histories of low and intermediate mass haloes, the most abundant in cosmological simulations.

Read this paper on arXiv…

S. Robles, J. Gómez, A. Rivera, et. al.
Wed, 1 Jun 22
48/65

Comments: 17 pages, 12 figures, 3 tables, 2 appendices

Calibrated Predictive Distributions via Diagnostics for Conditional Coverage [CL]

http://arxiv.org/abs/2205.14568


Uncertainty quantification is crucial for assessing the predictive ability of AI algorithms. A large body of work (including normalizing flows and Bayesian neural networks) has been devoted to describing the entire predictive distribution (PD) of a target variable Y given input features $\mathbf{X}$. However, off-the-shelf PDs are usually far from being conditionally calibrated; i.e., the probability of occurrence of an event given input $\mathbf{X}$ can be significantly different from the predicted probability. Most current research on predictive inference (such as conformal prediction) concerns constructing prediction sets, that do not only provide correct uncertainties on average over the entire population (that is, averaging over $\mathbf{X}$), but that are also approximately conditionally calibrated with accurate uncertainties for individual instances. It is often believed that the problem of obtaining and assessing entire conditionally calibrated PDs is too challenging to approach. In this work, we show that recalibration as well as validation are indeed attainable goals in practice. Our proposed method relies on the idea of regressing probability integral transform (PIT) scores against $\mathbf{X}$. This regression gives full diagnostics of conditional coverage across the entire feature space and can be used to recalibrate misspecified PDs. We benchmark our corrected prediction bands against oracle bands and state-of-the-art predictive inference algorithms for synthetic data, including settings with distributional shift and dependent high-dimensional sequence data. Finally, we demonstrate an application to the physical sciences in which we assess and produce calibrated PDs for measurements of galaxy distances using imaging data (i.e., photometric redshifts).

Read this paper on arXiv…

B. Dey, D. Zhao, J. Newman, et. al.
Tue, 31 May 22
87/89

Comments: 10 pages, 6 figures. Under review

Global geomagnetic perturbation forecasting using Deep Learning [CL]

http://arxiv.org/abs/2205.12734


Geomagnetically Induced Currents (GICs) arise from spatio-temporal changes to Earth’s magnetic field which arise from the interaction of the solar wind with Earth’s magnetosphere, and drive catastrophic destruction to our technologically dependent society. Hence, computational models to forecast GICs globally with large forecast horizon, high spatial resolution and temporal cadence are of increasing importance to perform prompt necessary mitigation. Since GIC data is proprietary, the time variability of horizontal component of the magnetic field perturbation (dB/dt) is used as a proxy for GICs. In this work, we develop a fast, global dB/dt forecasting model, which forecasts 30 minutes into the future using only solar wind measurements as input. The model summarizes 2 hours of solar wind measurement using a Gated Recurrent Unit, and generates forecasts of coefficients which are folded with a spherical harmonic basis to enable global forecasts. When deployed, our model produces results in under a second, and generates global forecasts for horizontal magnetic perturbation components at 1-minute cadence. We evaluate our model across models in literature for two specific storms of 5 August 2011 and 17 March 2015, while having a self-consistent benchmark model set. Our model outperforms, or has consistent performance with state-of-the-practice high time cadence local and low time cadence global models, while also outperforming/having comparable performance with the benchmark models. Such quick inferences at high temporal cadence and arbitrary spatial resolutions may ultimately enable accurate forewarning of dB/dt for any place on Earth, resulting in precautionary measures to be taken in an informed manner.

Read this paper on arXiv…

V. Upendran, P. Tigas, B. Ferdousi, et. al.
Fri, 27 May 22
11/61

Comments: 23 pages, 8 figures, 5 tables; accepted for publication in AGU: Spaceweather

Removing the fat from your posterior samples with margarine [IMA]

http://arxiv.org/abs/2205.12841


Bayesian workflows often require the introduction of nuisance parameters, yet for core science modelling one needs access to a marginal posterior density. In this work we use masked autoregressive flows and kernel density estimators to encapsulate the marginal posterior, allowing us to compute marginal Kullback-Leibler divergences and marginal Bayesian model dimensionalities in addition to generating samples and computing marginal log probabilities. We demonstrate this in application to topical cosmological examples of the Dark Energy Survey, and global 21cm signal experiments. In addition to the computation of marginal Bayesian statistics, this work is important for further applications in Bayesian experimental design, complex prior modelling and likelihood emulation. This technique is made publicly available in the pip-installable code margarine.

Read this paper on arXiv…

H. Bevins, W. Handley, P. Lemos, et. al.
Thu, 26 May 22
20/56

Comments: Submitted to NeurIPS

Global geomagnetic perturbation forecasting using Deep Learning [CL]

http://arxiv.org/abs/2205.12734


Geomagnetically Induced Currents (GICs) arise from spatio-temporal changes to Earth’s magnetic field which arise from the interaction of the solar wind with Earth’s magnetosphere, and drive catastrophic destruction to our technologically dependent society. Hence, computational models to forecast GICs globally with large forecast horizon, high spatial resolution and temporal cadence are of increasing importance to perform prompt necessary mitigation. Since GIC data is proprietary, the time variability of horizontal component of the magnetic field perturbation (dB/dt) is used as a proxy for GICs. In this work, we develop a fast, global dB/dt forecasting model, which forecasts 30 minutes into the future using only solar wind measurements as input. The model summarizes 2 hours of solar wind measurement using a Gated Recurrent Unit, and generates forecasts of coefficients which are folded with a spherical harmonic basis to enable global forecasts. When deployed, our model produces results in under a second, and generates global forecasts for horizontal magnetic perturbation components at 1-minute cadence. We evaluate our model across models in literature for two specific storms of 5 August 2011 and 17 March 2015, while having a self-consistent benchmark model set. Our model outperforms, or has consistent performance with state-of-the-practice high time cadence local and low time cadence global models, while also outperforming/having comparable performance with the benchmark models. Such quick inferences at high temporal cadence and arbitrary spatial resolutions may ultimately enable accurate forewarning of dB/dt for any place on Earth, resulting in precautionary measures to be taken in an informed manner.

Read this paper on arXiv…

V. Upendran, P. Tigas, B. Ferdousi, et. al.
Thu, 26 May 22
43/56

Comments: 23 pages, 8 figures, 5 tables; accepted for publication in AGU: Spaceweather

The Fellowship of the Dyson Ring: ACT\&Friends' Results and Methods for GTOC 11 [CL]

http://arxiv.org/abs/2205.10124


Dyson spheres are hypothetical megastructures encircling stars in order to harvest most of their energy output. During the 11th edition of the GTOC challenge, participants were tasked with a complex trajectory planning related to the construction of a precursor Dyson structure, a heliocentric ring made of twelve stations. To this purpose, we developed several new approaches that synthesize techniques from machine learning, combinatorial optimization, planning and scheduling, and evolutionary optimization effectively integrated into a fully automated pipeline. These include a machine learned transfer time estimator, improving the established Edelbaum approximation and thus better informing a Lazy Race Tree Search to identify and collect asteroids with high arrival mass for the stations; a series of optimally-phased low-thrust transfers to all stations computed by indirect optimization techniques, exploiting the synodic periodicity of the system; and a modified Hungarian scheduling algorithm, which utilizes evolutionary techniques to arrange a mass-balanced arrival schedule out of all transfer possibilities. We describe the steps of our pipeline in detail with a special focus on how our approaches mutually benefit from each other. Lastly, we outline and analyze the final solution of our team, ACT&Friends, which ranked second at the GTOC 11 challenge.

Read this paper on arXiv…

M. Märtens, D. Izzo, E. Blazquez, et. al.
Mon, 23 May 22
36/50

Comments: N/A

Fast and realistic large-scale structure from machine-learning-augmented random field simulations [CEA]

http://arxiv.org/abs/2205.07898


Producing thousands of simulations of the dark matter distribution in the Universe with increasing precision is a challenging but critical task to facilitate the exploitation of current and forthcoming cosmological surveys. Many inexpensive substitutes to full $N$-body simulations have been proposed, even though they often fail to reproduce the statistics of the smaller, non-linear scales. Among these alternatives, a common approximation is represented by the lognormal distribution, which comes with its own limitations as well, while being extremely fast to compute even for high-resolution density fields. In this work, we train a machine learning model to transform projected lognormal dark matter density fields to more realistic dark matter maps, as obtained from full $N$-body simulations. We detail the procedure that we follow to generate highly correlated pairs of lognormal and simulated maps, which we use as our training data, exploiting the information of the Fourier phases. We demonstrate the performance of our model comparing various statistical tests with different field resolutions, redshifts and cosmological parameters, proving its robustness and explaining its current limitations. The augmented lognormal random fields reproduce the power spectrum up to wavenumbers of $1 \ h \ \rm{Mpc}^{-1}$, the bispectrum and the peak counts within 10%, and always within the error bars, of the fiducial target simulations. Finally, we describe how we plan to integrate our proposed model with existing tools to yield more accurate spherical random fields for weak lensing analysis, going beyond the lognormal approximation.

Read this paper on arXiv…

D. Piras, B. Joachimi and F. Villaescusa-Navarro
Wed, 18 May 22
37/66

Comments: 13 pages, 7 figures, comments welcome

High-Resolution CMB Lensing Reconstruction with Deep Learning [CEA]

http://arxiv.org/abs/2205.07368


Next-generation cosmic microwave background (CMB) surveys are expected to provide valuable information about the primordial universe by creating maps of the mass along the line of sight. Traditional tools for creating these lensing convergence maps include the quadratic estimator and the maximum likelihood based iterative estimator. Here, we apply a generative adversarial network (GAN) to reconstruct the lensing convergence field. We compare our results with a previous deep learning approach — Residual-UNet — and discuss the pros and cons of each. In the process, we use training sets generated by a variety of power spectra, rather than the one used in testing the methods.

Read this paper on arXiv…

P. Li, I. Onur, S. Dodelson, et. al.
Tue, 17 May 22
55/95

Comments: 11 pages, 9 figures

Towards on-sky adaptive optics control using reinforcement learning [IMA]

http://arxiv.org/abs/2205.07554


The direct imaging of potentially habitable Exoplanets is one prime science case for the next generation of high contrast imaging instruments on ground-based extremely large telescopes. To reach this demanding science goal, the instruments are equipped with eXtreme Adaptive Optics (XAO) systems which will control thousands of actuators at a framerate of kilohertz to several kilohertz. Most of the habitable exoplanets are located at small angular separations from their host stars, where the current XAO systems’ control laws leave strong residuals.Current AO control strategies like static matrix-based wavefront reconstruction and integrator control suffer from temporal delay error and are sensitive to mis-registration, i.e., to dynamic variations of the control system geometry. We aim to produce control methods that cope with these limitations, provide a significantly improved AO correction and, therefore, reduce the residual flux in the coronagraphic point spread function.
We extend previous work in Reinforcement Learning for AO. The improved method, called PO4AO, learns a dynamics model and optimizes a control neural network, called a policy. We introduce the method and study it through numerical simulations of XAO with Pyramid wavefront sensing for the 8-m and 40-m telescope aperture cases. We further implemented PO4AO and carried out experiments in a laboratory environment using MagAO-X at the Steward laboratory. PO4AO provides the desired performance by improving the coronagraphic contrast in numerical simulations by factors 3-5 within the control region of DM and Pyramid WFS, in simulation and in the laboratory. The presented method is also quick to train, i.e., on timescales of typically 5-10 seconds, and the inference time is sufficiently small (< ms) to be used in real-time control for XAO with currently available hardware even for extremely large telescopes.

Read this paper on arXiv…

J. Nousiainen, C. Rajani, M. Kasper, et. al.
Tue, 17 May 22
81/95

Comments: N/A

Improving Astronomical Time-series Classification via Data Augmentation with Generative Adversarial Networks [IMA]

http://arxiv.org/abs/2205.06758


Due to the latest advances in technology, telescopes with significant sky coverage will produce millions of astronomical alerts per night that must be classified both rapidly and automatically. Currently, classification consists of supervised machine learning algorithms whose performance is limited by the number of existing annotations of astronomical objects and their highly imbalanced class distributions. In this work, we propose a data augmentation methodology based on Generative Adversarial Networks (GANs) to generate a variety of synthetic light curves from variable stars. Our novel contributions, consisting of a resampling technique and an evaluation metric, can assess the quality of generative models in unbalanced datasets and identify GAN-overfitting cases that the Fr\’echet Inception Distance does not reveal. We applied our proposed model to two datasets taken from the Catalina and Zwicky Transient Facility surveys. The classification accuracy of variable stars is improved significantly when training with synthetic data and testing with real data with respect to the case of using only real data.

Read this paper on arXiv…

G. García-Jara, P. Protopapas and P. Estévez
Mon, 16 May 22
8/42

Comments: Accepted to ApJ on May 11, 2022

Insights into the origin of halo mass profiles from machine learning [CEA]

http://arxiv.org/abs/2205.04474


The mass distribution of dark matter haloes is the result of the hierarchical growth of initial density perturbations through mass accretion and mergers. We use an interpretable machine-learning framework to provide physical insights into the origin of the spherically-averaged mass profile of dark matter haloes. We train a gradient-boosted-trees algorithm to predict the final mass profiles of cluster-sized haloes, and measure the importance of the different inputs provided to the algorithm. We find two primary scales in the initial conditions (ICs) that impact the final mass profile: the density at approximately the scale of the haloes’ Lagrangian patch $R_L$ ($R\sim 0.7\, R_L$) and that in the large-scale environment ($R\sim 1.7~R_L$). The model also identifies three primary time-scales in the halo assembly history that affect the final profile: (i) the formation time of the virialized, collapsed material inside the halo, (ii) the dynamical time, which captures the dynamically unrelaxed, infalling component of the halo over its first orbit, (iii) a third, most recent time-scale, which captures the impact on the outer profile of recent massive merger events. While the inner profile retains memory of the ICs, this information alone is insufficient to yield accurate predictions for the outer profile. As we add information about the haloes’ mass accretion history, we find a significant improvement in the predicted profiles at all radii. Our machine-learning framework provides novel insights into the role of the ICs and the mass assembly history in determining the final mass profile of cluster-sized haloes.

Read this paper on arXiv…

L. Lucie-Smith, S. Adhikari and R. Wechsler
Wed, 11 May 22
23/60

Comments: 14 pages, 11 figures, comments welcome

Automatic Detection of Interplanetary Coronal Mass Ejections in Solar Wind In Situ Data [SSA]

http://arxiv.org/abs/2205.03578


Interplanetary coronal mass ejections (ICMEs) are one of the main drivers for space weather disturbances. In the past, different approaches have been used to automatically detect events in existing time series resulting from solar wind in situ observations. However, accurate and fast detection still remains a challenge when facing the large amount of data from different instruments. For the automatic detection of ICMEs we propose a pipeline using a method that has recently proven successful in medical image segmentation. Comparing it to an existing method, we find that while achieving similar results, our model outperforms the baseline regarding training time by a factor of approximately 20, thus making it more applicable for other datasets. The method has been tested on in situ data from the Wind spacecraft between 1997 and 2015 with a True Skill Statistic (TSS) of 0.64. Out of the 640 ICMEs, 466 were detected correctly by our algorithm, producing a total of 254 False Positives. Additionally, it produced reasonable results on datasets with fewer features and smaller training sets from Wind, STEREO-A and STEREO-B with True Skill Statistics of 0.56, 0.57 and 0.53, respectively. Our pipeline manages to find the start of an ICME with a mean absolute error (MAE) of around 2 hours and 56 minutes, and the end time with a MAE of 3 hours and 20 minutes. The relatively fast training allows straightforward tuning of hyperparameters and could therefore easily be used to detect other structures and phenomena in solar wind data, such as corotating interaction regions.

Read this paper on arXiv…

H. Rüdisser, A. Windisch, U. Amerstorfer, et. al.
Tue, 10 May 22
12/70

Comments: N/A

ASTROMER: A transformer-based embedding for the representation of light curves [IMA]

http://arxiv.org/abs/2205.01677


Taking inspiration from natural language embeddings, we present ASTROMER, a transformer-based model to create representations of light curves. ASTROMER was trained on millions of MACHO R-band samples, and it can be easily fine-tuned to match specific domains associated with downstream tasks. As an example, this paper shows the benefits of using pre-trained representations to classify variable stars. In addition, we provide a python library including all functionalities employed in this work. Our library includes the pre-trained models that can be used to enhance the performance of deep learning models, decreasing computational resources while achieving state-of-the-art results.

Read this paper on arXiv…

C. Donoso-Oliva, I. Becker, P. Protopapas, et. al.
Thu, 5 May 22
35/51

Comments: N/A

DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational Lensing Data [IMA]

http://arxiv.org/abs/2205.00701


Gravitational lensing is the relativistic effect generated by massive bodies, which bend the space-time surrounding them. It is a deeply investigated topic in astrophysics and allows validating theoretical relativistic results and studying faint astrophysical objects that would not be visible otherwise. In recent years Machine Learning methods have been applied to support the analysis of the gravitational lensing phenomena by detecting lensing effects in data sets consisting of images associated with brightness variation time series. However, the state-of-art approaches either consider only images and neglect time-series data or achieve relatively low accuracy on the most difficult data sets. This paper introduces DeepGraviLens, a novel multi-modal network that classifies spatio-temporal data belonging to one non-lensed system type and three lensed system types. It surpasses the current state of the art accuracy results by $\approx$ 19% to $\approx$ 43%, depending on the considered data set. Such an improvement will enable the acceleration of the analysis of lensed objects in upcoming astrophysical surveys, which will exploit the petabytes of data collected, e.g., from the Vera C. Rubin Observatory.

Read this paper on arXiv…

N. Vago and P. Fraternali
Tue, 3 May 22
56/82

Comments: N/A

Learning cosmology and clustering with cosmic graphs [CEA]

http://arxiv.org/abs/2204.13713


We train deep learning models on thousands of galaxy catalogues from the state-of-the-art hydrodynamic simulations of the CAMELS project to perform regression and inference. We employ Graph Neural Networks (GNNs), architectures designed to work with irregular and sparse data, like the distribution of galaxies in the Universe. We first show that GNNs can learn to compute the power spectrum of galaxy catalogues with a few percent accuracy. We then train GNNs to perform likelihood-free inference at the galaxy-field level. Our models are able to infer the value of $\Omega_{\rm m}$ with a $\sim12\%-13\%$ accuracy just from the positions of $\sim1000$ galaxies in a volume of $(25~h^{-1}{\rm Mpc})^3$ at $z=0$ while accounting for astrophysical uncertainties as modelled in CAMELS. Incorporating information from galaxy properties, such as stellar mass, stellar metallicity, and stellar radius, increases the accuracy to $4\%-8\%$. Our models are built to be translational and rotational invariant, and they can extract information from any scale larger than the minimum distance between two galaxies. However, our models are not completely robust: testing on simulations run with a different subgrid physics than the ones used for training does not yield as accurate results.

Read this paper on arXiv…

P. Villanueva-Domingo and F. Villaescusa-Navarro
Mon, 2 May 22
11/52

Comments: 21 pages, 8 figures, code publicly available at this https URL

The Galactic 3D large-scale dust distribution via Gaussian process regression on spherical coordinates [GA]

http://arxiv.org/abs/2204.11715


Knowing the Galactic 3D dust distribution is relevant for understanding many processes in the interstellar medium and for correcting many astronomical observations for dust absorption and emission. Here, we aim for a 3D reconstruction of the Galactic dust distribution with an increase in the number of meaningful resolution elements by orders of magnitude with respect to previous reconstructions, while taking advantage of the dust’s spatial correlations to inform the dust map. We use iterative grid refinement to define a log-normal process in spherical coordinates. This log-normal process assumes a fixed correlation structure, which was inferred in an earlier reconstruction of Galactic dust. Our map is informed through 111 Million data points, combining data of PANSTARRS, 2MASS, Gaia DR2 and ALLWISE. The log-normal process is discretized to 122 Billion degrees of freedom, a factor of 400 more than our previous map. We derive the most probable posterior map and an uncertainty estimate using natural gradient descent and the Fisher-Laplace approximation. The dust reconstruction covers a quarter of the volume of our Galaxy, with a maximum coordinate distance of $16\,\text{kpc}$, and meaningful information can be found up to at distances of $4\,$kpc, still improving upon our earlier map by a factor of 5 in maximal distance, of $900$ in volume, and of about eighteen in angular grid resolution. Unfortunately, the maximum posterior approach chosen to make the reconstruction computational affordable introduces artifacts and reduces the accuracy of our uncertainty estimate. Despite of the apparent limitations of the presented 3D dust map, a good part of the reconstructed structures are confirmed by independent maser observations. Thus, the map is a step towards reliable 3D Galactic cartography and already can serve for a number of tasks, if used with care.

Read this paper on arXiv…

R. Leike, G. Edenhofer, J. Knollmüller, et. al.
Tue, 26 Apr 22
40/74

Comments: N/A

Exoplanet Cartography using Convolutional Neural Networks [EPA]

http://arxiv.org/abs/2204.11821


In the near-future, dedicated telescopes observe Earth-like exoplanets in reflected light, allowing their characterization. Because of the huge distances, every exoplanet will be a single pixel, but temporal variations in its spectral flux hold information about the planet’s surface and atmosphere. We test convolutional neural networks for retrieving a planet’s rotation axis, surface and cloud map from simulated single-pixel flux and polarization observations. We investigate the assumption that the planets reflect Lambertian in the retrieval while their actual reflection is bidirectional, and of including polarization in retrievals. We simulate observations along a planet’s orbit using a radiative transfer algorithm that includes polarization and bidirectional reflection by vegetation, desert, oceans, water clouds, and Rayleigh scattering in 6 spectral bands from 400 to 800 nm, at various photon noise levels. The surface-types and cloud patterns of the facets covering a model planet are based on probability distributions. Our networks are trained with simulated observations of millions of planets before retrieving maps of test planets. The neural networks can constrain rotation axes with a mean squared error (MSE) as small as 0.0097, depending on the orbital inclination. On a bidirectionally reflecting planet, 92% of ocean and 85% of vegetation, desert, and cloud facets are correctly retrieved, in the absence of noise. With realistic noise, it should still be possible to retrieve the main map features with a dedicated telescope. Except for face-on orbits, a network trained with Lambertian reflecting planets, yields significant retrieval errors when given observations of bidirectionally reflecting planets, in particular, brightness artefacts around a planet’s pole. Including polarization improves retrieving the rotation axis and the accuracy of the retrieval of ocean and cloud facets.

Read this paper on arXiv…

K. Meinke, D. Stam and P. Visser
Tue, 26 Apr 22
53/74

Comments: 38 pages, 25 figures, 5 tables, accepted for publication in Astron. Astrophys

Hephaestus: A large scale multitask dataset towards InSAR understanding [CL]

http://arxiv.org/abs/2204.09435


Synthetic Aperture Radar (SAR) data and Interferometric SAR (InSAR) products in particular, are one of the largest sources of Earth Observation data. InSAR provides unique information on diverse geophysical processes and geology, and on the geotechnical properties of man-made structures. However, there are only a limited number of applications that exploit the abundance of InSAR data and deep learning methods to extract such knowledge. The main barrier has been the lack of a large curated and annotated InSAR dataset, which would be costly to create and would require an interdisciplinary team of experts experienced on InSAR data interpretation. In this work, we put the effort to create and make available the first of its kind, manually annotated dataset that consists of 19,919 individual Sentinel-1 interferograms acquired over 44 different volcanoes globally, which are split into 216,106 InSAR patches. The annotated dataset is designed to address different computer vision problems, including volcano state classification, semantic segmentation of ground deformation, detection and classification of atmospheric signals in InSAR imagery, interferogram captioning, text to InSAR generation, and InSAR image quality assessment.

Read this paper on arXiv…

N. Bountos, I. Papoutsis, D. Michail, et. al.
Thu, 21 Apr 22
18/73

Comments: This work has been accepted for publication in EARTHVISION 2022, in conjuction with the Computer Vision and Pattern Recognition (CVPR) 2022 Conference

A stochastic Stein Variational Newton method [CL]

http://arxiv.org/abs/2204.09039


Stein variational gradient descent (SVGD) is a general-purpose optimization-based sampling algorithm that has recently exploded in popularity, but is limited by two issues: it is known to produce biased samples, and it can be slow to converge on complicated distributions. A recently proposed stochastic variant of SVGD (sSVGD) addresses the first issue, producing unbiased samples by incorporating a special noise into the SVGD dynamics such that asymptotic convergence is guaranteed. Meanwhile, Stein variational Newton (SVN), a Newton-like extension of SVGD, dramatically accelerates the convergence of SVGD by incorporating Hessian information into the dynamics, but also produces biased samples. In this paper we derive, and provide a practical implementation of, a stochastic variant of SVN (sSVN) which is both asymptotically correct and converges rapidly. We demonstrate the effectiveness of our algorithm on a difficult class of test problems — the Hybrid Rosenbrock density — and show that sSVN converges using three orders of magnitude fewer gradient evaluations of the log likelihood than its stochastic SVGD counterpart. Our results show that sSVN is a promising approach to accelerating high-precision Bayesian inference tasks with modest-dimension, $d\sim\mathcal{O}(10)$.

Read this paper on arXiv…

A. Leviyev, J. Chen, Y. Wang, et. al.
Wed, 20 Apr 22
49/62

Comments: N/A

Radio Galaxy Zoo: Using semi-supervised learning to leverage large unlabelled data-sets for radio galaxy classification under data-set shift [GA]

http://arxiv.org/abs/2204.08816


In this work we examine the classification accuracy and robustness of a state-of-the-art semi-supervised learning (SSL) algorithm applied to the morphological classification of radio galaxies. We test if SSL with fewer labels can achieve test accuracies comparable to the supervised state-of-the-art and whether this holds when incorporating previously unseen data. We find that for the radio galaxy classification problem considered, SSL provides additional regularisation and outperforms the baseline test accuracy. However, in contrast to model performance metrics reported on computer science benchmarking data-sets, we find that improvement is limited to a narrow range of label volumes, with performance falling off rapidly at low label volumes. Additionally, we show that SSL does not improve model calibration, regardless of whether classification is improved. Moreover, we find that when different underlying catalogues drawn from the same radio survey are used to provide the labelled and unlabelled data-sets required for SSL, a significant drop in classification performance is observered, highlighting the difficulty of applying SSL techniques under dataset shift. We show that a class-imbalanced unlabelled data pool negatively affects performance through prior probability shift, which we suggest may explain this performance drop, and that using the Frechet Distance between labelled and unlabelled data-sets as a measure of data-set shift can provide a prediction of model performance, but that for typical radio galaxy data-sets with labelled sample volumes of O(1000), the sample variance associated with this technique is high and the technique is in general not sufficiently robust to replace a train-test cycle.

Read this paper on arXiv…

I. Slijepcevic, A. Scaife, M. Walmsley, et. al.
Wed, 20 Apr 22
59/62

Comments: Accepted to MNRAS. 14 pages

Estimation of stellar atmospheric parameters from LAMOST DR8 low-resolution spectra with 20$\leq$SNR$<$30 [GA]

http://arxiv.org/abs/2204.06301


The accuracy of the estimated stellar atmospheric parameter decreases evidently with the decreasing of spectral signal-to-noise ratio (SNR) and there are a huge amount of this kind observations, especially in case of SNR$<$30. Therefore, it is helpful to improve the parameter estimation performance for these spectra and this work studied the ($T_\texttt{eff}, \log~g$, [Fe/H]) estimation problem for LAMOST DR8 low-resolution spectra with 20$\leq$SNR$<$30. We proposed a data-driven method based on machine learning techniques. Firstly, this scheme detected stellar atmospheric parameter-sensitive features from spectra by the Least Absolute Shrinkage and Selection Operator (LASSO), rejected ineffective data components and irrelevant data. Secondly, a Multi-layer Perceptron (MLP) method was used to estimate stellar atmospheric parameters from the LASSO features. Finally, the performance of the LASSO-MLP was evaluated by computing and analyzing the consistency between its estimation and the reference from the APOGEE (Apache Point Observatory Galactic Evolution Experiment) high-resolution spectra. Experiments show that the Mean Absolute Errors (MAE) of $T_\texttt{eff}, \log~g$, [Fe/H] are reduced from the LASP (137.6 K, 0.195 dex, 0.091 dex) to LASSO-MLP (84.32 K, 0.137 dex, 0.063 dex), which indicate evident improvements on stellar atmospheric parameter estimation. In addition, this work estimated the stellar atmospheric parameters for 1,162,760 low-resolution spectra with 20$\leq$SNR$<$30 from LAMOST DR8 using LASSO-MLP, and released the estimation catalog, learned model, experimental code, trained model, training data and test data for scientific exploration and algorithm study.

Read this paper on arXiv…

X. Li, Z. Wang, S. Zeng, et. al.
Thu, 14 Apr 22
35/62

Comments: 16 pages, 6 figures, 4 tables

A Machine Learning and Computer Vision Approach to Geomagnetic Storm Forecasting [CL]

http://arxiv.org/abs/2204.05780


Geomagnetic storms, disturbances of Earth’s magnetosphere caused by masses of charged particles being emitted from the Sun, are an uncontrollable threat to modern technology. Notably, they have the potential to damage satellites and cause instability in power grids on Earth, among other disasters. They result from high sun activity, which are induced from cool areas on the Sun known as sunspots. Forecasting the storms to prevent disasters requires an understanding of how and when they will occur. However, current prediction methods at the National Oceanic and Atmospheric Administration (NOAA) are limited in that they depend on expensive solar wind spacecraft and a global-scale magnetometer sensor network. In this paper, we introduce a novel machine learning and computer vision approach to accurately forecast geomagnetic storms without the need of such costly physical measurements. Our approach extracts features from images of the Sun to establish correlations between sunspots and geomagnetic storm classification and is competitive with NOAA’s predictions. Indeed, our prediction achieves a 76% storm classification accuracy. This paper serves as an existence proof that machine learning and computer vision techniques provide an effective means for augmenting and improving existing geomagnetic storm forecasting methods.

Read this paper on arXiv…

K. Domico, R. Sheatsley, Y. Beugin, et. al.
Wed, 13 Apr 22
27/73

Comments: Presented at ML-Helio 2022

Zero-phase angle asteroid taxonomy classification using unsupervised machine learning algorithms [EPA]

http://arxiv.org/abs/2204.05075


We are in an era of large catalogs and, thus, statistical analysis tools for large data sets, such as machine learning, play a fundamental role. One example of such a survey is the Sloan Moving Object Catalog (MOC), which lists the astrometric and photometric information of all moving objects captured by the Sloan field of view. One great advantage of this telescope is represented by its set of five filters, allowing for taxonomic analysis of asteroids by studying their colors. However, until now, the color variation produced by the change of phase angle of the object has not been taken into account. In this paper, we address this issue by using absolute magnitudes for classification. We aim to produce a new taxonomic classification of asteroids based on their magnitudes that is unaffected by variations caused by the change in phase angle. We selected 9481 asteroids with absolute magnitudes of Hg, Hi and Hz, computed from the Sloan Moving Objects Catalog using the HG12 system. We calculated the absolute colors with them. To perform the taxonomic classification, we applied a unsupervised machine learning algorithm known as fuzzy C-means. This is a useful soft clustering tool for working with {data sets where the different groups are not completely separated and there are regions of overlap between them. We have chosen to work with the four main taxonomic complexes, C, S, X, and V, as they comprise most of the known spectral characteristics. We classified a total of 6329 asteroids with more than 60% probability of belonging to the assigned taxonomic class, with 162 of these objects having been characterized by an ambiguous classification in the past. By analyzing the sample obtained in the plane Semimajor axis versus inclination, we identified 15 new V-type asteroid candidates outside the Vesta family region.

Read this paper on arXiv…

M. Colazo, A. Alvarez-Candal and R. Duffard
Tue, 12 Apr 22
16/87

Comments: N/A

Half-sibling regression meets exoplanet imaging: PSF modeling and subtraction using a flexible, domain knowledge-driven, causal framework [IMA]

http://arxiv.org/abs/2204.03439


High-contrast imaging of exoplanets hinges on powerful post-processing methods to denoise the data and separate the signal of a companion from its host star, which is typically orders of magnitude brighter. Existing post-processing algorithms do not use all prior domain knowledge that is available about the problem. We propose a new method that builds on our understanding of the systematic noise and the causal structure of the data-generating process. Our algorithm is based on a modified version of half-sibling regression (HSR), a flexible denoising framework that combines ideas from the fields of machine learning and causality. We adapt the method to address the specific requirements of high-contrast exoplanet imaging data obtained in pupil tracking mode. The key idea is to estimate the systematic noise in a pixel by regressing the time series of this pixel onto a set of causally independent, signal-free predictor pixels. We use regularized linear models in this work; however, other (non-linear) models are also possible. In a second step, we demonstrate how the HSR framework allows us to incorporate observing conditions such as wind speed or air temperature as additional predictors. When we apply our method to four data sets from the VLT/NACO instrument, our algorithm provides a better false-positive fraction than PCA-based PSF subtraction, a popular baseline method in the field. Additionally, we find that the HSR-based method provides direct and accurate estimates for the contrast of the exoplanets without the need to insert artificial companions for calibration in the data sets. Finally, we present first evidence that using the observing conditions as additional predictors can improve the results. Our HSR-based method provides an alternative, flexible and promising approach to the challenge of modeling and subtracting the stellar PSF and systematic noise in exoplanet imaging data.

Read this paper on arXiv…

T. Gebhard, M. Bonse, S. Quanz, et. al.
Fri, 8 Apr 22
5/65

Comments: Accepted for publication in Astronomy & Astrophysics

Identifying Exoplanets with Machine Learning Methods: A Preliminary Study [EPA]

http://arxiv.org/abs/2204.00721


The discovery of habitable exoplanets has long been a heated topic in astronomy. Traditional methods for exoplanet identification include the wobble method, direct imaging, gravitational microlensing, etc., which not only require a considerable investment of manpower, time, and money, but also are limited by the performance of astronomical telescopes. In this study, we proposed the idea of using machine learning methods to identify exoplanets. We used the Kepler dataset collected by NASA from the Kepler Space Observatory to conduct supervised learning, which predicts the existence of exoplanet candidates as a three-categorical classification task, using decision tree, random forest, na\”ive Bayes, and neural network; we used another NASA dataset consisted of the confirmed exoplanets data to conduct unsupervised learning, which divides the confirmed exoplanets into different clusters, using k-means clustering. As a result, our models achieved accuracies of 99.06%, 92.11%, 88.50%, and 99.79%, respectively, in the supervised learning task and successfully obtained reasonable clusters in the unsupervised learning task.

Read this paper on arXiv…

Y. Jin, L. Yang and C. Chiang
Tue, 5 Apr 22
51/83

Comments: 12 pages with 9 figures and 2 tables

Active Learning for Computationally Efficient Distribution of Binary Evolution Simulations [SSA]

http://arxiv.org/abs/2203.16683


Binary stars undergo a variety of interactions and evolutionary phases, critical for predicting and explaining observed properties. Binary population synthesis with full stellar-structure and evolution simulations are computationally expensive requiring a large number of mass-transfer sequences. The recently developed binary population synthesis code POSYDON incorporates grids of MESA binary star simulations which are then interpolated to model large-scale populations of massive binaries. The traditional method of computing a high-density rectilinear grid of simulations is not scalable for higher-dimension grids, accounting for a range of metallicities, rotation, and eccentricity. We present a new active learning algorithm, psy-cris, which uses machine learning in the data-gathering process to adaptively and iteratively select targeted simulations to run, resulting in a custom, high-performance training set. We test psy-cris on a toy problem and find the resulting training sets require fewer simulations for accurate classification and regression than either regular or randomly sampled grids. We further apply psy-cris to the target problem of building a dynamic grid of MESA simulations, and we demonstrate that, even without fine tuning, a simulation set of only $\sim 1/4$ the size of a rectilinear grid is sufficient to achieve the same classification accuracy. We anticipate further gains when algorithmic parameters are optimized for the targeted application. We find that optimizing for classification only may lead to performance losses in regression, and vice versa. Lowering the computational cost of producing grids will enable future versions of POSYDON to cover more input parameters while preserving interpolation accuracies.

Read this paper on arXiv…

K. Rocha, J. Andrews, C. Berry, et. al.
Fri, 1 Apr 22
16/85

Comments: 20 pages (16 main text), 10 figures, submitted to ApJ

Predicting Winners of the Reality TV Dating Show $\textit{The Bachelor}$ Using Machine Learning Algorithms [CL]

http://arxiv.org/abs/2203.16648


$\textit{The Bachelor}$ is a reality TV dating show in which a single bachelor selects his wife from a pool of approximately 30 female contestants over eight weeks of filming (American Broadcasting Company 2002). We collected the following data on all 422 contestants that participated in seasons 11 through 25: their Age, Hometown, Career, Race, Week they got their first 1-on-1 date, whether they got the first impression rose, and what “place” they ended up getting. We then trained three machine learning models to predict the ideal characteristics of a successful contestant on $\textit{The Bachelor}$. The three algorithms that we tested were: random forest classification, neural networks, and linear regression. We found consistency across all three models, although the neural network performed the best overall. Our models found that a woman has the highest probability of progressing far on $\textit{The Bachelor}$ if she is: 26 years old, white, from the Northwest, works as an dancer, received a 1-on-1 in week 6, and did not receive the First Impression Rose. Our methodology is broadly applicable to all romantic reality television, and our results will inform future $\textit{The Bachelor}$ production and contestant strategies. While our models were relatively successful, we still encountered high misclassification rates. This may be because: (1) Our training dataset had fewer than 400 points or (2) Our models were too simple to parameterize the complex romantic connections contestants forge over the course of a season.

Read this paper on arXiv…

A. Lee, G. Chesmore, K. Rocha, et. al.
Fri, 1 Apr 22
60/85

Comments: 6 Pages, 5 Figures. Submitted to Acta Prima Aprila. Code used in this work available at this http URL

Neural representation of a time optimal, constant acceleration rendezvous [EPA]

http://arxiv.org/abs/2203.15490


We train neural models to represent both the optimal policy (i.e. the optimal thrust direction) and the value function (i.e. the time of flight) for a time optimal, constant acceleration low-thrust rendezvous. In both cases we develop and make use of the data augmentation technique we call backward generation of optimal examples. We are thus able to produce and work with large dataset and to fully exploit the benefit of employing a deep learning framework. We achieve, in all cases, accuracies resulting in successful rendezvous (simulated following the learned policy) and time of flight predictions (using the learned value function). We find that residuals as small as a few m/s, thus well within the possibility of a spacecraft navigation $\Delta V$ budget, are achievable for the velocity at rendezvous. We also find that, on average, the absolute error to predict the optimal time of flight to rendezvous from any orbit in the asteroid belt to an Earth-like orbit is small (less than 4\%) and thus also of interest for practical uses, for example, during preliminary mission design phases.

Read this paper on arXiv…

D. Izzo and S. Origer
Wed, 30 Mar 22
59/77

Comments: N/A

Using Multiple Instance Learning for Explainable Solar Flare Prediction [SSA]

http://arxiv.org/abs/2203.13896


In this work we leverage a weakly-labeled dataset of spectral data from NASAs IRIS satellite for the prediction of solar flares using the Multiple Instance Learning (MIL) paradigm. While standard supervised learning models expect a label for every instance, MIL relaxes this and only considers bags of instances to be labeled. This is ideally suited for flare prediction with IRIS data that consists of time series of bags of UV spectra measured along the instrument slit. In particular, we consider the readout window around the Mg II h&k lines that encodes information on the dynamics of the solar chromosphere. Our MIL models are not only able to predict whether flares occur within the next $\sim$25 minutes with accuracies of around 90%, but are also able to explain which spectral profiles were particularly important for their bag-level prediction. This information can be used to highlight regions of interest in ongoing IRIS observations in real-time and to identify candidates for typical flare precursor spectral profiles. We use k-means clustering to extract groups of spectral profiles that appear relevant for flare prediction. The recovered groups show high intensity, triplet red wing emission and single-peaked h and k lines, as found by previous works. They seem to be related to small-scale explosive events that have been reported to occur tens of minutes before a flare.

Read this paper on arXiv…

C. Huwyler and M. Melchior
Tue, 29 Mar 22
35/73

Comments: N/A

Predicting Solar Energetic Particles Using SDO/HMI Vector Magnetic Data Products and a Bidirectional LSTM Network [SSA]

http://arxiv.org/abs/2203.14393


Solar energetic particles (SEPs) are an essential source of space radiation, which are hazards for humans in space, spacecraft, and technology in general. In this paper we propose a deep learning method, specifically a bidirectional long short-term memory (biLSTM) network, to predict if an active region (AR) would produce an SEP event given that (i) the AR will produce an M- or X-class flare and a coronal mass ejection (CME) associated with the flare, or (ii) the AR will produce an M- or X-class flare regardless of whether or not the flare is associated with a CME. The data samples used in this study are collected from the Geostationary Operational Environmental Satellite’s X-ray flare catalogs provided by the National Centers for Environmental Information. We select M- and X-class flares with identified ARs in the catalogs for the period between 2010 and 2021, and find the associations of flares, CMEs and SEPs in the Space Weather Database of Notifications, Knowledge, Information during the same period. Each data sample contains physical parameters collected from the Helioseismic and Magnetic Imager on board the Solar Dynamics Observatory. Experimental results based on different performance metrics demonstrate that the proposed biLSTM network is better than related machine learning algorithms for the two SEP prediction tasks studied here. We also discuss extensions of our approach for probabilistic forecasting and calibration with empirical evaluation.

Read this paper on arXiv…

Y. Abduallah, V. Jordanova, H. Liu, et. al.
Tue, 29 Mar 22
44/73

Comments: 22 pages, 6 figures, 8 tables

Applications of physics informed neural operators [CL]

http://arxiv.org/abs/2203.12634


We present an end-to-end framework to learn partial differential equations that brings together initial data production, selection of boundary conditions, and the use of physics-informed neural operators to solve partial differential equations that are ubiquitous in the study and modeling of physics phenomena. We first demonstrate that our methods reproduce the accuracy and performance of other neural operators published elsewhere in the literature to learn the 1D wave equation and the 1D Burgers equation. Thereafter, we apply our physics-informed neural operators to learn new types of equations, including the 2D Burgers equation in the scalar, inviscid and vector types. Finally, we show that our approach is also applicable to learn the physics of the 2D linear and nonlinear shallow water equations, which involve three coupled partial differential equations. We release our artificial intelligence surrogates and scientific software to produce initial data and boundary conditions to study a broad range of physically motivated scenarios. We provide the source code, an interactive website to visualize the predictions of our physics informed neural operators, and a tutorial for their use at the Data and Learning Hub for Science.

Read this paper on arXiv…

S. Rosofsky and E. Huerta
Fri, 25 Mar 22
11/46

Comments: 15 pages, 10 figures

AI Poincaré 2.0: Machine Learning Conservation Laws from Differential Equations [CL]

http://arxiv.org/abs/2203.12610


We present a machine learning algorithm that discovers conservation laws from differential equations, both numerically (parametrized as neural networks) and symbolically, ensuring their functional independence (a non-linear generalization of linear independence). Our independence module can be viewed as a nonlinear generalization of singular value decomposition. Our method can readily handle inductive biases for conservation laws. We validate it with examples including the 3-body problem, the KdV equation and nonlinear Schr\”odinger equation.

Read this paper on arXiv…

Z. Liu, V. Madhavan and M. Tegmark
Thu, 24 Mar 22
21/56

Comments: 17 pages, 10 figures

3D Adapted Random Forest Vision (3DARFV) for Untangling Heterogeneous-Fabric Exceeding Deep Learning Semantic Segmentation Efficiency at the Utmost Accuracy [CL]

http://arxiv.org/abs/2203.12469


Planetary exploration depends heavily on 3D image data to characterize the static and dynamic properties of the rock and environment. Analyzing 3D images requires many computations, causing efficiency to suffer lengthy processing time alongside large energy consumption. High-Performance Computing (HPC) provides apparent efficiency at the expense of energy consumption. However, for remote explorations, the conveyed surveillance and the robotized sensing need faster data analysis with ultimate accuracy to make real-time decisions. In such environments, access to HPC and energy is limited. Therefore, we realize that reducing the number of computations to optimal and maintaining the desired accuracy leads to higher efficiency. This paper demonstrates the semantic segmentation capability of a probabilistic decision tree algorithm, 3D Adapted Random Forest Vision (3DARFV), exceeding deep learning algorithm efficiency at the utmost accuracy.

Read this paper on arXiv…

O. Alfarisi, Z. Aung, Q. Huang, et. al.
Thu, 24 Mar 22
37/56

Comments: N/A

DeepLSS: breaking parameter degeneracies in large scale structure with deep learning analysis of combined probes [CEA]

http://arxiv.org/abs/2203.09616


In classical cosmological analysis of large scale structure surveys with 2-pt functions, the parameter measurement precision is limited by several key degeneracies within the cosmology and astrophysics sectors. For cosmic shear, clustering amplitude $\sigma_8$ and matter density $\Omega_m$ roughly follow the $S_8=\sigma_8(\Omega_m/0.3)^{0.5}$ relation. In turn, $S_8$ is highly correlated with the intrinsic galaxy alignment amplitude $A_{\rm{IA}}$. For galaxy clustering, the bias $b_g$ is degenerate with both $\sigma_8$ and $\Omega_m$, as well as the stochasticity $r_g$. Moreover, the redshift evolution of IA and bias can cause further parameter confusion. A tomographic 2-pt probe combination can partially lift these degeneracies. In this work we demonstrate that a deep learning analysis of combined probes of weak gravitational lensing and galaxy clustering, which we call DeepLSS, can effectively break these degeneracies and yield significantly more precise constraints on $\sigma_8$, $\Omega_m$, $A_{\rm{IA}}$, $b_g$, $r_g$, and IA redshift evolution parameter $\eta_{\rm{IA}}$. The most significant gains are in the IA sector: the precision of $A_{\rm{IA}}$ is increased by approximately 8x and is almost perfectly decorrelated from $S_8$. Galaxy bias $b_g$ is improved by 1.5x, stochasticity $r_g$ by 3x, and the redshift evolution $\eta_{\rm{IA}}$ and $\eta_b$ by 1.6x. Breaking these degeneracies leads to a significant gain in constraining power for $\sigma_8$ and $\Omega_m$, with the figure of merit improved by 15x. We give an intuitive explanation for the origin of this information gain using sensitivity maps. These results indicate that the fully numerical, map-based forward modeling approach to cosmological inference with machine learning may play an important role in upcoming LSS surveys. We discuss perspectives and challenges in its practical deployment for a full survey analysis.

Read this paper on arXiv…

T. Kacprzak and J. Fluri
Mon, 21 Mar 22
26/60

Comments: 18 pages, 10 figures, 2 tables, submitted to Physical Review

Identifying Transients in the Dark Energy Survey using Convolutional Neural Networks [IMA]

http://arxiv.org/abs/2203.09908


The ability to discover new transients via image differencing without direct human intervention is an important task in observational astronomy. For these kind of image classification problems, machine Learning techniques such as Convolutional Neural Networks (CNNs) have shown remarkable success. In this work, we present the results of an automated transient identification on images with CNNs for an extant dataset from the Dark Energy Survey Supernova program (DES-SN), whose main focus was on using Type Ia supernovae for cosmology. By performing an architecture search of CNNs, we identify networks that efficiently select non-artifacts (e.g. supernovae, variable stars, AGN, etc.) from artifacts (image defects, mis-subtractions, etc.), achieving the efficiency of previous work performed with random Forests, without the need to expend any effort in feature identification. The CNNs also help us identify a subset of mislabeled images. Performing a relabeling of the images in this subset, the resulting classification with CNNs is significantly better than previous results.

Read this paper on arXiv…

V. Ayyar, R. Jr., A. Awbrey, et. al.
Mon, 21 Mar 22
43/60

Comments: 14 pages, 13 figures

Deep Residual Error and Bag-of-Tricks Learning for Gravitational Wave Surrogate Modeling [IMA]

http://arxiv.org/abs/2203.08434


Deep learning methods have been employed in gravitational-wave astronomy to accelerate the construction of surrogate waveforms for the inspiral of spin-aligned black hole binaries, among other applications. We demonstrate, that the residual error of an artificial neural network that models the coefficients of the surrogate waveform expansion (especially those of the phase of the waveform) has sufficient structure to be learnable by a second network. Adding this second network, we were able to reduce the maximum mismatch for waveforms in a validation set by more than an order of magnitude. We also explored several other ideas for improving the accuracy of the surrogate model, such as the exploitation of similarities between waveforms, the augmentation of the training set, the dissection of the input space, using dedicated networks per output coefficient and output augmentation. In several cases, small improvements can be observed, but the most significant improvement still comes from the addition of a second network that models the residual error. Since the residual error for more general surrogate waveform models (when e.g. eccentricity is included) may also have a specific structure, one can expect our method to be applicable to cases where the gain in accuracy could lead to significant gains in computational time.

Read this paper on arXiv…

S. Fragkouli, P. Nousi, N. Passalis, et. al.
Thu, 17 Mar 22
50/66

Comments: N/A

A comparative study of non-deep learning, deep learning, and ensemble learning methods for sunspot number prediction [SSA]

http://arxiv.org/abs/2203.05757


Solar activity has significant impacts on human activities and health. One most commonly used measure of solar activity is the sunspot number. This paper compares three important non-deep learning models, four popular deep learning models, and their five ensemble models in forecasting sunspot numbers. Our proposed ensemble model XGBoost-DL, which uses XGBoost as a two-level nonlinear ensemble method to combine the deep learning models, achieves the best forecasting performance among all considered models and the NASA’s forecast. Our XGBoost-DL forecasts a peak sunspot number of 133.47 in May 2025 for Solar Cycle 25 and 164.62 in November 2035 for Solar Cycle 26, similar to but later than the NASA’s at 137.7 in October 2024 and 161.2 in December 2034.

Read this paper on arXiv…

Y. Dang, Z. Chen, H. Li, et. al.
Mon, 14 Mar 22
11/67

Comments: N/A

Symmetry Group Equivariant Architectures for Physics [CL]

http://arxiv.org/abs/2203.06153


Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied.

Read this paper on arXiv…

A. Bogatskiy, S. Ganguly, T. Kipf, et. al.
Mon, 14 Mar 22
31/67

Comments: Contribution to Snowmass 2021

Detecting and Diagnosing Terrestrial Gravitational-Wave Mimics Through Feature Learning [IMA]

http://arxiv.org/abs/2203.05086


As engineered systems grow in complexity, there is an increasing need for automatic methods that can detect, diagnose, and even correct transient anomalies that inevitably arise and can be difficult or impossible to diagnose and fix manually. Among the most sensitive and complex systems of our civilization are the detectors that search for incredibly small variations in distance caused by gravitational waves — phenomena originally predicted by Albert Einstein to emerge and propagate through the universe as the result of collisions between black holes and other massive objects in deep space. The extreme complexity and precision of such detectors causes them to be subject to transient noise issues that can significantly limit their sensitivity and effectiveness.
In this work, we present a demonstration of a method that can detect and characterize emergent transient anomalies of such massively complex systems. We illustrate the performance, precision, and adaptability of the automated solution via one of the prevalent issues limiting gravitational-wave discoveries: noise artifacts of terrestrial origin that contaminate gravitational wave observatories’ highly sensitive measurements and can obscure or even mimic the faint astrophysical signals for which they are listening. Specifically, we demonstrate how a highly interpretable convolutional classifier can automatically learn to detect transient anomalies from auxiliary detector data without needing to observe the anomalies themselves. We also illustrate several other useful features of the model, including how it performs automatic variable selection to reduce tens of thousands of auxiliary data channels to only a few relevant ones; how it identifies behavioral signatures predictive of anomalies in those channels; and how it can be used to investigate individual anomalies and the channels associated with them.

Read this paper on arXiv…

R. Colgan, Z. Márka, J. Yan, et. al.
Fri, 11 Mar 22
55/59

Comments: N/A

Follow the Water: Finding Water, Snow and Clouds on Terrestrial Exoplanets with Photometry and Machine Learning [EPA]

http://arxiv.org/abs/2203.04201


All life on Earth needs water. NASA’s quest to follow the water links water to the search for life in the cosmos. Telescopes like JWST and mission concepts like HabEx, LUVOIR and Origins are designed to characterise rocky exoplanets spectroscopically. However, spectroscopy remains time-intensive and therefore, initial characterisation is critical to prioritisation of targets.
Here, we study machine learning as a tool to assess water’s existence through broadband-filter reflected photometric flux on Earth-like exoplanets in three forms: seawater, water-clouds and snow; based on 53,130 spectra of cold, Earth-like planets with 6 major surfaces. XGBoost, a well-known machine learning algorithm, achieves over 90\% balanced accuracy in detecting the existence of snow or clouds for S/N$\gtrsim 20$, and 70\% for liquid seawater for S/N $\gtrsim 30$. Finally, we perform mock Bayesian analysis with Markov-chain Monte Carlo with five filters identified to derive exact surface compositions to test for retrieval feasibility.
The results show that the use of machine learning to identify water on the surface of exoplanets from broadband-filter photometry provides a promising initial characterisation tool of water in different forms. Planned small and large telescope missions could use this to aid their prioritisation of targets for time-intense follow-up observations.

Read this paper on arXiv…

D. Pham and L. Kaltenegger
Wed, 9 Mar 22
25/68

Comments: 6 pages, 5 figures. Accepted for publications in MNRAS Letters

Successful Recovery of an Observed Meteorite Fall Using Drones and Machine Learning [EPA]

http://arxiv.org/abs/2203.01466


We report the first-time recovery of a fresh meteorite fall using a drone and a machine learning algorithm. A fireball on the 1st April 2021 was observed over Western Australia by the Desert Fireball Network, for which a fall area was calculated for the predicted surviving mass. A search team arrived on site and surveyed 5.1 km2 area over a 4-day period. A convolutional neural network, trained on previously-recovered meteorites with fusion crusts, processed the images on our field computer after each flight. meteorite candidates identified by the algorithm were sorted by team members using two user interfaces to eliminate false positives. Surviving candidates were revisited with a smaller drone, and imaged in higher resolution, before being eliminated or finally being visited in-person. The 70 g meteorite was recovered within 50 m of the calculated fall line using, demonstrating the effectiveness of this methodology which will facilitate the efficient collection of many more observed meteorite falls.

Read this paper on arXiv…

S. Anderson, M. Towner, J. Fairweather, et. al.
Fri, 4 Mar 22
45/63

Comments: 4 Figures, 1 Table, 10 pages

Convolutional neural networks as an alternative to Bayesian retrievals [EPA]

http://arxiv.org/abs/2203.01236


Exoplanet observations are currently analysed with Bayesian retrieval techniques. Due to the computational load of the models used, a compromise is needed between model complexity and computing time. Analysis of data from future facilities, will need more complex models which will increase the computational load of retrievals, prompting the search for a faster approach for interpreting exoplanet observations. Our goal is to compare machine learning retrievals of exoplanet transmission spectra with nested sampling, and understand if machine learning can be as reliable as Bayesian retrievals for a statistically significant sample of spectra while being orders of magnitude faster. We generate grids of synthetic transmission spectra and their corresponding planetary and atmospheric parameters, one using free chemistry models, and the other using equilibrium chemistry models. Each grid is subsequently rebinned to simulate both HST/WFC3 and JWST/NIRSpec observations, yielding four datasets in total. Convolutional neural networks (CNNs) are trained with each of the datasets. We perform retrievals on a 1,000 simulated observations for each combination of model type and instrument with nested sampling and machine learning. We also use both methods to perform retrievals on real WFC3 transmission spectra. Finally, we test how robust machine learning and nested sampling are against incorrect assumptions in our models. CNNs reach a lower coefficient of determination between predicted and true values of the parameters. Nested sampling underestimates the uncertainty in ~8% of retrievals, whereas CNNs estimate them correctly. For real WFC3 observations, nested sampling and machine learning agree within $2\sigma$ for ~86% of spectra. When doing retrievals with incorrect assumptions, nested sampling underestimates the uncertainty in ~12% to ~41% of cases, whereas this is always below ~10% for the CNN.

Read this paper on arXiv…

F. Martinez, M. Min, I. Kamp, et. al.
Thu, 3 Mar 22
55/55

Comments: Accepted for publication in A&A

Architectural Optimization and Feature Learning for High-Dimensional Time Series Datasets [CL]

http://arxiv.org/abs/2202.13486


As our ability to sense increases, we are experiencing a transition from data-poor problems, in which the central issue is a lack of relevant data, to data-rich problems, in which the central issue is to identify a few relevant features in a sea of observations. Motivated by applications in gravitational-wave astrophysics, we study the problem of predicting the presence of transient noise artifacts in a gravitational wave detector from a rich collection of measurements from the detector and its environment.
We argue that feature learning–in which relevant features are optimized from data–is critical to achieving high accuracy. We introduce models that reduce the error rate by over 60\% compared to the previous state of the art, which used fixed, hand-crafted features. Feature learning is useful not only because it improves performance on prediction tasks; the results provide valuable information about patterns associated with phenomena of interest that would otherwise be undiscoverable. In our application, features found to be associated with transient noise provide diagnostic information about its origin and suggest mitigation strategies.
Learning in high-dimensional settings is challenging. Through experiments with a variety of architectures, we identify two key factors in successful models: sparsity, for selecting relevant variables within the high-dimensional observations; and depth, which confers flexibility for handling complex interactions and robustness with respect to temporal variations. We illustrate their significance through systematic experiments on real detector data. Our results provide experimental corroboration of common assumptions in the machine-learning community and have direct applicability to improving our ability to sense gravitational waves, as well as to many other problem settings with similarly high-dimensional, noisy, or partly irrelevant data.

Read this paper on arXiv…

R. Colgan, J. Yan, Z. Márka, et. al.
Tue, 1 Mar 22
76/80

Comments: N/A

A duality connecting neural network and cosmological dynamics [CL]

http://arxiv.org/abs/2202.11104


We demonstrate that the dynamics of neural networks trained with gradient descent and the dynamics of scalar fields in a flat, vacuum energy dominated Universe are structurally profoundly related. This duality provides the framework for synergies between these systems, to understand and explain neural network dynamics and new ways of simulating and describing early Universe models. Working in the continuous-time limit of neural networks, we analytically match the dynamics of the mean background and the dynamics of small perturbations around the mean field, highlighting potential differences in separate limits. We perform empirical tests of this analytic description and quantitatively show the dependence of the effective field theory parameters on hyperparameters of the neural network. As a result of this duality, the cosmological constant is matched inversely to the learning rate in the gradient descent update.

Read this paper on arXiv…

S. Krippendorf and M. Spannowsky
Thu, 24 Feb 22
24/52

Comments: 17 pages, 6 figures

Translation and Rotation Equivariant Normalizing Flow (TRENF) for Optimal Cosmological Analysis [CEA]

http://arxiv.org/abs/2202.05282


Our universe is homogeneous and isotropic, and its perturbations obey translation and rotation symmetry. In this work we develop Translation and Rotation Equivariant Normalizing Flow (TRENF), a generative Normalizing Flow (NF) model which explicitly incorporates these symmetries, defining the data likelihood via a sequence of Fourier space-based convolutions and pixel-wise nonlinear transforms. TRENF gives direct access to the high dimensional data likelihood p(x|y) as a function of the labels y, such as cosmological parameters. In contrast to traditional analyses based on summary statistics, the NF approach has no loss of information since it preserves the full dimensionality of the data. On Gaussian random fields, the TRENF likelihood agrees well with the analytical expression and saturates the Fisher information content in the labels y. On nonlinear cosmological overdensity fields from N-body simulations, TRENF leads to significant improvements in constraining power over the standard power spectrum summary statistic. TRENF is also a generative model of the data, and we show that TRENF samples agree well with the N-body simulations it trained on, and that the inverse mapping of the data agrees well with a Gaussian white noise both visually and on various summary statistics: when this is perfectly achieved the resulting p(x|y) likelihood analysis becomes optimal. Finally, we develop a generalization of this model that can handle effects that break the symmetry of the data, such as the survey mask, which enables likelihood analysis on data without periodic boundaries.

Read this paper on arXiv…

B. Dai and U. Seljak
Mon, 14 Feb 22
2/55

Comments: 11 pages, 10 figures. Submitted to MNRAS. Comments welcome

Feasible Low-thrust Trajectory Identification via a Deep Neural Network Classifier [CL]

http://arxiv.org/abs/2202.04962


In recent years, deep learning techniques have been introduced into the field of trajectory optimization to improve convergence and speed. Training such models requires large trajectory datasets. However, the convergence of low thrust (LT) optimizations is unpredictable before the optimization process ends. For randomly initialized low thrust transfer data generation, most of the computation power will be wasted on optimizing infeasible low thrust transfers, which leads to an inefficient data generation process. This work proposes a deep neural network (DNN) classifier to accurately identify feasible LT transfer prior to the optimization process. The DNN-classifier achieves an overall accuracy of 97.9%, which has the best performance among the tested algorithms. The accurate low-thrust trajectory feasibility identification can avoid optimization on undesired samples, so that the majority of the optimized samples are LT trajectories that converge. This technique enables efficient dataset generation for different mission scenarios with different spacecraft configurations.

Read this paper on arXiv…

R. Xie and A. Dempster
Fri, 11 Feb 22
32/71

Comments: 18 Pages; 10 figures; Presented at 2021 AAS/AIAA Astrodynamics Specialist Conference, Big Sky, Virtual

SUPA: A Lightweight Diagnostic Simulator for Machine Learning in Particle Physics [CL]

http://arxiv.org/abs/2202.05012


Deep learning methods have gained popularity in high energy physics for fast modeling of particle showers in detectors. Detailed simulation frameworks such as the gold standard Geant4 are computationally intensive, and current deep generative architectures work on discretized, lower resolution versions of the detailed simulation. The development of models that work at higher spatial resolutions is currently hindered by the complexity of the full simulation data, and by the lack of simpler, more interpretable benchmarks. Our contribution is SUPA, the SUrrogate PArticle propagation simulator, an algorithm and software package for generating data by simulating simplified particle propagation, scattering and shower development in matter. The generation is extremely fast and easy to use compared to Geant4, but still exhibits the key characteristics and challenges of the detailed simulation. We support this claim experimentally by showing that performance of generative models on data from our simulator reflects the performance on a dataset generated with Geant4. The proposed simulator generates thousands of particle showers per second on a desktop machine, a speed up of up to 6 orders of magnitudes over Geant4, and stores detailed geometric information about the shower propagation. SUPA provides much greater flexibility for setting initial conditions and defining multiple benchmarks for the development of models. Moreover, interpreting particle showers as point clouds creates a connection to geometric machine learning and provides challenging and fundamentally new datasets for the field.
The code for SUPA is available at https://github.com/itsdaniele/SUPA.

Read this paper on arXiv…

A. Sinha, D. Paliotta, B. Máté, et. al.
Fri, 11 Feb 22
61/71

Comments: N/A

Rediscovering orbital mechanics with machine learning [EPA]

http://arxiv.org/abs/2202.02306


We present an approach for using machine learning to automatically discover the governing equations and hidden properties of real physical systems from observations. We train a “graph neural network” to simulate the dynamics of our solar system’s Sun, planets, and large moons from 30 years of trajectory data. We then use symbolic regression to discover an analytical expression for the force law implicitly learned by the neural network, which our results showed is equivalent to Newton’s law of gravitation. The key assumptions that were required were translational and rotational equivariance, and Newton’s second and third laws of motion. Our approach correctly discovered the form of the symbolic force law. Furthermore, our approach did not require any assumptions about the masses of planets and moons or physical constants. They, too, were accurately inferred through our methods. Though, of course, the classical law of gravitation has been known since Isaac Newton, our result serves as a validation that our method can discover unknown laws and hidden properties from observed data. More broadly this work represents a key step toward realizing the potential of machine learning for accelerating scientific discovery.

Read this paper on arXiv…

P. Lemos, N. Jeffrey, M. Cranmer, et. al.
Mon, 7 Feb 22
32/46

Comments: 12 pages, 6 figures, under review

Machine Learning Solar Wind Driving Magnetospheric Convection in Tail Lobes [SSA]

http://arxiv.org/abs/2202.01383


To quantitatively study the driving mechanisms of magnetospheric convection in the magnetotail lobes on a global scale, we utilize data from the ARTEMIS spacecraft in the deep tail and the Cluster spacecraft in the near tail. Previous work demonstrated that, in the lobes near the Moon, we can estimate the convection by utilizing ARTEMIS measurements of lunar ions velocity. In this paper, we analyze these datasets with machine learning models to determine what upstream factors drive the lobe convection in different magnetotail regions and thereby understand the mechanisms that control the dynamics of the tail lobes. Our results show that the correlations between the predicted and test convection velocities for the machine learning models (> 0.75) are much better than those of the multiple linear regression model (~ 0.23 – 0.43). The systematic analysis reveals that the IMF and magnetospheric activity play an important role in influencing plasma convection in the global magnetotail lobes.

Read this paper on arXiv…

X. Cao, J. Halekas, S. Haaland, et. al.
Fri, 4 Feb 22
10/65

Comments: N/A

YOUNG Star detrending for Transiting Exoplanet Recovery (YOUNGSTER) II: Using Self-Organising Maps to explore young star variability in Sectors 1-13 of TESS data [EPA]

http://arxiv.org/abs/2202.00031


Young exoplanets and their corresponding host stars are fascinating laboratories for constraining the timescale of planetary evolution and planet-star interactions. However, because young stars are typically much more active than the older population, in order to discover more young exoplanets, greater knowledge of the wide array of young star variability is needed. Here Kohonen Self Organising Maps (SOMs) are used to explore young star variability present in the first year of observations from the Transiting Exoplanet Survey Satellite (TESS), with such knowledge valuable to perform targeted detrending of young stars in the future. This technique was found to be particularly effective at separating the signals of young eclipsing binaries and potential transiting objects from stellar variability, a list of which are provided in this paper. The effect of pre-training the Self-Organising Maps on known variability classes was tested, but found to be challenging without a significant training set from TESS. SOMs were also found to provide an intuitive and informative overview of leftover systematics in the TESS data, providing an important new way to characterise troublesome systematics in photometric data-sets. This paper represents the first stage of the wider YOUNGSTER program, which will use a machine-learning-based approach to classification and targeted detrending of young stars in order to improve the recovery of smaller young exoplanets.

Read this paper on arXiv…

M. Battley, D. Armstrong and D. Pollacco
Wed, 2 Feb 22
12/60

Comments: 21 pages, 33 figures, accepted for publication at MNRAS

Exoplanet Characterization using Conditional Invertible Neural Networks [EPA]

http://arxiv.org/abs/2202.00027


The characterization of an exoplanet’s interior is an inverse problem, which requires statistical methods such as Bayesian inference in order to be solved. Current methods employ Markov Chain Monte Carlo (MCMC) sampling to infer the posterior probability of planetary structure parameters for a given exoplanet. These methods are time consuming since they require the calculation of a large number of planetary structure models. To speed up the inference process when characterizing an exoplanet, we propose to use conditional invertible neural networks (cINNs) to calculate the posterior probability of the internal structure parameters. cINNs are a special type of neural network which excel in solving inverse problems. We constructed a cINN using FrEIA, which was then trained on a database of $5.6\cdot 10^6$ internal structure models to recover the inverse mapping between internal structure parameters and observable features (i.e., planetary mass, planetary radius and composition of the host star). The cINN method was compared to a Metropolis-Hastings MCMC. For that we repeated the characterization of the exoplanet K2-111 b, using both the MCMC method and the trained cINN. We show that the inferred posterior probability of the internal structure parameters from both methods are very similar, with the biggest differences seen in the exoplanet’s water content. Thus cINNs are a possible alternative to the standard time-consuming sampling methods. Indeed, using cINNs allows for orders of magnitude faster inference of an exoplanet’s composition than what is possible using an MCMC method, however, it still requires the computation of a large database of internal structures to train the cINN. Since this database is only computed once, we found that using a cINN is more efficient than an MCMC, when more than 10 exoplanets are characterized using the same cINN.

Read this paper on arXiv…

J. Haldemann, V. Ksoll, D. Walter, et. al.
Wed, 2 Feb 22
32/60

Comments: 15 pages, 13 figures, submitted to Astronomy & Astrophysics

Inference-optimized AI and high performance computing for gravitational wave detection at scale [CL]

http://arxiv.org/abs/2201.11133


We introduce an ensemble of artificial intelligence models for gravitational wave detection that we trained in the Summit supercomputer using 32 nodes, equivalent to 192 NVIDIA V100 GPUs, within 2 hours. Once fully trained, we optimized these models for accelerated inference using NVIDIA TensorRT. We deployed our inference-optimized AI ensemble in the ThetaGPU supercomputer at Argonne Leadership Computer Facility to conduct distributed inference. Using the entire ThetaGPU supercomputer, consisting of 20 nodes each of which has 8 NVIDIA A100 Tensor Core GPUs and 2 AMD Rome CPUs, our NVIDIA TensorRT-optimized AI ensemble porcessed an entire month of advanced LIGO data (including Hanford and Livingston data streams) within 50 seconds. Our inference-optimized AI ensemble retains the same sensitivity of traditional AI models, namely, it identifies all known binary black hole mergers previously identified in this advanced LIGO dataset and reports no misclassifications, while also providing a 3X inference speedup compared to traditional artificial intelligence models. We used time slides to quantify the performance of our AI ensemble to process up to 5 years worth of advanced LIGO data. In this synthetically enhanced dataset, our AI ensemble reports an average of one misclassification for every month of searched advanced LIGO data. We also present the receiver operating characteristic curve of our AI ensemble using this 5 year long advanced LIGO dataset. This approach provides the required tools to conduct accelerated, AI-driven gravitational wave detection at scale.

Read this paper on arXiv…

P. Chaturvedi, A. Khan, M. Tian, et. al.
Fri, 28 Jan 22
3/64

Comments: 19 pages, 8 figure

Born-Infeld (BI) for AI: Energy-Conserving Descent (ECD) for Optimization [CL]

http://arxiv.org/abs/2201.11137


We introduce a novel framework for optimization based on energy-conserving Hamiltonian dynamics in a strongly mixing (chaotic) regime and establish its key properties analytically and numerically. The prototype is a discretization of Born-Infeld dynamics, with a squared relativistic speed limit depending on the objective function. This class of frictionless, energy-conserving optimizers proceeds unobstructed until slowing naturally near the minimal loss, which dominates the phase space volume of the system. Building from studies of chaotic systems such as dynamical billiards, we formulate a specific algorithm with good performance on machine learning and PDE-solving tasks, including generalization. It cannot stop at a high local minimum and cannot overshoot the global minimum, yielding an advantage in non-convex loss functions, and proceeds faster than GD+momentum in shallow valleys.

Read this paper on arXiv…

G. Luca and E. Silverstein
Fri, 28 Jan 22
17/64

Comments: 9 pages + Appendix, 8 figures. Code available online

Alleviating the Transit Timing Variations bias in transit surveys. II. RIVERS: Twin resonant Earth-sized planets around Kepler-1972 recovered from Kepler's false positive [EPA]

http://arxiv.org/abs/2201.11459


Transit Timing Variations (TTVs) can provide useful information for systems observed by transit, by putting constraints on the masses and eccentricities of the observed planets, or even constrain the existence of non-transiting companions. However, TTVs can also prevent the detection of small planets in transit surveys, or bias the recovered planetary and transit parameters. Here we show that Kepler-1972 c, initially the “not transit-like” false positive KOI-3184.02, is an Earth-sized planet whose orbit is perturbed by Kepler-1972 b (initially KOI-3184.01). The pair is locked in a 3:2 Mean-motion resonance, each planet displaying TTVs of more than 6h hours of amplitude over the duration of the Kepler mission. The two planets have similar masses $m_b/m_c =0.956_{-0.051}^{+0.056}$ and radii $R_b=0.802_{-0.041}^{+0.042}R_{Earth}$, $R_c=0.868_{-0.050}^{+0.051}R_{Earth}$, and the whole system, including the inner candidate KOI-3184.03, appear to be coplanar. Despite the faintness of the signals (SNR of 1.35 for each transit of Kepler-1972 b and 1.10 for Kepler-1972 c), we recovered the transits of the planets using the RIVERS method, based on the recognition of the tracks of planets in river diagrams using machine learning, and a photo-dynamic fit of the lightcurve. Recovering the correct ephemerides of the planets is essential to have a complete picture of the observed planetary systems. In particular, we show that in Kepler-1972, not taking into account planet-planet interactions yields an error of $\sim 30\%$ on the radii of planets b and c, in addition to generating in-transit scatter, which leads to mistake KOI3184.02 for a false positive. Alleviating this bias is essential for an unbiased view of Kepler systems, some of the TESS stars, and the upcoming PLATO mission.

Read this paper on arXiv…

A. Leleu, J. Delisle, R. Mardling, et. al.
Fri, 28 Jan 22
19/64

Comments: arXiv admin note: text overlap with arXiv:2111.06825

Deep Attention-Based Supernovae Classification of Multi-Band Light-Curves [IMA]

http://arxiv.org/abs/2201.08482


In astronomical surveys, such as the Zwicky Transient Facility (ZTF), supernovae (SNe) are relatively uncommon objects compared to other classes of variable events. Along with this scarcity, the processing of multi-band light-curves is a challenging task due to the highly irregular cadence, long time gaps, missing-values, low number of observations, etc. These issues are particularly detrimental for the analysis of transient events with SN-like light-curves. In this work, we offer three main contributions. First, based on temporal modulation and attention mechanisms, we propose a Deep Attention model called TimeModAttn to classify multi-band light-curves of different SN types, avoiding photometric or hand-crafted feature computations, missing-values assumptions, and explicit imputation and interpolation methods. Second, we propose a model for the synthetic generation of SN multi-band light-curves based on the Supernova Parametric Model (SPM). This allows us to increase the number of samples and the diversity of the cadence. The TimeModAttn model is first pre-trained using synthetic light-curves in a semi-supervised learning scheme. Then, a fine-tuning process is performed for domain adaptation. The proposed TimeModAttn model outperformed a Random Forest classifier, increasing the balanced-$F_1$score from $\approx.525$ to $\approx.596$. The TimeModAttn model also outperformed other Deep Learning models, based on Recurrent Neural Networks (RNNs), in two scenarios: late-classification and early-classification. Finally, we conduct interpretability experiments. High attention scores are obtained for observations earlier than and close to the SN brightness peaks, which are supported by an early and highly expressive learned temporal modulation.

Read this paper on arXiv…

&. Pimentel, P. Estévez and F. Förster
Mon, 24 Jan 22
28/59

Comments: Submitted to AJ on 14-Jan-2022

alpha-Deep Probabilistic Inference (alpha-DPI): efficient uncertainty quantification from exoplanet astrometry to black hole feature extraction [IMA]

http://arxiv.org/abs/2201.08506


Inference is crucial in modern astronomical research, where hidden astrophysical features and patterns are often estimated from indirect and noisy measurements. Inferring the posterior of hidden features, conditioned on the observed measurements, is essential for understanding the uncertainty of results and downstream scientific interpretations. Traditional approaches for posterior estimation include sampling-based methods and variational inference. However, sampling-based methods are typically slow for high-dimensional inverse problems, while variational inference often lacks estimation accuracy. In this paper, we propose alpha-DPI, a deep learning framework that first learns an approximate posterior using alpha-divergence variational inference paired with a generative neural network, and then produces more accurate posterior samples through importance re-weighting of the network samples. It inherits strengths from both sampling and variational inference methods: it is fast, accurate, and scalable to high-dimensional problems. We apply our approach to two high-impact astronomical inference problems using real data: exoplanet astrometry and black hole feature extraction.

Read this paper on arXiv…

H. Sun, K. Bouman, P. Tiede, et. al.
Mon, 24 Jan 22
42/59

Comments: N/A

Using machine learning to parametrize postmerger signals from binary neutron stars [CL]

http://arxiv.org/abs/2201.06461


There is growing interest in the detection and characterization of gravitational waves from postmerger oscillations of binary neutron stars. These signals contain information about the nature of the remnant and the high-density and out-of-equilibrium physics of the postmerger processes, which would complement any electromagnetic signal. However, the construction of binary neutron star postmerger waveforms is much more complicated than for binary black holes: (i) there are theoretical uncertainties in the neutron-star equation of state and other aspects of the high-density physics, (ii) numerical simulations are expensive and available ones only cover a small fraction of the parameter space with limited numerical accuracy, and (iii) it is unclear how to parametrize the theoretical uncertainties and interpolate across parameter space. In this work, we describe the use of a machine-learning method called a conditional variational autoencoder (CVAE) to construct postmerger models for hyper/massive neutron star remnant signals based on numerical-relativity simulations. The CVAE provides a probabilistic model, which encodes uncertainties in the training data within a set of latent parameters. We estimate that training such a model will ultimately require $\sim 10^4$ waveforms. However, using synthetic training waveforms as a proof-of-principle, we show that the CVAE can be used as an accurate generative model and that it encodes the equation of state in a useful latent representation.

Read this paper on arXiv…

T. Whittaker, W. East, S. Green, et. al.
Thu, 20 Jan 22
33/77

Comments: N/A

A Novel Approach to Topological Graph Theory with R-K Diagrams and Gravitational Wave Analysis [HEAP]

http://arxiv.org/abs/2201.06923


Graph Theory and Topological Data Analytics, while powerful, have many drawbacks related to their sensitivity and consistency with TDA & Graph Network Analytics. In this paper, we aim to propose a novel approach for encoding vectorized associations between data points for the purpose of enabling smooth transitions between Graph and Topological Data Analytics. We conclusively reveal effective ways of converting such vectorized associations to simplicial complexes representing micro-states in a Phase-Space, resulting in filter specific, homotopic self-expressive, event-driven unique topological signatures which we have referred as Roy-Kesselman Diagrams or R-K Diagrams with persistent homology, which emerge from filter-based encodings of R-K Models. The validity and impact of this approach were tested specifically on high-dimensional raw and derived measures of Gravitational Wave Data from the latest LIGO datasets published by the LIGO Open Science Centre along with testing a generalized approach for a non-scientific use-case, which has been demonstrated using the Tableau Superstore Sales dataset. We believe the findings of our work will lay the foundation for many future scientific and engineering applications of stable, high-dimensional data analysis with the combined effectiveness of Topological Graph Theory transformations.

Read this paper on arXiv…

A. Roy and A. Kesselman
Wed, 19 Jan 22
87/121

Comments: N/A

Machine learning prediction for mean motion resonance behaviour — The planar case [EPA]

http://arxiv.org/abs/2201.06743


Most recently, machine learning has been used to study the dynamics of integrable Hamiltonian systems and the chaotic 3-body problem. In this work, we consider an intermediate case of regular motion in a non-integrable system: the behaviour of objects in the 2:3 mean motion resonance with Neptune. We show that, given initial data from a short 6250 yr numerical integration, the best-trained artificial neural network (ANN) can predict the trajectories of the 2:3 resonators over the subsequent 18750 yr evolution, covering a full libration cycle over the combined time period. By comparing our ANN’s prediction of the resonant angle to the outcome of numerical integrations, the former can predict the resonant angle with an accuracy as small as of a few degrees only, while it has the advantage of considerably saving computational time. More specifically, the trained ANN can effectively measure the resonant amplitudes of the 2:3 resonators, and thus provides a fast approach that can identify the resonant candidates. This may be helpful in classifying a huge population of KBOs to be discovered in future surveys.

Read this paper on arXiv…

X. Li, J. Li, Z. Xia, et. al.
Wed, 19 Jan 22
88/121

Comments: 12 pages, 9 figures, accepted for pubblication on Monthly Notices of the Royal Astronomical Society

Probabilistic Mass Mapping with Neural Score Estimation [CEA]

http://arxiv.org/abs/2201.05561


Weak lensing mass-mapping is a useful tool to access the full distribution of dark matter on the sky, but because of intrinsic galaxy ellipticies and finite fields/missing data, the recovery of dark matter maps constitutes a challenging ill-posed inverse problem. We introduce a novel methodology allowing for efficient sampling of the high-dimensional Bayesian posterior of the weak lensing mass-mapping problem, and relying on simulations for defining a fully non-Gaussian prior. We aim to demonstrate the accuracy of the method on simulations, and then proceed to applying it to the mass reconstruction of the HST/ACS COSMOS field. The proposed methodology combines elements of Bayesian statistics, analytic theory, and a recent class of Deep Generative Models based on Neural Score Matching. This approach allows us to do the following: 1) Make full use of analytic cosmological theory to constrain the 2pt statistics of the solution. 2) Learn from cosmological simulations any differences between this analytic prior and full simulations. 3) Obtain samples from the full Bayesian posterior of the problem for robust Uncertainty Quantification. We demonstrate the method on the $\kappa$TNG simulations and find that the posterior mean significantly outperfoms previous methods (Kaiser-Squires, Wiener filter, Sparsity priors) both on root-mean-square error and in terms of the Pearson correlation. We further illustrate the interpretability of the recovered posterior by establishing a close correlation between posterior convergence values and SNR of clusters artificially introduced into a field. Finally, we apply the method to the reconstruction of the HST/ACS COSMOS field and yield the highest quality convergence map of this field to date.

Read this paper on arXiv…

B. Remy, F. Lanusse, N. Jeffrey, et. al.
Mon, 17 Jan 22
2/59

Comments: Submitted to A&A, 20 pages, 15 figures, comments are welcome

Probabilistic Mass Mapping with Neural Score Estimation [CEA]

http://arxiv.org/abs/2201.05561


Weak lensing mass-mapping is a useful tool to access the full distribution of dark matter on the sky, but because of intrinsic galaxy ellipticies and finite fields/missing data, the recovery of dark matter maps constitutes a challenging ill-posed inverse problem. We introduce a novel methodology allowing for efficient sampling of the high-dimensional Bayesian posterior of the weak lensing mass-mapping problem, and relying on simulations for defining a fully non-Gaussian prior. We aim to demonstrate the accuracy of the method on simulations, and then proceed to applying it to the mass reconstruction of the HST/ACS COSMOS field. The proposed methodology combines elements of Bayesian statistics, analytic theory, and a recent class of Deep Generative Models based on Neural Score Matching. This approach allows us to do the following: 1) Make full use of analytic cosmological theory to constrain the 2pt statistics of the solution. 2) Learn from cosmological simulations any differences between this analytic prior and full simulations. 3) Obtain samples from the full Bayesian posterior of the problem for robust Uncertainty Quantification. We demonstrate the method on the $\kappa$TNG simulations and find that the posterior mean significantly outperfoms previous methods (Kaiser-Squires, Wiener filter, Sparsity priors) both on root-mean-square error and in terms of the Pearson correlation. We further illustrate the interpretability of the recovered posterior by establishing a close correlation between posterior convergence values and SNR of clusters artificially introduced into a field. Finally, we apply the method to the reconstruction of the HST/ACS COSMOS field and yield the highest quality convergence map of this field to date.

Read this paper on arXiv…

B. Remy, F. Lanusse, N. Jeffrey, et. al.
Mon, 17 Jan 22
1/59

Comments: Submitted to A&A, 20 pages, 15 figures, comments are welcome