Advances on the classification of radio image cubes [IMA]

http://arxiv.org/abs/2305.03435


Modern radio telescopes will daily generate data sets on the scale of exabytes for systems like the Square Kilometre Array (SKA). Massive data sets are a source of unknown and rare astrophysical phenomena that lead to discoveries. Nonetheless, this is only plausible with the exploitation of intensive machine intelligence to complement human-aided and traditional statistical techniques. Recently, there has been a surge in scientific publications focusing on the use of artificial intelligence in radio astronomy, addressing challenges such as source extraction, morphological classification, and anomaly detection. This study presents a succinct, but comprehensive review of the application of machine intelligence techniques on radio images with emphasis on the morphological classification of radio galaxies. It aims to present a detailed synthesis of the relevant papers summarizing the literature based on data complexity, data pre-processing, and methodological novelty in radio astronomy. The rapid advancement and application of computer intelligence in radio astronomy has resulted in a revolution and a new paradigm shift in the automation of daunting data processes. However, the optimal exploitation of artificial intelligence in radio astronomy, calls for continued collaborative efforts in the creation of annotated data sets. Additionally, in order to quickly locate radio galaxies with similar or dissimilar physical characteristics, it is necessary to index the identified radio sources. Nonetheless, this issue has not been adequately addressed in the literature, making it an open area for further study.

Read this paper on arXiv…

S. Ndung’u, T. Grobler, S. Wijnholds, et. al.
Mon, 8 May 23
56/63

Comments: 21 page review paper submitted to New astronomy reviews journal for review

Multiplicity Boost Of Transit Signal Classifiers: Validation of 69 New Exoplanets Using The Multiplicity Boost of ExoMiner [EPA]

http://arxiv.org/abs/2305.02470


Most existing exoplanets are discovered using validation techniques rather than being confirmed by complementary observations. These techniques generate a score that is typically the probability of the transit signal being an exoplanet (y(x)=exoplanet) given some information related to that signal (represented by x). Except for the validation technique in Rowe et al. (2014) that uses multiplicity information to generate these probability scores, the existing validation techniques ignore the multiplicity boost information. In this work, we introduce a framework with the following premise: given an existing transit signal vetter (classifier), improve its performance using multiplicity information. We apply this framework to several existing classifiers, which include vespa (Morton et al. 2016), Robovetter (Coughlin et al. 2017), AstroNet (Shallue & Vanderburg 2018), ExoNet (Ansdel et al. 2018), GPC and RFC (Armstrong et al. 2020), and ExoMiner (Valizadegan et al. 2022), to support our claim that this framework is able to improve the performance of a given classifier. We then use the proposed multiplicity boost framework for ExoMiner V1.2, which addresses some of the shortcomings of the original ExoMiner classifier (Valizadegan et al. 2022), and validate 69 new exoplanets for systems with multiple KOIs from the Kepler catalog.

Read this paper on arXiv…

H. Valizadegan, M. Martinho, J. Jenkins, et. al.
Fri, 5 May 23
24/67

Comments: N/A

Distinguishing a planetary transit from false positives: a Transformer-based classification for planetary transit signals [EPA]

http://arxiv.org/abs/2304.14283


Current space-based missions, such as the Transiting Exoplanet Survey Satellite (TESS), provide a large database of light curves that must be analysed efficiently and systematically. In recent years, deep learning (DL) methods, particularly convolutional neural networks (CNN), have been used to classify transit signals of candidate exoplanets automatically. However, CNNs have some drawbacks; for example, they require many layers to capture dependencies on sequential data, such as light curves, making the network so large that it eventually becomes impractical. The self-attention mechanism is a DL technique that attempts to mimic the action of selectively focusing on some relevant things while ignoring others. Models, such as the Transformer architecture, were recently proposed for sequential data with successful results. Based on these successful models, we present a new architecture for the automatic classification of transit signals. Our proposed architecture is designed to capture the most significant features of a transit signal and stellar parameters through the self-attention mechanism. In addition to model prediction, we take advantage of attention map inspection, obtaining a more interpretable DL approach. Thus, we can identify the relevance of each element to differentiate a transit signal from false positives, simplifying the manual examination of candidates. We show that our architecture achieves competitive results concerning the CNNs applied for recognizing exoplanetary transit signals in data from the TESS telescope. Based on these results, we demonstrate that applying this state-of-the-art DL model to light curves can be a powerful technique for transit signal detection while offering a level of interpretability.

Read this paper on arXiv…

H. Salinas, K. Pichara, R. Brahm, et. al.
Fri, 28 Apr 23
34/68

Comments: N/A

Onboard Science Instrument Autonomy for the Detection of Microscopy Biosignatures on the Ocean Worlds Life Surveyor [IMA]

http://arxiv.org/abs/2304.13189


The quest to find extraterrestrial life is a critical scientific endeavor with civilization-level implications. Icy moons in our solar system are promising targets for exploration because their liquid oceans make them potential habitats for microscopic life. However, the lack of a precise definition of life poses a fundamental challenge to formulating detection strategies. To increase the chances of unambiguous detection, a suite of complementary instruments must sample multiple independent biosignatures (e.g., composition, motility/behavior, and visible structure). Such an instrument suite could generate 10,000x more raw data than is possible to transmit from distant ocean worlds like Enceladus or Europa. To address this bandwidth limitation, Onboard Science Instrument Autonomy (OSIA) is an emerging discipline of flight systems capable of evaluating, summarizing, and prioritizing observational instrument data to maximize science return. We describe two OSIA implementations developed as part of the Ocean Worlds Life Surveyor (OWLS) prototype instrument suite at the Jet Propulsion Laboratory. The first identifies life-like motion in digital holographic microscopy videos, and the second identifies cellular structure and composition via innate and dye-induced fluorescence. Flight-like requirements and computational constraints were used to lower barriers to infusion, similar to those available on the Mars helicopter, “Ingenuity.” We evaluated the OSIA’s performance using simulated and laboratory data and conducted a live field test at the hypersaline Mono Lake planetary analog site. Our study demonstrates the potential of OSIA for enabling biosignature detection and provides insights and lessons learned for future mission concepts aimed at exploring the outer solar system.

Read this paper on arXiv…

M. Wronkiewicz, J. Lee, L. Mandrake, et. al.
Thu, 27 Apr 23
69/78

Comments: 49 pages, 18 figures, submitted to The Planetary Science Journal on 2023-04-20

Lossy Compression of Large-Scale Radio Interferometric Data [IMA]

http://arxiv.org/abs/2304.07050


This work proposes to reduce visibility data volume using a baseline-dependent lossy compression technique that preserves smearing at the edges of the field-of-view. We exploit the relation of the rank of a matrix and the fact that a low-rank approximation can describe the raw visibility data as a sum of basic components where each basic component corresponds to a specific Fourier component of the sky distribution. As such, the entire visibility data is represented as a collection of data matrices from baselines, instead of a single tensor. The proposed methods are formulated as follows: provided a large dataset of the entire visibility data; the first algorithm, named $simple~SVD$ projects the data into a regular sampling space of rank$-r$ data matrices. In this space, the data for all the baselines has the same rank, which makes the compression factor equal across all baselines. The second algorithm, named $BDSVD$ projects the data into an irregular sampling space of rank$-r_{pq}$ data matrices. The subscript $pq$ indicates that the rank of the data matrix varies across baselines $pq$, which makes the compression factor baseline-dependent. MeerKAT and the European Very Long Baseline Interferometry Network are used as reference telescopes to evaluate and compare the performance of the proposed methods against traditional methods, such as traditional averaging and baseline-dependent averaging (BDA). For the same spatial resolution threshold, both $simple~SVD$ and $BDSVD$ show effective compression by two-orders of magnitude higher than traditional averaging and BDA. At the same space-saving rate, there is no decrease in spatial resolution and there is a reduction in the noise variance in the data which improves the S/N to over $1.5$ dB at the edges of the field-of-view.

Read this paper on arXiv…

M. Atemkeng, S. Perkins, E. Seck, et. al.
Mon, 17 Apr 23
19/51

Comments: N/A

Constraining cosmological parameters from N-body simulations with Variational Bayesian Neural Networks [IMA]

http://arxiv.org/abs/2301.03991


Methods based on Deep Learning have recently been applied on astrophysical parameter recovery thanks to their ability to capture information from complex data. One of these methods is the approximate Bayesian Neural Networks (BNNs) which have demonstrated to yield consistent posterior distribution into the parameter space, helpful for uncertainty quantification. However, as any modern neural networks, they tend to produce overly confident uncertainty estimates and can introduce bias when BNNs are applied to data. In this work, we implement multiplicative normalizing flows (MNFs), a family of approximate posteriors for the parameters of BNNs with the purpose of enhancing the flexibility of the variational posterior distribution, to extract $\Omega_m$, $h$, and $\sigma_8$ from the QUIJOTE simulations. We have compared this method with respect to the standard BNNs, and the flipout estimator. We found that MNFs combined with BNNs outperform the other models obtaining predictive performance with almost one order of magnitude larger that standard BNNs, $\sigma_8$ extracted with high accuracy ($r^2=0.99$), and precise uncertainty estimates. The latter implies that MNFs provide more realistic predictive distribution closer to the true posterior mitigating the bias introduced by the variational approximation and allowing to work with well-calibrated networks.

Read this paper on arXiv…

H. Hortúa, L. García and L. C
Wed, 11 Jan 23
52/80

Comments: 15 pages, 4 figures, 3 tables, submitted. Comments welcome

Modeling the Central Supermassive Black Holes Mass of Quasars via LSTM Approach [GA]

http://arxiv.org/abs/2301.01459


One of the fundamental questions about quasars is related to their central supermassive black holes. The reason for the existence of these black holes with such a huge mass is still unclear and various models have been proposed to explain them. However, there is still no comprehensive explanation that is accepted by the community. The only thing we are sure of is that these black holes were not created by the collapse of giant stars, nor by the accretion of matter around them. Moreover, another important question is the mass distribution of these black holes over time. Observations have shown that if we go back through redshift, we see black holes with more masses, and after passing the peak of star formation redshift, this procedure decreases. Nevertheless, the exact redshift of this peak is still controversial. In this paper, with the help of deep learning and the LSTM algorithm, we tried to find a suitable model for the mass of central black holes of quasars over time by considering QuasarNET data. Our model was built with these data reported from redshift 3 to 7 and for two redshift intervals 0 to 3 and 7 to 10, it predicted the mass of the quasar’s central supermassive black holes. We have also tested our model for the specified intervals with observed data from central black holes and discussed the results.

Read this paper on arXiv…

S. Tabasi, R. Salmani, P. Khaliliyan, et. al.
Thu, 5 Jan 23
43/51

Comments: 15 pages, 5 figures

Heliophysics Discovery Tools for the 21st Century: Data Science and Machine Learning Structures and Recommendations for 2020-2050 [IMA]

http://arxiv.org/abs/2212.13325


Three main points: 1. Data Science (DS) will be increasingly important to heliophysics; 2. Methods of heliophysics science discovery will continually evolve, requiring the use of learning technologies [e.g., machine learning (ML)] that are applied rigorously and that are capable of supporting discovery; and 3. To grow with the pace of data, technology, and workforce changes, heliophysics requires a new approach to the representation of knowledge.

Read this paper on arXiv…

R. McGranaghan, B. Thompson, E. Camporeale, et. al.
Thu, 29 Dec 22
35/47

Comments: 4 pages; Heliophysics 2050 White Paper

Applications of AI in Astronomy [IMA]

http://arxiv.org/abs/2212.01493


We provide a brief, and inevitably incomplete overview of the use of Machine Learning (ML) and other AI methods in astronomy, astrophysics, and cosmology. Astronomy entered the big data era with the first digital sky surveys in the early 1990s and the resulting Terascale data sets, which required automating of many data processing and analysis tasks, for example the star-galaxy separation, with billions of feature vectors in hundreds of dimensions. The exponential data growth continued, with the rise of synoptic sky surveys and the Time Domain Astronomy, with the resulting Petascale data streams and the need for a real-time processing, classification, and decision making. A broad variety of classification and clustering methods have been applied for these tasks, and this remains a very active area of research. Over the past decade we have seen an exponential growth of the astronomical literature involving a variety of ML/AI applications of an ever increasing complexity and sophistication. ML and AI are now a standard part of the astronomical toolkit. As the data complexity continues to increase, we anticipate further advances leading towards a collaborative human-AI discovery.

Read this paper on arXiv…

S. Djorgovski, A. Mahabal, M. Graham, et. al.
Tue, 6 Dec 22
79/87

Comments: 12 pages, 1 figure, an invited review chapter, to appear in: Artificial Intelligence for Science, eds. A. Choudhary, G. Fox and T. Hey, Singapore: World Scientific, in press (2023)

A comparative study of source-finding techniques in HI emission line cubes using SoFiA, MTObjects, and supervised deep learning [IMA]

http://arxiv.org/abs/2211.12809


The 21 cm spectral line emission of atomic neutral hydrogen (HI) is one of the primary wavelengths observed in radio astronomy. However, the signal is intrinsically faint and the HI content of galaxies depends on the cosmic environment, requiring large survey volumes and survey depth to investigate the HI Universe. As the amount of data coming from these surveys continues to increase with technological improvements, so does the need for automatic techniques for identifying and characterising HI sources while considering the tradeoff between completeness and purity. This study aimed to find the optimal pipeline for finding and masking the most sources with the best mask quality and the fewest artefacts in 3D neutral hydrogen cubes. Various existing methods were explored in an attempt to create a pipeline to optimally identify and mask the sources in 3D neutral hydrogen 21 cm spectral line data cubes. Two traditional source-finding methods were tested, SoFiA and MTObjects, as well as a new supervised deep learning approach, in which a 3D convolutional neural network architecture, known as V-Net was used. These three source-finding methods were further improved by adding a classical machine learning classifier as a post-processing step to remove false positive detections. The pipelines were tested on HI data cubes from the Westerbork Synthesis Radio Telescope with additional inserted mock galaxies. SoFiA combined with a random forest classifier provided the best results, with the V-Net-random forest combination a close second. We suspect this is due to the fact that there are many more mock sources in the training set than real sources. There is, therefore, room to improve the quality of the V-Net network with better-labelled data such that it can potentially outperform SoFiA.

Read this paper on arXiv…

J. Barkai, M. Verheijen, E. Martínez, et. al.
Thu, 24 Nov 22
51/71

Comments: N/A

A comparative study of source-finding techniques in HI emission line cubes using SoFiA, MTObjects, and supervised deep learning [IMA]

http://arxiv.org/abs/2211.12809


The 21 cm spectral line emission of atomic neutral hydrogen (HI) is one of the primary wavelengths observed in radio astronomy. However, the signal is intrinsically faint and the HI content of galaxies depends on the cosmic environment, requiring large survey volumes and survey depth to investigate the HI Universe. As the amount of data coming from these surveys continues to increase with technological improvements, so does the need for automatic techniques for identifying and characterising HI sources while considering the tradeoff between completeness and purity. This study aimed to find the optimal pipeline for finding and masking the most sources with the best mask quality and the fewest artefacts in 3D neutral hydrogen cubes. Various existing methods were explored in an attempt to create a pipeline to optimally identify and mask the sources in 3D neutral hydrogen 21 cm spectral line data cubes. Two traditional source-finding methods were tested, SoFiA and MTObjects, as well as a new supervised deep learning approach, in which a 3D convolutional neural network architecture, known as V-Net was used. These three source-finding methods were further improved by adding a classical machine learning classifier as a post-processing step to remove false positive detections. The pipelines were tested on HI data cubes from the Westerbork Synthesis Radio Telescope with additional inserted mock galaxies. SoFiA combined with a random forest classifier provided the best results, with the V-Net-random forest combination a close second. We suspect this is due to the fact that there are many more mock sources in the training set than real sources. There is, therefore, room to improve the quality of the V-Net network with better-labelled data such that it can potentially outperform SoFiA.

Read this paper on arXiv…

J. Barkai, M. Verheijen, E. Martínez, et. al.
Thu, 24 Nov 22
16/71

Comments: N/A

Semi-Supervised Domain Adaptation for Cross-Survey Galaxy Morphology Classification and Anomaly Detection [GA]

http://arxiv.org/abs/2211.00677


In the era of big astronomical surveys, our ability to leverage artificial intelligence algorithms simultaneously for multiple datasets will open new avenues for scientific discovery. Unfortunately, simply training a deep neural network on images from one data domain often leads to very poor performance on any other dataset. Here we develop a Universal Domain Adaptation method DeepAstroUDA, capable of performing semi-supervised domain alignment that can be applied to datasets with different types of class overlap. Extra classes can be present in any of the two datasets, and the method can even be used in the presence of unknown classes. For the first time, we demonstrate the successful use of domain adaptation on two very different observational datasets (from SDSS and DECaLS). We show that our method is capable of bridging the gap between two astronomical surveys, and also performs well for anomaly detection and clustering of unknown data in the unlabeled dataset. We apply our model to two examples of galaxy morphology classification tasks with anomaly detection: 1) classifying spiral and elliptical galaxies with detection of merging galaxies (three classes including one unknown anomaly class); 2) a more granular problem where the classes describe more detailed morphological properties of galaxies, with the detection of gravitational lenses (ten classes including one unknown anomaly class).

Read this paper on arXiv…

A. Ćiprijanović, A. Lewis, K. Pedro, et. al.
Thu, 3 Nov 22
30/59

Comments: 3 figures, 1 table; accepted to Machine Learning and the Physical Sciences – Workshop at the 36th conference on Neural Information Processing Systems (NeurIPS)

Learning to Detect Interesting Anomalies [CL]

http://arxiv.org/abs/2210.16334


Anomaly detection algorithms are typically applied to static, unchanging, data features hand-crafted by the user. But how does a user systematically craft good features for anomalies that have never been seen? Here we couple deep learning with active learning — in which an Oracle iteratively labels small amounts of data selected algorithmically over a series of rounds — to automatically and dynamically improve the data features for efficient outlier detection. This approach, AHUNT, shows excellent performance on MNIST, CIFAR10, and Galaxy-DESI data, significantly outperforming both standard anomaly detection and active learning algorithms with static feature spaces. Beyond improved performance, AHUNT also allows the number of anomaly classes to grow organically in response to Oracle’s evaluations. Extensive ablation studies explore the impact of Oracle question selection strategy and loss function on performance. We illustrate how the dynamic anomaly class taxonomy represents another step towards fully personalized rankings of different anomaly classes that reflect a user’s interests, allowing the algorithm to learn to ignore statistically significant but uninteresting outliers (e.g., noise). This should prove useful in the era of massive astronomical datasets serving diverse sets of users who can only review a tiny subset of the incoming data.

Read this paper on arXiv…

A. Sadr, B. Bassett and E. Sekyi
Tue, 1 Nov 22
39/100

Comments: 10 pages, 7 figures

Interstellar Object Accessibility and Mission Design [EPA]

http://arxiv.org/abs/2210.14980


Interstellar objects (ISOs) are fascinating and under-explored celestial objects, providing physical laboratories to understand the formation of our solar system and probe the composition and properties of material formed in exoplanetary systems. This paper will discuss the accessibility of and mission design to ISOs with varying characteristics, including a discussion of state covariance estimation over the course of a cruise, handoffs from traditional navigation approaches to novel autonomous navigation for fast flyby regimes, and overall recommendations about preparing for the future in situ exploration of these targets. The lessons learned also apply to the fast flyby of other small bodies including long-period comets and potentially hazardous asteroids, which also require a tactical response with similar characteristics

Read this paper on arXiv…

B. Donitz, D. Mages, H. Tsukamoto, et. al.
Fri, 28 Oct 22
47/56

Comments: Accepted at IEEE Aerospace Conference

Explainable classification of astronomical uncertain time series [CL]

http://arxiv.org/abs/2210.00869


Exploring the expansion history of the universe, understanding its evolutionary stages, and predicting its future evolution are important goals in astrophysics. Today, machine learning tools are used to help achieving these goals by analyzing transient sources, which are modeled as uncertain time series. Although black-box methods achieve appreciable performance, existing interpretable time series methods failed to obtain acceptable performance for this type of data. Furthermore, data uncertainty is rarely taken into account in these methods. In this work, we propose an uncertaintyaware subsequence based model which achieves a classification comparable to that of state-of-the-art methods. Unlike conformal learning which estimates model uncertainty on predictions, our method takes data uncertainty as additional input. Moreover, our approach is explainable-by-design, giving domain experts the ability to inspect the model and explain its predictions. The explainability of the proposed method has also the potential to inspire new developments in theoretical astrophysics modeling by suggesting important subsequences which depict details of light curve shapes. The dataset, the source code of our experiment, and the results are made available on a public repository.

Read this paper on arXiv…

M. Mbouopda, E. Ishida, E. Nguifo, et. al.
Tue, 4 Oct 22
58/71

Comments: N/A

Robust field-level inference with dark matter halos [CEA]

http://arxiv.org/abs/2209.06843


We train graph neural networks on halo catalogues from Gadget N-body simulations to perform field-level likelihood-free inference of cosmological parameters. The catalogues contain $\lesssim$5,000 halos with masses $\gtrsim 10^{10}~h^{-1}M_\odot$ in a periodic volume of $(25~h^{-1}{\rm Mpc})^3$; every halo in the catalogue is characterized by several properties such as position, mass, velocity, concentration, and maximum circular velocity. Our models, built to be permutationally, translationally, and rotationally invariant, do not impose a minimum scale on which to extract information and are able to infer the values of $\Omega_{\rm m}$ and $\sigma_8$ with a mean relative error of $\sim6\%$, when using positions plus velocities and positions plus masses, respectively. More importantly, we find that our models are very robust: they can infer the value of $\Omega_{\rm m}$ and $\sigma_8$ when tested using halo catalogues from thousands of N-body simulations run with five different N-body codes: Abacus, CUBEP$^3$M, Enzo, PKDGrav3, and Ramses. Surprisingly, the model trained to infer $\Omega_{\rm m}$ also works when tested on thousands of state-of-the-art CAMELS hydrodynamic simulations run with four different codes and subgrid physics implementations. Using halo properties such as concentration and maximum circular velocity allow our models to extract more information, at the expense of breaking the robustness of the models. This may happen because the different N-body codes are not converged on the relevant scales corresponding to these parameters.

Read this paper on arXiv…

H. Shao, F. Villaescusa-Navarro, P. Villanueva-Domingo, et. al.
Fri, 16 Sep 22
42/84

Comments: 25 pages, 11 figures, summary video: this https URL

Towards Coupling Full-disk and Active Region-based Flare Prediction for Operational Space Weather Forecasting [CL]

http://arxiv.org/abs/2209.07406


Solar flare prediction is a central problem in space weather forecasting and has captivated the attention of a wide spectrum of researchers due to recent advances in both remote sensing as well as machine learning and deep learning approaches. The experimental findings based on both machine and deep learning models reveal significant performance improvements for task specific datasets. Along with building models, the practice of deploying such models to production environments under operational settings is a more complex and often time-consuming process which is often not addressed directly in research settings. We present a set of new heuristic approaches to train and deploy an operational solar flare prediction system for $\geq$M1.0-class flares with two prediction modes: full-disk and active region-based. In full-disk mode, predictions are performed on full-disk line-of-sight magnetograms using deep learning models whereas in active region-based models, predictions are issued for each active region individually using multivariate time series data instances. The outputs from individual active region forecasts and full-disk predictors are combined to a final full-disk prediction result with a meta-model. We utilized an equal weighted average ensemble of two base learners’ flare probabilities as our baseline meta learner and improved the capabilities of our two base learners by training a logistic regression model. The major findings of this study are: (i) We successfully coupled two heterogeneous flare prediction models trained with different datasets and model architecture to predict a full-disk flare probability for next 24 hours, (ii) Our proposed ensembling model, i.e., logistic regression, improves on the predictive performance of two base learners and the baseline meta learner measured in terms of two widely used metrics True Skill Statistic (TSS) and Heidke Skill core (HSS), and (iii) Our result analysis suggests that the logistic regression-based ensemble (Meta-FP) improves on the full-disk model (base learner) by $\sim9\%$ in terms TSS and $\sim10\%$ in terms of HSS. Similarly, it improves on the AR-based model (base learner) by $\sim17\%$ and $\sim20\%$ in terms of TSS and HSS respectively. Finally, when compared to the baseline meta model, it improves on TSS by $\sim10\%$ and HSS by $\sim15\%$.

Read this paper on arXiv…

C. Pandey, A. Ji, R. Angryk, et. al.
Fri, 16 Sep 22
44/84

Comments: N/A

Operational solar flare forecasting via video-based deep learning [SSA]

http://arxiv.org/abs/2209.05128


Operational flare forecasting aims at providing predictions that can be used to make decisions, typically at a daily scale, about the space weather impacts of flare occurrence. This study shows that video-based deep learning can be used for operational purposes when the training and validation sets used for the network optimization are generated while accounting for the periodicity of the solar cycle. Specifically, the paper describes an algorithm that can be applied to build up sets of active regions that are balanced according to the flare class rates associated to a specific cycle phase. These sets are used to train and validate a Long-term Recurrent Convolutional Network made of a combination of a convolutional neural network and a Long-Short Memory network. The reliability of this approach is assessed in the case of two prediction windows containing the solar storm of March 2015 and September 2017, respectively.

Read this paper on arXiv…

S. Guastavino, F. Marchetti, F. Benvenuto, et. al.
Tue, 13 Sep 22
36/85

Comments: N/A

Investigation of a Machine learning methodology for the SKA pulsar search pipeline [IMA]

http://arxiv.org/abs/2209.04430


The SKA pulsar search pipeline will be used for real time detection of pulsars. Modern radio telescopes such as SKA will be generating petabytes of data in their full scale of operation. Hence experience-based and data-driven algorithms become indispensable for applications such as candidate detection. Here we describe our findings from testing a state of the art object detection algorithm called Mask R-CNN to detect candidate signatures in the SKA pulsar search pipeline. We have trained the Mask R-CNN model to detect candidate images. A custom annotation tool was developed to mark the regions of interest in large datasets efficiently. We have successfully demonstrated this algorithm by detecting candidate signatures on a simulation dataset. The paper presents details of this work with a highlight on the future prospects.

Read this paper on arXiv…

S. Bhat, P. Thiagaraj, B. Stappers, et. al.
Mon, 12 Sep 22
51/54

Comments: N/A

Leap-frog neural network for learning the symplectic evolution from partitioned data [EPA]

http://arxiv.org/abs/2208.14148


For the Hamiltonian system, this work considers the learning and prediction of the position (q) and momentum (p) variables generated by a symplectic evolution map. Similar to Chen & Tao (2021), the symplectic map is represented by the generating function. In addition, we develop a new learning scheme by splitting the time series (q_i, p_i) into several partitions, and then train a leap-frog neural network (LFNN) to approximate the generating function between the first (i.e. initial condition) and one of the rest partitions. For predicting the system evolution in a short timescale, the LFNN could effectively avoid the issue of accumulative error. Then the LFNN is applied to learn the behavior of the 2:3 resonant Kuiper belt objects, in a much longer time period, and there are two significant improvements on the neural network constructed in our previous work (Li et al. 2022): (1) conservation of the Jacobi integral ; (2) highly accurate prediction of the orbital evolution. We propose that the LFNN may be useful to make the prediction of the long time evolution of the Hamiltonian system.

Read this paper on arXiv…

X. Li, J. Li and Z. Xia
Wed, 31 Aug 22
60/86

Comments: 10 pages, 5 figures, comments welcome

The Fellowship of the Dyson Ring: ACT\&Friends' Results and Methods for GTOC 11 [CL]

http://arxiv.org/abs/2205.10124


Dyson spheres are hypothetical megastructures encircling stars in order to harvest most of their energy output. During the 11th edition of the GTOC challenge, participants were tasked with a complex trajectory planning related to the construction of a precursor Dyson structure, a heliocentric ring made of twelve stations. To this purpose, we developed several new approaches that synthesize techniques from machine learning, combinatorial optimization, planning and scheduling, and evolutionary optimization effectively integrated into a fully automated pipeline. These include a machine learned transfer time estimator, improving the established Edelbaum approximation and thus better informing a Lazy Race Tree Search to identify and collect asteroids with high arrival mass for the stations; a series of optimally-phased low-thrust transfers to all stations computed by indirect optimization techniques, exploiting the synodic periodicity of the system; and a modified Hungarian scheduling algorithm, which utilizes evolutionary techniques to arrange a mass-balanced arrival schedule out of all transfer possibilities. We describe the steps of our pipeline in detail with a special focus on how our approaches mutually benefit from each other. Lastly, we outline and analyze the final solution of our team, ACT&Friends, which ranked second at the GTOC 11 challenge.

Read this paper on arXiv…

M. Märtens, D. Izzo, E. Blazquez, et. al.
Mon, 23 May 22
36/50

Comments: N/A

Insights into the origin of halo mass profiles from machine learning [CEA]

http://arxiv.org/abs/2205.04474


The mass distribution of dark matter haloes is the result of the hierarchical growth of initial density perturbations through mass accretion and mergers. We use an interpretable machine-learning framework to provide physical insights into the origin of the spherically-averaged mass profile of dark matter haloes. We train a gradient-boosted-trees algorithm to predict the final mass profiles of cluster-sized haloes, and measure the importance of the different inputs provided to the algorithm. We find two primary scales in the initial conditions (ICs) that impact the final mass profile: the density at approximately the scale of the haloes’ Lagrangian patch $R_L$ ($R\sim 0.7\, R_L$) and that in the large-scale environment ($R\sim 1.7~R_L$). The model also identifies three primary time-scales in the halo assembly history that affect the final profile: (i) the formation time of the virialized, collapsed material inside the halo, (ii) the dynamical time, which captures the dynamically unrelaxed, infalling component of the halo over its first orbit, (iii) a third, most recent time-scale, which captures the impact on the outer profile of recent massive merger events. While the inner profile retains memory of the ICs, this information alone is insufficient to yield accurate predictions for the outer profile. As we add information about the haloes’ mass accretion history, we find a significant improvement in the predicted profiles at all radii. Our machine-learning framework provides novel insights into the role of the ICs and the mass assembly history in determining the final mass profile of cluster-sized haloes.

Read this paper on arXiv…

L. Lucie-Smith, S. Adhikari and R. Wechsler
Wed, 11 May 22
23/60

Comments: 14 pages, 11 figures, comments welcome

DeepGraviLens: a Multi-Modal Architecture for Classifying Gravitational Lensing Data [IMA]

http://arxiv.org/abs/2205.00701


Gravitational lensing is the relativistic effect generated by massive bodies, which bend the space-time surrounding them. It is a deeply investigated topic in astrophysics and allows validating theoretical relativistic results and studying faint astrophysical objects that would not be visible otherwise. In recent years Machine Learning methods have been applied to support the analysis of the gravitational lensing phenomena by detecting lensing effects in data sets consisting of images associated with brightness variation time series. However, the state-of-art approaches either consider only images and neglect time-series data or achieve relatively low accuracy on the most difficult data sets. This paper introduces DeepGraviLens, a novel multi-modal network that classifies spatio-temporal data belonging to one non-lensed system type and three lensed system types. It surpasses the current state of the art accuracy results by $\approx$ 19% to $\approx$ 43%, depending on the considered data set. Such an improvement will enable the acceleration of the analysis of lensed objects in upcoming astrophysical surveys, which will exploit the petabytes of data collected, e.g., from the Vera C. Rubin Observatory.

Read this paper on arXiv…

N. Vago and P. Fraternali
Tue, 3 May 22
56/82

Comments: N/A

Predicting Solar Flares Using CNN and LSTM on Two Solar Cycles of Active Region Data [SSA]

http://arxiv.org/abs/2204.03710


We consider the flare prediction problem that distinguishes flare-imminent active regions that produce an M- or X-class flare in the future 24 hours, from quiet active regions that do not produce any flare within $\pm 24$ hours. Using line-of-sight magnetograms and parameters of active regions in two data products covering Solar Cycle 23 and 24, we train and evaluate two deep learning algorithms — CNN and LSTM — and their stacking ensembles. The decisions of CNN are explained using visual attribution methods. We have the following three main findings. (1) LSTM trained on data from two solar cycles achieves significantly higher True Skill Scores (TSS) than that trained on data from a single solar cycle with a confidence level of at least 0.95. (2) On data from Solar Cycle 23, a stacking ensemble that combines predictions from LSTM and CNN using the TSS criterion achieves significantly higher TSS than the “select-best” strategy with a confidence level of at least 0.95. (3) A visual attribution method called Integrated Gradients is able to attribute the CNN’s predictions of flares to the emerging magnetic flux in the active region. It also reveals a limitation of CNN as a flare prediction method using line-of-sight magnetograms: it treats the polarity artifact of line-of-sight magnetograms as positive evidence of flares.

Read this paper on arXiv…

Z. Sun, M. Bobra, X. Wang, et. al.
Mon, 11 Apr 22
11/61

Comments: 31 pages, 16 figures, accepted in the ApJ

Monte Carlo Physarum Machine: Characteristics of Pattern Formation in Continuous Stochastic Transport Networks [CEA]

http://arxiv.org/abs/2204.01256


We present Monte Carlo Physarum Machine: a computational model suitable for reconstructing continuous transport networks from sparse 2D and 3D data. MCPM is a probabilistic generalization of Jones’s 2010 agent-based model for simulating the growth of Physarum polycephalum slime mold. We compare MCPM to Jones’s work on theoretical grounds, and describe a task-specific variant designed for reconstructing the large-scale distribution of gas and dark matter in the Universe known as the Cosmic web. To analyze the new model, we first explore MCPM’s self-patterning behavior, showing a wide range of continuous network-like morphologies — called “polyphorms” — that the model produces from geometrically intuitive parameters. Applying MCPM to both simulated and observational cosmological datasets, we then evaluate its ability to produce consistent 3D density maps of the Cosmic web. Finally, we examine other possible tasks where MCPM could be useful, along with several examples of fitting to domain-specific data as proofs of concept.

Read this paper on arXiv…

O. Elek, J. Burchett, J. Prochaska, et. al.
Tue, 5 Apr 22
54/83

Comments: N/A

Neural representation of a time optimal, constant acceleration rendezvous [EPA]

http://arxiv.org/abs/2203.15490


We train neural models to represent both the optimal policy (i.e. the optimal thrust direction) and the value function (i.e. the time of flight) for a time optimal, constant acceleration low-thrust rendezvous. In both cases we develop and make use of the data augmentation technique we call backward generation of optimal examples. We are thus able to produce and work with large dataset and to fully exploit the benefit of employing a deep learning framework. We achieve, in all cases, accuracies resulting in successful rendezvous (simulated following the learned policy) and time of flight predictions (using the learned value function). We find that residuals as small as a few m/s, thus well within the possibility of a spacecraft navigation $\Delta V$ budget, are achievable for the velocity at rendezvous. We also find that, on average, the absolute error to predict the optimal time of flight to rendezvous from any orbit in the asteroid belt to an Earth-like orbit is small (less than 4\%) and thus also of interest for practical uses, for example, during preliminary mission design phases.

Read this paper on arXiv…

D. Izzo and S. Origer
Wed, 30 Mar 22
59/77

Comments: N/A

Applications of physics informed neural operators [CL]

http://arxiv.org/abs/2203.12634


We present an end-to-end framework to learn partial differential equations that brings together initial data production, selection of boundary conditions, and the use of physics-informed neural operators to solve partial differential equations that are ubiquitous in the study and modeling of physics phenomena. We first demonstrate that our methods reproduce the accuracy and performance of other neural operators published elsewhere in the literature to learn the 1D wave equation and the 1D Burgers equation. Thereafter, we apply our physics-informed neural operators to learn new types of equations, including the 2D Burgers equation in the scalar, inviscid and vector types. Finally, we show that our approach is also applicable to learn the physics of the 2D linear and nonlinear shallow water equations, which involve three coupled partial differential equations. We release our artificial intelligence surrogates and scientific software to produce initial data and boundary conditions to study a broad range of physically motivated scenarios. We provide the source code, an interactive website to visualize the predictions of our physics informed neural operators, and a tutorial for their use at the Data and Learning Hub for Science.

Read this paper on arXiv…

S. Rosofsky and E. Huerta
Fri, 25 Mar 22
11/46

Comments: 15 pages, 10 figures

A comparative study of non-deep learning, deep learning, and ensemble learning methods for sunspot number prediction [SSA]

http://arxiv.org/abs/2203.05757


Solar activity has significant impacts on human activities and health. One most commonly used measure of solar activity is the sunspot number. This paper compares three important non-deep learning models, four popular deep learning models, and their five ensemble models in forecasting sunspot numbers. Our proposed ensemble model XGBoost-DL, which uses XGBoost as a two-level nonlinear ensemble method to combine the deep learning models, achieves the best forecasting performance among all considered models and the NASA’s forecast. Our XGBoost-DL forecasts a peak sunspot number of 133.47 in May 2025 for Solar Cycle 25 and 164.62 in November 2035 for Solar Cycle 26, similar to but later than the NASA’s at 137.7 in October 2024 and 161.2 in December 2034.

Read this paper on arXiv…

Y. Dang, Z. Chen, H. Li, et. al.
Mon, 14 Mar 22
11/67

Comments: N/A

Symmetry Group Equivariant Architectures for Physics [CL]

http://arxiv.org/abs/2203.06153


Physical theories grounded in mathematical symmetries are an essential component of our understanding of a wide range of properties of the universe. Similarly, in the domain of machine learning, an awareness of symmetries such as rotation or permutation invariance has driven impressive performance breakthroughs in computer vision, natural language processing, and other important applications. In this report, we argue that both the physics community and the broader machine learning community have much to understand and potentially to gain from a deeper investment in research concerning symmetry group equivariant machine learning architectures. For some applications, the introduction of symmetries into the fundamental structural design can yield models that are more economical (i.e. contain fewer, but more expressive, learned parameters), interpretable (i.e. more explainable or directly mappable to physical quantities), and/or trainable (i.e. more efficient in both data and computational requirements). We discuss various figures of merit for evaluating these models as well as some potential benefits and limitations of these methods for a variety of physics applications. Research and investment into these approaches will lay the foundation for future architectures that are potentially more robust under new computational paradigms and will provide a richer description of the physical systems to which they are applied.

Read this paper on arXiv…

A. Bogatskiy, S. Ganguly, T. Kipf, et. al.
Mon, 14 Mar 22
31/67

Comments: Contribution to Snowmass 2021

Interpreting a Machine Learning Model for Detecting Gravitational Waves [CL]

http://arxiv.org/abs/2202.07399


We describe a case study of translational research, applying interpretability techniques developed for computer vision to machine learning models used to search for and find gravitational waves. The models we study are trained to detect black hole merger events in non-Gaussian and non-stationary advanced Laser Interferometer Gravitational-wave Observatory (LIGO) data. We produced visualizations of the response of machine learning models when they process advanced LIGO data that contains real gravitational wave signals, noise anomalies, and pure advanced LIGO noise. Our findings shed light on the responses of individual neurons in these machine learning models. Further analysis suggests that different parts of the network appear to specialize in local versus global features, and that this difference appears to be rooted in the branched architecture of the network as well as noise characteristics of the LIGO detectors. We believe efforts to whiten these “black box” models can suggest future avenues for research and help inform the design of interpretable machine learning models for gravitational wave astrophysics.

Read this paper on arXiv…

M. Safarzadeh, A. Khan, E. Huerta, et. al.
Wed, 16 Feb 22
55/69

Comments: 19 pages, to be submitted, comments are welcome. Movies based on this work can be accessed via: this https URL this https URL

Inference-optimized AI and high performance computing for gravitational wave detection at scale [CL]

http://arxiv.org/abs/2201.11133


We introduce an ensemble of artificial intelligence models for gravitational wave detection that we trained in the Summit supercomputer using 32 nodes, equivalent to 192 NVIDIA V100 GPUs, within 2 hours. Once fully trained, we optimized these models for accelerated inference using NVIDIA TensorRT. We deployed our inference-optimized AI ensemble in the ThetaGPU supercomputer at Argonne Leadership Computer Facility to conduct distributed inference. Using the entire ThetaGPU supercomputer, consisting of 20 nodes each of which has 8 NVIDIA A100 Tensor Core GPUs and 2 AMD Rome CPUs, our NVIDIA TensorRT-optimized AI ensemble porcessed an entire month of advanced LIGO data (including Hanford and Livingston data streams) within 50 seconds. Our inference-optimized AI ensemble retains the same sensitivity of traditional AI models, namely, it identifies all known binary black hole mergers previously identified in this advanced LIGO dataset and reports no misclassifications, while also providing a 3X inference speedup compared to traditional artificial intelligence models. We used time slides to quantify the performance of our AI ensemble to process up to 5 years worth of advanced LIGO data. In this synthetically enhanced dataset, our AI ensemble reports an average of one misclassification for every month of searched advanced LIGO data. We also present the receiver operating characteristic curve of our AI ensemble using this 5 year long advanced LIGO dataset. This approach provides the required tools to conduct accelerated, AI-driven gravitational wave detection at scale.

Read this paper on arXiv…

P. Chaturvedi, A. Khan, M. Tian, et. al.
Fri, 28 Jan 22
3/64

Comments: 19 pages, 8 figure

The CAMELS project: public data release [CEA]

http://arxiv.org/abs/2201.01300


The Cosmology and Astrophysics with MachinE Learning Simulations (CAMELS) project was developed to combine cosmology with astrophysics through thousands of cosmological hydrodynamic simulations and machine learning. CAMELS contains 4,233 cosmological simulations, 2,049 N-body and 2,184 state-of-the-art hydrodynamic simulations that sample a vast volume in parameter space. In this paper we present the CAMELS public data release, describing the characteristics of the CAMELS simulations and a variety of data products generated from them, including halo, subhalo, galaxy, and void catalogues, power spectra, bispectra, Lyman-$\alpha$ spectra, probability distribution functions, halo radial profiles, and X-rays photon lists. We also release over one thousand catalogues that contain billions of galaxies from CAMELS-SAM: a large collection of N-body simulations that have been combined with the Santa Cruz Semi-Analytic Model. We release all the data, comprising more than 350 terabytes and containing 143,922 snapshots, millions of halos, galaxies and summary statistics. We provide further technical details on how to access, download, read, and process the data at \url{https://camels.readthedocs.io}.

Read this paper on arXiv…

F. Villaescusa-Navarro, S. Genel, D. Anglés-Alcázar, et. al.
Thu, 6 Jan 22
8/56

Comments: 18 pages, 3 figures. More than 350 Tb of data from thousands of simulations publicly available at this https URL

Augmenting astrophysical scaling relations with machine learning : application to reducing the SZ flux-mass scatter [CEA]

http://arxiv.org/abs/2201.01305


Complex systems (stars, supernovae, galaxies, and clusters) often exhibit low scatter relations between observable properties (e.g., luminosity, velocity dispersion, oscillation period, temperature). These scaling relations can illuminate the underlying physics and can provide observational tools for estimating masses and distances. Machine learning can provide a systematic way to search for new scaling relations (or for simple extensions to existing relations) in abstract high-dimensional parameter spaces. We use a machine learning tool called symbolic regression (SR), which models the patterns in a given dataset in the form of analytic equations. We focus on the Sunyaev-Zeldovich flux$-$cluster mass relation ($Y_\mathrm{SZ}-M$), the scatter in which affects inference of cosmological parameters from cluster abundance data. Using SR on the data from the IllustrisTNG hydrodynamical simulation, we find a new proxy for cluster mass which combines $Y_\mathrm{SZ}$ and concentration of ionized gas ($c_\mathrm{gas}$): $M \propto Y_\mathrm{conc}^{3/5} \equiv Y_\mathrm{SZ}^{3/5} (1-A\, c_\mathrm{gas})$. $Y_\mathrm{conc}$ reduces the scatter in the predicted $M$ by $\sim 20-30$% for large clusters ($M\gtrsim 10^{14}\, h^{-1} \, M_\odot$) at both high and low redshifts, as compared to using just $Y_\mathrm{SZ}$. We show that the dependence on $c_\mathrm{gas}$ is linked to cores of clusters exhibiting larger scatter than their outskirts. Finally, we test $Y_\mathrm{conc}$ on clusters from simulations of the CAMELS project and show that $Y_\mathrm{conc}$ is robust against variations in cosmology, astrophysics, subgrid physics, and cosmic variance. Our results and methodology can be useful for accurate multiwavelength cluster mass estimation from current and upcoming CMB and X-ray surveys like ACT, SO, SPT, eROSITA and CMB-S4.

Read this paper on arXiv…

D. Wadekar, L. Thiele, F. Villaescusa-Navarro, et. al.
Thu, 6 Jan 22
18/56

Comments: 11+6 pages, 8+3 figures. The code and data associated with this paper will be uploaded upon the acceptance of this paper

Detection of extragalactic Ultra-Compact Dwarfs and Globular Clusters using Explainable AI techniques [GA]

http://arxiv.org/abs/2201.01604


Compact stellar systems such as Ultra-compact dwarfs (UCDs) and Globular Clusters (GCs) around galaxies are known to be the tracers of the merger events that have been forming these galaxies. Therefore, identifying such systems allows to study galaxies mass assembly, formation and evolution. However, in the lack of spectroscopic information detecting UCDs/GCs using imaging data is very uncertain. Here, we aim to train a machine learning model to separate these objects from the foreground stars and background galaxies using the multi-wavelength imaging data of the Fornax galaxy cluster in 6 filters, namely u, g, r, i, J and Ks. The classes of objects are highly imbalanced which is problematic for many automatic classification techniques. Hence, we employ Synthetic Minority Over-sampling to handle the imbalance of the training data. Then, we compare two classifiers, namely Localized Generalized Matrix Learning Vector Quantization (LGMLVQ) and Random Forest (RF). Both methods are able to identify UCDs/GCs with a precision and a recall of >93 percent and provide relevances that reflect the importance of each feature dimension %(colors and angular sizes) for the classification. Both methods detect angular sizes as important markers for this classification problem. While it is astronomical expectation that color indices of u-i and i-Ks are the most important colors, our analysis shows that colors such as g-r are more informative, potentially because of higher signal-to-noise ratio. Besides the excellent performance the LGMLVQ method allows further interpretability by providing the feature importance for each individual class, class-wise representative samples and the possibility for non-linear visualization of the data as demonstrated in this contribution. We conclude that employing machine learning techniques to identify UCDs/GCs can lead to promising results.

Read this paper on arXiv…

M. Mohammadi, J. Mutatiina, T. Saifollahi, et. al.
Thu, 6 Jan 22
29/56

Comments: N/A

Unsupervised Domain Adaptation for Constraining Star Formation Histories [GA]

http://arxiv.org/abs/2112.14072


The prevalent paradigm of machine learning today is to use past observations to predict future ones. What if, however, we are interested in knowing the past given the present? This situation is indeed one that astronomers must contend with often. To understand the formation of our universe, we must derive the time evolution of the visible mass content of galaxies. However, to observe a complete star life, one would need to wait for one billion years! To overcome this difficulty, astrophysicists leverage supercomputers and evolve simulated models of galaxies till the current age of the universe, thus establishing a mapping between observed radiation and star formation histories (SFHs). Such ground-truth SFHs are lacking for actual galaxy observations, where they are usually inferred — with often poor confidence — from spectral energy distributions (SEDs) using Bayesian fitting methods. In this investigation, we discuss the ability of unsupervised domain adaptation to derive accurate SFHs for galaxies with simulated data as a necessary first step in developing a technique that can ultimately be applied to observational data.

Read this paper on arXiv…

S. Gilda, A. Mathelin, S. Bellstedt, et. al.
Thu, 30 Dec 21
48/71

Comments: Accepted for oral presentation at the 1st Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE). Journal article to follow

DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification [CL]

http://arxiv.org/abs/2112.14299


Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, G. Snyder, et. al.
Thu, 30 Dec 21
53/71

Comments: 19 pages, 7 figures, 5 tables, submitted to Astronomy & Computing

DeepAdversaries: Examining the Robustness of Deep Learning Models for Galaxy Morphology Classification [CL]

http://arxiv.org/abs/2112.14299


Data processing and analysis pipelines in cosmological survey experiments introduce data perturbations that can significantly degrade the performance of deep learning-based models. Given the increased adoption of supervised deep learning methods for processing and analysis of cosmological survey data, the assessment of data perturbation effects and the development of methods that increase model robustness are increasingly important. In the context of morphological classification of galaxies, we study the effects of perturbations in imaging data. In particular, we examine the consequences of using neural networks when training on baseline data and testing on perturbed data. We consider perturbations associated with two primary sources: 1) increased observational noise as represented by higher levels of Poisson noise and 2) data processing noise incurred by steps such as image compression or telescope errors as represented by one-pixel adversarial attacks. We also test the efficacy of domain adaptation techniques in mitigating the perturbation-driven errors. We use classification accuracy, latent space visualizations, and latent space distance to assess model robustness. Without domain adaptation, we find that processing pixel-level errors easily flip the classification into an incorrect class and that higher observational noise makes the model trained on low-noise data unable to classify galaxy morphologies. On the other hand, we show that training with domain adaptation improves model robustness and mitigates the effects of these perturbations, improving the classification accuracy by 23% on data with higher observational noise. Domain adaptation also increases by a factor of ~2.3 the latent space distance between the baseline and the incorrectly classified one-pixel perturbed image, making the model more robust to inadvertent perturbations.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, G. Snyder, et. al.
Thu, 30 Dec 21
45/71

Comments: 19 pages, 7 figures, 5 tables, submitted to Astronomy & Computing

Unsupervised Domain Adaptation for Constraining Star Formation Histories [GA]

http://arxiv.org/abs/2112.14072


The prevalent paradigm of machine learning today is to use past observations to predict future ones. What if, however, we are interested in knowing the past given the present? This situation is indeed one that astronomers must contend with often. To understand the formation of our universe, we must derive the time evolution of the visible mass content of galaxies. However, to observe a complete star life, one would need to wait for one billion years! To overcome this difficulty, astrophysicists leverage supercomputers and evolve simulated models of galaxies till the current age of the universe, thus establishing a mapping between observed radiation and star formation histories (SFHs). Such ground-truth SFHs are lacking for actual galaxy observations, where they are usually inferred — with often poor confidence — from spectral energy distributions (SEDs) using Bayesian fitting methods. In this investigation, we discuss the ability of unsupervised domain adaptation to derive accurate SFHs for galaxies with simulated data as a necessary first step in developing a technique that can ultimately be applied to observational data.

Read this paper on arXiv…

S. Gilda, A. Mathelin, S. Bellstedt, et. al.
Thu, 30 Dec 21
61/71

Comments: Accepted for oral presentation at the 1st Annual AAAI Workshop on AI to Accelerate Science and Engineering (AI2ASE). Journal article to follow

AI and extreme scale computing to learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non-precessing binary black hole mergers [IMA]

http://arxiv.org/abs/2112.07669


We use artificial intelligence (AI) to learn and infer the physics of higher order gravitational wave modes of quasi-circular, spinning, non precessing binary black hole mergers. We trained AI models using 14 million waveforms, produced with the surrogate model NRHybSur3dq8, that include modes up to $\ell \leq 4$ and $(5,5)$, except for $(4,0)$ and $(4,1)$, that describe binaries with mass-ratios $q\leq8$ and individual spins $s^z_{{1,2}}\in[-0.8, 0.8]$. We use our AI models to obtain deterministic and probabilistic estimates of the mass-ratio, individual spins, effective spin, and inclination angle of numerical relativity waveforms that describe such signal manifold. Our studies indicate that AI provides informative estimates for these physical parameters. This work marks the first time AI is capable of characterizing this high-dimensional signal manifold. Our AI models were trained within 3.4 hours using distributed training on 256 nodes (1,536 NVIDIA V100 GPUs) in the Summit supercomputer.

Read this paper on arXiv…

A. Khan and E. Huerta
Thu, 16 Dec 21
44/83

Comments: 21 pages, 12 figures

Weighing the Milky Way and Andromeda with Artificial Intelligence [GA]

http://arxiv.org/abs/2111.14874


We present new constraints on the masses of the halos hosting the Milky Way and Andromeda galaxies derived using graph neural networks. Our models, trained on thousands of state-of-the-art hydrodynamic simulations of the CAMELS project, only make use of the positions, velocities and stellar masses of the galaxies belonging to the halos, and are able to perform likelihood-free inference on halo masses while accounting for both cosmological and astrophysical uncertainties. Our constraints are in agreement with estimates from other traditional methods.

Read this paper on arXiv…

P. Villanueva-Domingo, F. Villaescusa-Navarro, S. Genel, et. al.
Wed, 1 Dec 21
107/110

Comments: 2 figures, 2 tables, 7 pages. Code publicly available at this https URL

Automatically detecting anomalous exoplanet transits [CL]

http://arxiv.org/abs/2111.08679


Raw light curve data from exoplanet transits is too complex to naively apply traditional outlier detection methods. We propose an architecture which estimates a latent representation of both the main transit and residual deviations with a pair of variational autoencoders. We show, using two fabricated datasets, that our latent representations of anomalous transit residuals are significantly more amenable to outlier detection than raw data or the latent representation of a traditional variational autoencoder. We then apply our method to real exoplanet transit data. Our study is the first which automatically identifies anomalous exoplanet transit light curves. We additionally release three first-of-their-kind datasets to enable further research.

Read this paper on arXiv…

C. Hönes, B. Miller, A. Heras, et. al.
Wed, 17 Nov 21
4/64

Comments: 12 pages, 4 figures, 4 tables, Accepted at NeurIPS 2021 (Workshop for Machine Learning and the Physical Sciences)

Interpretable AI forecasting for numerical relativity waveforms of quasi-circular, spinning, non-precessing binary black hole mergers [CL]

http://arxiv.org/abs/2110.06968


We present a deep-learning artificial intelligence model that is capable of learning and forecasting the late-inspiral, merger and ringdown of numerical relativity waveforms that describe quasi-circular, spinning, non-precessing binary black hole mergers. We used the NRHybSur3dq8 surrogate model to produce train, validation and test sets of $\ell=|m|=2$ waveforms that cover the parameter space of binary black hole mergers with mass-ratios $q\leq8$ and individual spins $|s^z_{{1,2}}| \leq 0.8$. These waveforms cover the time range $t\in[-5000\textrm{M}, 130\textrm{M}]$, where $t=0M$ marks the merger event, defined as the maximum value of the waveform amplitude. We harnessed the ThetaGPU supercomputer at the Argonne Leadership Computing Facility to train our AI model using a training set of 1.5 million waveforms. We used 16 NVIDIA DGX A100 nodes, each consisting of 8 NVIDIA A100 Tensor Core GPUs and 2 AMD Rome CPUs, to fully train our model within 3.5 hours. Our findings show that artificial intelligence can accurately forecast the dynamical evolution of numerical relativity waveforms in the time range $t\in[-100\textrm{M}, 130\textrm{M}]$. Sampling a test set of 190,000 waveforms, we find that the average overlap between target and predicted waveforms is $\gtrsim99\%$ over the entire parameter space under consideration. We also combined scientific visualization and accelerated computing to identify what components of our model take in knowledge from the early and late-time waveform evolution to accurately forecast the latter part of numerical relativity waveforms. This work aims to accelerate the creation of scalable, computationally efficient and interpretable artificial intelligence models for gravitational wave astrophysics.

Read this paper on arXiv…

A. Khan, E. Huerta and H. Zheng
Fri, 15 Oct 21
32/56

Comments: 17 pages, 7 figures, 1 appendix

Machine Learning for Discovering Effective Interaction Kernels between Celestial Bodies from Ephemerides [EPA]

http://arxiv.org/abs/2108.11894


Building accurate and predictive models of the underlying mechanisms of celestial motion has inspired fundamental developments in theoretical physics. Candidate theories seek to explain observations and predict future positions of planets, stars, and other astronomical bodies as faithfully as possible. We use a data-driven learning approach, extending that developed in Lu et al. ($2019$) and extended in Zhong et al. ($2020$), to a derive stable and accurate model for the motion of celestial bodies in our Solar System. Our model is based on a collective dynamics framework, and is learned from the NASA Jet Propulsion Lab’s development ephemerides. By modeling the major astronomical bodies in the Solar System as pairwise interacting agents, our learned model generate extremely accurate dynamics that preserve not only intrinsic geometric properties of the orbits, but also highly sensitive features of the dynamics, such as perihelion precession rates. Our learned model can provide a unified explanation to the observation data, especially in terms of reproducing the perihelion precession of Mars, Mercury, and the Moon. Moreover, Our model outperforms Newton’s Law of Universal Gravitation in all cases and performs similarly to, and exceeds on the Moon, the Einstein-Infeld-Hoffman equations derived from Einstein’s theory of general relativity.

Read this paper on arXiv…

M. Zhong, J. Miller and M. Maggioni
Fri, 27 Aug 21
52/67

Comments: N/A

Uncertainty-Aware Learning for Improvements in Image Quality of the Canada-France-Hawaii Telescope [IMA]

http://arxiv.org/abs/2107.00048


We leverage state-of-the-art machine learning methods and a decade’s worth of archival data from the Canada-France-Hawaii Telescope (CFHT) to predict observatory image quality (IQ) from environmental conditions and observatory operating parameters. Specifically, we develop accurate and interpretable models of the complex dependence between data features and observed IQ for CFHT’s wide field camera, MegaCam. Our contributions are several-fold. First, we collect, collate and reprocess several disparate data sets gathered by CFHT scientists. Second, we predict probability distribution functions (PDFs) of IQ, and achieve a mean absolute error of $\sim0.07”$ for the predicted medians. Third, we explore data-driven actuation of the 12 dome “vents”, installed in 2013-14 to accelerate the flushing of hot air from the dome. We leverage epistemic and aleatoric uncertainties in conjunction with probabilistic generative modeling to identify candidate vent adjustments that are in-distribution (ID) and, for the optimal configuration for each ID sample, we predict the reduction in required observing time to achieve a fixed SNR. On average, the reduction is $\sim15\%$. Finally, we rank sensor data features by Shapley values to identify the most predictive variables for each observation. Our long-term goal is to construct reliable and real-time models that can forecast optimal observatory operating parameters for optimization of IQ. Such forecasts can then be fed into scheduling protocols and predictive maintenance routines. We anticipate that such approaches will become standard in automating observatory operations and maintenance by the time CFHT’s successor, the Maunakea Spectroscopic Explorer (MSE), is installed in the next decade.

Read this paper on arXiv…

S. Gilda, S. Draper, S. Fabbro, et. al.
Fri, 2 Jul 21
62/67

Comments: 25 pages, 1 appendix, 12 figures. To be submitted to MNRAS. Comments and feedback welcome

Fast, high-fidelity Lyman $α$ forests with convolutional neural networks [CEA]

http://arxiv.org/abs/2106.12662


Full-physics cosmological simulations are powerful tools for studying the formation and evolution of structure in the universe but require extreme computational resources. Here, we train a convolutional neural network to use a cheaper N-body-only simulation to reconstruct the baryon hydrodynamic variables (density, temperature, and velocity) on scales relevant to the Lyman-$\alpha$ (Ly$\alpha$) forest, using data from Nyx simulations. We show that our method enables rapid estimation of these fields at a resolution of $\sim$20kpc, and captures the statistics of the Ly$\alpha$ forest with much greater accuracy than existing approximations. Because our model is fully-convolutional, we can train on smaller simulation boxes and deploy on much larger ones, enabling substantial computational savings. Furthermore, as our method produces an approximation for the hydrodynamic fields instead of Ly$\alpha$ flux directly, it is not limited to a particular choice of ionizing background or mean transmitted flux.

Read this paper on arXiv…

P. Harrington, M. Mustafa, M. Dornfest, et. al.
Fri, 25 Jun 21
31/62

Comments: 10 pages, 6 figures

Image simulation for space applications with the SurRender software [EPA]

http://arxiv.org/abs/2106.11322


Image Processing algorithms for vision-based navigation require reliable image simulation capacities. In this paper we explain why traditional rendering engines may present limitations that are potentially critical for space applications. We introduce Airbus SurRender software v7 and provide details on features that make it a very powerful space image simulator. We show how SurRender is at the heart of the development processes of our computer vision solutions and we provide a series of illustrations of rendered images for various use cases ranging from Moon and Solar System exploration, to in orbit rendezvous and planetary robotics.

Read this paper on arXiv…

J. Lebreton, R. Brochard, M. Baudry, et. al.
Wed, 23 Jun 21
19/48

Comments: 11th International ESA Conference on Guidance, Navigation & Control Systems, 22 – 25 June 2021 16 pages, 8 figures

Unsupervised Resource Allocation with Graph Neural Networks [CL]

http://arxiv.org/abs/2106.09761


We present an approach for maximizing a global utility function by learning how to allocate resources in an unsupervised way. We expect interactions between allocation targets to be important and therefore propose to learn the reward structure for near-optimal allocation policies with a GNN. By relaxing the resource constraint, we can employ gradient-based optimization in contrast to more standard evolutionary algorithms. Our algorithm is motivated by a problem in modern astronomy, where one needs to select-based on limited initial information-among $10^9$ galaxies those whose detailed measurement will lead to optimal inference of the composition of the universe. Our technique presents a way of flexibly learning an allocation strategy by only requiring forward simulators for the physics of interest and the measurement process. We anticipate that our technique will also find applications in a range of resource allocation problems.

Read this paper on arXiv…

M. Cranmer, P. Melchior and B. Nord
Mon, 21 Jun 21
29/54

Comments: Accepted to PMLR/contributed oral at NeurIPS 2020 Pre-registration Workshop. Code at this https URL

Using Convolutional Neural Networks for the Helicity Classification of Magnetic Fields [HEAP]

http://arxiv.org/abs/2106.06718


The presence of non-zero helicity in intergalactic magnetic fields is a smoking gun for their primordial origin since they have to be generated by processes that break CP invariance. As an experimental signature for the presence of helical magnetic fields, an estimator $Q$ based on the triple scalar product of the wave-vectors of photons generated in electromagnetic cascades from, e.g., TeV blazars, has been suggested previously. We propose to apply deep learning to helicity classification employing Convolutional Neural Networks and show that this method outperforms the $Q$ estimator.

Read this paper on arXiv…

N. Vago, I. Hameed and M. Kachelriess
Tue, 15 Jun 21
1/67

Comments: 14 pages, extended version of a contribution to the proceedings of the 37.th ICRC 2021

The effect of phased recurrent units in the classification of multiple catalogs of astronomical lightcurves [IMA]

http://arxiv.org/abs/2106.03736


In the new era of very large telescopes, where data is crucial to expand scientific knowledge, we have witnessed many deep learning applications for the automatic classification of lightcurves. Recurrent neural networks (RNNs) are one of the models used for these applications, and the LSTM unit stands out for being an excellent choice for the representation of long time series. In general, RNNs assume observations at discrete times, which may not suit the irregular sampling of lightcurves. A traditional technique to address irregular sequences consists of adding the sampling time to the network’s input, but this is not guaranteed to capture sampling irregularities during training. Alternatively, the Phased LSTM unit has been created to address this problem by updating its state using the sampling times explicitly. In this work, we study the effectiveness of the LSTM and Phased LSTM based architectures for the classification of astronomical lightcurves. We use seven catalogs containing periodic and nonperiodic astronomical objects. Our findings show that LSTM outperformed PLSTM on 6/7 datasets. However, the combination of both units enhances the results in all datasets.

Read this paper on arXiv…

C. Donoso-Oliva, G. Cabrera-Vives, P. Protopapas, et. al.
Tue, 8 Jun 21
82/86

Comments: N/A

Geodesy of irregular small bodies via neural density fields: geodesyNets [EPA]

http://arxiv.org/abs/2105.13031


We present a novel approach based on artificial neural networks, so-called geodesyNets, and present compelling evidence of their ability to serve as accurate geodetic models of highly irregular bodies using minimal prior information on the body. The approach does not rely on the body shape information but, if available, can harness it. GeodesyNets learn a three-dimensional, differentiable, function representing the body density, which we call neural density field. The body shape, as well as other geodetic properties, can easily be recovered. We investigate six different shapes including the bodies 101955 Bennu, 67P Churyumov-Gerasimenko, 433 Eros and 25143 Itokawa for which shape models developed during close proximity surveys are available. Both heterogeneous and homogeneous mass distributions are considered. The gravitational acceleration computed from the trained geodesyNets models, as well as the inferred body shape, show great accuracy in all cases with a relative error on the predicted acceleration smaller than 1\% even close to the asteroid surface. When the body shape information is available, geodesyNets can seamlessly exploit it and be trained to represent a high-fidelity neural density field able to give insights into the internal structure of the body. This work introduces a new unexplored approach to geodesy, adding a powerful tool to consolidated ones based on spherical harmonics, mascon models and polyhedral gravity.

Read this paper on arXiv…

D. Izzo and P. Gómez
Fri, 28 May 21
37/56

Comments: N/A

Informative Bayesian model selection for RR Lyrae star classifiers [IMA]

http://arxiv.org/abs/2105.11531


Machine learning has achieved an important role in the automatic classification of variable stars, and several classifiers have been proposed over the last decade. These classifiers have achieved impressive performance in several astronomical catalogues. However, some scientific articles have also shown that the training data therein contain multiple sources of bias. Hence, the performance of those classifiers on objects not belonging to the training data is uncertain, potentially resulting in the selection of incorrect models. Besides, it gives rise to the deployment of misleading classifiers. An example of the latter is the creation of open-source labelled catalogues with biased predictions. In this paper, we develop a method based on an informative marginal likelihood to evaluate variable star classifiers. We collect deterministic rules that are based on physical descriptors of RR Lyrae stars, and then, to mitigate the biases, we introduce those rules into the marginal likelihood estimation. We perform experiments with a set of Bayesian Logistic Regressions, which are trained to classify RR Lyraes, and we found that our method outperforms traditional non-informative cross-validation strategies, even when penalized models are assessed. Our methodology provides a more rigorous alternative to assess machine learning models using astronomical knowledge. From this approach, applications to other classes of variable stars and algorithmic improvements can be developed.

Read this paper on arXiv…

F. Pérez-Galarce, K. Pichara, P. Huijse, et. al.
Wed, 26 May 21
8/66

Comments: N/A

Advances in Machine and Deep Learning for Modeling and Real-time Detection of Multi-Messenger Sources [IMA]

http://arxiv.org/abs/2105.06479


We live in momentous times. The science community is empowered with an arsenal of cosmic messengers to study the Universe in unprecedented detail. Gravitational waves, electromagnetic waves, neutrinos and cosmic rays cover a wide range of wavelengths and time scales. Combining and processing these datasets that vary in volume, speed and dimensionality requires new modes of instrument coordination, funding and international collaboration with a specialized human and technological infrastructure. In tandem with the advent of large-scale scientific facilities, the last decade has experienced an unprecedented transformation in computing and signal processing algorithms. The combination of graphics processing units, deep learning, and the availability of open source, high-quality datasets, have powered the rise of artificial intelligence. This digital revolution now powers a multi-billion dollar industry, with far-reaching implications in technology and society. In this chapter we describe pioneering efforts to adapt artificial intelligence algorithms to address computational grand challenges in Multi-Messenger Astrophysics. We review the rapid evolution of these disruptive algorithms, from the first class of algorithms introduced in early 2017, to the sophisticated algorithms that now incorporate domain expertise in their architectural design and optimization schemes. We discuss the importance of scientific visualization and extreme-scale computing in reducing time-to-insight and obtaining new knowledge from the interplay between models and data.

Read this paper on arXiv…

E. Huerta and Z. Zhao
Mon, 17 May 21
32/55

Comments: 30 pages, 11 figures. Invited chapter to be published in “Handbook of Gravitational Wave Astronomy”

MeerCRAB: MeerLICHT Classification of Real and Bogus Transients using Deep Learning [IMA]

http://arxiv.org/abs/2104.13950


Astronomers require efficient automated detection and classification pipelines when conducting large-scale surveys of the (optical) sky for variable and transient sources. Such pipelines are fundamentally important, as they permit rapid follow-up and analysis of those detections most likely to be of scientific value. We therefore present a deep learning pipeline based on the convolutional neural network architecture called $\texttt{MeerCRAB}$. It is designed to filter out the so called ‘bogus’ detections from true astrophysical sources in the transient detection pipeline of the MeerLICHT telescope. Optical candidates are described using a variety of 2D images and numerical features extracted from those images. The relationship between the input images and the target classes is unclear, since the ground truth is poorly defined and often the subject of debate. This makes it difficult to determine which source of information should be used to train a classification algorithm. We therefore used two methods for labelling our data (i) thresholding and (ii) latent class model approaches. We deployed variants of $\texttt{MeerCRAB}$ that employed different network architectures trained using different combinations of input images and training set choices, based on classification labels provided by volunteers. The deepest network worked best with an accuracy of 99.5$\%$ and Matthews correlation coefficient (MCC) value of 0.989. The best model was integrated to the MeerLICHT transient vetting pipeline, enabling the accurate and efficient classification of detected transients that allows researchers to select the most promising candidates for their research goals.

Read this paper on arXiv…

Z. Hosenie, S. Bloemen, P. Groot, et. al.
Fri, 30 Apr 21
31/59

Comments: 15 pages, 13 figures, Accepted for publication in Experimental Astronomy and appeared in the 3rd Workshop on Machine Learning and the Physical Sciences, NeurIPS 2020

Using Artificial Intelligence to Shed Light on the Star of Biscuits: The Jaffa Cake [IMA]

http://arxiv.org/abs/2103.16575


Before Brexit, one of the greatest causes of arguments amongst British families was the question of the nature of Jaffa Cakes. Some argue that their size and host environment (the biscuit aisle) should make them a biscuit in their own right. Others consider that their physical properties (e.g. they harden rather than soften on becoming stale) suggest that they are in fact cake. In order to finally put this debate to rest, we re-purpose technologies used to classify transient events. We train two classifiers (a Random Forest and a Support Vector Machine) on 100 recipes of traditional cakes and biscuits. Our classifiers have 95 percent and 91 percent accuracy respectively. Finally we feed two Jaffa Cake recipes to the algorithms and find that Jaffa Cakes are, without a doubt, cakes. Finally, we suggest a new theory as to why some believe Jaffa Cakes are biscuits.

Read this paper on arXiv…

H. Stevance
Thu, 1 Apr 2021
18/71

Comments: April fools astro-ph submission – The topic is a joke but the numbers and the analysis are real

DeepMerge II: Building Robust Deep Learning Algorithms for Merging Galaxy Identification Across Domains [IMA]

http://arxiv.org/abs/2103.01373


In astronomy, neural networks are often trained on simulation data with the prospect of being used on telescope observations. Unfortunately, training a model on simulation data and then applying it to instrument data leads to a substantial and potentially even detrimental decrease in model accuracy on the new target dataset. Simulated and instrument data represent different data domains, and for an algorithm to work in both, domain-invariant learning is necessary. Here we employ domain adaptation techniques$-$ Maximum Mean Discrepancy (MMD) as an additional transfer loss and Domain Adversarial Neural Networks (DANNs)$-$ and demonstrate their viability to extract domain-invariant features within the astronomical context of classifying merging and non-merging galaxies. Additionally, we explore the use of Fisher loss and entropy minimization to enforce better in-domain class discriminability. We show that the addition of each domain adaptation technique improves the performance of a classifier when compared to conventional deep learning algorithms. We demonstrate this on two examples: between two Illustris-1 simulated datasets of distant merging galaxies, and between Illustris-1 simulated data of nearby merging galaxies and observed data from the Sloan Digital Sky Survey. The use of domain adaptation techniques in our experiments leads to an increase of target domain classification accuracy of up to ${\sim}20\%$. With further development, these techniques will allow astronomers to successfully implement neural network models trained on simulation data to efficiently detect and study astrophysical objects in current and future large-scale astronomical surveys.

Read this paper on arXiv…

A. Ćiprijanović, D. Kafkes, K. Downey, et. al.
Wed, 3 Mar 21
8/82

Comments: Submitted to MNRAS; 21 pages, 9 figures, 9 tables

Deep reinforcement learning for smart calibration of radio telescopes [IMA]

http://arxiv.org/abs/2102.03200


Modern radio telescopes produce unprecedented amounts of data, which are passed through many processing pipelines before the delivery of scientific results. Hyperparameters of these pipelines need to be tuned by hand to produce optimal results. Because many thousands of observations are taken during a lifetime of a telescope and because each observation will have its unique settings, the fine tuning of pipelines is a tedious task. In order to automate this process of hyperparameter selection in data calibration pipelines, we introduce the use of reinforcement learning. We use a reinforcement learning technique called twin delayed deep deterministic policy gradient (TD3) to train an autonomous agent to perform this fine tuning. For the sake of generalization, we consider the pipeline to be a black-box system where only an interpreted state of the pipeline is used by the agent. The autonomous agent trained in this manner is able to determine optimal settings for diverse observations and is therefore able to perform ‘smart’ calibration, minimizing the need for human intervention.

Read this paper on arXiv…

S. Yatawatta and I. Avruch
Mon, 8 Feb 21
17/46

Comments: N/A

Automatic Detection of Occulted Hard X-ray Flares Using Deep-Learning Methods [SSA]

http://arxiv.org/abs/2101.11550


We present a concept for a machine-learning classification of hard X-ray (HXR) emissions from solar flares observed by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), identifying flares that are either occulted by the solar limb or located on the solar disk. Although HXR observations of occulted flares are important for particle-acceleration studies, HXR data analyses for past observations were time consuming and required specialized expertise. Machine-learning techniques are promising for this situation, and we constructed a sample model to demonstrate the concept using a deep-learning technique. Input data to the model are HXR spectrograms that are easily produced from RHESSI data. The model can detect occulted flares without the need for image reconstruction nor for visual inspection by experts. A technique of convolutional neural networks was used in this model by regarding the input data as images. Our model achieved a classification accuracy better than 90 %, and the ability for the application of the method to either event screening or for an event alert for occulted flares was successfully demonstrated.

Read this paper on arXiv…

S. Ishikawa, H. Matsumura, Y. Uchiyama, et. al.
Thu, 28 Jan 21
26/64

Comments: 11 pages, 3 figures, accepted for publication in Solar Physics

A Bayesian neural network predicts the dissolution of compact planetary systems [EPA]

http://arxiv.org/abs/2101.04117


Despite over three hundred years of effort, no solutions exist for predicting when a general planetary configuration will become unstable. We introduce a deep learning architecture to push forward this problem for compact systems. While current machine learning algorithms in this area rely on scientist-derived instability metrics, our new technique learns its own metrics from scratch, enabled by a novel internal structure inspired from dynamics theory. Our Bayesian neural network model can accurately predict not only if, but also when a compact planetary system with three or more planets will go unstable. Our model, trained directly from short N-body time series of raw orbital elements, is more than two orders of magnitude more accurate at predicting instability times than analytical estimators, while also reducing the bias of existing machine learning algorithms by nearly a factor of three. Despite being trained on compact resonant and near-resonant three-planet configurations, the model demonstrates robust generalization to both non-resonant and higher multiplicity configurations, in the latter case outperforming models fit to that specific set of integrations. The model computes instability estimates up to five orders of magnitude faster than a numerical integrator, and unlike previous efforts provides confidence intervals on its predictions. Our inference model is publicly available in the SPOCK package, with training code open-sourced.

Read this paper on arXiv…

M. Cranmer, D. Tamayo, H. Rein, et. al.
Wed, 13 Jan 21
2/70

Comments: 8 content pages, 7 appendix and references. 8 figures. Source code at: this https URL; inference code at this https URL

Estimating Galactic Distances From Images Using Self-supervised Representation Learning [IMA]

http://arxiv.org/abs/2101.04293


We use a contrastive self-supervised learning framework to estimate distances to galaxies from their photometric images. We incorporate data augmentations from computer vision as well as an application-specific augmentation accounting for galactic dust. We find that the resulting visual representations of galaxy images are semantically useful and allow for fast similarity searches, and can be successfully fine-tuned for the task of redshift estimation. We show that (1) pretraining on a large corpus of unlabeled data followed by fine-tuning on some labels can attain the accuracy of a fully-supervised model which requires 2-4x more labeled data, and (2) that by fine-tuning our self-supervised representations using all available data labels in the Main Galaxy Sample of the Sloan Digital Sky Survey (SDSS), we outperform the state-of-the-art supervised learning method.

Read this paper on arXiv…

M. Hayat, P. Harrington, G. Stein, et. al.
Wed, 13 Jan 21
41/70

Comments: N/A

Visibility Interpolation in Solar Hard X-ray Imaging: Application to RHESSI and STIX [IMA]

http://arxiv.org/abs/2012.14007


Space telescopes for solar hard X-ray imaging provide observations made of sampled Fourier components of the incoming photon flux. The aim of this study is to design an image reconstruction method relying on enhanced visibility interpolation in the Fourier domain. % methods heading (mandatory) The interpolation-based method is applied on synthetic visibilities generated by means of the simulation software implemented within the framework of the Spectrometer/Telescope for Imaging X-rays (STIX) mission on board Solar Orbiter. An application to experimental visibilities observed by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) is also considered. In order to interpolate these visibility data we have utilized an approach based on Variably Scaled Kernels (VSKs), which are able to realize feature augmentation by exploiting prior information on the flaring source and which are used here, for the first time, for image reconstruction purposes.} % results heading (mandatory) When compared to an interpolation-based reconstruction algorithm previously introduced for RHESSI, VSKs offer significantly better performances, particularly in the case of STIX imaging, which is characterized by a notably sparse sampling of the Fourier domain. In the case of RHESSI data, this novel approach is particularly reliable when either the flaring sources are characterized by narrow, ribbon-like shapes or high-resolution detectors are utilized for observations. % conclusions heading (optional), leave it empty if necessary The use of VSKs for interpolating hard X-ray visibilities allows a notable image reconstruction accuracy when the information on the flaring source is encoded by a small set of scattered Fourier data and when the visibility surface is affected by significant oscillations in the frequency domain.

Read this paper on arXiv…

E. Perracchione, P. Massa, A. Massone, et. al.
Tue, 29 Dec 20
43/66

Comments: N/A

Visibility Interpolation in Solar Hard X-ray Imaging: Application to RHESSI and STIX [IMA]

http://arxiv.org/abs/2012.14007


Space telescopes for solar hard X-ray imaging provide observations made of sampled Fourier components of the incoming photon flux. The aim of this study is to design an image reconstruction method relying on enhanced visibility interpolation in the Fourier domain. % methods heading (mandatory) The interpolation-based method is applied on synthetic visibilities generated by means of the simulation software implemented within the framework of the Spectrometer/Telescope for Imaging X-rays (STIX) mission on board Solar Orbiter. An application to experimental visibilities observed by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) is also considered. In order to interpolate these visibility data we have utilized an approach based on Variably Scaled Kernels (VSKs), which are able to realize feature augmentation by exploiting prior information on the flaring source and which are used here, for the first time, for image reconstruction purposes.} % results heading (mandatory) When compared to an interpolation-based reconstruction algorithm previously introduced for RHESSI, VSKs offer significantly better performances, particularly in the case of STIX imaging, which is characterized by a notably sparse sampling of the Fourier domain. In the case of RHESSI data, this novel approach is particularly reliable when either the flaring sources are characterized by narrow, ribbon-like shapes or high-resolution detectors are utilized for observations. % conclusions heading (optional), leave it empty if necessary The use of VSKs for interpolating hard X-ray visibilities allows a notable image reconstruction accuracy when the information on the flaring source is encoded by a small set of scattered Fourier data and when the visibility surface is affected by significant oscillations in the frequency domain.

Read this paper on arXiv…

E. Perracchione, P. Massa, A. Massone, et. al.
Tue, 29 Dec 20
62/66

Comments: N/A

Visibility Interpolation in Solar Hard X-ray Imaging: Application to RHESSI and STIX [IMA]

http://arxiv.org/abs/2012.14007


Space telescopes for solar hard X-ray imaging provide observations made of sampled Fourier components of the incoming photon flux. The aim of this study is to design an image reconstruction method relying on enhanced visibility interpolation in the Fourier domain. % methods heading (mandatory) The interpolation-based method is applied on synthetic visibilities generated by means of the simulation software implemented within the framework of the Spectrometer/Telescope for Imaging X-rays (STIX) mission on board Solar Orbiter. An application to experimental visibilities observed by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI) is also considered. In order to interpolate these visibility data we have utilized an approach based on Variably Scaled Kernels (VSKs), which are able to realize feature augmentation by exploiting prior information on the flaring source and which are used here, for the first time, for image reconstruction purposes.} % results heading (mandatory) When compared to an interpolation-based reconstruction algorithm previously introduced for RHESSI, VSKs offer significantly better performances, particularly in the case of STIX imaging, which is characterized by a notably sparse sampling of the Fourier domain. In the case of RHESSI data, this novel approach is particularly reliable when either the flaring sources are characterized by narrow, ribbon-like shapes or high-resolution detectors are utilized for observations. % conclusions heading (optional), leave it empty if necessary The use of VSKs for interpolating hard X-ray visibilities allows a notable image reconstruction accuracy when the information on the flaring source is encoded by a small set of scattered Fourier data and when the visibility surface is affected by significant oscillations in the frequency domain.

Read this paper on arXiv…

E. Perracchione, P. Massa, A. Massone, et. al.
Tue, 29 Dec 20
59/66

Comments: N/A

Self-Supervised Representation Learning for Astronomical Images [IMA]

http://arxiv.org/abs/2012.13083


Sky surveys are the largest data generators in astronomy, making automated tools for extracting meaningful scientific information an absolute necessity. We show that, without the need for labels, self-supervised learning recovers representations of sky survey images that are semantically useful for a variety of scientific tasks. These representations can be directly used as features, or fine-tuned, to outperform supervised methods trained only on labeled data. We apply a contrastive learning framework on multi-band galaxy photometry from the Sloan Digital Sky Survey (SDSS) to learn image representations. We then use them for galaxy morphology classification, and fine-tune them for photometric redshift estimation, using labels from the Galaxy Zoo 2 dataset and SDSS spectroscopy. In both downstream tasks, using the same learned representations, we outperform the supervised state-of-the-art results, and we show that our approach can achieve the accuracy of supervised models while using 2-4 times fewer labels for training.

Read this paper on arXiv…

M. Hayat, G. Stein, P. Harrington, et. al.
Fri, 25 Dec 20
32/51

Comments: N/A

Self-Supervised Representation Learning for Astronomical Images [IMA]

http://arxiv.org/abs/2012.13083


Sky surveys are the largest data generators in astronomy, making automated tools for extracting meaningful scientific information an absolute necessity. We show that, without the need for labels, self-supervised learning recovers representations of sky survey images that are semantically useful for a variety of scientific tasks. These representations can be directly used as features, or fine-tuned, to outperform supervised methods trained only on labeled data. We apply a contrastive learning framework on multi-band galaxy photometry from the Sloan Digital Sky Survey (SDSS) to learn image representations. We then use them for galaxy morphology classification, and fine-tune them for photometric redshift estimation, using labels from the Galaxy Zoo 2 dataset and SDSS spectroscopy. In both downstream tasks, using the same learned representations, we outperform the supervised state-of-the-art results, and we show that our approach can achieve the accuracy of supervised models while using 2-4 times fewer labels for training.

Read this paper on arXiv…

M. Hayat, G. Stein, P. Harrington, et. al.
Fri, 25 Dec 20
45/51

Comments: N/A

Confluence of Artificial Intelligence and High Performance Computing for Accelerated, Scalable and Reproducible Gravitational Wave Detection [CL]

http://arxiv.org/abs/2012.08545


Finding new ways to use artificial intelligence (AI) to accelerate the analysis of gravitational wave data, and ensuring the developed models are easily reusable promises to unlock new opportunities in multi-messenger astrophysics (MMA), and to enable wider use, rigorous validation, and sharing of developed models by the community. In this work, we demonstrate how connecting recently deployed DOE and NSF-sponsored cyberinfrastructure allows for new ways to publish models, and to subsequently deploy these models into applications using computing platforms ranging from laptops to high performance computing clusters. We develop a workflow that connects the Data and Learning Hub for Science (DLHub), a repository for publishing machine learning models, with the Hardware Accelerated Learning (HAL) deep learning computing cluster, using funcX as a universal distributed computing service. We then use this workflow to search for binary black hole gravitational wave signals in open source advanced LIGO data. We find that using this workflow, an ensemble of four openly available deep learning models can be run on HAL and process the entire month of August 2017 of advanced LIGO data in just seven minutes, identifying all four binary black hole mergers previously identified in this dataset, and reporting no misclassifications. This approach, which combines advances in AI, distributed computing, and scientific data infrastructure opens new pathways to conduct reproducible, accelerated, data-driven gravitational wave detection.

Read this paper on arXiv…

E. Huerta, A. Khan, X. Huang, et. al.
Thu, 17 Dec 20
60/85

Comments: 17 pages, 5 figures

Attention-gating for improved radio galaxy classification [GA]

http://arxiv.org/abs/2012.01248


In this work we introduce attention as a state of the art mechanism for classification of radio galaxies using convolutional neural networks. We present an attention-based model that performs on par with previous classifiers while using more than 50\% fewer parameters than the next smallest classic CNN application in this field. We demonstrate quantitatively how the selection of normalisation and aggregation methods used in attention-gating can affect the output of individual models, and show that the resulting attention maps can be used to interpret the classification choices made by the model. We observe that the salient regions identified by the our model align well with the regions an expert human classifier would attend to make equivalent classifications. We show that while the selection of normalisation and aggregation may only minimally affect the performance of individual models, it can significantly affect the interpretability of the respective attention maps and by selecting a model which aligns well with how astronomers classify radio sources by eye, a user can employ the model in a more effective manner.

Read this paper on arXiv…

M. Bowles, A. Scaife, F. Porter, et. al.
Thu, 3 Dec 20
66/81

Comments: 18 pages, 16 figures, submitted to MNRAS

Accelerating MCMC algorithms through Bayesian Deep Networks [CEA]

http://arxiv.org/abs/2011.14276


Markov Chain Monte Carlo (MCMC) algorithms are commonly used for their versatility in sampling from complicated probability distributions. However, as the dimension of the distribution gets larger, the computational costs for a satisfactory exploration of the sampling space become challenging. Adaptive MCMC methods employing a choice of proposal distribution can address this issue speeding up the convergence. In this paper we show an alternative way of performing adaptive MCMC, by using the outcome of Bayesian Neural Networks as the initial proposal for the Markov Chain. This combined approach increases the acceptance rate in the Metropolis-Hasting algorithm and accelerate the convergence of the MCMC while reaching the same final accuracy. Finally, we demonstrate the main advantages of this approach by constraining the cosmological parameters directly from Cosmic Microwave Background maps.

Read this paper on arXiv…

H. Hortua, R. Volpi, D. Marinelli, et. al.
Tue, 1 Dec 20
63/108

Comments: Accepted in the Third Workshop on Machine Learning and the Physical Sciences, NeurIPS 2020, Vancouver, Canada. Text overlap with arXiv:1911.08508v3

Deep learning insights into cosmological structure formation [CEA]

http://arxiv.org/abs/2011.10577


While the evolution of linear initial conditions present in the early universe into extended halos of dark matter at late times can be computed using cosmological simulations, a theoretical understanding of this complex process remains elusive. Here, we build a deep learning framework to learn this non-linear relationship, and develop techniques to physically interpret the learnt mapping. A three-dimensional convolutional neural network (CNN) is trained to predict the mass of dark matter halos from the initial conditions. We find no change in the predictive accuracy of the model if we retrain the model removing anisotropic information from the inputs. This suggests that the features learnt by the CNN are equivalent to spherical averages over the initial conditions. Our results indicate that interpretable deep learning frameworks can provide a powerful tool for extracting insight into cosmological structure formation.

Read this paper on arXiv…

L. Lucie-Smith, H. Peiris, A. Pontzen, et. al.
Tue, 24 Nov 2020
31/83

Comments: 15 pages, 6 figures, to be submitted to Nature Communications, comments welcome

Physics-inspired deep learning to characterize the signal manifold of quasi-circular, spinning, non-precessing binary black hole mergers [CL]

http://arxiv.org/abs/2004.09524


The spin distribution of binary black hole mergers contains key information concerning the formation channels of these objects, and the astrophysical environments where they form, evolve and coalesce. To quantify the suitability of deep learning to characterize the signal manifold of quasi-circular, spinning, non-precessing binary black hole mergers, we introduce a modified version of WaveNet trained with a novel optimization scheme that incorporates general relativistic constraints of the spin properties of astrophysical black holes. The neural network model is trained, validated and tested with 1.5 million $\ell=|m|=2$ waveforms generated within the regime of validity of NRHybSur3dq8, i.e., mass-ratios $q\leq8$ and individual black hole spins $ | s^z_{{1,\,2}} | \leq 0.8$. Using this neural network model, we quantify how accurately we can infer the astrophysical parameters of black hole mergers in the absence of noise. We do this by computing the overlap between waveforms in the testing data set and the corresponding signals whose mass-ratio and individual spins are predicted by our neural network. We find that the convergence of high performance computing and physics-inspired optimization algorithms enable an accurate reconstruction of the mass-ratio and individual spins of binary black hole mergers across the parameter space under consideration. This is a significant step towards an informed utilization of physics-inspired deep learning models to reconstruct the spin distribution of binary black hole mergers in realistic detection scenarios.

Read this paper on arXiv…

A. Khan, E. Huerta and A. Das
Wed, 22 Apr 20
16/74

Comments: 21 pages, 10 figures, 1 appendix, 1 Interactive visualization at this https URL

Space Debris Ontology for ADR Capture Methods Selection [CL]

http://arxiv.org/abs/2004.08866


Studies have concluded that active debris removal (ADR) of the existing in-orbit mass is necessary. However, the quest for an optimal solution does not have a unique answer and the available data often lacks coherence. To improve this situation, modern knowledge representation techniques, that have been shaping the World Wide Web, medicine and pharmacy, should be employed. Prior efforts in the domain of space debris have only focused onto space situational awareness, neglecting ADR. To bridge this gap we present a domain-ontology of intact derelict objects, i.e. payloads and rocket bodies, for ADR capture methods selection. The ontology is defined on a minimal set of physical, dynamical and statistical parameters of a target object. The practicality and validity of the ontology are demonstrated by applying it onto a database of 30 representative objects, built by combining structured and unstructured data from publicly available sources. The analysis of results proves the ontology capable of inferring the most suited ADR capture methods for considered objects. Furthermore, it confirms its ability to handle the input data from different sources transparently, minimizing user input. The developed ontology provides an initial step towards a more comprehensive knowledge representation framework meant to improve data management and knowledge discovery in the domain of space debris. Furthermore, it provides a tool that should make the initial planning of future ADR missions simpler yet more systematic.

Read this paper on arXiv…

M. Jankovic, M. Yüksel, M. Babr, et. al.
Tue, 21 Apr 20
64/90

Comments: 32 pages, 7 figures and 6 tables

A computational theoretical approach for mining data on transient events from databases of high energy astrophysics experiments [IMA]

http://arxiv.org/abs/2004.04131


Data on transient events, like GRBs, are often contained in large databases of unstructured data from space experiments, merged with potentially large amount of background or simply undesired information. We present a computational formal model to apply techniques of modern computer science -such as Data Mining (DM) and Knowledge Discovering in Databases (KDD)- to a generic, large database derived from a high energy astrophysics experiment. This method is aimed to search, identify and extract expected information, and maybe to discover unexpected information .

Read this paper on arXiv…

F. Lazzarotto, M. Feroci and M. Pazienza
Thu, 9 Apr 20
5/54

Comments: 9 pages, 6 figures (in colors). Conference poster at Third Rome Workshop on Gamma-Ray Bursts in the Afterglow Era, Held 27-30 September 2002 at CNR Headquarters, Rome, Italy

Agile Earth observation satellite scheduling over 20 years: formulations, methods and future directions [IMA]

http://arxiv.org/abs/2003.06169


Agile satellites with advanced attitude maneuvering capability are the new generation of Earth observation satellites (EOSs). The continuous improvement in satellite technology and decrease in launch cost have boosted the development of agile EOSs (AEOSs). To efficiently employ the increasing orbiting AEOSs, the AEOS scheduling problem (AEOSSP) aiming to maximize the entire observation profit while satisfying all complex operational constraints, has received much attention over the past 20 years. The objectives of this paper are thus to summarize current research on AEOSSP, identify main accomplishments and highlight potential future research directions. To this end, general definitions of AEOSSP with operational constraints are described initially, followed by its three typical variations including different definitions of observation profit, multi-objective function and autonomous model. A detailed literature review from 1997 up to 2019 is then presented in line with four different solution methods, i.e., exact method, heuristic, metaheuristic and machine learning. Finally, we discuss a number of topics worth pursuing in the future.

Read this paper on arXiv…

X. Wang, G. Wu, L. Xing, et. al.
Mon, 16 Mar 20
48/57

Comments: N/A

Stochastic Calibration of Radio Interferometers [IMA]

http://arxiv.org/abs/2003.00986


With ever increasing data rates produced by modern radio telescopes like LOFAR and future telescopes like the SKA, many data processing steps are overwhelmed by the amount of data that needs to be handled using limited compute resources. Calibration is one such operation that dominates the overall data processing computational cost, nonetheless, it is an essential operation to reach many science goals. Calibration algorithms do exist that scale well with the number of stations of an array and the number of directions being calibrated. However, the remaining bottleneck is the raw data volume, which scales with the number of baselines, and which is proportional to the square of the number of stations. We propose a ‘stochastic’ calibration strategy where we only read in a mini-batch of data for obtaining calibration solutions, as opposed to reading the full batch of data being calibrated. Nonetheless, we obtain solutions that are valid for the full batch of data. Normally, data need to be averaged before calibration is performed to accommodate the data in size-limited compute memory. Stochastic calibration overcomes the need for data averaging before any calibration can be performed, and offers many advantages including: enabling the mitigation of faint radio frequency interference; better removal of strong celestial sources from the data; and better detection and spatial localization of fast radio transients.

Read this paper on arXiv…

S. Yatawatta
Tue, 3 Mar 20
45/68

Comments: N/A

MEM_GE: a new maximum entropy method for image reconstruction from solar X-ray visibilities [SSA]

http://arxiv.org/abs/2002.07921


Maximum Entropy is an image reconstruction method conceived to image a sparsely occupied field of view and therefore particularly appropriate to achieve super-resolution effects. Although widely used in image deconvolution, this method has been formulated in radio astronomy for the analysis of observations in the spatial frequency domain, and an Interactive Data Language (IDL) code has been implemented for image reconstruction from solar X-ray Fourier data. However, this code relies on a non-convex formulation of the constrained optimization problem addressed by the Maximum Entropy approach and this sometimes results in unreliable reconstructions characterized by unphysical shrinking effects.
This paper introduces a new approach to Maximum Entropy based on the constrained minimization of a convex functional. In the case of observations recorded by the Reuven Ramaty High Energy Solar Spectroscopic Imager (RHESSI), the resulting code provides the same super-resolution effects of the previous algorithm, while working properly also when that code produces unphysical reconstructions. Results are also provided of testing the algorithm with synthetic data simulating observations of the Spectrometer/Telescope for Imaging X-rays (STIX) in Solar Orbiter. The new code is available in the {\em{HESSI}} folder of the Solar SoftWare (SSW)tree.

Read this paper on arXiv…

P. Massa, R. Schwartz, A. Tolbert, et. al.
Thu, 20 Feb 20
39/61

Comments: N/A

Two-dimensional Multi-fiber Spectrum Image Correction Based on Machine Learning Techniques [IMA]

http://arxiv.org/abs/2002.06600


Due to limited size and imperfect of the optical components in a spectrometer, aberration has inevitably been brought into two-dimensional multi-fiber spectrum image in LAMOST, which leads to obvious spacial variation of the point spread functions (PSFs). Consequently, if spatial variant PSFs are estimated directly , the huge storage and intensive computation requirements result in deconvolutional spectral extraction method become intractable. In this paper, we proposed a novel method to solve the problem of spatial variation PSF through image aberration correction. When CCD image aberration is corrected, PSF, the convolution kernel, can be approximated by one spatial invariant PSF only. Specifically, machine learning techniques are adopted to calibrate distorted spectral image, including Total Least Squares (TLS) algorithm, intelligent sampling method, multi-layer feed-forward neural networks. The calibration experiments on the LAMOST CCD images show that the calibration effect of proposed method is effectible. At the same time, the spectrum extraction results before and after calibration are compared, results show the characteristics of the extracted one-dimensional waveform are more close to an ideal optics system, and the PSF of the corrected object spectrum image estimated by the blind deconvolution method is nearly central symmetry, which indicates that our proposed method can significantly reduce the complexity of spectrum extraction and improve extraction accuracy.

Read this paper on arXiv…

J. Xu, Q. Yin, P. Guo, et. al.
Tue, 18 Feb 20
54/72

Comments: 10 pages, 14 figures

Response to NITRD, NCO, NSF Request for Information on "Update to the 2016 National Artificial Intelligence Research and Development Strategic Plan" [IMA]

http://arxiv.org/abs/1911.05796


We present a response to the 2018 Request for Information (RFI) from the NITRD, NCO, NSF regarding the “Update to the 2016 National Artificial Intelligence Research and Development Strategic Plan.” Through this document, we provide a response to the question of whether and how the National Artificial Intelligence Research and Development Strategic Plan (NAIRDSP) should be updated from the perspective of Fermilab, America’s premier national laboratory for High Energy Physics (HEP). We believe the NAIRDSP should be extended in light of the rapid pace of development and innovation in the field of Artificial Intelligence (AI) since 2016, and present our recommendations below. AI has profoundly impacted many areas of human life, promising to dramatically reshape society — e.g., economy, education, science — in the coming years. We are still early in this process. It is critical to invest now in this technology to ensure it is safe and deployed ethically. Science and society both have a strong need for accuracy, efficiency, transparency, and accountability in algorithms, making investments in scientific AI particularly valuable. Thus far the US has been a leader in AI technologies, and we believe as a national Laboratory it is crucial to help maintain and extend this leadership. Moreover, investments in AI will be important for maintaining US leadership in the physical sciences.

Read this paper on arXiv…

J. Amundson, J. Annis, C. Avestruz, et. al.
Fri, 15 Nov 19
30/73

Comments: N/A

Algorithms and Statistical Models for Scientific Discovery in the Petabyte Era [IMA]

http://arxiv.org/abs/1911.02479


The field of astronomy has arrived at a turning point in terms of size and complexity of both datasets and scientific collaboration. Commensurately, algorithms and statistical models have begun to adapt — e.g., via the onset of artificial intelligence — which itself presents new challenges and opportunities for growth. This white paper aims to offer guidance and ideas for how we can evolve our technical and collaborative frameworks to promote efficient algorithmic development and take advantage of opportunities for scientific discovery in the petabyte era. We discuss challenges for discovery in large and complex data sets; challenges and requirements for the next stage of development of statistical methodologies and algorithmic tool sets; how we might change our paradigms of collaboration and education; and the ethical implications of scientists’ contributions to widely applicable algorithms and computational modeling. We start with six distinct recommendations that are supported by the commentary following them. This white paper is related to a larger corpus of effort that has taken place within and around the Petabytes to Science Workshops (https://petabytestoscience.github.io/).

Read this paper on arXiv…

B. Nord, A. Connolly, J. Kinney, et. al.
Thu, 7 Nov 19
15/50

Comments: arXiv admin note: substantial text overlap with arXiv:1905.05116

Algorithms and Statistical Models for Scientific Discovery in the Petabyte Era [IMA]

http://arxiv.org/abs/1911.02479


The field of astronomy has arrived at a turning point in terms of size and complexity of both datasets and scientific collaboration. Commensurately, algorithms and statistical models have begun to adapt — e.g., via the onset of artificial intelligence — which itself presents new challenges and opportunities for growth. This white paper aims to offer guidance and ideas for how we can evolve our technical and collaborative frameworks to promote efficient algorithmic development and take advantage of opportunities for scientific discovery in the petabyte era. We discuss challenges for discovery in large and complex data sets; challenges and requirements for the next stage of development of statistical methodologies and algorithmic tool sets; how we might change our paradigms of collaboration and education; and the ethical implications of scientists’ contributions to widely applicable algorithms and computational modeling. We start with six distinct recommendations that are supported by the commentary following them. This white paper is related to a larger corpus of effort that has taken place within and around the Petabytes to Science Workshops (https://petabytestoscience.github.io/).

Read this paper on arXiv…

B. Nord, A. Connolly, J. Kinney, et. al.
Thu, 7 Nov 19
8/50

Comments: arXiv admin note: substantial text overlap with arXiv:1905.05116

Automated Multidisciplinary Design and Control of Hopping Robots for Exploration of Extreme Environments on the Moon and Mars [CL]

http://arxiv.org/abs/1910.03827


The next frontier in solar system exploration will be missions targeting extreme and rugged environments such as caves, canyons, cliffs and crater rims of the Moon, Mars and icy moons. These environments are time capsules into early formation of the solar system and will provide vital clues of how our early solar system gave way to the current planets and moons. These sites will also provide vital clues to the past and present habitability of these environments. Current landers and rovers are unable to access these areas of high interest due to limitations in precision landing techniques, need for large and sophisticated science instruments and a mission assurance and operations culture where risks are minimized at all costs. Our past work has shown the advantages of using multiple spherical hopping robots called SphereX for exploring these extreme environments. Our previous work was based on performing exploration with a human-designed baseline design of a SphereX robot. However, the design of SphereX is a complex task that involves a large number of design variables and multiple engineering disciplines. In this work we propose to use Automated Multidisciplinary Design and Control Optimization (AMDCO) techniques to find near optimal design solutions in terms of mass, volume, power, and control for SphereX for different mission scenarios.

Read this paper on arXiv…

H. Kalita and J. Thangavelautham
Thu, 10 Oct 19
34/63

Comments: 11 pages, 13 figures, International Astronautical Congress 2019

A Flexible Framework for Anomaly Detection via Dimensionality Reduction [CL]

http://arxiv.org/abs/1909.04060


Anomaly detection is challenging, especially for large datasets in high dimensions. Here we explore a general anomaly detection framework based on dimensionality reduction and unsupervised clustering. We release DRAMA, a general python package that implements the general framework with a wide range of built-in options. We test DRAMA on a wide variety of simulated and real datasets, in up to 3000 dimensions, and find it robust and highly competitive with commonly-used anomaly detection algorithms, especially in high dimensions. The flexibility of the DRAMA framework allows for significant optimization once some examples of anomalies are available, making it ideal for online anomaly detection, active learning and highly unbalanced datasets.

Read this paper on arXiv…

A. Sadr, B. Bassett and M. Kunz
Thu, 19 Sep 19
45/71

Comments: 6 pages

Genetic Algorithms for Starshade Retargeting in Space-Based Telescopes [IMA]

http://arxiv.org/abs/1907.09789


Future space-based telescopes will leverage starshades as components that can be independently positioned. Starshades will adjust the light coming in from exoplanet host stars and enhance the direct imaging of exoplanets and other phenomena. In this context, scheduling of space-based telescope observations is subject to a large number of dynamic constraints, including target observability, fuel, and target priorities. We present an application of genetic algorithm (GA) scheduling on this problem that not only takes physical constraints into account, but also considers direct human suggestions on schedules. By allowing direct suggestions on schedules, this type of heuristic can capture the scheduling preferences and expertise of stakeholders without the need to always formally codify such objectives. Additionally, this approach allows schedules to be constructed from existing ones when scenarios change; for example, this capability allows for optimization without the need to recompute schedules from scratch after changes such as new discoveries or new targets of opportunity. We developed a specific graph-traversal-based framework upon which to apply GA for telescope scheduling, and use it to demonstrate the convergence behavior of a particular implementation of GA. From this work, difficulties with regards to assigning values to observational targets are also noted, and recommendations are made for different scenarios.

Read this paper on arXiv…

H. Siu and V. Pankratius
Wed, 24 Jul 19
10/60

Comments: N/A

Desaturating EUV observations of solar flaring storms [SSA]

http://arxiv.org/abs/1904.04211


Image saturation has been an issue for several instruments in solar astronomy, mainly at EUV wavelengths. However, with the launch of the Atmospheric Imaging Assembly (AIA) as part of the payload of the Solar Dynamic Observatory (SDO) image saturation has become a big data issue, involving around 10^$ frames of the impressive dataset this beautiful telescope has been providing every year since February 2010. This paper introduces a novel desaturation method, which is able to recover the signal in the saturated region of any AIA image by exploiting no other information but the one contained in the image itself. This peculiar methodological property, jointly with the unprecedented statistical reliability of the desaturated images, could make this algorithm the perfect tool for the realization of a reconstruction pipeline for AIA data, able to work properly even in the case of long-lasting, very energetic flaring events.

Read this paper on arXiv…

S. Guastavino, M. Piana, A. Massone, et. al.
Tue, 9 Apr 19
23/105

Comments: N/A

A Machine Learning Dataset Prepared From the NASA Solar Dynamics Observatory Mission [SSA]

http://arxiv.org/abs/1903.04538


In this paper we present a curated dataset from the NASA Solar Dynamics Observatory (SDO) mission in a format suitable for machine learning research. Beginning from level 1 scientific products we have processed various instrumental corrections, downsampled to manageable spatial and temporal resolutions, and synchronized observations spatially and temporally. We illustrate the use of this dataset with two example applications: forecasting future EVE irradiance from present EVE irradiance and translating HMI observations into AIA observations. For each application we provide metrics and baselines for future model comparison. We anticipate this curated dataset will facilitate machine learning research in heliophysics and the physical sciences generally, increasing the scientific return of the SDO mission. This work is a direct result of the 2018 NASA Frontier Development Laboratory Program. Please see the appendix for access to the dataset.

Read this paper on arXiv…

R. Galvez, D. Fouhey, M. Jin, et. al.
Wed, 13 Mar 19
79/125

Comments: Accepted to The Astrophysical Journal Supplement Series; 11 pages, 8 figures

Deep Learning at Scale for Gravitational Wave Parameter Estimation of Binary Black Hole Mergers [CL]

http://arxiv.org/abs/1903.01998


We present the first application of deep learning at scale to do gravitational wave parameter estimation of binary black hole mergers that describe a 4-D signal manifold, i.e., black holes whose spins are aligned or anti-aligned, and which evolve on quasi-circular orbits. We densely sample this 4-D signal manifold using over three hundred thousand simulated waveforms. In order to cover a broad range of astrophysically motivated scenarios, we synthetically enhance this waveform dataset to ensure that our deep learning algorithms can process waveforms located at any point in the data stream of gravitational wave detectors (time invariance) for a broad range of signal-to-noise ratios (scale invariance), which in turn means that our neural network models are trained with over $10^{7}$ waveform signals. We then apply these neural network models to estimate the astrophysical parameters of black hole mergers, and their corresponding black hole remnants, including the final spin and the gravitational wave quasi-normal frequencies. These neural network models represent the first time deep learning is used to provide point-parameter estimation calculations endowed with statistical errors. For each binary black hole merger that ground-based gravitational wave detectors have observed, our deep learning algorithms can reconstruct its parameters within 2 milliseconds using a single Tesla V100 GPU. We show that this new approach produces parameter estimation results that are consistent with Bayesian analyses that have been used to reconstruct the parameters of the catalog of binary black hole mergers observed by the advanced LIGO and Virgo detectors.

Read this paper on arXiv…

H. Shen, E. Huerta and Z. Zhao
Thu, 7 Mar 19
28/82

Comments: 8 pages, 4 figures and 4 tables

Learning to Predict the Cosmological Structure Formation [CEA]

http://arxiv.org/abs/1811.06533


Matter evolved under influence of gravity from minuscule density fluctuations. Non-perturbative structure formed hierarchically over all scales, and developed non-Gaussian features in the Universe, known as the Cosmic Web. To fully understand the structure formation of the Universe is one of the holy grails of modern astrophysics. Astrophysicists survey large volumes of the Universe and employ a large ensemble of computer simulations to compare with the observed data in order to extract the full information of our own Universe. However, to evolve trillions of galaxies over billions of years even with the simplest physics is a daunting task. We build a deep neural network, the Deep Density Displacement Model (hereafter D$^3$M), to predict the non-linear structure formation of the Universe from simple linear perturbation theory. Our extensive analysis, demonstrates that D$^3$M outperforms the second order perturbation theory (hereafter 2LPT), the commonly used fast approximate simulation method, in point-wise comparison, 2-point correlation, and 3-point correlation. We also show that D$^3$M is able to accurately extrapolate far beyond its training data, and predict structure formation for significantly different cosmological parameters. Our study proves, for the first time, that deep learning is a practical and accurate alternative to approximate simulations of the gravitational structure formation of the Universe.

Read this paper on arXiv…

S. He, Y. Li, Y. Feng, et. al.
Fri, 16 Nov 18
-11/78

Comments: 7 pages, 5 figures, 1 table

DeepSphere: Efficient spherical Convolutional Neural Network with HEALPix sampling for cosmological applications [CEA]

http://arxiv.org/abs/1810.12186


Convolutional Neural Networks (CNNs) are a cornerstone of the Deep Learning toolbox and have led to many breakthroughs in Artificial Intelligence. These networks have mostly been developed for regular Euclidean domains such as those supporting images, audio, or video. Because of their success, CNN-based methods are becoming increasingly popular in Cosmology. Cosmological data often comes as spherical maps, which make the use of the traditional CNNs more complicated. The commonly used pixelization scheme for spherical maps is the Hierarchical Equal Area isoLatitude Pixelisation (HEALPix). We present a spherical CNN for analysis of full and partial HEALPix maps, which we call DeepSphere. The spherical CNN is constructed by representing the sphere as a graph. Graphs are versatile data structures that can act as a discrete representation of a continuous manifold. Using the graph-based representation, we define many of the standard CNN operations, such as convolution and pooling. With filters restricted to being radial, our convolutions are equivariant to rotation on the sphere, and DeepSphere can be made invariant or equivariant to rotation. This way, DeepSphere is a special case of a graph CNN, tailored to the HEALPix sampling of the sphere. This approach is computationally more efficient than using spherical harmonics to perform convolutions. We demonstrate the method on a classification problem of weak lensing mass maps from two cosmological models and compare the performance of the CNN with that of two baseline classifiers. The results show that the performance of DeepSphere is always superior or equal to both of these baselines. For high noise levels and for data covering only a smaller fraction of the sphere, DeepSphere achieves typically 10% better classification accuracy than those baselines. Finally, we show how learned filters can be visualized to introspect the neural network.

Read this paper on arXiv…

N. Perraudin, M. Defferrard, T. Kacprzak, et. al.
Tue, 30 Oct 18
70/73

Comments: N/A

Real-time regression analysis with deep convolutional neural networks [CL]

http://arxiv.org/abs/1805.02716


We discuss the development of novel deep learning algorithms to enable real-time regression analysis for time series data. We showcase the application of this new method with a timely case study, and then discuss the applicability of this approach to tackle similar challenges across science domains.

Read this paper on arXiv…

E. Huerta, D. George, Z. Zhao, et. al.
Wed, 9 May 18
19/55

Comments: 3 pages. Position Paper accepted to SciML2018: DOE ASCR Workshop on Scientific Machine Learning. North Bethesda, MD, United States, January 30-February 1, 2018

Advice from the Oracle: Really Intelligent Information Retrieval [CL]

http://arxiv.org/abs/1801.00815


What is “intelligent” information retrieval? Essentially this is asking what is intelligence, in this article I will attempt to show some of the aspects of human intelligence, as related to information retrieval. I will do this by the device of a semi-imaginary Oracle. Every Observatory has an oracle, someone who is a distinguished scientist, has great administrative responsibilities, acts as mentor to a number of less senior people, and as trusted advisor to even the most accomplished scientists, and knows essentially everyone in the field. In an appendix I will present a brief summary of the Statistical Factor Space method for text indexing and retrieval, and indicate how it will be used in the Astrophysics Data System Abstract Service. 2018 Keywords: Personal Digital Assistant; Supervised Topic Models

Read this paper on arXiv…

M. Kurtz
Thu, 4 Jan 2018
20/44

Comments: Author copy; published 25 years ago at the beginning of the Astrophysics Data System; 2018 keywords added

Crossmatching variable objects with the Gaia data [IMA]

http://arxiv.org/abs/1702.04165


Tens of millions of new variable objects are expected to be identified in over a billion time series from the Gaia mission. Crossmatching known variable sources with those from Gaia is crucial to incorporate current knowledge, understand how these objects appear in the Gaia data, train supervised classifiers to recognise known classes, and validate the results of the Variability Processing and Analysis Coordination Unit (CU7) within the Gaia Data Analysis and Processing Consortium (DPAC). The method employed by CU7 to crossmatch variables for the first Gaia data release includes a binary classifier to take into account positional uncertainties, proper motion, targeted variability signals, and artefacts present in the early calibration of the Gaia data. Crossmatching with a classifier makes it possible to automate all those decisions which are typically made during visual inspection. The classifier can be trained with objects characterized by a variety of attributes to ensure similarity in multiple dimensions (astrometry, photometry, time-series features), with no need for a-priori transformations to compare different photometric bands, or of predictive models of the motion of objects to compare positions. Other advantages as well as some disadvantages of the method are discussed. Implementation steps from the training to the assessment of the crossmatch classifier and selection of results are described.

Read this paper on arXiv…

L. Rimoldini, K. Nienartowicz, M. Suveges, et. al.
Wed, 15 Feb 17
25/47

Comments: 4 pages, 1 figure, in Astronomical Data Analysis Software and Systems XXVI, Astronomical Society of the Pacific Conference Series

Enabling Dark Energy Science with Deep Generative Models of Galaxy Images [IMA]

http://arxiv.org/abs/1609.05796


Understanding the nature of dark energy, the mysterious force driving the accelerated expansion of the Universe, is a major challenge of modern cosmology. The next generation of cosmological surveys, specifically designed to address this issue, rely on accurate measurements of the apparent shapes of distant galaxies. However, shape measurement methods suffer from various unavoidable biases and therefore will rely on a precise calibration to meet the accuracy requirements of the science analysis. This calibration process remains an open challenge as it requires large sets of high quality galaxy images. To this end, we study the application of deep conditional generative models in generating realistic galaxy images. In particular we consider variations on conditional variational autoencoder and introduce a new adversarial objective for training of conditional generative networks. Our results suggest a reliable alternative to the acquisition of expensive high quality observations for generating the calibration data needed by the next generation of cosmological surveys.

Read this paper on arXiv…

S. Ravanbakhsh, F. Lanusse, R. Mandelbaum, et. al.
Tue, 20 Sep 16
10/74

Comments: N/A

On the estimation of stellar parameters with uncertainty prediction from Generative Artificial Neural Networks: application to Gaia RVS simulated spectra [IMA]

http://arxiv.org/abs/1607.05954


Aims. We present an innovative artificial neural network (ANN) architecture, called Generative ANN (GANN), that computes the forward model, that is it learns the function that relates the unknown outputs (stellar atmospheric parameters, in this case) to the given inputs (spectra). Such a model can be integrated in a Bayesian framework to estimate the posterior distribution of the outputs. Methods. The architecture of the GANN follows the same scheme as a normal ANN, but with the inputs and outputs inverted. We train the network with the set of atmospheric parameters (Teff, logg, [Fe/H] and [alpha/Fe]), obtaining the stellar spectra for such inputs. The residuals between the spectra in the grid and the estimated spectra are minimized using a validation dataset to keep solutions as general as possible. Results. The performance of both conventional ANNs and GANNs to estimate the stellar parameters as a function of the star brightness is presented and compared for different Galactic populations. GANNs provide significantly improved parameterizations for early and intermediate spectral types with rich and intermediate metallicities. The behaviour of both algorithms is very similar for our sample of late-type stars, obtaining residuals in the derivation of [Fe/H] and [alpha/Fe] below 0.1dex for stars with Gaia magnitude Grvs<12, which accounts for a number in the order of four million stars to be observed by the Radial Velocity Spectrograph of the Gaia satellite. Conclusions. Uncertainty estimation of computed astrophysical parameters is crucial for the validation of the parameterization itself and for the subsequent exploitation by the astronomical community. GANNs produce not only the parameters for a given spectrum, but a goodness-of-fit between the observed spectrum and the predicted one for a given set of parameters. Moreover, they allow us to obtain the full posterior distribution…

Read this paper on arXiv…

C. Dafonte, D. Fustes, M. Manteiga, et. al.
Thu, 21 Jul 16
14/48

Comments: N/A

Development of a VO Registry Subject Ontology using Automated Methods [IMA]

http://arxiv.org/abs/1502.05974


We report on our initial work to automate the generation of a domain ontology using subject fields of resources held in the Virtual Observatory registry. Preliminary results are comparable to more generalized ontology learning software currently in use. We expect to be able to refine our solution to improve both the depth and breadth of the generated ontology.

Read this paper on arXiv…

B. Thomas
Mon, 23 Feb 15
35/43

Comments: N/A

Renewal Strings for Cleaning Astronomical Databases [CL]

http://arxiv.org/abs/1408.1489


Large astronomical databases obtained from sky surveys such as the SuperCOSMOS Sky Surveys (SSS) invariably suffer from a small number of spurious records coming from artefactual effects of the telescope, satellites and junk objects in orbit around earth and physical defects on the photographic plate or CCD. Though relatively small in number these spurious records present a significant problem in many situations where they can become a large proportion of the records potentially of interest to a given astronomer. In this paper we focus on the four most common causes of unwanted records in the SSS: satellite or aeroplane tracks, scratches fibres and other linear phenomena introduced to the plate, circular halos around bright stars due to internal reflections within the telescope and diffraction spikes near to bright stars. Accurate and robust techniques are needed for locating and flagging such spurious objects. We have developed renewal strings, a probabilistic technique combining the Hough transform, renewal processes and hidden Markov models which have proven highly effective in this context. The methods are applied to the SSS data to develop a dataset of spurious object detections, along with confidence measures, which can allow this unwanted data to be removed from consideration. These methods are general and can be adapted to any future astronomical survey data.

Read this paper on arXiv…

A. Storkey, N. Hambly, C. Williams, et. al.
Fri, 8 Aug 14
11/52

Comments: Appears in Proceedings of the Nineteenth Conference on Uncertainty in Artificial Intelligence (UAI2003)

A practical approach to ontology-enabled control systems for astronomical instrumentation [IMA]

http://arxiv.org/abs/1310.5488


Even though modern service-oriented and data-oriented architectures promise to deliver loosely coupled control systems, they are inherently brittle as they commonly depend on a priori agreed interfaces and data models. At the same time, the Semantic Web and a whole set of accompanying standards and tools are emerging, advocating ontologies as the basis for knowledge exchange. In this paper we aim to identify a number of key ideas from the myriad of knowledge-based practices that can readily be implemented by control systems today. We demonstrate with a practical example (a three-channel imager for the Mercator Telescope) how ontologies developed in the Web Ontology Language (OWL) can serve as a meta-model for our instrument, covering as many engineering aspects of the project as needed. We show how a concrete system model can be built on top of this meta-model via a set of Domain Specific Languages (DSLs), supporting both formal verification and the generation of software and documentation artifacts. Finally we reason how the available semantics can be exposed at run-time by adding a “semantic layer” that can be browsed, queried, monitored etc. by any OPC UA-enabled client.

Read this paper on arXiv…

Date added: Tue, 22 Oct 13