Machine Learning based search for Cataclysmic Variables within Gaia Science Alerts [SSA]

http://arxiv.org/abs/2210.01431


Wide-field time domain facilities detect transient events in large numbers through difference imaging. For example, Zwicky Transient Facility produces alerts for hundreds of thousands of transient events per night, a rate set to be dwarfed by the upcoming Vera Rubin Observatory. The automation provided by Machine Learning (ML) is, therefore, necessary to classify these events and select the most interesting sources for follow-up observations. Cataclysmic Variables (CVs) are a transient class that are numerous, bright, and nearby, providing excellent laboratories for the study of accretion and binary evolution. Here we focus on our use of ML to identify CVs from photometric data of transient sources published by the Gaia Science Alerts program (GSA) – a large, easily accessible resource, not fully explored with ML. The use of light curve feature extraction techniques and source metadata from the Gaia survey resulted in a Random Forest model capable of distinguishing CVs from supernovae, Active Galactic Nuclei, and Young Stellar Objects with a 92\% precision score and an 85\% hit rate. Of 13,280 sources within GSA without an assigned transient classification our model predicts the CV class for $\sim$2800. Spectroscopic observations are underway to classify a statistically significant sample of these targets to validate the performance of the model. This work puts us on a path towards the classification of rare CV subtypes from future wide-field surveys such as the Legacy Survey of Space and Time.

Read this paper on arXiv…

D. Mistry, C. Copperwheat, M. Darnley, et. al.
Wed, 5 Oct 22
57/73

Comments: 16 pages, 8 figures, 8 tables