Quasar photometric redshifts from incomplete data using Deep Learning [CEA]

Forthcoming astronomical surveys are expected to detect new sources in such large numbers that measuring their spectroscopic redshift measurements will be not be practical. Thus, there is much interest in using machine learning to yield the redshift from the photometry of each object. We are particularly interested in radio sources (quasars) detected with the Square Kilometre Array and have found Deep Learning, trained upon a large optically-selected sample of quasi-stellar objects, to be effective in the prediction of the redshifts in three external samples of radio-selected sources. However, the requirement of nine different magnitudes, from the near-infrared, optical and ultra-violet bands, has the effect of significantly reducing the number of sources for which redshifts can be predicted. Here we explore the possibility of using machine learning to impute the missing features. We find that for the training sample, simple imputation is sufficient, particularly replacing the missing magnitude with the maximum for that band, thus presuming that the non-detection is at the sensitivity limit. For the test samples, however, this does not perform as well as multivariate imputation, which suggests that many of the missing magnitudes are not limits, but have indeed not been observed. From extensive testing of the models, we suggest that the imputation is best restricted to two missing values per source. Where the sources overlap on the sky, in the worst case, this increases the fraction of sources for which redshifts can be estimated from 46% to 80%, with >90% being reached for the other samples.

Read this paper on arXiv…

S. Curran
Wed, 9 Mar 22
22/68

Comments: MNRAS, pending minor revision