The Cavendish Computors: The women working in scientific computing for Radio Astronomy [IMA]

http://arxiv.org/abs/2205.07267


A discussion of the history of scientific computing for Radio Astronomy in the Cavendish Laboratory of the University of Cambridge in the decades after the Second World War. This covers the development of the aperture synthesis technique for Radio Astronomy and how that required using the new computing technology developed by the University’s Mathematical Laboratory: the EDSAC, EDSAC 2 and TITAN computers. It looks at the scientific advances made by the Radio Astronomy group, particularly the assembling of evidence which contradicted the Steady State Hypothesis. It also examines the software advances that allowed bigger telescopes to be built: the Fast Fourier Transform (FFT) and the degridding algorithm. Throughout, the contribution of women is uncovered, from the diagrams they drew for scientific publications, through programming and operating computers, to writing scientific papers.

Read this paper on arXiv…

V. Allan
Tue, 17 May 22
29/95

Comments: First presented at the Joint BSHM CSHPM/SCHPM Conference People, Places, Practices at St Andrews, July 2021

A Sonification of the zCOSMOS Galaxy Dataset [CL]

http://arxiv.org/abs/2202.05539


Sonification is the transformation of data into acoustic signals, achievable through different techniques. Sonification can be defined as a way to represent data values and relations as perceivable sounds, aiming at facilitating their communication and interpretation. Like data visualization provides meaning via images, sonification conveys meaning via sound. Sonification approaches are useful in a number of scenario. A first case is the possibility to receive information while keeping other sensory channels free, like in medical environment, in driving experience, etc. Another scenario addresses an easier recognition of patterns when data present high dimensionality and cardinality. Finally, sonification can be applied to presentation and dissemination initiatives, also with artistic goals. The zCOSMOS dataset contains detailed data about almost 20000 galaxies, describing the evolution of a relatively small portion of the universe in the last 10 million years in terms of galaxy mass, absolute luminosity, redshift, distance, age, and star formation rate. The present paper proposes a sonification for the mentioned dataset, with the following goals: i) providing a general description of the dataset, accessible via sound, which could also make unnoticed patterns emerge; ii) realizing an artistic but scientifically accurate sonic portrait of a portion of the universe, thus filling the gap between art and science in the context of scientific dissemination and so-called “edutainment”; iii) adding value to the dataset, since also scientific data and achievements must be considered as a cultural heritage that needs to be preserved and enhanced. Both scientific and technological aspects of the sonification are addressed.

Read this paper on arXiv…

S. Bardelli, C. Ferretti, L. Ludovico, et. al.
Mon, 14 Feb 22
4/55

Comments: 18 pages, 6 figures

Scientific Computing in the Cavendish Laboratory and the pioneering women Computors [CL]

http://arxiv.org/abs/2106.00365


The use of computers and the role of women in radio astronomy and X-ray crystallography research at the Cavendish Laboratory between 1949 and 1975 have been investigated. We recorded examples of when computers were used, what they were used for and who used them from hundreds of papers published during these years. The use of the EDSAC, EDSAC 2 and TITAN computers was found to increase considerably over this time-scale and they were used for a diverse range of applications. The majority of references to computer operators and programmers referred to women, 57% for astronomy and 62% for crystallography, in contrast to a very small proportion, 4% and 13% respectively, of female authors of papers.

Read this paper on arXiv…

V. Allan and C. Leedham
Wed, 2 Jun 21
40/48

Comments: 11 pages, 8 figures, submitted to IEEE Annals in the History of Computing, (C) IEEE 2021

Gender Imbalance and Spatiotemporal Patterns of Contributions to Citizen Science Projects: the case of Zooniverse [CL]

http://arxiv.org/abs/2101.02695


Citizen Science is research undertaken by professional scientists and members of the public collaboratively. Despite numerous benefits of citizen science for both the advancement of science and the community of the citizen scientists, there is still no comprehensive knowledge of patterns of contributions, and the demography of contributors to citizen science projects. In this paper we provide a first overview of spatiotemporal and gender distribution of citizen science workforce by analyzing 54 million classifications contributed by more than 340 thousand citizen science volunteers from 198 countries to one of the largest citizen science platforms, Zooniverse. First we report on the uneven geographical distribution of the citizen scientist and model the variations among countries based on the socio-economic conditions as well as the level of research investment in each country. Analyzing the temporal features of contributions, we report on high “burstiness” of participation instances as well as the leisurely nature of participation suggested by the time of the day that the citizen scientists were the most active. Finally, we discuss the gender imbalance among citizen scientists (about 30% female) and compare it with other collaborative projects as well as the gender distribution in more formal scientific activities. Citizen science projects need further attention from outside of the academic community, and our findings can help attract the attention of public and private stakeholders, as well as to inform the design of the platforms and science policy making processes.

Read this paper on arXiv…

K. Ibrahim, S. Khodursky and T. Yasseri
Fri, 8 Jan 21
22/48

Comments: Under Review

Towards Inclusive Practices with Indigenous Knowledge [CL]

http://arxiv.org/abs/2009.12425


Astronomy across world cultures is rooted in Indigenous Knowledge. We share models of partnering with indigenous communities involving Collaboration with Integrity to co-create an inclusive scientific enterprise on Earth and in space.

Read this paper on arXiv…

A. Venkatesan, D. Begay, A. Burgasser, et. al.
Tue, 29 Sep 20
49/98

Comments: 3 pages formatted in Nature style, published as a Comment in DEI focus issue in Nature Astronomy

Recommendations for Planning Inclusive Astronomy Conferences [CL]

http://arxiv.org/abs/2007.10970


The Inclusive Astronomy (IA) conference series aims to create a safe space where community members can listen to the experiences of marginalized individuals in astronomy, discuss actions being taken to address inequities, and give recommendations to the community for how to improve diversity, equity, and inclusion in astronomy. The first IA was held in Nashville, TN, USA, 17-19 June, 2015. The Inclusive Astronomy 2 (IA2) conference was held in Baltimore, MD, USA, 14-15 October, 2019. The Inclusive Astronomy 2 (IA2) Local Organizing Committee (LOC) has put together a comprehensive document of recommendations for planning future Inclusive Astronomy conferences based on feedback received and lessons learned. While these are specific to the IA series, many parts will be applicable to other conferences as well. Please find the recommendations and accompanying letter to the community here: https://outerspace.stsci.edu/display/IA2/LOC+Recommendations.

Read this paper on arXiv…

I. Committee, B. Brooks, K. Brooks, et. al.
Wed, 22 Jul 20
-448/67

Comments: 41 pages. An editable version of the document and contact information available here: this https URL

When Scientists Become Social Scientists: How Citizen Science Projects Learn About Volunteers [IMA]

http://arxiv.org/abs/1802.00362


Online citizen science projects involve recruitment of volunteers to assist researchers with the creation, curation, and analysis of large datasets. Enhancing the quality of these data products is a fundamental concern for teams running citizen science projects. Decisions about a project’s design and operations have a critical effect both on whether the project recruits and retains enough volunteers, and on the quality of volunteers’ work. The processes by which the team running a project learn about their volunteers play a critical role in these decisions. Improving these processes will enhance decision-making, resulting in better quality datasets, and more successful outcomes for citizen science projects. This paper presents a qualitative case study, involving interviews and long-term observation, of how the team running Galaxy Zoo, a major citizen science project in astronomy, came to know their volunteers and how this knowledge shaped their decision-making processes. This paper presents three instances that played significant roles in shaping Galaxy Zoo team members’ understandings of volunteers. Team members integrated heterogeneous sources of information to derive new insights into the volunteers. Project metrics and formal studies of volunteers combined with tacit understandings gained through on- and offline interactions with volunteers. This paper presents a number of recommendations for practice. These recommendations include strategies for improving how citizen science project team members learn about volunteers, and how teams can more effectively circulate among themselves what they learn.

Read this paper on arXiv…

P. Darch
Fri, 2 Feb 18
6/48

Comments: 15 pages

Software metadata: How much is enough? [IMA]

http://arxiv.org/abs/1712.02398


Broad efforts are underway to capture metadata about research software and retain it across services; notable in this regard is the CodeMeta project. What metadata are important to have about (research) software? What metadata are useful for searching for codes? What would you like to learn about astronomy software? This BoF sought to gather information on metadata most desired by researchers and users of astro software and others interested in registering, indexing, capturing, and doing research on this software. Information from this BoF could conceivably result in changes to the Astrophysics Source Code Library (ASCL) or other resources for the benefit of the community or provide input into other projects concerned with software metadata.

Read this paper on arXiv…

A. Allen, P. Teuben, G. Berriman, et. al.
Fri, 8 Dec 17
15/70

Comments: 4 pages; to be published in ADASS XXVII (held Oct 22-26, 2017 in Santiago, Chile) proceedings

Hack Weeks as a model for Data Science Education and Collaboration [CL]

http://arxiv.org/abs/1711.00028


Across almost all scientific disciplines, the instruments that record our experimental data and the methods required for storage and data analysis are rapidly increasing in complexity. This gives rise to the need for scientific communities to adapt on shorter time scales than traditional university curricula allow for, and therefore requires new modes of knowledge transfer. The universal applicability of data science tools to a broad range of problems has generated new opportunities to foster exchange of ideas and computational workflows across disciplines. In recent years, hack weeks have emerged as an effective tool for fostering these exchanges by providing training in modern data analysis workflows. While there are variations in hack week implementation, all events consist of a common core of three components: tutorials in state-of-the-art methodology, peer-learning and project work in a collaborative environment. In this paper, we present the concept of a hack week in the larger context of scientific meetings and point out similarities and differences to traditional conferences. We motivate the need for such an event and present in detail its strengths and challenges. We find that hack weeks are successful at cultivating collaboration and the exchange of knowledge. Participants self-report that these events help them both in their day-to-day research as well as their careers. Based on our results, we conclude that hack weeks present an effective, easy-to-implement, fairly low-cost tool to positively impact data analysis literacy in academic disciplines, foster collaboration and cultivate best practices.

Read this paper on arXiv…

D. Huppenkothen, A. Arendt, D. Hogg, et. al.
Thu, 2 Nov 17
24/71

Comments: 15 pages, 2 figures, submitted to PNAS, all relevant code available at this https URL

Usage Bibliometrics as a Tool to Measure Research Activity [CL]

http://arxiv.org/abs/1706.02153


Measures for research activity and impact have become an integral ingredient in the assessment of a wide range of entities (individual researchers, organizations, instruments, regions, disciplines). Traditional bibliometric indicators, like publication and citation based indicators, provide an essential part of this picture, but cannot describe the complete picture. Since reading scholarly publications is an essential part of the research life cycle, it is only natural to introduce measures for this activity in attempts to quantify the efficiency, productivity and impact of an entity. Citations and reads are significantly different signals, so taken together, they provide a more complete picture of research activity. Most scholarly publications are now accessed online, making the study of reads and their patterns possible. Click-stream logs allow us to follow information access by the entire research community, real-time. Publication and citation datasets just reflect activity by authors. In addition, download statistics will help us identify publications with significant impact, but which do not attract many citations. Click-stream signals are arguably more complex than, say, citation signals. For one, they are a superposition of different classes of readers. Systematic downloads by crawlers also contaminate the signal, as does browsing behavior. We discuss the complexities associated with clickstream data and how, with proper filtering, statistically significant relations and conclusions can be inferred from download statistics. We describe how download statistics can be used to describe research activity at different levels of aggregation, ranging from organizations to countries. These statistics show a correlation with socio-economic indicators. A comparison will be made with traditional bibliometric indicators. We will argue that astronomy is representative of more general trends.

Read this paper on arXiv…

E. Henneken and M. Kurtz
Thu, 8 Jun 17
7/69

Comments: 25 pages, 11 figures, accepted for publication in Handbook of Quantitative Science and Technology Research, Springer

The Durability and Fragility of Knowledge Infrastructures: Lessons Learned from Astronomy [CL]

http://arxiv.org/abs/1611.00055


Infrastructures are not inherently durable or fragile, yet all are fragile over the long term. Durability requires care and maintenance of individual components and the links between them. Astronomy is an ideal domain in which to study knowledge infrastructures, due to its long history, transparency, and accumulation of observational data over a period of centuries. Research reported here draws upon a long-term study of scientific data practices to ask questions about the durability and fragility of infrastructures for data in astronomy. Methods include interviews, ethnography, and document analysis. As astronomy has become a digital science, the community has invested in shared instruments, data standards, digital archives, metadata and discovery services, and other relatively durable infrastructure components. Several features of data practices in astronomy contribute to the fragility of that infrastructure. These include different archiving practices between ground- and space-based missions, between sky surveys and investigator-led projects, and between observational and simulated data. Infrastructure components are tightly coupled, based on international agreements. However, the durability of these infrastructures relies on much invisible work – cataloging, metadata, and other labor conducted by information professionals. Continual investments in care and maintenance of the human and technical components of these infrastructures are necessary for sustainability.

Read this paper on arXiv…

C. Borgman, P. Darch, A. Sands, et. al.
Wed, 2 Nov 16
27/55

Comments: Paper presented at the 2016 Annual Meeting of the Association for Information Science and Technology, October 14-18, 2016, Copenhagen, Denmark. 10 pages; this https URL

Science Learning via Participation in Online Citizen Science [IMA]

http://arxiv.org/abs/1601.05973


We investigate the development of scientific content knowledge of volunteers participating in online citizen science projects in the Zooniverse (www.zooniverse.org), including the astronomy projects Galaxy Zoo (www.galaxyzoo.org) and Planet Hunters (www.planethunters.org). We use econometric methods to test how measures of project participation relate to success in a science quiz, controlling for factors known to correlate with scientific knowledge. Citizen scientists believe they are learning about both the content and processes of science through their participation. Won’t don’t directly test the latter, but we find evidence to support the former – that more actively engaged participants perform better in a project-specific science knowledge quiz, even after controlling for their general science knowledge. We interpret this as evidence of learning of science content inspired by participation in online citizen science.

Read this paper on arXiv…

K. Masters, E. Oh, J. Cox, et. al.
Mon, 25 Jan 16
24/56

Comments: 32 pages (9 pages of Appendix material). Accepted for publication in the Journal of Science Communication (JCOM; this http URL)

From Stars to Patients: Lessons from Space Science and Astrophysics for Health Care Informatics [IMA]

http://arxiv.org/abs/1512.05272


Big Data are revolutionizing nearly every aspect of the modern society. One area where this can have a profound positive societal impact is the field of Health Care Informatics (HCI), which faces many challenges. The key idea behind this study is: can we use some of the experience and technical and methodological solutions from the fields that have successfully adapted to the Big Data era, namely astronomy and space science, to help accelerate the progress of HCI? We illustrate this with examples from the Virtual Observatory framework, and the NCI EDRN project. An effective sharing and reuse of tools, methods, and experiences from different fields can save a lot of effort, time, and expense. HCI can thus benefit from the proven solutions to big data challenges from other domains.

Read this paper on arXiv…

S. Djorgovski, A. Mahabal, D. Crichton, et. al.
Thu, 17 Dec 15
11/55

Comments: 3 pages, to appear in refereed Proc. IEEE Big Data 2015, IEEE press

From Thread to Transcontinental Computer: Disturbing Lessons in Distributed Supercomputing [IMA]

http://arxiv.org/abs/1507.01138


We describe the political and technical complications encountered during the astronomical CosmoGrid project. CosmoGrid is a numerical study on the formation of large scale structure in the universe. The simulations are challenging due to the enormous dynamic range in spatial and temporal coordinates, as well as the enormous computer resources required. In CosmoGrid we dealt with the computational requirements by connecting up to four supercomputers via an optical network and make them operate as a single machine. This was challenging, if only for the fact that the supercomputers of our choice are separated by half the planet, as three of them are located scattered across Europe and fourth one is in Tokyo. The co-scheduling of multiple computers and the ‘gridification’ of the code enabled us to achieve an efficiency of up to $93\%$ for this distributed intercontinental supercomputer. In this work, we find that high-performance computing on a grid can be done much more effectively if the sites involved are willing to be flexible about their user policies, and that having facilities to provide such flexibility could be key to strengthening the position of the HPC community in an increasingly Cloud-dominated computing landscape. Given that smaller computer clusters owned by research groups or university departments usually have flexible user policies, we argue that it could be easier to instead realize distributed supercomputing by combining tens, hundreds or even thousands of these resources.

Read this paper on arXiv…

D. Groen and S. Zwart
Tue, 7 Jul 15
26/65

Comments: Accepted for publication in IEEE conference on ERRORs

Crowdfunding Astronomy Outreach Projects: Lessons Learned from the UNAWE Crowdfunding Campaign [CL]

http://arxiv.org/abs/1412.2115


In recent years, crowdfunding has become a popular method of funding new technology or entertainment products, or artistic projects. The idea is that people or projects ask for many small donations from individuals who support the proposed work, rather than a large amount from a single source. Crowdfunding is usually done via an online portal or platform which handles the financial transactions involved. The Universe Awareness (UNAWE) programme decided to undertake a Kickstarter crowdfunding campaign centring on the resource Universe in a Box2. In this article we present the lessons learned and best practices from that campaign.

Read this paper on arXiv…

A. Ashton, P. Russo and T. Heenatigala
Mon, 8 Dec 14
12/61

Comments: Published – Communicating Astronomy with the Public journal #16 (4 pages) (2014)

The first SPIE software Hack Day [IMA]

http://arxiv.org/abs/1408.1278


We report here on the software Hack Day organised at the 2014 SPIE conference on Astronomical Telescopes and Instrumentation in Montreal. The first ever Hack Day to take place at an SPIE event, the aim of the day was to bring together developers to collaborate on innovative solutions to problems of their choice. Such events have proliferated in the technology community, providing opportunities to showcase, share and learn skills. In academic environments, these events are often also instrumental in building community beyond the limits of national borders, institutions and projects. We show examples of projects the participants worked on, and provide some lessons learned for future events.

Read this paper on arXiv…

S. Kendrew, C. Deen, N. Radziwill, et. al.
Thu, 7 Aug 14
22/46

Comments: To be published in Proc. SPIE volume 9152; paper will be available in the SPIE Digital Library via Open Access

10 Simple Rules for the Care and Feeding of Scientific Data [CL]

http://arxiv.org/abs/1401.2134


This article offers a short guide to the steps scientists can take to ensure that their data and associated analyses continue to be of value and to be recognized. In just the past few years, hundreds of scholarly papers and reports have been written on questions of data sharing, data provenance, research reproducibility, licensing, attribution, privacy, and more, but our goal here is not to review that literature. Instead, we present a short guide intended for researchers who want to know why it is important to “care for and feed” data, with some practical advice on how to do that.

Read this paper on arXiv…

Fri, 10 Jan 14
58/69