Mozilla Science Sprint 2015 – Open Data Formats for Cosmic Rays

What is it?

For the last 2 years Mozilla has organised the Mozilla Science Lab Global Sprint, an opportunity for anyone with the enthusiasm and time to work with scientists or on their own project for a weekend anywhere around the world. It’s a great opportunity to talk about your own projects, see what open science is out there and get down and do some sciencing.

globalsprint

This year CERN was one of the hosting location, with about 15 participants locally and a number internationally bodging, trying and sciencing over a couple of days in early June.  There were some nice projects, both related to CERN and otherwise (for more information, including links to our notes/work here):

  • Geotag-X – A way of crowd sourcing photo analysis for environmental disasters and conflict zones (i.e. take photos from disaster areas and figure out what’s going on in them)
  • Github Science Badges – Developing a badge system to visualise for open a science project is (i.e. if you make a science experiment, how open is it)
  • E3 Extreme Energy Events – Designing a monitoring system  for conflict zones to designate the origin and type of explosions/impacts (i.e. if someone fires a gun figure and where, when and how big)
  • Event displays for particle physics experiments – Letting people view particle physics events in detectors using 3D viewers
  • Open data for cosmic ray experiments – Let the many cosmic ray experiments around the world share their data with the each other and the world at large. <- We go here

What were we doing?

Something we’re committed to in the CosmicPi team is making our project and the things is produces as open as possible (open to look, free to play with as you want to). For software and hardware this is obvious in releasing the designs etc., but for the data it produces this is less clear. Taking a bit of inspiration from the CERN open data portal (particle physics data you can look at yourself, rediscover the Higgs!), we thought to do something similar with cosmic ray events.

Ideally we want to make the data open, what we do to the data to get information from it to be open, and what we find out to be open. Even better, be able to put in other experiments data anywhere along that stream! So we started with the foundation of everything: The data!

Many Experiments, Much Data

We aren’t the first people to want to build a cosmic ray experiment. There are old one (fly balloons and look for tracks on camera films like these students did), to big ones buried under Antarctica in a giant IceCube, to ones made by schools, like the HiSPARC project in the Netherlands. All of them make their own data in their own way – is there a way for this data to be put together so that anyone can access it? This means more data, better statistics, and a better understanding of how each others experiment work.

After some planning, discussing and experimenting (complete with the obligatory dead ends and head scratching) we can up with a standard format for cosmic ray data, summarised nicely in this mindmap, courtesy of F. Berghaus.

Putting it into Practice

From this start we had to get something functional, where the real fun started! We had a great team working with us, both locally and remotely, coding at various times, teaching those of us not so skilled and generally having a good time. We learnt a few things:

  • Github works great for both local and remote team development. The issues and comments system was really productive
  • Gitter is a great tool for plugging into Github. It’s essentially a chatroom wrapped into Githubs user system
  • Sprints are a great way to make rapid and focused progress on projects you don’t work on all the time

The final result of our work was a solid specification for an open data format for cosmic ray events and example implementation in python using json as the storage format. Whether this stays the final implementation remains open, we’re looking for more input from anyone with an opinion! Our code output is available on github at https://github.com/OpenCosmics/fits-evaluation. If you want to fork it, play with it, ask about it, please feel free.

Where now?

The next step – for now we’ve only had serious input from one cosmic ray experiment, us! We’ve had some discussion with other experiments but some more serious discussions are needed to ensure this an open data format that people would use and is useful for them. If you’re interested in helping/discussing either join us on Github, or drop a line to the CosmicPi team. There is some discussion amongst what’s known as IPPOG (Internation Particle Physics Outreach Group) and APPEC (Astroparticle Physics European Consortium) to push forwards on open data and open analysis frameworks, so watch this space for more developments.

A massive thanks to the folks that worked on this project over the couple of days: Frank Berghaus, Sophie Redford, Achintya Rao, Sahal Yaqoob, Anirudha Bose, Bruno Kinoshita, Patricia Herterich and Arne de Laat. Really this wouldn’t have gone as far or as fast without your contributions, and we hope you enjoyed it and would like to keep contributing in future.

Thanks also to Mozilla for organising the science sprint, IdeaSquare for hosting us.

Leave a Reply

Your email address will not be published. Required fields are marked *