Skip to main content

Data in Particle Physics and How It Leads to Analytics Engineer

By January 19, 2024March 19th, 2024No Comments
Curiosity has always been the characteristic of humans. Through the years, people have discovered smaller and smaller particles that build the matter surrounding us. To find the answer to the most intriguing question we now have: “How the universe was created?”, we have to look back in the past, when the matter was very dense and hot.
Here, experiments like those at CERN come in (especially the ALICE experiment; the rest of them investigate the wide range of new particle physics). The aim of them is to produce conditions similar to those of the early universe, when everything was very hot. This introduces the demand to accelerate the particle beams close to the speed of light and to collide them at very high energies. After the collision, a better understanding of the structure and features of the created particles requires very precise detection of them.
In general, we would like to have information about:
  • the particle’s path – for this purpose, we have different types of tracking detector,
  • the particle’s energy – which we get from the calorimeters,
  • the particle’s type – which is not that easy to find out. There is no dedicated detector. The identification is done through knowing the way they interact, their mass, and sometimes their charge.
Some detectors even have up to millions of independent channels placed very densely (some tracking detectors have pixel sizes of hundreds of μm). The readout electronics that is responsible for reading the signal from the detector has to face several challenges. Not only the number and dense placement of channels, but also the operation under high radiation conditions. For these reasons, particle physics experiments need the dedicated electronics circuits, called Application-Specific Integrated Circuits (ASICs). I was responsible for designing them for the last few years during my PhD at AGH University of Krakow.
But where is the processing and analyzing of data in all this? The answer is: everywhere!
The good thing to mention here is the fact that one detector can produce even up to Pbytes/s of data. This is the reason why it is common to use real-time processing of data to reduce it to a few percent. You may ask: Why so much data anyway? Some particles are very rare, so to find this needle in a haystack, you need to have a lot of data. The first part of data management starts here to ensure that we save only significant data. Some of this processing is performed at the level of reading electronics inside ASICs. In some detectors, scientists use machine-learing models, for example, to predict the path of the particle and provide better track reconstruction. Finally, hundreds of physicists analyze the collected data. Physicists from CERN created even their own program libraries, such as ROOT (in C++) or recently scikit-hep (in Python). And last but not least… As mentioned above, sometimes only one detector may need even thousands of ASICs. One has to ensure that they are all tested, verified, and well parameterized. This requires not only preparation of the measurement setup for mass tests, but also processing of the obtained results and analyzing them. This was the second and very important part of my work at university in recent years.
Why Nimbus Intelligence Academy?
My previous university experience increased my interest in data management and how it can be done better. And at this point I had the opportunity to join the Nimbus Intelligence Academy. The Information Lab Netherlands and The Information Lab Italy founded academy to train next generation of Analytics Engineers who will become experts in designing and establishing databases. That’s why I immediately decided to move to Milan and start my adventure with Nimbus Intelligence. Let’s see what this brings!  


Leave a Reply