Skip to content
Snippets Groups Projects
Name Last commit Last update
rootable
LICENSE.txt
README.md
setup.py

What is this tool?

Let's all be frank, root sucks and the root file format is horrible. It's among humanities worst pieces of software. With this small tool I hope to fix the damage that was done, at least a little, by converting root files into native Python formats.

It's using Numpy and a library called Uproot to read and process these damn root files. So far it is specialist for one task and I will have to work on it to make it actually viable for more use cases. That task is to extract PXD data from Belle 2 data files.

How to use this?

This is a single class, that needs to be instantiated, it doesn't take any arguments. Just import it like this:

from rootable import Rootable

Then you can create an instance:

loadFromRoot = Rootable()

and load the root file and all the data:

loadFromRoot.loadData('/root-files/slow_pions_2.root')
loadFromRoot.getClusters()
loadFromRoot.getCoordinates()
loadFromRoot.getLayers()
loadFromRoot.getMatrices()
loadFromRoot.getMCData()

One can now specify how many events should be loaded from the root file. Keep in mind, that this is different from 'entries'. Each event consists of an irregular number of entries. The user can also set if events should be selected randomly

loadFromRoot.loadData('/root-files/slow_pions_2.root', events = 50)
loadFromRoot.loadData('/root-files/slow_pions_2.root', events = 50, selection = 'random')

The 'get' commands don't have any return value, but instead work in-place. Then all data is stored inside the object as dict:

loadFromRoot.data

Here follows a list of keywords contained in the dict:

  • cluster data:
    • 'eventNumber'
    • 'clsCharge'
    • 'seedCharge'
    • 'clsSize'
    • 'uSize'
    • 'vSize'
    • 'uPosition'
    • 'vPosition'
    • 'sensorID'
  • coordinates:
    • 'xPosition'
    • 'yPosition'
    • 'zPosition'
  • layers:
    • 'layer'
    • 'ladder'
  • matrices:
    • 'cluster'
  • Monte Carlo data:
    • 'momentumX'
    • 'momentumY'
    • 'momentumZ'
    • 'pdg'
    • 'clsNumber'

Since the class is subscriptable one can access every element directly using the keywords like this:

loadFromRoot['eventNumber']

or

loadFromRoot[0]

will return either the array containing the event numbers of the first entry of every array contained in the classes dict.

It is possible to filter through the data:

loadFromRoot.where('clsSize == 1')
loadFromRoot.where('clsSize > 1')

or even:

loadFromRoot.where('eventNumbers in [0,1,2]')

And finally you can convert the dict into a structured Numpy array by simply writing:

loadFromRoot.getStructuredArray()

This last command returns a Numpy array. From there the user can save it using Numpys build-in functions, convert it to Pandas or use it in any way that is compatible with Numpy.

The class itself is iterable, it's a bit different from typical python dicts, I iterate over rows and return it as a dict, not sure if that's actually useful.

In certain instances it can be very usefull to stack certain columns together, for example when one wants to calculate the distance from the origin. Then one can stack the positions:

loadFromRoot.stack('xPosition', 'yPosition', 'zPosition', toKey: 'position')

Installation

You will need to the wheel and setuptools packages of python in order to install Download the repo, navigate in the terminal to the folder and run the following script:

python3 setup.py sdist

and then:

pip3 install .