The DZERO Lumionsity Claculation

 
Home Research Classes

Some details of the luminosity calculation at DZERO.

Invariants

These are things I'm pretty sure are true all the time and are at the core of our ability to easily calculate luminosity.

  • All events are stamped with a Luminosity Block Number (LBN).
  • The LBN is strictly time ordered. If an event is in LBN #4 then it came after any event in LBN #3, and before any event in LBN #5.
  • Events from a single LBN all appear in a single file. It is possible for more than one LBN's events to appear in a single file, but not vice versa.
    • There is a caveat for a special class of events, called late events. But these are very rare, and not handled correctly (very rare == less than one a month).
  • A LBN does not span a run transition.

How the Luminosity Is Calculated

I'm skipping the physics description of how it is done (I can get references for those interested). I'm sticking more to procedural ones.

Luminosity is calculated on an LBN basis. We have a large data base that contains the integrated luminosity for each LBN and for each trigger. We also put data quality in terms of LBNs. This means our analyses reject events for DQ in terms of LBN. Also, when we calculate the luminosity we end up with a good LBN list which includes data quality and thus keeps us from accidentally including both bad events and even good events for which we haven't calculated the luminosity.

Error is assigned at a flat 6% rate.

  1. Each beam crossing is stamped with a time and a LBN number. If the event passes all triggers this information is present in the final data stream.
  2. The Level 1 Trigger manager has scalars for every single L1 trigger and an overall scalar counting the beam crossings. Using these two things it is possible to calculate the live time on a trigger-by-trigger basis. This factor thus includes the prescale factor.
  3. At each LBN transition the trigger system is stopped briefly, events are allowed to empty out of L1, and a snap shot of all scalars is taken. This is read out by a DAQ separate from the main DAQ system (the lumi DAQ!). The length of the stop is micro-seconds. The LBN is incremented about once every 1 - 2 minutes (or by any problem found in the DAQ system). This part of the lumi DAQ is iron clad. That is, if it fails, then the whole experiment's DAQ system stops as well (this hasn't happened in several years now).
  4. Effort is made to track event loss through out the system. Level 2 node or Level 3 farm node crashes, for example. If the event loss is too large, the LBN is marked questionable or bad (actually, it is just marked; quality cuts have been studied and applied, but I'm not sure exactly what they are off the top of my head).
  5. Events are written to a file by the DAQ system and contain an integral number of LBNs.
  6. The DAQ system inserts the RAW data file into the tape robot and SAM. The meta data put into SAM for the file includes a list of the LBNs contained in the file.
  7. The DZERO luminosity group calculates the luminosity for each LBN. This along with the live time for the trigger are recorded also in the database -- as well as if the LBN passed the quality cuts.
  8. The calorimeter group, the muon group, etc., all inspect the data to determine when their detector was bad. These are entered into a separate database (or sometimes @*&#&@ flat file).
  9. SAM is used to track all files as they move through the reconstruction process - providence is key here.
  10. An analyzer decides to run on a certain period of data. This is often dictated by the trigger list version. This gives them a range of run numbers which then then use to extract a rather lengthy list of AOD files (we call them thumbnails).
  11. The analyzer then determines the triggers they wish to analyze and data quality requirements for their analysis (i.e. if they don't care about muons, they don't ask for LBNs where the muon detector is good).
  12. Scripts read the lumi database, the trigger list, the data quality stuff requirements, and the list of files extracted from SAM. With this input it produces as output a luminosity number and a list of good luminosity blocks. Currently this calculation can take a long time -- several hours. So the results are cached.
  13. The analyzer can then run on their data files and checks to make sure every event is a member of one of these good LBNs.

DZERO Comments

Our system is far from perfect!

  • We don't really understand how to properly calculate the luminosity when we have two prescaled triggers that aren't prescaled on the same bit. This is a statistics problem.
  • After the lumi database is generated very little is done by universal tools. This is unfortunate as that means every analysis/analysis group must re-create the infrastructure to do what is essentially the same task. Now that we are in full swing no one has taken the time to fix this up.

I've used to do the lumi calculation for all of the top group. You can see a "pretty" version of the output here for the data sets that were used in the set of top analyses that are just being made public now (click on links at bottom of page for more details on each data sample).

ATLAS Comments

  • The LBN number is similar to a course time stamp. If events are stamped at Level 1 -- in a strictly time ordered way -- one could use the time stamp as a LBN number.
  • ATLAS can not afford to write all events from a single time slice (LBN) to a single file. The fundamental design of the DAQ means that events will be spread out over at least 4 files from a single time period.

Things To Address at ATLAS

If the ATLAS luminosity calculation were to be done similar to the way DZERO does its, there are several things that we would need to address fairly quickly.

  • DAQ System Architecture
    • Current plans are to close a raw data file when it reaches a certain size. This means each file will cover a basically random segment of time as events will not leave the HLT time ordered. This makes dealing with lost files and files which crashed the reconstruction and didn't make it into the final dataset very difficult to properly account for in the luminosity calculation.
    • I have mentioned this to Hans-Peter Beck, from Bern who is working on the output sub farm processors. He hadn't thought this through but agreed that is required some thinking. I've not talked to him since. I also had a 5 minutes conversation with Chris Bee, but we were doing too many other things at the time.

Other Notes

In OPAL the luminosity was done by including luminosity events in the data stream. You could then count and extract information from these to determine the luminosity of your sample. The down side is that a highly selected sub sample of data will contain many more luminosity events than actual data. Further, extra code and handling must be done for these events, and one must insure they are always written out if read in.