Optimizing STAR Event Reconstruction

Dmitri Smirnov

Updated

Motivation

  • Speed up the existing event reconstruction
  • The road map
    • Investigate bottle necks in the code \( \Longrightarrow \) Refactor individual routines
    • Consider changes in algorithms to eliminate unnecessary calculations
    • Benefit from "automatic" compiler optimization

Alternative Implementations of Individual Routines

  • In reconstruction jobs most of the time is spent in StiMaker::Make()
  • Matrix operation routines
  • Made sure output for all implementations is identical for input based on real data
  • Tests with Eigen implementation of matrix operations give up to 40%
    May translate to up to 10% additional speed-up in full reconstruction jobs

Alternative Implementations: errPropag6()

Alternative Implementations: joinTwo()

Comparison of Optimization Options


    set1  = "-O2 -m32 -msse -mno-avx -ftree-vectorize"
    set2  = "-O2 -m32 -msse -mno-avx -fno-tree-vectorize"
    set3  = "-O2 -m32 -msse -mno-avx -ftree-vectorize -D EIGEN_DONT_VECTORIZE"
    set4  = "-O2 -m32 -msse -mno-avx -fno-tree-vectorize -D EIGEN_DONT_VECTORIZE"
    set5  = "-O3 -m32 -msse -mno-avx -ftree-vectorize"
    set6  = "-O3 -m32 -msse -mno-avx -fno-tree-vectorize"
    set7  = "-O3 -m32 -msse -mno-avx -ftree-vectorize -D EIGEN_DONT_VECTORIZE"
    set8  = "-O3 -m32 -msse -mno-avx -fno-tree-vectorize -D EIGEN_DONT_VECTORIZE"
    set9  = "-O2 -m64 -msse -mno-avx -ftree-vectorize"
    set10 = "-O2 -m64 -msse -mno-avx -fno-tree-vectorize"
    set11 = "-O2 -m64 -msse -mno-avx -ftree-vectorize -D EIGEN_DONT_VECTORIZE"
    set12 = "-O2 -m64 -msse -mno-avx -fno-tree-vectorize -D EIGEN_DONT_VECTORIZE"
    set13 = "-O3 -m64 -msse -mno-avx -ftree-vectorize"
    set14 = "-O3 -m64 -msse -mno-avx -fno-tree-vectorize"
    set15 = "-O3 -m64 -msse -mno-avx -ftree-vectorize -D EIGEN_DONT_VECTORIZE"
    set16 = "-O3 -m64 -msse -mno-avx -fno-tree-vectorize -D EIGEN_DONT_VECTORIZE"

Changes in Sti Tracking Algorithm

  • Attempted to speed up Kalman smoother by skipping calculation of internal track states
  • Expected no change in reconstructed track parameters
  • Unfortunately the change in Kalman smoother affects reconstructed tracks
  • Giving up? ... Need to revisit this result and quantify the change more precisely

Event Reconstruction with 64 bit Libraries

  • Base release is SL17h (... dNdx)
    • 64 and 32 libraries available tagged releases. DEV has only 32 bit
    • Not sure what options are used for the builds but 64 bit case maybe unoptimized...
    • I use cmake to compile locally (Most of the libraries used in "reco" are build locally)
  • Use regular root to run bfc.C macro
  • Currently remove btof option to avoid the need for geant libraries
  • StBFChain distinguishes root from root4star and sets "VMC" options internally for root. Got rid of them
  • Seems safe to remove unused options/libraries ctf, St_ctf, St_ctf_Make invoked by Sti, VFPPV, VFPPVEv, genvtx, and tables
    • globT and sim_T removed for this studies but need to confirm later

Tests for Modified Libraries

Compare the following reconstruction chains

  • Test 1. 32-bit root4star, SL17h vs 32-bit root4star, ds-tof-tgeo
  • Test 2. 32-bit root4star, ds-tof-tgeo vs 32-bit root, ds-tof-tgeo, ds-root-reco
  • Test 3. 32-bit root4star, ds-tof-tgeo vs 64-bit root, ds-tof-tgeo, ds-root-reco
    • `ds-tof-tgeo` is the branch with modified code `StBTofUtil/StBTofGeometry`, `StiMaker`
    • Reconstruction with `root` assumes elimination of `St_geant_Maker` and `VMCMaker` through removal of special treatment of `root`. Changes on `ds-root-reco` branch modifying `StBFChain`, `bfc.C`
  • Test 4. 64-bit root, ds-tof-tgeo, ds-root-reco vs
    64-bit root, ds-tof-tgeo, ds-root-reco, ds-optim-eigen
    • `ds-optim-eigen` contains new implementations of `joinTwo()` and `errPropag6()`

Summary for Tests

  • Test 1. Identical results
  • Test 2. Identical results
  • Test 3.
    • Difference in 1-2 tracks is observed in many events (~0.5‰). In total track count and/or after vertex cuts
    • Similar level of discrepancy (~10 out of 40–50k hits) is already seen in the hits
      The difference does not appear to be systematic
    • Not worth tracking down but may try a special compiler flag to guarantee the order of arithmetic operations
    • Confirmed gain in total BFC speed 10–15%
  • Test 4. Eigen library is used for matrix operations
    • Numerical difference is smaller than in Test 2 (1 track in ~5% of events)
    • Gain additional 5% in total BFC speed

Preliminary Results

  • Based on 100 events
  • Results are for local builds unless noted
    cmake -D CMAKE_CXX_FLAGS="-O2 -msse -mno-avx -fno-tree-vectorize"
  • Notes:
    [1] pp 510GeV run 2017
    [2] auau 200GeV run 2016
    [3] pp 200GeV run 2015

Preliminary Results: Sti

Data Notes bfc, cpu StiMaker, %
st_physics_18060106_raw_0000008.daq [1] -m32 1366 57.7
-m64 1208 (-11%) 57.5
st_physics_17072001_raw_3500002.daq [2] -m32 2575 43.9
-m64 2284 (-11%) 42.0
st_physics_16067017_raw_1000020.daq [3] -m32 1258 34.1
-m64 1130 (-10%) 32.0

Preliminary Results: StiCA

Data Notes bfc, cpu StiMaker, %
st_physics_18060106_raw_0000008.daq [1] -m32 1606 63.7
-m64 1430 (-11%) 45.7
st_physics_17072001_raw_3500002.daq [2] -m32 2932 43.9
-m64 2505 (-15%) 45.9

Preliminary Results: Discussion

  • Consistent gain in speed (32 vs 64 bit) of at least 10% for various data sets
  • StiCA is slower than Sti. Could be due to turned off auto vectorization
Summary

Back-Up Slides

Comparison of Optimization Options

  • Consider different RHIC Runs: 2017, 2016, and 2015
File Notes StiMaker, % (total mins) joinTwo(), % errPropag6(), %
st_physics_18060106_raw_0000008.daq Sti 57 (83) 28 9
StiCA 63 (97) 23 7
st_physics_17072001_raw_3500002.daq Sti 44 (155) 22 7
StiCA 46 (177) 19 6.4
st_physics_16067017_raw_1000020.daq Sti 34 (67) 20.4 7.1

64-bit Libraries for Event Reconstruction

  • Need to understand library dependencies and compile 64-bit libraries for full event reconstruction
    • Test with real data reco jobs using already defined set of options
  • Already established that StiMaker depends on St_geant_Maker via BTOF geometry
    • St_geant_Maker calls TGiant3 in root4star
    • TGiant3 is an adapter for Geant3 library. Some Fortran components are not ready for 64-bit...
  • Removing dependence on Geant geometry will warrant uniform use of TGeoManager
  • Effective calls to St_geant_Maker made via code:
    TVolume* starHall = static_cast<TVolume*>( GetDataSet("HALL") );

    Need to refactor by using TGeoManager instead
  • An internal BTOF geometry (tray/module/sensor structures) is built in StBTofGeometry and is used in track to cell matching

Understanding Geant Role in Event Reconstruction

32-bit case:

  • Geant not required for event reconstruction
  • Removed "btof" option but still see St_geant_Maker
  • Using root instead of root4star completely removes St_geant_Maker
  • But some side effects observed
  • StMtdHitMaker:INFO  - Process data from year -1
    • geant3/pythia/... libraries still loaded (perhaps not used)
    • StVMCMaker is invoked and it seems the data is treated as simulation
    • Yes, confirmed and the fix is to remove "VMC" options
  • Reconstruction works as expected with root

64-bit case: There is no 64-bit root4star, 64-bit libraries are not built daily

Proposed Changes for Reco Chain

  • Remove "VMC" options internally set in StBFChain.cxx. This also removes geant3/pythia/... libraries
  • Remove unused options/libraries such as ctf, St_ctf, St_ctf_Make invoked by genvtx
  • Most of the libraries used in 'reco' chain can be built with cmake

--- a/StBFChain/StBFChain.cxx
+++ b/StBFChain/StBFChain.cxx
@@ -1475,14 +1475,6 @@ void StBFChain::SetFlags(const Char_t *Chain)
       SetOption("-geantL","Default,-TGiant3");
       SetOption("-geometry","Default,-TGiant3");
       SetOption("-geomNoField","Default,-TGiant3");
-      if (! (GetOption("Stv"))) {
-       if (! (GetOption("VMC") || GetOption("VMCPassive"))) {
-         SetOption("VMCPassive","Default,-TGiant3");
-       }
-       SetOption("pgf77","Default,-TGiant3");
-       SetOption("mysql","Default,-TGiant3");
-       SetOption("StarMiniCern","Default,-TGiant3");
-      }
     }
     if (GetOption("ITTF") && ! (GetOption("Sti") || GetOption("StiCA")  || GetOption("Stv") || GetOption("StiVMC"))) {
       TString STAR_LEVEL(gSystem->Getenv("STAR_LEVEL"));