Exploration of Optimization Techniques
Dmitri Smirnov
Updated
Motivation
- Reconstruction of particle tracks and vertices is intrinsically an
optimization problem searching for best solution in a multidimensional
parameter space
- In the field of particle physics we are constantly searching for
fast and efficient reconstruction techniques to cope with
growing number of particles produced in high energy collisions
Requirements
- Applicable to the following class of problems
- A set of measurements coming from a number of underlying processes
that can be (approximately) described by (roughly) known models
- Minimal tuning of initial parameters, i.e. defaults should satisfy
many problems
- Perform for a wide range of the number of input measurements (orders of magnitude)
- Auto seeding. Components should be tried (identified) during the fit
automatically controlled by a minimal set of parameters
- Robustness. Converge to global optimum from different initial conditions
EM Algorithm
- After reviewing the literature we chose to explore the EM algorithm
(and its variations) due to the following:
- Simple implementation
- Guaranteed to converge
- Extensible and adoptable, with
- References:
- "The Expectation-Maximization (EM) algorithm (Dempster, Laird, and
Rubin, 1977) is an iterative statistical technique for computing maximum
likelihood parameter estimates from incomplete data", (DA Variant of the EM
Algorithm, Ueda, Nakano, 1995)
- wiki article
Software Stack
- For prototyping and quick tests of new ideas we chose to work within
Python ecosystem due to its reach codebase for data analysis
Anticipated dependencies
- numpy for fast vector and matrix calculations
- pandas for easy data manipulation
- plotly for visualization
Release Schedule
- 0.0.x
- Initial implementation of EM
- Layout controls and visualization
- Sporadic updates while optimizing software stack
- Tests with toy data samples
- 0.1.0
- Automatic seed creation
- Support multidimensional data $(N>2)$ with arbitrary 2D/(1D?) projections
- Read user data from files
- Tests with realistic data
Release Schedule
- 0.1.x
- ...
- Automatic seed merging?
- 1.x.x