As defined in our clustering primitives, Clusters
belong to one plane
only. Events are clusters matched across two planes. AbstractMaccher defines the interface for
cluster matching strategies. It also provides some common parts of the implementation for
various matching strategies.
Since hits and clusters for different planes may arrive from different readout units
(expectation for Gadolinium GEM at the very least), it is important to ensure that all relevant
data is considered before the total event is reassembled. This class helps ensure this
by enforcing the maximum latency requirement across the planes. The expectation for
the client pipeline is that the clusters it supplies are chronologically consecutive for each plane.
If something like GapClusterer
was used to produce the clusters, then this is already taken care of
by meeting the assumptions for that class.
The ready_to_be_matched
method provides the necessary logic for various matcher implementations.
Recall that each cluster has a plane identity, which tells us that all hits in the cluster belong
to that particular plane. In order to not discard relevant data, clusters may only be matched and
released if we are sure that no more data will arrive that is potentially coincident with some
cluster being considered. And since we are considering multiple planes from which data may arrive
at different times, we must guarantee that each cluster considered for matching satisfies the
maximum latency cirterion for all relevant planes in question, not just the plane that the said
cluster belongs to.
Let us compare the time endpoint of the cluster versus the latest timepoint so far encountered on that plane. We say that the cluster is safe to process if this difference is greater than the maximum latency guaranteed by the readout system. We consider the time endpoint rather than the start point, because the time difference with the end point is likely smaller, so this is a more conservative criterion. We have now check for the same condition against the other plane, the one this cluster does not belong to.
Consider a failed comparison for one plane. If the difference between cluster end-time and the latest time-point observed on a plane is less than the maximum latency, then it means we have not seen events far enough in time on that particular plane to be sure that there is nothing in coincidence with our cluster, and therefore we cannot release it.
The way ready_to_be_matched
approaches this question is by selecting the earliest (least advanced
in time) point from the relevand planes, and then uses this as the comparison point for the
latency check.
Recall that the Event
class can merge any Clusters
so long as they belong to one of the relevant
planes. This simplifies matcher implementation, particularly when at this stage we are mostly
interested in merging all clusters that are in some ways time-coincident, for the moment ignoring
the conmplexities of disambiguating multiple time-coincident clusters.
In a similar way, feeding clusters to a Matcher can be simplified if the matcher is already aware of the relevant planes. Unmatched clusters are kept in a single chronological queue and when considering each cluster for matching, we only care about the maximum latency condition versus the latest time point in each plane. When receiving a new cluster (or container of clusters), the matcher can indentify each cluster's plane and update the appropriate "time horizon".
For this reason, it is reasonable that a Clusterer
must be initialized with the following
parameters already known:
- maximum latency of readout system
- plane ID for the first relevant plane
- plane ID for the second relevant plane
Since clustering on a single plane is expected to have happened in one of the Clusterer
implementations, it is also very likely that clusters from a single plane will arrive in bulk
and already chronologically sorted.
For this reason a convenience (and performance-advantageous) function is provided which
allows for this assumption. If you indicate which plane a container of events belongs to, only
one time comparison will be needed to establish the time horizon and the entire container
can be spliced into the queue.
Otherwise, clusters can be inserted individually, or in bulk with no assumption about their planes or origin, allowing the clusterer to evaluate them individually.
When establishing the time horizon for each plane, the time_start
of the cluster is used,
i.e. the earliest point of the cluster, again to be on the conservative side for the
latency comparison.
The pathological case of "endless clusters" described in the Clusterer
documenation could have
unintended consequences for the matcher implementations. The matching pipeline might be blocked
from proceeding until the relevant clusterer releases its data.
(TODO: this section is to be moved elsewhere) The presence of pulse-time complicates things, and the need to keep track of it in the matcher makes implementations more complex than they would otherwise have to be. It is highly likely that this class is not the right place to keep track of the pulse time. Particle events and pulse events have to at some point be queued up on the same track, because of the expectations in the ev42 buffer definition. All particle events have to come after and in relation to the most recent pulse event. If there is ever an external mechanism that can ensure this (an additional queue), then it would be very welcome to remove this complexity from the Matcher class.
The simplest and most strict of the matcher implementations. Requires that clusters overlap in
time for them to be merged into a single Event. There are no threshold-like criteria. Only
the match
function needs to be implemented.
The first thing that happens in the implementation is that the unmatched clusters are sorted in time. This is because multiple batches of clusters may have been added from different planes prior to this matching step, only noting the changing time horizons for each plane. To begin releasing the clusters, we need to sort them only once prior to performing the actual matching.
The matching loop will continue so long as there are unmatched clusters that satisfy the
ready_to_be_matched
criterion evaluated by the parent class method.
Any event is considered ready to release, even if it only has a cluster in one dimension. It must
not be empty, and it must not overlap (as defined in the Event
class) with a subsequent cluster.
Whatever has so far been accumulated into a candidate event will be released onto the "out" queue.
Whether the candidate event has anything or not, the current cluster satisfying the
ready_to_be_matched
criterion will be merged into the candidate event. By the above logic,
if there was an actual overlap, the candidate will not have ben released. So, true merging
of multiple clusters only happens here in case of time overlap.
At the end of the loop, it may be that there was something accumulated in the candidate event, but it has not met the criteria for release. In which case it is disassembeled and it's constituent clusters are put back on the front of the queue of unmatched clusters.
This implementation merges clusters without requiring a strict time overlap. So long as two clusters are within a certain small time-gap of each other, they are considered to be merge-worthy.
A minimum_time_gap
parameter is provided that defines the threshold for cluster adjacency.
The implementation is almost a carbon copy of the one for OverlapMatcher
, only that instead
of the time-overlap condition for candidate event relase it is instead the event vs. cluster
time-gap being outside the requested threshold.
This implementation has a similar criterion as GapMatcher
above, except the time-difference is
not the gap between clusters, but difference in between their end-times. This implementation
is a specific adaptation for the Gadolinium GEM pipeline, where cluster end-points are
of particular interest.
Implementation wise it is a similar story to the above. Some convenience functions have been provided to encapsule the necessary time-end comparisons.