Idea: Measure processed samples on MQE #10138

tinitiuset · 2024-12-05T09:54:08Z

What is the problem you are trying to solve?

In the effort of trying to measure throughput in Mimir, we've realized MQE does not make processed samples data available as the PromQL engine does.

Which solution do you envision (roughly)?

Make MQE count samples loaded from storage and expose it to mimir-stats so we can get it to a response header for each query.

Have you considered any alternatives?

I'm happy to listen to any alternatives. The objective is to measure throughput in the best possible way.

Any additional context to share?

Related work on measuring Throughput in Mimir has been done in #10103, #9985, #7966 and here.

How long do you think this would take to be developed?

Small (<= 1 month dev)

What are the documentation dependencies?

No response

Proposer?

No response

The text was updated successfully, but these errors were encountered:

jhesketh · 2024-12-05T10:18:51Z

The place to measure instant vector selectors would be here:

mimir/pkg/streamingpromql/operators/selectors/instant_vector_selector.go

Line 164 in be2f23b

return data, nil

You should be able to use the len(data.Floats) and len(data.Histograms) to count each point type being loaded.

Passing it back through the stack is slightly more complex. My suggestion though would be to use the limiting.MemoryConsumptionTracker to store the counts onto. This can then be retrieved in pkg/streamingpromql/query.go and emitted as a metric, similar to:

mimir/pkg/streamingpromql/query.go

Line 556 in be2f23b

    
           q.engine.estimatedPeakMemoryConsumption.Observe(float64(q.memoryConsumptionTracker.PeakEstimatedMemoryConsumptionBytes))

mimir/pkg/streamingpromql/engine.go

Line 56 in be2f23b

    
           estimatedPeakMemoryConsumption: promauto.With(opts.CommonOpts.Reg).NewHistogram(prometheus.HistogramOpts{

Range vector selectors are a bit more complex because of the way points can be reused from a buffer. So it depends what you're counting here in terms of "processed" samples.

You'll need to do something similar around here, depending on how you want to count samples at each step:

mimir/pkg/streamingpromql/operators/selectors/range_vector_selector.go

Line 100 in be2f23b

    
           m.stepData.Floats = m.floats.ViewUntilSearchingBackwards(rangeEnd, m.stepData.Floats)

charleskorn · 2024-12-10T00:14:16Z

Passing it back through the stack is slightly more complex. My suggestion though would be to use the limiting.MemoryConsumptionTracker to store the counts onto. This can then be retrieved in pkg/streamingpromql/query.go and emitted as a metric.

I don't think we should add this to MemoryConsumptionTracker - its purpose is to track memory consumption, and the number of samples processed isn't that.

I also don't think we should emit this as a metric when there's already a mechanism to pass this data back to the caller of the query.

Instead, we should implement something specific to tracking query stats, and use that to populate a stats.Statistics and return it from the Query.Stats() method. This will then expose the information in the same way that Prometheus' engine does, and means #10103 should work as-is with MQE.

There are a bunch of fields on stats.Statistics that we'll have to ignore (eg. all of Timers and Samples.PeakSamples, Samples.TotalSamplesPerStep etc.), but it looks like all we need for this is Samples.TotalSamples.

Range vector selectors are a bit more complex because of the way points can be reused from a buffer. So it depends what you're counting here in terms of "processed" samples.

If we can, we should do whatever Prometheus' engine does, unless it doesn't make sense in the context of MQE.

tinitiuset added the enhancement New feature or request label Dec 5, 2024

tinitiuset mentioned this issue Dec 5, 2024

Query Frontend: Expose samples_processed in Server-Timing header #10103

Merged

4 tasks

charleskorn linked a pull request Dec 13, 2024 that will close this issue

MQE: track number of processed samples in each query #10232

Open

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Idea: Measure processed samples on MQE #10138

Idea: Measure processed samples on MQE #10138

tinitiuset commented Dec 5, 2024

jhesketh commented Dec 5, 2024

charleskorn commented Dec 10, 2024

Idea: Measure processed samples on MQE #10138

Idea: Measure processed samples on MQE #10138

Comments

tinitiuset commented Dec 5, 2024

What is the problem you are trying to solve?

Which solution do you envision (roughly)?

Have you considered any alternatives?

Any additional context to share?

How long do you think this would take to be developed?

What are the documentation dependencies?

Proposer?

jhesketh commented Dec 5, 2024

charleskorn commented Dec 10, 2024