prometheus increase函数统计得到小数
今天发现prometheus的increase函数得到了小数,研究一下源码,以下是rate/increase/delta 对应的计算函数
https://github.com/prometheus/prometheus/blob/d77b56e88e3d554a499e22d2073812b59191256c/promql/functions.go#L55
// extrapolatedRate is a utility function for rate/increase/delta.
// It calculates the rate (allowing for counter resets if isCounter is true),
// extrapolates if the first/last sample is close to the boundary, and returns
// the result as either per-second (if isRate is true) or overall.
func extrapolatedRate(vals []parser.Value, args parser.Expressions, enh *EvalNodeHelper, isCounter bool, isRate bool) Vector {ms := args[0].(*parser.MatrixSelector)vs := ms.VectorSelector.(*parser.VectorSelector)var (//取样点对应的structsamples = vals[0].(Matrix)[0]//取样的开始和结束时间rangeStart = enh.ts - durationMilliseconds(ms.Range+vs.Offset)rangeEnd = enh.ts - durationMilliseconds(vs.Offset))// No sense in trying to compute a rate without at least two points. Drop// this Vector element.//如果只有1或0个取样点,则没法计算增量if len(samples.Points) < 2 {return enh.out}var (counterCorrection float64lastValue float64)for _, sample := range samples.Points {if isCounter && sample.V < lastValue {//没看懂counterCorrection在干什么,但是不影响后面的理解,因为这里应该是要处理特殊情况,一般来说 sample.V < lastValue不应该成立,因为后一个点的值应该大于等于前一个点counterCorrection += lastValue}lastValue = sample.V}//最后一个计数点和第一个计数点之间的差值(粗略的结果)resultValue := lastValue - samples.Points[0].V + counterCorrection// Duration between first/last samples and boundary of range.//取样开始时间与第一个计数点时间之间的差值durationToStart := float64(samples.Points[0].T-rangeStart) / 1000//取样结束时间与最后一个计数点之间的差值durationToEnd := float64(rangeEnd-samples.Points[len(samples.Points)-1].T) / 1000//最后一个计数点与第一个计数点之间的差值sampledInterval := float64(samples.Points[len(samples.Points)-1].T-samples.Points[0].T) / 1000//计数点之间的时间间隔averageDurationBetweenSamples := sampledInterval / float64(len(samples.Points)-1)if isCounter && resultValue > 0 && samples.Points[0].V >= 0 {// Counters cannot be negative. If we have any slope at// all (i.e. resultValue went up), we can extrapolate// the zero point of the counter. If the duration to the// zero point is shorter than the durationToStart, we// take the zero point as the start of the series,// thereby avoiding extrapolation to negative counter// values.//这里的durationToZero是第一个计数点到零点(原始零点,就是整个表格的零点)之间的差值,如果durationToZero < durationToStart 就说明不正常,需要把durationToStart更新为durationToZero//至于为什么这么更新,可以看上面原文注释durationToZero := sampledInterval * (samples.Points[0].V / resultValue)if durationToZero < durationToStart {durationToStart = durationToZero}}// If the first/last samples are close to the boundaries of the range,// extrapolate the result. This is as we expect that another sample// will exist given the spacing between samples we've seen thus far,// with an allowance for noise.extrapolationThreshold := averageDurationBetweenSamples * 1.1extrapolateToInterval := sampledInterval//这个if一般来说会为true,因为extrapolationThreshold > averageDurationBetweenSamples,而正常情况下 durationToStart <= averageDurationBetweenSamples 会成立if durationToStart < extrapolationThreshold {extrapolateToInterval += durationToStart} else {extrapolateToInterval += averageDurationBetweenSamples / 2}//这里与durationToStart的情况一致if durationToEnd < extrapolationThreshold {extrapolateToInterval += durationToEnd} else {extrapolateToInterval += averageDurationBetweenSamples / 2}//这里根据之前的计算,会采取数学上的外推法来减少预测的误差//这里就是小数出现原因,resultValue原本是一个整数,但是经过外推法的调整,就有了小数部分resultValue = resultValue * (extrapolateToInterval / sampledInterval)if isRate {resultValue = resultValue / ms.Range.Seconds()}return append(enh.out, Sample{Point: Point{V: resultValue},})
}
从源码中可以看出,第一个和最后一个计数点之间的差值需要经过外推法计算才能得到最后的结果。一般来说,如果计数点之间的间隔为15s,每60s统计一次,每次统计则会收入4个计数点(而不是5个),也就只有三个时间间隔,因为计数点不会精确地卡在统计的开始和结束,所以会出现durationToBegin和durationToEnd,而durationToBegin + durationToEnd = 15s, 如图

回到我的问题上,在某个时间段内,我的table只增加了一个数据,计数间隔为15s时:
1.如果将统计间隔设为30s,则每次统计只会涵盖两个计数点(一个时间段),而这个增加的数据刚好就在两个计数点之间,所以最初resultValue=1,extrapolateToInterval=30s,sampledInterval=15s resultValue = resultValue * (extrapolateToInterval / sampledInterval) 后 resultValue=2
2.如果将统计间隔设为1min,则会涵盖四个计数点(三个时间段),所以最初resultValue=1,extrapolateToInterval=60s,sampledInterval=45s resultValue = resultValue * (extrapolateToInterval / sampledInterval) 后 resultValue=1.33 也就是4/3
3.如果将统计间隔设为2min,则会涵盖8个计数点(7个时间段),所以最初resultValue=1,extrapolateToInterval=120s,sampledInterval=105s resultValue = resultValue * (extrapolateToInterval / sampledInterval) 后 resultValue=1.143 也就是8/7