Skip to content

Improve HPA explainability with a machine-readable scaling decision trace #6107

@mattsu2020

Description

@mattsu2020

Enhancement Description

Please keep this description up to date. This will help the Enhancement Team to track the evolution of the enhancement efficiently.

/sig autoscaling

Summary

HorizontalPodAutoscaler currently exposes status fields such as currentReplicas, desiredReplicas, currentMetrics, and conditions.

These fields are useful, but when multiple metrics are configured, it can still be difficult to understand why a particular desiredReplicas value was selected.

This enhancement proposes improving HPA explainability by providing a machine-readable view of the most recent scaling decision. The goal is to improve troubleshooting and observability without changing the HPA scaling algorithm.

One possible API surface is HPA status, but this issue does not assume that status is the only or final design. Other options, such as conditions, events, or another mechanism, should be discussed with SIG Autoscaling.

The initial Alpha scope would be limited to the most recent reconciliation decision only.

Possible information exposed by the decision trace may include:

  • selected or driving metric
  • per-metric proposed replicas
  • invalid metric reasons
  • final decision reason
  • whether the final decision was affected by limits, stabilization, tolerance, or invalid metrics

Open questions:

  • What is the appropriate API surface for this information?
  • Should this be guarded by a feature gate?
  • What is the minimal useful Alpha scope?
  • How can we avoid exposing controller implementation details that would make the API hard to evolve?
  • How should this relate to existing HPA status fields, conditions, events, and controller logs?

One possible API surface is HPA status, but this issue does not assume that status is the only or final design.

Metadata

Metadata

Assignees

No one assigned

    Labels

    sig/autoscalingCategorizes an issue or PR as relevant to SIG Autoscaling.

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions