Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on active healthchecking to an internal service using an BackendTrafficPolicy #4863

Open
mckornfield opened this issue Dec 7, 2024 · 5 comments
Labels

Comments

@mckornfield
Copy link

mckornfield commented Dec 7, 2024

Description:
Let's say I have an HTTP Route and BackendTrafficPolicy that look like the following

apiVersion: gateway.envoyproxy.io/v1alpha1
kind: BackendTrafficPolicy
metadata:
  name: my-app
  namespace: my-app
spec:
  hostnames:
  - api-external.cloud
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: cloud
    namespace: cloud
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: my-app
      port: 8000
      weight: 1
    matches:
    - path:
        type: Exact
        value: /healthcheck
---
apiVersion: gateway.networking.k8s.io/v1
kind: HTTPRoute
metadata:
  annotations:
    meta.helm.sh/release-name: my-app
    meta.helm.sh/release-namespace: my-app
  name: my-app
  namespace: my-app
spec:
  hostnames:
  - api-external.cloud
  parentRefs:
  - group: gateway.networking.k8s.io
    kind: Gateway
    name: cloud
    namespace: cloud
  rules:
  - backendRefs:
    - group: ""
      kind: Service
      name: my-app
      port: 8000
      weight: 1
    matches:
    - path:
        type: PathPrefix
        value: /v1/my-app

When I debug the generated config within envoy gateway, I see the following

"health_checks": [
       {
        "timeout": "5s",
        "interval": "5s",
        "unhealthy_threshold": 1,
        "healthy_threshold": 2,
        "http_health_check": {
         "host": "api-external.cloud",
         "path": "/healthcheck",
         "expected_statuses": [
          {
           "start": "200",
           "end": "201"
          }
         ],
         "method": "GET"
        }
       }
      ],

And it leads to an error since the path /healthcheck is not exposed externally. I have a handful of questions

  1. Shouldn't envoy be able to look up this cluster locally instead of having to healthcheck through the host?
  2. If this is expected, is the expected workaround to expose an HTTPRoute/healthcheck publicly so that it can be accessed via envoy at this host?

[optional Relevant Links:]
Trying to follow: https://www.envoyproxy.io/docs/envoy/latest/intro/arch_overview/upstream/health_checking
And: https://gateway.envoyproxy.io/v0.6/design/backend-traffic-policy/

@arkodg
Copy link
Contributor

arkodg commented Dec 7, 2024

here's an example of what the config looks like

- apiVersion: gateway.envoyproxy.io/v1alpha1

Docs are here; https://gateway.envoyproxy.io/docs/api/extension_types/#backendtrafficpolicyspec

@mckornfield
Copy link
Author

thanks for those references! I think I ended up on the API spec but not the first.

To further elaborate: if I have three services all standing behind envoy gateway, if I follow that example and use /healthz, it ends up going to whatever service "wins" for binding to that endpoint, correct?

So each service would need like a /v2/healthz, /v3/healthz if they were to actually check independent services

In this snippet for example

- apiVersion: gateway.envoyproxy.io/v1alpha1
  kind: BackendTrafficPolicy
  metadata:
    namespace: default
    name: policy-for-route-1
  spec:
    targetRef:
      group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: httproute-1
    healthCheck:
      active:
        timeout: "1s"
        interval: "5s"
        unhealthyThreshold: 3
        healthyThreshold: 3
        type: HTTP
        http:
          path: "/healthz"
          method: "GET"
          expectedStatuses:
          - 200
          - 201
          expectedResponse:
            type: Text
            text: pong
      passive:
        baseEjectionTime: 150s
        interval: 1s
        maxEjectionPercent: 100
        consecutive5XxErrors: 5
        consecutiveGatewayErrors: 0
        consecutiveLocalOriginFailures: 5
        splitExternalLocalOriginErrors: false
- apiVersion: gateway.envoyproxy.io/v1alpha1
  kind: BackendTrafficPolicy
  metadata:
    namespace: default
    name: policy-for-route-4
  spec:
    targetRef:
      group: gateway.networking.k8s.io
      kind: HTTPRoute
      name: httproute-4
    healthCheck:
      active:
        timeout: "1s"
        interval: "5s"
        unhealthyThreshold: 3
        healthyThreshold: 3
        type: HTTP
        http:
          path: "/healthz"
          method: "GET"
          expectedResponse:
            type: Text
            text: pong

Are they not simply going to the same path once they're finally resolved in the envoy config?

@arkodg
Copy link
Contributor

arkodg commented Dec 7, 2024

the /healthz will be called asynchronously for the my-app endpoint, irrespective of what matcher path prefix you have /v1/my-app for the route rule

@mckornfield
Copy link
Author

so if I understand correctly, the healthcheck config will read, for the above examples for policy-for-route-1 and policy-for-route-4

"health_checks": [
       {
        "timeout": "1s",
        "interval": "5s",
        "unhealthy_threshold": 3,
        "healthy_threshold": 3,
        "http_health_check": {
         "host": "gateway.envoyproxy.io",
         "path": "/healthz",
         "expected_statuses": [
          {
           "start": "200",
           "end": "202"
          },
         ],
         "method": "GET"
        }
       }
      ],

Meaning they'll both go to gateway.envoyproxy.io/healthz, is that correct?

@arkodg
Copy link
Contributor

arkodg commented Dec 7, 2024

Yes they'll go to the same server, but the decision on the backendRef may be different ( if settings are different) i.e. the policy is unique per policy per route rule

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants