Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Configuration validation segfaults with cluster_type configured for Redis #37623

Open
L30Bola opened this issue Dec 11, 2024 · 4 comments
Open

Comments

@L30Bola
Copy link

L30Bola commented Dec 11, 2024

Title: Configuration validation segfaults with cluster_type configured for Redis

Description:
When trying to validate a configuration with the CLI envoy -c /etc/envoy/redis.yaml --mode validate it segfaults. Even though the validate segfaults. When running without the --mode validate flag it works as expected.

Repro steps:
My current configuration is like this:

redis.yaml
static_resources:
  listeners:
  - name: redis_listener
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 6379
    filter_chains:
    - filters:
      - name: envoy.filters.network.redis_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.redis_proxy.v3.RedisProxy
          stat_prefix: redis_stats
          settings:
            op_timeout: 5s
            enable_hashtagging: true
            enable_redirection: true
            enable_command_stats: true
            dns_cache_config:
              name: redis_cache
              dns_lookup_family: V4_ONLY
              dns_refresh_rate: 300s
              dns_min_refresh_rate: 30s
          prefix_routes:
            catch_all_route:
              cluster: redis_cluster

  clusters:
  - name: redis_cluster
    cluster_type:
      name: envoy.clusters.redis
      typed_config:
        "@type": type.googleapis.com/google.protobuf.Struct
    connect_timeout: 5s
    lb_policy: CLUSTER_PROVIDED
    load_assignment:
      cluster_name: redis_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: elasticache-endpoint
                port_value: 6379
    typed_extension_protocol_options:
      envoy.filters.network.redis_proxy:
        "@type": type.googleapis.com/envoy.extensions.filters.network.redis_proxy.v3.RedisProtocolOptions
        auth_password: 
          inline_string: "12345678"
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext

I've done some search through the documentation and some closed issues to come up with the above configuration, but when I try to validate it with the CLI, it segfaults. I'm using the Docker image envoyproxy/envoy:contrib-v1.32-latest for my current setup. I just copy the above configuration to the container and try to start envoy with the CLI.

If I change the cluster_type to a type: STRICT_DNS and the lb_policy to ROUND_ROBIN, everything continues to work AND the CLI can validate the configuration.

redis.yaml - with type: STRICT_DNS and lb_policy: ROUND_ROBIN
static_resources:
  listeners:
  - name: redis_listener
    address:
      socket_address:
        address: 0.0.0.0
        port_value: 6379
    filter_chains:
    - filters:
      - name: envoy.filters.network.redis_proxy
        typed_config:
          "@type": type.googleapis.com/envoy.extensions.filters.network.redis_proxy.v3.RedisProxy
          stat_prefix: redis_stats
          settings:
            op_timeout: 5s
            enable_hashtagging: true
            enable_redirection: true
            enable_command_stats: true
            dns_cache_config:
              name: redis_cache
              dns_lookup_family: V4_ONLY
              dns_refresh_rate: 300s
              dns_min_refresh_rate: 30s
          prefix_routes:
            catch_all_route:
              cluster: redis_cluster

  clusters:
  - name: redis_cluster
    connect_timeout: 5s
    type: STRICT_DNS
    lb_policy: ROUND_ROBIN
    load_assignment:
      cluster_name: redis_cluster
      endpoints:
      - lb_endpoints:
        - endpoint:
            address:
              socket_address:
                address: elasticache-endpoint
                port_value: 6379
    typed_extension_protocol_options:
      envoy.filters.network.redis_proxy:
        "@type": type.googleapis.com/envoy.extensions.filters.network.redis_proxy.v3.RedisProtocolOptions
        auth_password: 
          inline_string: "12345678"
    transport_socket:
      name: envoy.transport_sockets.tls
      typed_config:
        "@type": type.googleapis.com/envoy.extensions.transport_sockets.tls.v3.UpstreamTlsContext

Admin and Stats Output:
Envoy haven't actually started when this strange behavior happens, so the only output is the CLI validate:

Envoy CLI validate
root@envoy-cc48545bb-pmnl9:/# vim /etc/envoy/redis.yaml
root@envoy-cc48545bb-pmnl9:/# envoy -c /etc/envoy/redis.yaml --mode validate
[2024-12-11 23:16:46.436][11677][info][main] [source/server/server.cc:879] runtime: {}
[2024-12-11 23:16:46.437][11677][info][config] [source/server/configuration_impl.cc:168] loading tracing configuration
[2024-12-11 23:16:46.437][11677][info][config] [source/server/configuration_impl.cc:124] loading 0 static secret(s)
[2024-12-11 23:16:46.437][11677][info][config] [source/server/configuration_impl.cc:130] loading 1 cluster(s)
[2024-12-11 23:16:46.438][11677][info][config] [source/server/configuration_impl.cc:138] loading 1 listener(s)
[2024-12-11 23:16:46.440][11677][info][config] [source/server/configuration_impl.cc:154] loading stats configuration
configuration '/etc/envoy/redis.yaml' OK
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:127] Caught Segmentation fault, suspect faulting address 0x220
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:111] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:112] Envoy version: a0504e87c5a246cb097b37049b1e4dc7706c2a90/1.32.2/Clean/RELEASE/BoringSSL
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:114] Address mapping: 5d3e70c4b000-5d3e747c5000 /usr/local/bin/envoy
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #0: [0x7ce643148520]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #1: [0x5d3e725d135e]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #2: [0x5d3e72f903ee]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #3: [0x5d3e72f90219]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #4: [0x5d3e733d035c]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #5: [0x5d3e733cf550]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #6: [0x5d3e72f8febf]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #7: [0x5d3e70cd8b16]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #8: [0x5d3e7236c490]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #9: [0x5d3e7236b1fa]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #10: [0x5d3e7237803c]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #11: [0x5d3e72380073]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #12: [0x5d3e723767e1]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #13: [0x5d3e72368ea2]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #14: [0x5d3e70cd290c]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #15: [0x5d3e70cd294e]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #16: [0x5d3e725da280]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #17: [0x5d3e7255ad4d]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #18: [0x5d3e7255a5d9]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #19: [0x5d3e72547d95]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #20: [0x5d3e7251929b]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #21: [0x5d3e72519a3e]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #22: [0x5d3e70c4b14c]
[2024-12-11 23:16:46.441][11677][critical][backtrace] [./source/server/backtrace.h:121] #23: [0x7ce64312fd90]
Segmentation fault (core dumped)

Checking the above error with the -dev Docker image, has the same problem:

Envoy CLI validate -dev version
root@envoy-d9c55694c-crsgh:/# envoy -c /etc/envoy/redis.yaml --mode validate
[2024-12-11 23:29:14.911][796][info][main] [source/server/server.cc:889] runtime: {}
[2024-12-11 23:29:14.912][796][info][config] [source/server/configuration_impl.cc:173] loading tracing configuration
[2024-12-11 23:29:14.912][796][info][config] [source/server/configuration_impl.cc:124] loading 0 static secret(s)
[2024-12-11 23:29:14.912][796][info][config] [source/server/configuration_impl.cc:130] loading 1 cluster(s)
[2024-12-11 23:29:14.913][796][info][config] [source/server/configuration_impl.cc:140] loading 1 listener(s)
[2024-12-11 23:29:14.915][796][info][config] [source/server/configuration_impl.cc:156] loading stats configuration
configuration '/etc/envoy/redis.yaml' OK
[2024-12-11 23:29:14.915][796][critical][backtrace] [./source/server/backtrace.h:127] Caught Segmentation fault, suspect faulting address 0x220
[2024-12-11 23:29:14.915][796][critical][backtrace] [./source/server/backtrace.h:111] Backtrace (use tools/stack_decode.py to get line numbers):
[2024-12-11 23:29:14.915][796][critical][backtrace] [./source/server/backtrace.h:112] Envoy version: 602a2b97a54435c38cb5bacf133b9984d2212498/1.33.0-dev/Clean/RELEASE/BoringSSL
[2024-12-11 23:29:14.915][796][critical][backtrace] [./source/server/backtrace.h:114] Address mapping: 5dce9d1ac000-5dcea0d92000 /usr/local/bin/envoy
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #0: [0x7b2633dfb520]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #1: [0x5dce9eb2e6b2]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #2: [0x5dce9eb2dc43]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #3: [0x5dce9f54d764]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #4: [0x5dce9f54d589]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #5: [0x5dce9f95d58c]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #6: [0x5dce9f95c780]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #7: [0x5dce9f54d22f]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #8: [0x5dce9d236f56]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #9: [0x5dce9e8c24a0]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #10: [0x5dce9e8c120a]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #11: [0x5dce9e8ce05c]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #12: [0x5dce9e8d6093]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #13: [0x5dce9e8cc801]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #14: [0x5dce9e8beeb2]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #15: [0x5dce9d230d3c]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #16: [0x5dce9d230d7e]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #17: [0x5dce9eb36960]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #18: [0x5dce9eab784d]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #19: [0x5dce9eab70d9]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #20: [0x5dce9eaa4815]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #21: [0x5dce9ea768ab]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #22: [0x5dce9ea7704e]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #23: [0x5dce9d1ac11c]
[2024-12-11 23:29:14.916][796][critical][backtrace] [./source/server/backtrace.h:121] #24: [0x7b2633de2d90]

Config:
Included on the Repro steps.

Logs:
Same as above.

Call Stack:
The above on Repro steps is all I have.

@L30Bola L30Bola added bug triage Issue requires triage labels Dec 11, 2024
@wbpcode wbpcode added area/redis and removed triage Issue requires triage labels Dec 13, 2024
@wbpcode
Copy link
Member

wbpcode commented Dec 13, 2024

cc @mattklein123

@wbpcode
Copy link
Member

wbpcode commented Dec 13, 2024

cc @phlax do we provide a image that with unstripped binnary or with debug file? Then @L30Bola could produces more helpful stack.

@phlax
Copy link
Member

phlax commented Dec 13, 2024

there is the debug container https://hub.docker.com/r/envoyproxy/envoy/tags?name=debug

@L30Bola
Copy link
Author

L30Bola commented Dec 13, 2024

With that debug image, should I just re-run the same tests (and maybe save some file or even the entire output) or do I need to add any additional flag on the CLI?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants