2

How to setup regional fail-over for Google cloud serverless NEGs

 8 months ago
source link: https://xebia.com/blog/how-to-setup-regional-fail-over-for-serverless-negs/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Blog

How to setup regional fail-over for Google cloud serverless NEGs

21 Dec, 2023
Xebia Background Header Wave
Share

In this blog, we like to show you how to setup a highly available, regional fail-over setup when using serverless network endpoint groups (NEGs). Traditionally network endpoint groups rely on gRPC health checks to determine whether a backend is available and ready to receive traffic. However, these health checks are not supported for serverless network endpoint groups. Without health checks, traffic will be sent to backends in available regions and dropped if the backend is unhealthy. So, how would we instead redirect traffic from an unhealthy NEG?

To solve this issue outlier detection was introduced by Google. Outlier detection works by analyzing HTTP responses and ejecting unhealthy backends when certain thresholds are met. This functionality is available on global backend services with serverless NEGs.

Let's see how we can enable this on our backend service terraform resource:

resource "google_compute_backend_service" "backend" {
  ...

  outlier_detection = {
    base_ejection_time = "30s"
    consecutive_errors = 5
  }
}

With the configuration above, the backend service ejects a backend when it responds with five consecutive 5xx HTTP error codes. When ejected, the backend will not receive any traffic for 30 seconds before retesting the backend.

Now let's take a look at a full example:

locals {
  locations = toset(["europe-west1", "europe-west4"])
}

resource "google_compute_global_address" "ip" {
  name = "service-ip"
}

resource "google_compute_region_network_endpoint_group" "neg" {
  for_each = toset(local.locations)

  name                  = "neg-${each.key}"
  network_endpoint_type = "SERVERLESS"
  region                = each.key

  cloud_run {
    service = google_cloud_run_service.service[each.key].name
  }
}

resource "google_compute_backend_service" "backend" {
  name     = "backend"
  protocol = "HTTP"

  dynamic "backend" {
    for_each = local.locations

    content {
      group = google_compute_region_network_endpoint_group.neg[backend.key].id
    }
  }

  outlier_detection = {
    base_ejection_time = "30s"
    consecutive_errors = 5
  }
}

resource "google_compute_url_map" "url_map" {
  name            = "url-map"
  default_service = google_compute_backend_service.backend.id
}

resource "google_compute_target_http_proxy" "http_proxy" {
  name    = "http-proxy"
  url_map = google_compute_url_map.url_map.id
}

resource "google_compute_global_forwarding_rule" "frontend" {
  name       = "frontend"
  target     = google_compute_target_http_proxy.http_proxy.id
  port_range = "80"
  ip_address = google_compute_global_address.ip.address
}

Conclusion

In absence of health check support on serverless NEGs, always specify outlier detection on serverless NEGs backend services to ensure availability in the case of regional failures. The outlier detection ejects unhealthy backends and avoids traffic to be sent to services in unavailable regions.


Photo by Kevin Schmid on Unsplash


About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK