Blog

How to setup regional fail-over for Google cloud serverless NEGs

21 Dec, 2023

In this blog, we like to show you how to setup a highly available, regional fail-over setup when using serverless network endpoint groups (NEGs). Traditionally network endpoint groups rely on gRPC health checks to determine whether a backend is available and ready to receive traffic. However, these health checks are not supported for serverless network endpoint groups. Without health checks, traffic will be sent to backends in available regions and dropped if the backend is unhealthy. So, how would we instead redirect traffic from an unhealthy NEG?

To solve this issue outlier detection was introduced by Google. Outlier detection works by analyzing HTTP responses and ejecting unhealthy backends when certain thresholds are met. This functionality is available on global backend services with serverless NEGs.

Let's see how we can enable this on our backend service terraform resource:

resource "google_compute_backend_service" "backend" {
  ...

  outlier_detection = {
    base_ejection_time = "30s"
    consecutive_errors = 5
  }
}

With the configuration above, the backend service ejects a backend when it responds with five consecutive 5xx HTTP error codes. When ejected, the backend will not receive any traffic for 30 seconds before retesting the backend.

Now let's take a look at a full example:

locals {
  locations = toset(["europe-west1", "europe-west4"])
}

resource "google_compute_global_address" "ip" {
  name = "service-ip"
}

resource "google_compute_region_network_endpoint_group" "neg" {
  for_each = toset(local.locations)

  name                  = "neg-${each.key}"
  network_endpoint_type = "SERVERLESS"
  region                = each.key

  cloud_run {
    service = google_cloud_run_service.service[each.key].name
  }
}

resource "google_compute_backend_service" "backend" {
  name     = "backend"
  protocol = "HTTP"

  dynamic "backend" {
    for_each = local.locations

    content {
      group = google_compute_region_network_endpoint_group.neg[backend.key].id
    }
  }

  outlier_detection = {
    base_ejection_time = "30s"
    consecutive_errors = 5
  }
}

resource "google_compute_url_map" "url_map" {
  name            = "url-map"
  default_service = google_compute_backend_service.backend.id
}

resource "google_compute_target_http_proxy" "http_proxy" {
  name    = "http-proxy"
  url_map = google_compute_url_map.url_map.id
}

resource "google_compute_global_forwarding_rule" "frontend" {
  name       = "frontend"
  target     = google_compute_target_http_proxy.http_proxy.id
  port_range = "80"
  ip_address = google_compute_global_address.ip.address
}

Conclusion

In absence of health check support on serverless NEGs, always specify outlier detection on serverless NEGs backend services to ensure availability in the case of regional failures. The outlier detection ejects unhealthy backends and avoids traffic to be sent to services in unavailable regions.

Photo by Kevin Schmid on Unsplash

How to setup regional fail-over for Google cloud serverless NEGs

How to setup regional fail-over for Google cloud serverless NEGs

Conclusion

Recommend

北邮人论坛十大_2023_12_13

新玩意：WESCOM 31.5 英寸 4K 显示器

Infinispan Insights: Security basics and secured caches

SMTP Smuggling – Spoofing Emails Worldwide

美国房地产市场接连“爆雷”，美国产权保险巨头遭网络攻击下线

PHOLED Will Transform Displays

便利店的一元——回忆之8

ArXiv now offers papers in HTML format

对话Bernard Charlès先生：达索系统何以提前达成两位数增长

go-elasticsearch使用指南

About Joyk