Rate limit service

Rate limiting is a powerful technique to improve the availability and resilience of your services. In Emissary, each request can have one or more labels. These labels are exposed to a third-party service via a gRPC API. The third-party service can then rate limit requests based on the request labels.

Note that RateLimitService is only applicable to Emissary, and not Ambassador Edge Stack, as Ambassador Edge Stack includes a built-in rate limit service.

Request labels

See Attaching labels to requests for how to configure the labels that are attached to a request.

Domains

In Emissary, each engineer (or team) can be assigned its own domain. A domain is a separate namespace for labels. By creating individual domains, each team can assign their own labels to a given request, and independently set the rate limits based on their own labels.

See Attaching labels to requests for how to labels under different domains.

External rate limit service

In order for Emissary to rate limit, you need to implement a gRPC RateLimitService, as defined in Envoy’s v1/rls.proto interface. If you do not have the time or resources to implement your own rate limit service, Ambassador Edge Stack integrates a high-performance rate limiting service.

Note: In a future version of Emissary, Emissary will change the version of the gRPC service name used to communicate RateLimitServices from the one defined in v1/rls.proto (pb.lyft.ratelimit.RateLimitService) to the one defined in v2/rls.proto (envoy.service.ratelimit.v2.RateLimitService):

  • In some future version of Emissary, there will be a setting to control which name is used; with the default being the current name; it will be opt-in to the new name.

  • In some future version of Emissary after that, no sooner than Emissary 1.7.0, the default value of that setting will change; making it opt-out from the new name.

  • In some future version of Emissary after that, no sooner than Emissary 1.8.0, the setting will go away, and Emissary will always use the new name.

In the meantime, implementations of RateLimitService are encouraged to respond to both names–they are simply aliases of each other, registering the service under both names is usually a simple 1-or-2-line addition. For example, in Go the change to support both names is:

 import (
 	envoy_ratelimit_v1 "github.com/emissary-ingress/emissary/pkg/api/pb/lyft/ratelimit"
+	envoy_ratelimit_v2 "github.com/emissary-ingress/emissary/pkg/api/envoy/service/ratelimit/v2"
 )
...
 	envoy_ratelimit_v1.RegisterRateLimitServiceServer(myGRPCServer, myRateLimitImplementation)
+	envoy_ratelimit_v2.RegisterRateLimitServiceServer(myGRPCServer, myRateLimitImplementation)

Emissary generates a gRPC request to the external rate limit service and provides a list of labels on which the rate limit service can base its decision to accept or reject the request:

[
  {"source_cluster", "<local service cluster>"},
  {"destination_cluster", "<routed target cluster>"},
  {"remote_address", "<trusted address from x-forwarded-for>"},
  {"generic_key", "<descriptor_value>"},
  {"<some_request_header>", "<header_value_queried_from_header>"}
]

If Emissary cannot contact the rate limit service, it will allow the request to be processed as if there were no rate limit service configuration.

It is the external rate limit service’s responsibility to determine whether rate limiting should take place, depending on custom business logic. The rate limit service must simply respond to the request with an OK or OVER_LIMIT code:

  • If Envoy receives an OK response from the rate limit service, then Emissary allows the client request to resume being processed by the normal flow.
  • If Envoy receives an OVER_LIMIT response, then Emissary will return an HTTP 429 response to the client and will end the transaction flow, preventing the request from reaching the backing service.

The headers injected by the AuthService can also be passed to the rate limit service since the AuthService is invoked before the RateLimitService.

Configuring the rate limit service

A RateLimitService manifest configures Emissary to use an external service to check and enforce rate limits for incoming requests:

---
apiVersion: getambassador.io/v2
kind:  RateLimitService
metadata:
  name:  ratelimit
spec:
  service: "example-rate-limit.default:5000"
  protocol_version: oneOf[v2, v3]    # optional; default is v2
  • service gives the URL of the rate limit service. If using a Kubernetes service, this should be the namespace-qualified DNS name of that service.
  • protocol_version (optional) gRPC service name used to communicate with the RateLimitService. Allowed values are v2 which will use the envoy.service.ratelimit.v2.RateLimitService, and v3 which will use the envoy.service.ratelimit.v3.RateLimitService service name. Note that v3 requires Emissary to run in Envoy v3 mode by setting the AMBASSADOR_ENVOY_API_VERSION=V3 environment variable.

You may only use a single RateLimitService manifest.

Rate limit service and TLS

You can tell Emissary to use TLS to talk to your service by using a RateLimitService with an https:// prefix. However, you may also provide a tls attribute: if tls is present and true, Emissary will originate TLS even if the service does not have the https:// prefix.

If tls is present with a value that is not true, the value is assumed to be the name of a defined TLS context, which will determine the certificate presented to the upstream service.

Example

The Emissary Rate Limiting Tutorial has a simple rate limiting example. For a more advanced example, read the advanced rate limiting tutorial, which uses the rate limit service that is integrated with Ambassador Edge Stack.

Further reading


Last modified September 9, 2024: Update all 1.14 metadata to fix navigation (c0afada)