Istio rate limits for egress traffic

Recently we got blocked by one of our external service providers due to the high amount of calls to their service from our apps deployed on k8s cluster. To avoid being blocked again we had to follow a simple rule; don’t do more than 80k requests per day.
Since our stack contains apps written in different languages like Java and Python, it sounds reasonable to introduce rate limit on the infrastructure level instead of trying to add it to the code of individual microservice. What’s more, that was one of the reasons why we decided to use Istio as a service mesh; it supports rate limiting out of the box. At least that was something we read in the docs, but it turned out that it’s not as easy to implement it as we initially though. Mainly because Istio deprecated the support for a rate limiter configuration via mixer policy since version 1.5 ( see the docs):

The mixer policy is deprecated in Istio 1.5 and not recommended for production usage.
Consider using Envoy native rate limiting instead of mixer rate limiting. Istio will add support for native rate limiting API through the Istio extensions API.

I don’t mind using the“Envoy native rate limiting” approach, however I failed to find any examples on how to use it for egress traffic. Although, there were a few docs, blog posts and examples, but all of them reference limiting ingress traffic, not egress. What’s more, all the envoy configs were a bit blurry for a person like myself, who hasn’t worked much with an Envoy before, therefore modifying and adjusting those to my needs took me a while.

Hopefully this article will shed some light on how an Envoy works and how to configure it from Istio to limit egress traffic.


In the following examples I used Kubernetes 1.17 and Istio 1.6.5.

Istio / Envoy rate limit architecture

Something that is not obvious at first glance is that Envoy doesn’t have its own rate limiting mechanism. It uses an external service that can do the rate limiting, and all the Envoy does, is to forward the particular HTTP requests to that service and wait for a response. If the rate limit service responses with HTTP 200 then traffic is forwarded by Envoy and if the response is HTTP 429 the request is blocked by Envoy and the same HTTP code ( 429 ) is returned to the sender.
This strategy to use an external service limiter gives a lot of flexibility; you can use different implementations of rate limit service including the one you may write on your own that will suit your specific needs.
The Envoy team did a great job in providing a rate limit service that works with an Envoy proxy. RateLimit counts the number of each HTTP call and compares it with a configured quota (max value) to make a decision if the request should be forwarded or blocked. By storing a count of each call in cache engine (Redis), it’s possible to have more than one instance of RateLimit added to the stack.
The following diagram presents the architecture that we will use in this article.

  1. Application container tries to establish connection to an external service e.g. — I will use that domain as an example for rate limit configuration. It’s selected on purpose as it serves both http and https and is used in the Istio docs that I will refer to in the following chapter where I’ll try to explain https rate limiting
  2. Traffic is intercepted by istio-sidecar and Enovy configuration points it to the rateLimit service. This service is also deployed in k8s and has 2 pods associated
  3. Traffic gets to one of the RateLimit pods
  4. Number of calls to particular endpoints are stored in Redis. RateLimit informs Redis about each new call and also fetch total number of calls to specific endpoint
  5. Number of calls is received from Redis and RateLimit pod can make a decision if it’s within configured limit or above it
  6. Base on that decision RateLimit sends response to Envoy — either HTTP 200 (blue line) or HTTP 429 (red line)
  7. Only if HTTP 200 is returned from RateLimit, Envoy forwards traffic to the External Service e.g.

First things first — understanding Envoy

My first attempt to configure rate limiter was to take an examples provided by folks in one of the Github issues. However for a person who never worked with Envoy before those examples are difficult to understand. So I started to read Envoy docs to have at least a basic idea of how it works. Going through all the docs is a pain, so let me just point out the most relevant terminology Envoy uses:

Downstream: A downstream host connects to Envoy, sends requests, and receives responses.

Upstream: An upstream host receives connections and requests from Envoy and returns responses.

Listener: A listener is a named network location (e.g., port, unix domain socket, etc.) that can be connected to by downstream clients. Envoy exposes one or more listeners that downstream hosts connect to.

Cluster: A cluster is a group of logically similar upstream hosts that Envoy connects to. Envoy discovers the members of a cluster via service discovery. It optionally determines the health of cluster members via active health checking. The cluster member that Envoy routes a request to is determined by the load balancing policy.

At the beginning I was confused what the Envoy cluster is.To make it simpler we can assume that in istio, each k8s service is added as an Envoy cluster to its configuration. If Envoy wants to reach out to one of your services configured in k8s, it will do so by creating a cluster that refers to a given service.
Let’s see how above terminology items look like in our Istio mesh. I’ve deployed some testing pod called “bastion-box” that will simulate my source application; the one that we will try to put a limit on. Bastion-box pod has istio-sidecar enabled, so let’s display envoy listeners of that istio-sidecar:

istioctl proxy-config listener bastion-box-7449bfcbb9-59r4c                                                                                                                                                                                
ADDRESS PORT TYPE 443 TCP 53 TCP 6379 TCP 15012 TCP 15443 TCP 10250 TCP 10250 TCP 27017 TCP 443 TCP 443 TCP 10250 TCP 3306 TCP 10250 TCP 10250 TCP 10250 TCP 443 TCP 10250 TCP 10255 HTTP+TCP 8181 HTTP+TCP 9200 HTTP+TCP 9092 HTTP+TCP 80 HTTP+TCP 5672 HTTP+TCP 10252 HTTP+TCP 4194 HTTP+TCP 443 HTTP+TCP 443 HTTP+TCP 15672 HTTP+TCP 9090 HTTP+TCP 8085 HTTP+TCP 8444 HTTP+TCP 4369 HTTP+TCP 8080 HTTP+TCP 9091 HTTP+TCP 9153 HTTP+TCP 10249 HTTP+TCP 2181 HTTP+TCP 389 HTTP+TCP 25672 HTTP+TCP 8444 HTTP+TCP

My first thought was that Envoy, like any other service, listens on one particular port, therefore I will see a single listener. So why are there so many listeners configured? To explain this let’s refer to istio-docs:

If you query the listener summary on a pod you will notice Istio generates the following listeners:

A listener on that receives all inbound traffic to the pod and a listener on that receives all outbound traffic to the pod, then hands the request over to a virtual listener.

A virtual listener per service IP, per each non-HTTP for outbound TCP/HTTPS traffic.

A virtual listener on the pod IP for each exposed port for inbound traffic.

A virtual listener on per each HTTP port for outbound HTTP traffic.

That should explain why there are so many listeners. However, since we try to limit egress traffic, let’s focus on the last sentence— we will be connecting to on port 80, so our traffic will go via listener Let’s see how configuration of that listener looks (the output is shortened for better clarity).

There are 2 important things to notice:

  • line 22; envoy.http_connection_manager — you may read a full description of it here. This is basically part of the Envoy, that describes how to handle HTTP traffic. That HTTP traffic handling rules are called httpFilters — example of httpFilter may be external authentication, CORS handling, fault injection or rate limiting. By default the rate limiting http filter is not enabled on our listener, so one of our tasks will be to add it there. Full list of httpFilters may be found here.
  • line 30; routeConfigName — name of the routing configuration of Envoy which is the set of rules that Envoy should follow during traffic forwarding. In other words, should Envoy use some specific httpFilter when it forwards traffic to any particular domain? We would like the Envoy to use rateLimit filter, right? So route is the place where we’ll say we want to limit the traffic going to

Let’s see how that route config looks for our bastion-box pod (I took single route as an example as an output may be pretty long):

From the long output I took one route as an example which points to my grafana service in monitoring namespace. The most important lines are:

  • line 8 — specify list of domains that will match this particular route
  • line 27 — cluster — when the domain is matched, where should envoy forward the traffic to?

We see that traffic hitting domain “prometheus-operator-grafana.monitoring.svc.cluster.local” will be forwarded to cluster “outbound|80||prometheus-operator-grafana.monitoring.svc.cluster.local”.

Let’s see if that cluster even exists:

Cool, it’s there — do you remember when I mentioned that “clusters” in Enovy are created automatically based on k8s service? That’s why I see my grafana cluster on the list — it’s one of the services I have implemented in monitoring namespace. But if you take a look at route config one more time you will notice there are no routes to domains external for k8s e.g. or What’s more, there are no such clusters on the cluster list, so configuring a route to a non-existing cluster won’t be possible. That’s why we need to add a new cluster for our testing domain later on.

RateLimit service setup

To be able to configure rate limiting, first we need to have a rateLimit service. I will use the one provided by Envoy team available here. Unfortunately I haven’t found any reasonable helm chart that helps in deploying this service to k8s. I decided to take this one as en example and adjust it to my needs (original one didn’t start rateLimit properly and doesn’t have statsd sidecar which is needed to export metrics of rateLimit to prometheus). You can find the chart I used on my github.
Before you launch a chart you need to apply redis settings in values.yaml. For testing purposes I deployed redis in k8s cluster in the same namespace I’ll deploy the rateLimit service. Mandatory part is also adjustment of configuration of rateLimit app. You may check out the documentation here — I find those docs really good, but let me highlight one item that may be confusing. The top level configuration of rateLimit is a “domain” which has nothing to do with a domain which you try to apply a rate limit for. RateLimit “domain” is just a name for the set of rules that will be applied. So it can be anything and I will name it “my-rules”. Remember that name though, as we need to configure Istio to refer to this particular set of rules. So the config part of my helm chart looks like below. We will get to the key-value part of it later on.

config: |-
domain: my-rules
- key: generic_key
unit: hour
requests_per_unit: 3

After chart deployment we can see that each pod has 2 containers; one with the rateLimit app and one with statsd exposing the metrics if you want to scrape them later on.

Let’s configure it — istio configuration

In previous chapters we defined couple of problems that needs to be solved to configure rate limiting in Istio:

  • there is no envoy cluster for services/domain external to our service mesh e.g. or
  • there is no Envoy route to the external domains that we can apply some rules to (i.e. rate limiting)
  • even if we had a cluster and a route, we need to add an additional http filter to envoy to tell it how to modify http traffic for us
  • since Envoy uses an external service to handle rate limiting, we need to point it to that external service, so it knows where to send a traffic for rate limiting assessment

Let’s start from the bottom and tell Envoy where our rateLimit service is ( by adding new Envoy cluster) and add a HTTP rate limit filter that will point to it.

Istio can manipulate Envoy configuration by adding EnvoyFilter objects to the k8s cluster — full spec of EnvoyFilter can be found here. Let’s apply the following EnvoyFilter manifest to the same namespace where my source pod “bastion-box” is.

At the beginning of the file we can see the “workloadSelector” which allows us to select which source pods will be affected by our change. You may leave it blank if you want to apply the change for every istio enabled pod, but I decided to affect a single testing pod; bastion-box. Then we added 2 Envoy configuration patches; cluster named “rate_limit_cluster” that points to “ratelimit.default.svc.cluster.local:8081” and HTTP filter that points to this cluster. Each part has a matching section that selects appropriate pieces of Envoy config and patching section that defines what we would like to change/add there.
Keep in mind that “domain” in HTTP filter config is not the destination HTTP domain, but the domain defined in the rateLimit service — see previous chapter to find out about rateLimit domain definition.
Also take a look at “failure_mode_deny” parameter — it defines what Envoy should do with traffic that is pointed to rateLimit service when that service is not available for any reason. I prefer to deny such a traffic and returns HTTP 500 to the source, because I can easily catch and follow those errors, otherwise traffic that suppose to be rate limited may be forwarded to the destination without us knowing that the rateLimit service was broken in the first place.
Let’s see the result of applying the above.
First we added a new filter to our listener:

As we can see a new filter called “envoy.rate_limit” has been added to the listener configuration. This filter defines the rules on how to modify the traffic; all HTTP requests that this filter will be applied against should be sent to “rate_limit_cluster” and match the set of rules ( rate limit domain) “my-rules”. Moreover when “rate_limit_cluster” is not available, traffic should be blocked.
So far so good; but how does the Envoy knows what “rate_limit_cluster” is? That’s the second part of our manifest — cluster definition. Let’s see how it looks in Envoy:

It’s nothing more than pointing it to the k8s service of our rate limit app. Actually this step could be omitted as Istio will detect rateLimit service automatically and configure clusters for us, but I prefer to select my own name for the cluster, plus I think it helps in understanding the whole concept.

Good, now our Envoy knows how to reach rateLimit service, so let’s now configure which traffic should be actually rate limited. First, we need our external domain to show up as an Envoy cluster. Istio implements a special type of object that allows us to add an external services to the Istio mesh — ServiceEntry. Adding ServiceEntry has multiple benefits and one of them is that destinations defined in ServiceEntry will be treated like internal k8s services — which means Envoy will automatically add a cluster definition for that service.
Let’s try to add the following ServiceEntry:

The file is straight forward and doesn’t require much explanation, but let me highlight the fact that works on both http and https and we want both protocols to work. Firstly we focus on HTTP as it’s easier to configure rate limiting for it, however in the last chapters I will present a method how to handle HTTPS traffic as well.

Let’s verify if the Envoy cluster has been added to our side-car proxy configuration as expected.

So far so good. Let’s verify if the corresponding route has been added as well:

It looks like almost all the problems, we defined at the beginning, are solved by now; we added rateLimit service to the cluster and we pointed Envoy to it, we applied HTTP filter that is able to forward traffic to rateLimit service and its proper domain, we added external http/https domain to our istio mesh which caused Envoy to configure proper cluster and route for us. The last piece of the puzzle is to tell Envoy that every request which goes from the bastion-box to must be rate limited. The Envoy route is the place where we can apply HTTP filters and modify the traffic. So let’s try to apply our HTTP filter (envoy.rate_limit) to a route pointing to

To describe above yaml let me break it down into two pieces. First part “applyTo” works like a selector — it selects the Envoy object you want to modify. Second part “patch:” adds rate limit configuration. You can see below how the manifest is mapped to the Envoy route:

As you may read in rateLimit service docs, rateLimit uses key-value pairs to apply rate limiting rules. Istio/Envoy role is to assign the correct keys and corresponding values to the traffic being sent to rateLimit service. To see all the options of configuring those keys (descriptors) you may visit Envoy docs. To make things simpler I just assigned the following key value pair to all the traffic that goes to


Usage of generic_key descriptor sets name of the key to “generic_key” automatically and as far as I know it can’t be modified. The value of that key may be anything you want — I simply used the domain name, but it could also be “my-super-value”, although you need to make sure your rateLimit service configuration matches that key and value. In my case the rateLimit config looks like below.

domain: my-rules
- key: generic_key
unit: hour
requests_per_unit: 3

At this point we should be all set and ready to test the rate limiting.


To test it I will exec into my source pod; bastion-box and access a few times.

I successfully accessed the domain 3 times and after that my request had been blocked with a HTTP 429 response. After the configured timeout expires (we set it to 1 hour), bastion-box will be able to access again (for 3 times).

Istio ≥ 1.7

Istio 1.7 changes the syntax of EnvoyFilter, so provided manifests may need some minor adjustments. You may find required changes in the release notes or in this post.

How to rate limit HTTPS traffic?

In previous chapters we took as a domain we applied a rate limit for. It works on both http and https which makes it a perfect example to highlight a problem and reproduce it.
When Envoy receives a HTTP traffic it reads HTTP headers and maps it to certain key-value pairs (descriptors), however when the source app connects to the destination domain by HTTPS all the HTTP headers are encrypted by SSL. For that reason Envoy can’t read HTTP headers and apply a rate limit filter on it.
A solution to that problem could be to configure the source app to use plain HTTP to connect to destination, apply rate limiting for that traffic on Envoy side and configure Envoy to convert it to HTTPS before sending it outside a cluster. We covered the firsts steps in this article and the last part can be achieved by one of the Istio functionalities called Egress TLS Origination. Hopefully I will find more time to write a separate post about limiting HTTPS traffic where I can cover all the details, but the Istio docs for Egress TLS Origination are very useful and can be a good start.


A while ago Istio deprecated rate limiting configuration directly from Istio, leaving its users with an option to implement it directly in Envoy. This may be achieved by leveraging EnvoyFilter and ServiceEntry CRDs defined by Istio, however usage of EnvoyFilter object to manipulate Envoy config doesn’t sound right and may introduce some risks. The ability to integrate nicely a rate limit with an Istio profile and simplifying the whole concept is on the Istio development road map, unfortunately at the time of writing this article they have not announced yet which Istio release will bring that functionality.

Hopefully this article helped a few of you and shed some light on the rate limiting concept in Envoy proxy.

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store