Control Metered Costs with Feature Flags

A few weeks ago I switched Box Out and Flipper Cloud to AppSignal. I've been loving it so far! The only snag I hit was cost for Cloud (not for Box Out). Previously, Cloud was using New Relic and paying around $180/mo. After a few weeks on AppSignal I noticed our plan was going to be $580/mo.

I ❤️ AppSignal and the team behind it, so generally I'm happy paying whatever good software costs. But in this case, Cloud has a lot of repetitive poll traffic from customer's servers hitting us on repeat to keep their flags in sync locally to their application.

AppSignal's billing is plan based but also kind of metered. You use them for a month for free (at the time of this writing) and then they estimate what your plan will be. They are then very generous when your metered usage goes over your plan. Neat approach as it kind of gets the best of both worlds – metered and fixed monthly cost.

But since Cloud has a ton of repetitive poll requests and we are still a scrappy bootstrapped start up, I looked into reducing the requests we sent to AppSignal.

The Docs

As I've come to expect, AppSignal had very helpful docs on customizing data collection. My first option was that I could ignore entire actions. This is great for low value load balancer health checks, but not for an action that is the heart of an application and 99.8% of all ruby time. So I moved on.

Next, I checked out ignoring entire namespaces. Thanks to a great tip from Thijs (CTO at AppSignal), I set the namespace for our API to "api" soon after installing. But, again I couldn't ignore the entire namespace, as then I'd have no insight into the API which is > 99% of all our time spent processing stuff.

Or could I... that's when it hit me. I can change the namespace at runtime and I can ignore namespaces. What if I just used a feature flag to change the namespace on the fly and I ignore one of those namespaces. Turns out it worked!

The Middleware

I whipped together a middleware to insert before each API request to set the namespace based on the request.

class AppsignalNamespaceMiddleware
  def initialize(app, options = {})
    @app = app
  end

  def call(env)
    request = Rack::Request.new(env)
    api_request = ApiRequest.from_request(request)
    Appsignal.set_namespace(api_request.namespace)
    @app.call(env)
  end
end

The ApiRequest

You'll notice in the middleware I referenced an ApiRequest class. It's a very simple class I use to wrap a Rack::Request and add extra information about the request.

class ApiRequest
  def self.from_request(request)
    new(request)
  end

  attr_reader :request

  def initialize(request)
    @request = request
  end

  def read?
    @request.get? || @request.head?
  end

  def telemetry?
    @request.path.start_with?("/adapter/telemetry")
  end

  def poll?
    @request.path =~ /^\/adapter\/features(\/|\?.*)?$/ && read?
  end

  def namespace
    if poll?
      return "untraced_api" unless Flipper.enabled?(:appsignal_tracing_poll)
    elsif telemetry?
      return "untraced_api" unless Flipper.enabled?(:appsignal_tracing_telemetry)
    end

    "api"
  end
end

Nothing too fancy there. You'll notice the namespace method just changes the flag it uses based on whether we are polling or receiving telemetry. If the flag is enabled, "api" will be used for the namespace and if it is not it will early return "untraced_api".

The Config

Now that I had my flags in there, I deployed and enabled them both for 25% of the time. This caused an appropriate split in requests based on the namespace. All that was left was to configure AppSignal to ignore the "untraced_api" namespace. I chose to do so via an ENV var:

APPSIGNAL_IGNORE_NAMESPACES=untraced_api

As you can see in the graph below, traced requests immediately dropped.

Conclusion

The best part about all of this is now I'm in control of my costs via the click of a button regardless of how many requests my app is serving. I can increase the % of time enabled to gather more performance data and decrease it to gather less.

That is the point of feature flags (and this post) – control at runtime. You don't need to deploy. You don't need to change code. Just clickity-click and profit.

I should also point out again that my goal isn't to cut costs because I don't want to pay for good software. I do want to pay. But the goal of paying for anything is that it's mutually beneficial for both sides. By tracing every single request of which many were duplicates and low value, I wasn't getting the same value as the price went up.

Everything is about trade offs. Sampling like this is an easy way to get rid of low value requests while retaining the massive value that AppSignal gives me for an amount that is entirely worth it to me.

I'd be remiss not to mention that Roy (co-founder) even replied to my tweet about this:

Such a great move. I ❤️ proactive founders like this. That said, I'm perfectly fine with how they bill (it's fair) and I think what I ended up with is the best of both worlds. They get another paying customer (two actually and probably more soon as I'm a spreader) and I get a great APM for a price I'm good with.

So go switch to AppSignal! I have received nothing but good vibes from them.