Skip to content

Reduce memory bloating in FileMappedDict when reading metrics.#160

Merged
dmagliola merged 1 commit intoprometheus:masterfrom
cristiangreco:cristian-spike-memory-usage
Oct 23, 2019
Merged

Reduce memory bloating in FileMappedDict when reading metrics.#160
dmagliola merged 1 commit intoprometheus:masterfrom
cristiangreco:cristian-spike-memory-usage

Conversation

@cristiangreco
Copy link

(This patch is adapted from gocardless#5)

During the read path (metrics export) we open all existing metric files
and read through them in order to aggregate metrics across different
processes. When reading non-empty files we end up parsing file content
twice: first during FileMappedDict initialisation, then again in the
caller site (all_values).

This commit refactors FileMappedDict so that while the @positions map
is populated at creation time, the actual metric values are read only
when explicitly requested. This avoids the memory bloat of unpacking
file content twice.


The attached synthetic benchmark simulates two concurrent threads that
write and read from a shared direct_file_store while memory profiling
is enabled. I observed locally a reduction in peak memory usage (best of 3 runs
in each case):

master
---
Total allocated: 127.28 MB (2220463 objects)
Total retained:  139.40 kB (609 objects)

this branch
---
Total allocated: 86.06 MB (1331191 objects)
Total retained:  139.74 kB (611 objects)

# frozen_string_literal: true

require "prometheus/client"
require "prometheus/client/data_stores/direct_file_store"
require "prometheus/client/formats/text"

require "memory_profiler"

Prometheus::Client.config.data_store =
    Prometheus::Client::DataStores::DirectFileStore.new(dir: ".")

report = MemoryProfiler.report(allow_files: "direct_file_store.rb") do
  HISTOGRAM = Prometheus::Client.registry.histogram(
    :histogram_metric,
    docstring: "an histogram",
    buckets: [0.1,0.2,0.3,0.4,0.5,0.6,0.7,0.8,0.9,1],
    labels: %i[label1 label2 label3],
  )

  COUNTER = Prometheus::Client.registry.counter(
    :counter_metric,
    docstring: "a counter",
    labels: %i[label4 label5 label6],
  )

  REGISTRY = Prometheus::Client.registry

  def run
    100.times do
      HISTOGRAM.observe(
        rand(1024),
        labels: {
          label1: "foo" + rand(16).to_s,
          label2: rand(512),
          label3: [:a,"b"].sample,
        },
      )

      COUNTER.increment(
        by: rand(1024),
        labels: {
          label4: "foo" + rand(16).to_s,
          label5: rand(512),
          label6: [:a,"b"].sample,
        },
      )

      Prometheus::Client::Formats::Text.marshal(REGISTRY)
    end
  end

  t1 = Thread.new { run }
  t2 = Thread.new { run }

  t1.join
  t2.join
end

report.pretty_print(scale_bytes: true, detailed_report: true)

@coveralls
Copy link

Coverage Status

Coverage remained the same at 100.0% when pulling 8a8c92b on cristiangreco:cristian-spike-memory-usage into af29066 on prometheus:master.

@coveralls
Copy link

coveralls commented Oct 9, 2019

Coverage Status

Coverage remained the same at 100.0% when pulling c0260ff on cristiangreco:cristian-spike-memory-usage into af29066 on prometheus:master.

Copy link
Collaborator

@dmagliola dmagliola left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this!
Something tells me there's a lot more of these gains to be made, so if you find another one, please send it over! :D

Could you please follow the instructions from the "DCO" failed CI step, to signoff your commit?
Once you push that, i'll merge this in.

@Sinjo I think it'd be worth doing another pre-alpha / release candidate version with this upgrade in. I'd love to see whether this impacts the memory pressure in our main service (and also to make sure it's happy in production)

During the read path (metrics export) we open all existing metric files
and read through them in order to aggregate metrics across different
processes. When reading non-empty files we end up parsing file content
twice: first during FileMappedDict initialisation, then again in the
caller site (`all_values`).

This commit refactors FileMappedDict so that while the `@positions` map
is populated at creation time, the actual metric values are read only
when explicitly requested. This avoids the memory bloat of unpacking
file content twice.

Signed-off-by: Cristian Greco <cristian@gocardless.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants