[HN Gopher] More than 70% of prometheus executable are unused by...
       ___________________________________________________________________
        
       More than 70% of prometheus executable are unused by most people
        
       Author : wejick
       Score  : 62 points
       Date   : 2022-01-28 17:29 UTC (5 hours ago)
        
 (HTM) web link (wejick.wordpress.com)
 (TXT) w3m dump (wejick.wordpress.com)
        
       | etcet wrote:
        
       | [deleted]
        
       | tobyjsullivan wrote:
       | Great work on the part of the author. Pareto principal holds.
       | Often all it takes is one person motivated enough to look for
       | efficiency opportunities.
       | 
       | As for next steps, I can't imagine the Prometheus crew would
       | object to a proposal + PR to make the Service Discovery an
       | optional add-on in the next major version. It does open a can of
       | worms around how such an add-on would be distributed if not built
       | into the binary. (Caveat: I have no familiarity with this
       | particular project or its unique constraints or goals.)
        
         | rob74 wrote:
         | Easy: they can provide an option to remove the SD functionality
         | at compile time, and if you really care about the executable
         | size, you can compile the code with this option (and
         | `-ldflags="-s -w"`). The standard build would still be the "all
         | batteries included" one to avoid support issues (people
         | downloading the smaller binary and then asking why SD isn't
         | working).
        
       | rektide wrote:
       | i'd like to see memory usage differences, load time & runtime
       | performance impacts. i expect most of these to be small but i
       | expect some impact.
       | 
       | also just worth oting that the memory impact of statically
       | compiling in general is probably massive. most systems probably
       | would have a good percent of these libraries in memory already if
       | promtheus were using dynamic linking.
        
       | jdalsgaard wrote:
       | > Prometheus alternative
       | 
       | Well... if size of the executable is really a concern, perhaps
       | Victoria Metrics is worth considering; my amd64 executable is
       | about 17MiB in size.
        
       | pphysch wrote:
        
       | [deleted]
        
       | akireu wrote:
       | On a side note, Prometheus seems to be built for bloat. AFAIK, it
       | isn't even designed to consume metrics other than from apps
       | linked to its client library. It's like a microservice, but with
       | the footprint of an operating system.
        
         | momothereal wrote:
         | > AFAIK, it isn't even designed to consume metrics other than
         | from apps linked to its client library.
         | 
         | Could you elaborate? I use Prometheus to scrape from an HTTP
         | endpoint in various Pods in Kubernetes, so the service
         | discovery is pretty useful to me.
         | 
         | I could see the Kubernetes & the other SDs split out of the
         | core binary if default size is really an issue. Or are you
         | talking about something else?
        
           | akireu wrote:
           | I'm talking about the way I'm expected to provide metrics for
           | my apps. Rather than exporting free-form JSON and then
           | scripting Prometheus to understand it, I'm expected to use a
           | custom client library to export the metrics. As for
           | Kubernetes, you can only use it with Prometheus because of
           | not insignificant amount of work on both sides. Basically,
           | the latter is designed for vendor lock-in.
        
             | NikolaeVarius wrote:
             | The prometheus format is literally just a text page. It
             | dead simple to implement
        
               | morelisp wrote:
               | There are a frustrating number of fundamental corner
               | cases due variance to floating point text formats, and
               | slightly more in the descriptor if you also need that.
               | It's simple to implement an expositor for a limited set
               | of cases. As usual, it's much more difficult to parse
               | what you actually find in the world.
        
             | gempir wrote:
             | Prometheus follows the OpenMetrics standard I'm not sure
             | what you find propietary about that or specific to
             | prometheus.
             | 
             | https://github.com/OpenObservability/OpenMetrics/blob/main/
             | s...
        
               | momothereal wrote:
               | To be precise it was the other way around. OpenMetrics is
               | a standardization effort for the format Prometheus made
               | up.
               | 
               | However Prometheus was designed before JSON was
               | standardized itself, so I'm just glad they didn't choose
               | XML!
        
             | momothereal wrote:
             | Ah ok, I see what you mean.
             | 
             | The other commenters have pointed out that it _is_ based on
             | another open standard, but admittedly one less common than
             | say, JSON. So you'll generally have to implement your own
             | metrics producer or use a client library, that's true.
             | 
             | However it's also a dead simple format and you can probably
             | implement it with a for-loop or a shell script.
        
             | slimsag wrote:
             | What a bizarre claim.
             | 
             | Prometheus scrapes the same text format as OpenMetrics 1.0
             | and over 700 public exporters use this format, and there
             | are TONS of other non-Prometheus software that consume the
             | exact same text format. Prometheus's biggest competitor,
             | Datadog (which is not open source mind you), consumes it
             | too. I think even Grafana consumes it directly. It's
             | becoming an IETF standard[0].
             | 
             | Would I have preferred JSON over a custom text format like
             | this? Yeah. But to claim an open source project like
             | Prometheus with effectively no business at all is using a
             | text format like this to have vendor lock-in? That's quite
             | a stretch.
             | 
             | [0] https://github.com/OpenObservability/OpenMetrics/blob/m
             | ain/s...
        
               | morelisp wrote:
               | > Prometheus scrapes the same text format as OpenMetrics
               | 1.0
               | 
               | I find the GP's claims weird - I've written a relative
               | ton of collectors, exporters, and translators and the
               | format is pretty OK, not worse than most that came before
               | it and better than _lots_ - but I think this relationship
               | is backwards. Prometheus  "scrapes OpenMetrics" because
               | OpenMetrics was formal documentation of what Prometheus
               | was already doing for years.
               | 
               | I would not have preferred JSON. That an exposed metric
               | is also a query is also pretty close to a schematic
               | definition is nice.
        
               | akireu wrote:
               | I apologize for my mistake, then. My understanding was
               | based on reading the Prometheus docs on making exporters
               | alone - something I needed urgently for a job.
        
         | hughrr wrote:
         | It's a database engine not a microservice and needs to be
         | treated along the lines of postgresql etc.
        
       | hughrr wrote:
       | Yeah that's only unused until you need it at which point it
       | doesn't involve futzing with anything.
       | 
       | One of the things that kills me is running fluentd because you
       | have to fuck around with ruby gems in containers every two
       | minutes to get it to do something reasonable.
       | 
       | This is pain. Prom is not.
        
       | whateveracct wrote:
       | Clutching pearls about binary size is and always will be
       | hilarious to me.
        
         | sigmonsays wrote:
         | I also find this comical.
         | 
         | I'd love to know why 100MB is that big of a deal. If network is
         | slow, cache locally. Seems like nothing here to worry about.
        
           | [deleted]
        
         | jeppesen-io wrote:
         | This one does not even make sense - 100 megs for a binary for
         | centralized metrics? Who would even notice next to the OS and
         | metrics storage.
         | 
         | By design you should not install prometheus on every server you
         | monitor - it's designed to scrape metrics
         | 
         | Its a database, webui with support for email, webhooks, slack,
         | pagerduty, aws api and many others. 100megs does not sound like
         | a lot for all Pormetheus provides
        
         | 0xbadcafebee wrote:
         | It correlates to performance, speed to iterate, security, and
         | design complexity, but ok
        
         | akireu wrote:
         | It's all fun and games until you're stuck for a hour
         | downloading 600MB of updated packages over a metered LTE. The
         | same is with RAM usage: 512MB was enough for a phone back in
         | 2014, now a smart TV with 2GB is barely capable of
         | multitasking. Sure, binary sizes don't matter in most contexts.
         | But when they do, it's a PITA.
        
           | jayd16 wrote:
           | Galaxy S5 from 2014 had 2GB and that was 1080p vs 4k texture
           | sizes for today. Seems on par.
        
             | akireu wrote:
             | What you're kind of missing is that the S5 was a flagship
             | phone. Generally, one has to save for more than a month to
             | afford a purchase like that. The idea of working an extra
             | month so that some FAANG prick meets their KPI by cutting
             | corners on optimization doesn't even look like feudalism.
             | It looks like idiocracy. Paying the lip service of fat
             | shaming code bloat is the cost-effective option by
             | comparison :)
        
               | asiachick wrote:
               | What does FAANG have to do with this?
               | 
               | Don't FANNG people obsess over bloat because they're
               | trying to reach billions of customers? It might not seem
               | that way since their pages are bigger but I'd be
               | surprised if they were happy to leave 10s of millions of
               | customers on the table.
        
               | akireu wrote:
               | They're just poster children for the particular brand of
               | disdain $100k+/year "tech workers" bear for their users:
               | they make enough for the shiniest of toys, so they're too
               | far above spending their valuable time to make their
               | software run smooth on our $100 crap phones. Nevermind
               | that each Fb client update likely produces hundreds of
               | tons of toxic trash called gadgets. Sure, sometimes they
               | do optimizations. Generally, though, both Fb and Google
               | keep exploring the physical limits to code bloat.
               | Remember that one time that Fb hit the JVM class count
               | limit?
        
               | ysleepy wrote:
               | FAANG are the worst offenders. Didn't facebook employ
               | ungodly hacks to unload/load parts of the android app to
               | navigate around the 65k method limit of dex? Have you
               | looked at the js monstrosity of the Google hardware shop
               | website?
        
               | jayd16 wrote:
               | You misunderstand the point. You're comparing a 1080p
               | phone to a 4k television when texture memory is what will
               | take up the vast majority of ram. Code footprint is
               | pretty irrelevant.
               | 
               | Still the TV does fine with 2GB. Doesn't seem fair to
               | complain.
        
               | akireu wrote:
               | I wasn't speaking of a 4k TV, but still, this doesn't
               | check out. A single 2160p framebuffer is 8MPix, or 32MiB.
               | Not counting the original FB size, the extra 1.5GiB are
               | enough for 48 whole framebuffers. You don't need that
               | much image data all at once, the number is ridiculous.
               | No, I believe it's just that the code became that much
               | less efficient.
        
           | LimaBearz wrote:
           | Sure, but we're talking about an application written for a
           | cloud/hosted environment in a datacenter somewhere. nicking
           | at the size of a statically linked binary meant for
           | production grade environments with fast computers and fat
           | pipes feels overly pedantic no? Especially when we're talking
           | about a mere 100MB
        
         | [deleted]
        
         | [deleted]
        
         | jeffbee wrote:
         | Author doesn't even say why they object to the size. Are they
         | aware that file-backed executables are paged on demand and only
         | the active parts of the program will be resident?
        
           | Topgamer7 wrote:
           | Granted these days everyone is used to applications consuming
           | massive amounts of drive space. But perhaps they're using
           | legacy hardware for a home lab, or a IoT device with limited
           | disk space.
           | 
           | From a security stand point, reduced application code
           | decreases risk. It was service discovery code he removed,
           | what if it reached out to discover services on application
           | start up, that's a potential attack vector.
        
             | shoo wrote:
             | > From a security stand point, reduced application code
             | decreases risk. It was service discovery code he removed,
             | what if it reached out to discover services on application
             | start up, that's a potential attack vector.
             | 
             | Agreed. I've see a similar pattern with certain open source
             | libraries.
             | 
             | The first example I think of is the spf13/viper [1]
             | library, used to load configuration into go applications.
             | Viper is equipped with code for reading config from various
             | file formats, environment variables, as well as remote
             | config sources such as etcd, consul. If you introduce the
             | viper library as a dependency of your application to merely
             | read config from environment variables and YAML files in
             | the local filesystem, then your go application suddenly
             | gains a bunch of transitive dependencies on modules related
             | to remote config loading for various species of remote
             | config provider. It's not uncommon for these kind of remote
             | config loading dependencies to have security
             | vulnerabilities.
             | 
             | As well as the potential increased attack surface if a
             | bunch of unnecessary code to load application configuration
             | from all manner of remote config providers ends up in your
             | application binary [2], if you work in an environment that
             | monitors for vulnerabilities in open source dependencies,
             | if you depend on an open source library that drags in
             | dozens of transitive dependencies you don't really need, it
             | adds a fair bit of additional overhead re: detecting,
             | investigating and patching the potential vulnerabilities.
             | 
             | I guess there's arguably a "Hickean" simple-vs-easy
             | tradeoff in how such libraries are designed. The "easy"
             | design, that makes it quick for developers to get started
             | and achieve immediate success with a config loading
             | library, is to include code to load config from all popular
             | supported config sources into the default configuration of
             | the library, reducing the amount of steps a new user has to
             | do to get the library to work for their use case. A less
             | easy but arguably "simpler" design might be to only include
             | a common config-provider interface in the core module and
             | push all config-provider-specific client/adaptor code into
             | separate modules, and force the user to think about which
             | config sources they want to read from and then manually add
             | and integrate the dependencies for the corresponding
             | modules that contain the additional code they want.
             | 
             | edit: there has indeed been some discussion about the
             | proliferation of dependencies, and what to do about them,
             | in viper's issue tracker [3] [4]
             | 
             | [1] https://github.com/spf13/viper [2] this may or may not
             | actually happen, depending on which function calls you
             | actually use and what the compiler figures out. If your
             | application doesn't call any remote-config-provider library
             | functions then you shouldn't expect to find any in your
             | resulting application binary, even if the dependency is
             | there at the coarser-grain module dependency level [3]
             | https://github.com/spf13/viper/issues/887 [4]
             | https://github.com/spf13/viper/issues/707
        
           | cosmotic wrote:
           | Image pull size for a container is likely the concern. It
           | could shave a few seconds off a regularly-run integration
           | test. If it's run via on-demand build agents, then there's no
           | image cache.
        
             | jeffbee wrote:
             | If it takes multiple seconds to pull ~35MB of compressible
             | text into your CI environment, there may be other, larger
             | problems to solve.
        
           | wejick wrote:
           | sorry not to make it obvious in the article, I'm planning to
           | run it in small iot pi based device locally. So having
           | something small and fast is preferable, however the runtime
           | performance is a more important thing I haven't touch.
        
             | [deleted]
        
             | jeppesen-io wrote:
             | Prometheus is rather efficient, but it's focus is a little
             | different than yours. Its designed to for large scale
             | collection of metrics, scraped from many remote endpoints
             | 
             | You can run it locally but the "prometheus" way for iot env
             | would be a central prometheus server that scrapes the iot
             | devices running a prometheus exporter, which tend to be
             | very light weight
        
               | wejick wrote:
               | Totally agree. Another part of it is just feeding
               | curiosities.
        
             | ts4z wrote:
             | I'm curious -- is it the binary size that's a problem, or
             | the resident size in memory? Demand paging should help,
             | although you'd be stuck with carrying the enlarged binary.
        
               | wejick wrote:
               | my gut feeling tell it will be both memory and cpu
               | utilization. cant be sure, until I can find good way to
               | measure it.
        
           | wahern wrote:
           | That only helps if the code is well segregated by usage.
           | Looking at the ELF symbol table for
           | prometheus-2.33.0-rc.1.linux-amd64, it's not clear to me this
           | is the case. Not sure how it's ordered. Lexical import order?
           | Anyhow, without profiling how could the compiler know how to
           | order things optimally?
           | 
           | I think this is one of those cases where, in the absence of
           | profiling or some other hack (e.g. ensuring all routines
           | within a library are cleanly segregated across page
           | boundaries within the static binary _and_ the I /O scheduler
           | doesn't foil your intent), dynamic linking would prove
           | superior, at least for such large amounts of code.
        
         | xuhu wrote:
         | Auditd_2.8-amd64.deb is 194kb on debian, rsyslog_8.32-amd64 is
         | 411kb, and they both support centralized auditing and log
         | collection from multiple hosts.
        
           | djbusby wrote:
           | Do they do the metrics like Prometheus does? And include the
           | central collector and basic graph builder?
        
           | dtech wrote:
           | This doesn't seem a fair comparison. Prometheus is statically
           | linked like all Go applications, and those packages are not.
           | You can debate the merits of that, but if you compare a "only
           | rsyslog" server vs a "only prometheus" server the 2 will be
           | much closer in size.
        
         | foxfluff wrote:
         | It's kind of ironic reading this comment given that at the time
         | you posted it, I was screaming at gcc's stupid code generator
         | for wasting _bytes_ recreating constants that were already
         | there in that very register! That code needs to fit in a couple
         | hundred bytes..
         | 
         | And half an hour ago I was (once again) checking out hosting
         | providers and lamenting the fact that most don't seem to offer
         | support for loading custom ISOs so I could install a 30
         | megabyte distro and make the most out of the cheap plans that
         | only offer something like 10 gigabytes of storage. Half of it
         | is wasted after you install one of the these obese mainstream
         | distros.
        
           | mitjam wrote:
           | Hetzner Cloud can start instances from ISOs - here is an
           | example for ipfire :
           | https://wiki.ipfire.org/installation/hetzner-cloud
        
         | TillE wrote:
         | I work in a lot of situations with hard or soft resource limits
         | where I actually do need to count bytes and/or CPU cycles, so
         | it's bizarre to see anyone shrugging about distributing tens of
         | megabytes of fat for literally no reason.
         | 
         | One thing Microsoft got right a long time ago was separating
         | out debug symbols into their own file by default. I think
         | that's still awkward on Linux.
        
         | yodon wrote:
         | And oh what a beautiful bike shed it will be...
        
       | mpolun wrote:
        
       | rob_c wrote:
       | I suspect this is 70%+ of all features of all tools remain
       | undiscovered by the users.
        
       | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-01-28 23:01 UTC)