[HN Gopher] More than 70% of prometheus executable are unused by... ___________________________________________________________________ More than 70% of prometheus executable are unused by most people Author : wejick Score : 62 points Date : 2022-01-28 17:29 UTC (5 hours ago) (HTM) web link (wejick.wordpress.com) (TXT) w3m dump (wejick.wordpress.com) | etcet wrote: | [deleted] | tobyjsullivan wrote: | Great work on the part of the author. Pareto principal holds. | Often all it takes is one person motivated enough to look for | efficiency opportunities. | | As for next steps, I can't imagine the Prometheus crew would | object to a proposal + PR to make the Service Discovery an | optional add-on in the next major version. It does open a can of | worms around how such an add-on would be distributed if not built | into the binary. (Caveat: I have no familiarity with this | particular project or its unique constraints or goals.) | rob74 wrote: | Easy: they can provide an option to remove the SD functionality | at compile time, and if you really care about the executable | size, you can compile the code with this option (and | `-ldflags="-s -w"`). The standard build would still be the "all | batteries included" one to avoid support issues (people | downloading the smaller binary and then asking why SD isn't | working). | rektide wrote: | i'd like to see memory usage differences, load time & runtime | performance impacts. i expect most of these to be small but i | expect some impact. | | also just worth oting that the memory impact of statically | compiling in general is probably massive. most systems probably | would have a good percent of these libraries in memory already if | promtheus were using dynamic linking. | jdalsgaard wrote: | > Prometheus alternative | | Well... if size of the executable is really a concern, perhaps | Victoria Metrics is worth considering; my amd64 executable is | about 17MiB in size. | pphysch wrote: | [deleted] | akireu wrote: | On a side note, Prometheus seems to be built for bloat. AFAIK, it | isn't even designed to consume metrics other than from apps | linked to its client library. It's like a microservice, but with | the footprint of an operating system. | momothereal wrote: | > AFAIK, it isn't even designed to consume metrics other than | from apps linked to its client library. | | Could you elaborate? I use Prometheus to scrape from an HTTP | endpoint in various Pods in Kubernetes, so the service | discovery is pretty useful to me. | | I could see the Kubernetes & the other SDs split out of the | core binary if default size is really an issue. Or are you | talking about something else? | akireu wrote: | I'm talking about the way I'm expected to provide metrics for | my apps. Rather than exporting free-form JSON and then | scripting Prometheus to understand it, I'm expected to use a | custom client library to export the metrics. As for | Kubernetes, you can only use it with Prometheus because of | not insignificant amount of work on both sides. Basically, | the latter is designed for vendor lock-in. | NikolaeVarius wrote: | The prometheus format is literally just a text page. It | dead simple to implement | morelisp wrote: | There are a frustrating number of fundamental corner | cases due variance to floating point text formats, and | slightly more in the descriptor if you also need that. | It's simple to implement an expositor for a limited set | of cases. As usual, it's much more difficult to parse | what you actually find in the world. | gempir wrote: | Prometheus follows the OpenMetrics standard I'm not sure | what you find propietary about that or specific to | prometheus. | | https://github.com/OpenObservability/OpenMetrics/blob/main/ | s... | momothereal wrote: | To be precise it was the other way around. OpenMetrics is | a standardization effort for the format Prometheus made | up. | | However Prometheus was designed before JSON was | standardized itself, so I'm just glad they didn't choose | XML! | momothereal wrote: | Ah ok, I see what you mean. | | The other commenters have pointed out that it _is_ based on | another open standard, but admittedly one less common than | say, JSON. So you'll generally have to implement your own | metrics producer or use a client library, that's true. | | However it's also a dead simple format and you can probably | implement it with a for-loop or a shell script. | slimsag wrote: | What a bizarre claim. | | Prometheus scrapes the same text format as OpenMetrics 1.0 | and over 700 public exporters use this format, and there | are TONS of other non-Prometheus software that consume the | exact same text format. Prometheus's biggest competitor, | Datadog (which is not open source mind you), consumes it | too. I think even Grafana consumes it directly. It's | becoming an IETF standard[0]. | | Would I have preferred JSON over a custom text format like | this? Yeah. But to claim an open source project like | Prometheus with effectively no business at all is using a | text format like this to have vendor lock-in? That's quite | a stretch. | | [0] https://github.com/OpenObservability/OpenMetrics/blob/m | ain/s... | morelisp wrote: | > Prometheus scrapes the same text format as OpenMetrics | 1.0 | | I find the GP's claims weird - I've written a relative | ton of collectors, exporters, and translators and the | format is pretty OK, not worse than most that came before | it and better than _lots_ - but I think this relationship | is backwards. Prometheus "scrapes OpenMetrics" because | OpenMetrics was formal documentation of what Prometheus | was already doing for years. | | I would not have preferred JSON. That an exposed metric | is also a query is also pretty close to a schematic | definition is nice. | akireu wrote: | I apologize for my mistake, then. My understanding was | based on reading the Prometheus docs on making exporters | alone - something I needed urgently for a job. | hughrr wrote: | It's a database engine not a microservice and needs to be | treated along the lines of postgresql etc. | hughrr wrote: | Yeah that's only unused until you need it at which point it | doesn't involve futzing with anything. | | One of the things that kills me is running fluentd because you | have to fuck around with ruby gems in containers every two | minutes to get it to do something reasonable. | | This is pain. Prom is not. | whateveracct wrote: | Clutching pearls about binary size is and always will be | hilarious to me. | sigmonsays wrote: | I also find this comical. | | I'd love to know why 100MB is that big of a deal. If network is | slow, cache locally. Seems like nothing here to worry about. | [deleted] | jeppesen-io wrote: | This one does not even make sense - 100 megs for a binary for | centralized metrics? Who would even notice next to the OS and | metrics storage. | | By design you should not install prometheus on every server you | monitor - it's designed to scrape metrics | | Its a database, webui with support for email, webhooks, slack, | pagerduty, aws api and many others. 100megs does not sound like | a lot for all Pormetheus provides | 0xbadcafebee wrote: | It correlates to performance, speed to iterate, security, and | design complexity, but ok | akireu wrote: | It's all fun and games until you're stuck for a hour | downloading 600MB of updated packages over a metered LTE. The | same is with RAM usage: 512MB was enough for a phone back in | 2014, now a smart TV with 2GB is barely capable of | multitasking. Sure, binary sizes don't matter in most contexts. | But when they do, it's a PITA. | jayd16 wrote: | Galaxy S5 from 2014 had 2GB and that was 1080p vs 4k texture | sizes for today. Seems on par. | akireu wrote: | What you're kind of missing is that the S5 was a flagship | phone. Generally, one has to save for more than a month to | afford a purchase like that. The idea of working an extra | month so that some FAANG prick meets their KPI by cutting | corners on optimization doesn't even look like feudalism. | It looks like idiocracy. Paying the lip service of fat | shaming code bloat is the cost-effective option by | comparison :) | asiachick wrote: | What does FAANG have to do with this? | | Don't FANNG people obsess over bloat because they're | trying to reach billions of customers? It might not seem | that way since their pages are bigger but I'd be | surprised if they were happy to leave 10s of millions of | customers on the table. | akireu wrote: | They're just poster children for the particular brand of | disdain $100k+/year "tech workers" bear for their users: | they make enough for the shiniest of toys, so they're too | far above spending their valuable time to make their | software run smooth on our $100 crap phones. Nevermind | that each Fb client update likely produces hundreds of | tons of toxic trash called gadgets. Sure, sometimes they | do optimizations. Generally, though, both Fb and Google | keep exploring the physical limits to code bloat. | Remember that one time that Fb hit the JVM class count | limit? | ysleepy wrote: | FAANG are the worst offenders. Didn't facebook employ | ungodly hacks to unload/load parts of the android app to | navigate around the 65k method limit of dex? Have you | looked at the js monstrosity of the Google hardware shop | website? | jayd16 wrote: | You misunderstand the point. You're comparing a 1080p | phone to a 4k television when texture memory is what will | take up the vast majority of ram. Code footprint is | pretty irrelevant. | | Still the TV does fine with 2GB. Doesn't seem fair to | complain. | akireu wrote: | I wasn't speaking of a 4k TV, but still, this doesn't | check out. A single 2160p framebuffer is 8MPix, or 32MiB. | Not counting the original FB size, the extra 1.5GiB are | enough for 48 whole framebuffers. You don't need that | much image data all at once, the number is ridiculous. | No, I believe it's just that the code became that much | less efficient. | LimaBearz wrote: | Sure, but we're talking about an application written for a | cloud/hosted environment in a datacenter somewhere. nicking | at the size of a statically linked binary meant for | production grade environments with fast computers and fat | pipes feels overly pedantic no? Especially when we're talking | about a mere 100MB | [deleted] | [deleted] | jeffbee wrote: | Author doesn't even say why they object to the size. Are they | aware that file-backed executables are paged on demand and only | the active parts of the program will be resident? | Topgamer7 wrote: | Granted these days everyone is used to applications consuming | massive amounts of drive space. But perhaps they're using | legacy hardware for a home lab, or a IoT device with limited | disk space. | | From a security stand point, reduced application code | decreases risk. It was service discovery code he removed, | what if it reached out to discover services on application | start up, that's a potential attack vector. | shoo wrote: | > From a security stand point, reduced application code | decreases risk. It was service discovery code he removed, | what if it reached out to discover services on application | start up, that's a potential attack vector. | | Agreed. I've see a similar pattern with certain open source | libraries. | | The first example I think of is the spf13/viper [1] | library, used to load configuration into go applications. | Viper is equipped with code for reading config from various | file formats, environment variables, as well as remote | config sources such as etcd, consul. If you introduce the | viper library as a dependency of your application to merely | read config from environment variables and YAML files in | the local filesystem, then your go application suddenly | gains a bunch of transitive dependencies on modules related | to remote config loading for various species of remote | config provider. It's not uncommon for these kind of remote | config loading dependencies to have security | vulnerabilities. | | As well as the potential increased attack surface if a | bunch of unnecessary code to load application configuration | from all manner of remote config providers ends up in your | application binary [2], if you work in an environment that | monitors for vulnerabilities in open source dependencies, | if you depend on an open source library that drags in | dozens of transitive dependencies you don't really need, it | adds a fair bit of additional overhead re: detecting, | investigating and patching the potential vulnerabilities. | | I guess there's arguably a "Hickean" simple-vs-easy | tradeoff in how such libraries are designed. The "easy" | design, that makes it quick for developers to get started | and achieve immediate success with a config loading | library, is to include code to load config from all popular | supported config sources into the default configuration of | the library, reducing the amount of steps a new user has to | do to get the library to work for their use case. A less | easy but arguably "simpler" design might be to only include | a common config-provider interface in the core module and | push all config-provider-specific client/adaptor code into | separate modules, and force the user to think about which | config sources they want to read from and then manually add | and integrate the dependencies for the corresponding | modules that contain the additional code they want. | | edit: there has indeed been some discussion about the | proliferation of dependencies, and what to do about them, | in viper's issue tracker [3] [4] | | [1] https://github.com/spf13/viper [2] this may or may not | actually happen, depending on which function calls you | actually use and what the compiler figures out. If your | application doesn't call any remote-config-provider library | functions then you shouldn't expect to find any in your | resulting application binary, even if the dependency is | there at the coarser-grain module dependency level [3] | https://github.com/spf13/viper/issues/887 [4] | https://github.com/spf13/viper/issues/707 | cosmotic wrote: | Image pull size for a container is likely the concern. It | could shave a few seconds off a regularly-run integration | test. If it's run via on-demand build agents, then there's no | image cache. | jeffbee wrote: | If it takes multiple seconds to pull ~35MB of compressible | text into your CI environment, there may be other, larger | problems to solve. | wejick wrote: | sorry not to make it obvious in the article, I'm planning to | run it in small iot pi based device locally. So having | something small and fast is preferable, however the runtime | performance is a more important thing I haven't touch. | [deleted] | jeppesen-io wrote: | Prometheus is rather efficient, but it's focus is a little | different than yours. Its designed to for large scale | collection of metrics, scraped from many remote endpoints | | You can run it locally but the "prometheus" way for iot env | would be a central prometheus server that scrapes the iot | devices running a prometheus exporter, which tend to be | very light weight | wejick wrote: | Totally agree. Another part of it is just feeding | curiosities. | ts4z wrote: | I'm curious -- is it the binary size that's a problem, or | the resident size in memory? Demand paging should help, | although you'd be stuck with carrying the enlarged binary. | wejick wrote: | my gut feeling tell it will be both memory and cpu | utilization. cant be sure, until I can find good way to | measure it. | wahern wrote: | That only helps if the code is well segregated by usage. | Looking at the ELF symbol table for | prometheus-2.33.0-rc.1.linux-amd64, it's not clear to me this | is the case. Not sure how it's ordered. Lexical import order? | Anyhow, without profiling how could the compiler know how to | order things optimally? | | I think this is one of those cases where, in the absence of | profiling or some other hack (e.g. ensuring all routines | within a library are cleanly segregated across page | boundaries within the static binary _and_ the I /O scheduler | doesn't foil your intent), dynamic linking would prove | superior, at least for such large amounts of code. | xuhu wrote: | Auditd_2.8-amd64.deb is 194kb on debian, rsyslog_8.32-amd64 is | 411kb, and they both support centralized auditing and log | collection from multiple hosts. | djbusby wrote: | Do they do the metrics like Prometheus does? And include the | central collector and basic graph builder? | dtech wrote: | This doesn't seem a fair comparison. Prometheus is statically | linked like all Go applications, and those packages are not. | You can debate the merits of that, but if you compare a "only | rsyslog" server vs a "only prometheus" server the 2 will be | much closer in size. | foxfluff wrote: | It's kind of ironic reading this comment given that at the time | you posted it, I was screaming at gcc's stupid code generator | for wasting _bytes_ recreating constants that were already | there in that very register! That code needs to fit in a couple | hundred bytes.. | | And half an hour ago I was (once again) checking out hosting | providers and lamenting the fact that most don't seem to offer | support for loading custom ISOs so I could install a 30 | megabyte distro and make the most out of the cheap plans that | only offer something like 10 gigabytes of storage. Half of it | is wasted after you install one of the these obese mainstream | distros. | mitjam wrote: | Hetzner Cloud can start instances from ISOs - here is an | example for ipfire : | https://wiki.ipfire.org/installation/hetzner-cloud | TillE wrote: | I work in a lot of situations with hard or soft resource limits | where I actually do need to count bytes and/or CPU cycles, so | it's bizarre to see anyone shrugging about distributing tens of | megabytes of fat for literally no reason. | | One thing Microsoft got right a long time ago was separating | out debug symbols into their own file by default. I think | that's still awkward on Linux. | yodon wrote: | And oh what a beautiful bike shed it will be... | mpolun wrote: | rob_c wrote: | I suspect this is 70%+ of all features of all tools remain | undiscovered by the users. | [deleted] ___________________________________________________________________ (page generated 2022-01-28 23:01 UTC)