[HN Gopher] The dark side of GraphQL: performance ___________________________________________________________________ The dark side of GraphQL: performance Author : kamranahmedse Score : 84 points Date : 2020-01-01 19:07 UTC (3 hours ago) (HTM) web link (twitter.com) (TXT) w3m dump (twitter.com) | jensneuse wrote: | The title is misleading. The post doesn't discover any dark sides | of GraphQL. The post is about a potential performance problem | with a library that implements the GraphQL spec. There might be a | problem with the library itself. There might be a problem with | the use of said library. The author states that it takes 19ms to | fetch 20 recipes from a postgres database. This looks really | suspicious. Why does it take so long to fetch 20 indexed rows? | Maybe there's some general performance problem with the | application? | [deleted] | ctvo wrote: | You focus on and make assumptions it's indexed rows and that | ~20ms average for a database call is "suspicious" but you're | not concerned about the 400ms flamechart for graphql-js doing | validation shown in the thread? | | graphql-js is the reference implementation of GraphQL, so it's | not any random library. | jensneuse wrote: | The graphql-js library focuses on correctness, not on | performance. Facebook doesn't invoke it at runtime, only at | built time. They use persistent queries only. If you want a | high performance server runtime I wouldn't use Node.JS. | Especially for a complicated task like validating and | resolving a GraphQL query Node.JS is the wrong tool. It's too | high level to tweak hot paths and optimize the garbage | collector. So no, I don't think the flame graph is | suspicious. In my language of choice (go) I could drill down | memory and CPU consumption for each line of code to find the | bottleneck. Maybe this is possible for Node.JS too, I don't | know the tooling so well. I would suggest that if such tool | exists a detailed flame graph of the Node.JS application | might help understand the issue. | hn_throwaway_99 wrote: | OP is using Apollo Server, which is by far the most common | server implementation for GraphQL. It may well be there are | issues specific to Apollo, but it's definitely worth | getting to the bottom of based on how widely used Apollo | is. | | There is nothing in the posts that identifies NodeJS as the | culprit, and based on the info I'd be very surprised if it | was. It seems most likely that the type validation is what | is taking so much time. But then again, strong types are | one of the main benefits of GraphQL. If anything, I've | found Node to be one of the easiest and most "natural" | server languages for GraphQL, and I have implemented | GraphQL servers in Node, Java and Python. | [deleted] | foota wrote: | This comment neglects the fact that people can and do use | node in situations where performance is important, and | 400ms is especially egregious. Having the obvious path for | using graphql in js server side perform terrible is | problematic. | andrewingram wrote: | For a point of comparison, when I was using graphql-js | around 3 years ago, I was benchmarking things pretty | carefully and the main bottleneck wasn't graphql-js -- I | had comparable (or faster) response times to equivalent | existing hand-crafted JSON endpoints. | | But if you're fetching a lot more data than you need for | a typical UI, you might run into bottlenecks. | sciolistse wrote: | Regarding profiling in Node.JS (in case anyone who is | interested is unaware,) if you start your application with | the "--inspect" argument and then open devtools in | chrome/chromium there's a little node icon that shows up in | the top left corner. | | If you click on that you can get performance flame graphs / | tables, memory profiling, and there's also a REPL for the | process, as well as a list of loaded source files so you | can set breakpoints through there if you like, as well as | modify the files on the fly if you need something more for | debugging. | | It can be very useful, and works pretty much the same as | the normal web devtools. | city41 wrote: | There is also ndb for a very similar effect. What I | really like about ndb is it works in front of just about | anything, for example `ndb yarn test` | | https://github.com/GoogleChromeLabs/ndb | swyx wrote: | iirc fb doesnt even use graphql-js much? the whole thing | was embedded in PHP Ents . graphql-js was written purely | for the opensourcing. ofc things may have changed somewhat | in recent years but i doubt if | m_ke wrote: | Sounds to me like an issue that comes with coupling of validation | with serialization. A lot of these API frameworks combine the | two, with a the goal of automating validation when receiving data | from clients, but then also do that validation when serializing | response data, which should already be validated if it's sitting | in your database. | | I've ran into similar issues with FastAPI and DRF when dealing | with really large payloads. | hamandcheese wrote: | I've seen similar issues in graphql-ruby. Even if I hardcode the | data in my resolvers, it takes hundreds to thousands of ms to | render a list with some moderate nesting. | dclowd9901 wrote: | Forgive me if this sounds a bit "hindsight 20/20", but I feel | like performance was always a lower consideration when it came to | utilizing graphql. The win is in reducing overhead around | providing new endpoints. | | Like react, it eschews performance for the sake of enterprise | level scaling. This shouldn't come as a surprise to anyone, being | both of these came from one of the largest dev organizations in | the world. | toomim wrote: | > performance was always a lower consideration when it came to | utilizing graphql | | That's strange, because I thought the main selling point was to | consume only the data you need. The client specifies exactly | which fields it wants. Then it doesn't over-fetch. To make | things higher performance. | wolfgang42 wrote: | GraphQL the protocol/language was designed for performance, | but (when I tried GraphQL, which was several years ago) the | server-side implementations seem to have had much less of a | focus on it. | | It's true that the _client_ doesn 't over-fetch (and also | doesn't need multiple round-trips), but at least when I tried | the gql-js library it required the _server_ to over-fetch: it | would ask for individual records, and then do the field | plucking /record joining itself; there was no way to | intercept the query along the way to find out which fields it | needed so you could only fetch those. | | I get the impression that the server libraries were designed | to work with a document store or "fat" REST API that is only | capable of taking a single ID and returning the entire | record. In this situation it makes sense to have a separate | middleware server to keep the big fetches and round-trips | inside the datacenter and only give the client exactly what | it needs, and needing a little more server power isn't a big | deal. But, if you need to do something more sophisticated | (even something as simple as only fetching certain fields | from the datastore), they were no help whatsoever; when I was | looking into it there wasn't even a way to parse the query | into an AST and do the rest of the query planning yourself. | gavinray wrote: | Echoing this, GraphQL is just a specification, and it is up | to library authors how that spec is implemented. | | I think there might be a disconnect or misunderstanding in | the developer about this. GraphQL is sort of like the Flux | pattern for MVVM architecture. It isn't so much a thing as | an idea. | spamizbad wrote: | While it's certainly true performance can be a trade off... | 400ms+ response times are annoyingly slow. I'm not sure a trade | off is worth it unless it's some really exotic endpoint you've | created | nbardy wrote: | Eschew performance isn't the right way to put it. React allows | you to do 90% of UI work in performant ways. It has good | predictable performance for the majority of work and allows you | to move through a lot of simple UI tasks quickly. And spend | time focusing on the performance in parts of your app that | matter. The situations where you really need performance tuning | are going to be unique to your specific app and data. | picardo wrote: | I'm not sure if this is mentioned in the thread, but one of the | reason it takes so long for the requests to return is when GQL | initializes the entire record in memory and then reduces it back | to only the fields you wanted. This can be a big problem if you | have a deeply nested data model, and potentially many results. | The memory consumption can hit the roof. I find that the best | approach in those cases is to create a one-off REST endpoint (or | to create a field higher up the GQL hierarchy) and handroll the | SQL query. | benawad wrote: | I thought it could have been a memory problem too, but VPS | didn't show any signs of anything spiking | https://twitter.com/benawad/status/1212404379371917313 | | but I do think it's related to my nested object | https://twitter.com/benawad/status/1212407236284338176 | viraptor wrote: | > when GQL initializes the entire record in memory | | GQL is an idea not an implementation. I don't believe there's | anything preventing actual software from optimising this case. | Or am I missing something here? The query defines what you're | asking for so extra data does not necessarily need to be | fetched. | xtagon wrote: | You're correct. It's common for GraphQL API implementations | to batch all the parent record's fields up front, but it's | not the only way. One alternative method is to traverse the | whole query object and generate one big query to your | database (SQL, graph database, what have you) instead of | batching queries per table/object. This has trade-offs. | Sometimes it's more performant, especially for smaller | queries, but for larger queries it can actually be slower | because joining lots of tables into one query causes some | duplicated data and transfer overhead (assuming you're using | SQL). I have a feeling that this method would perform very | well if your GraphQL data was backed by an actual graph | database though. | np_tedious wrote: | How deeply do you mean "entire record"? I am pretty sure that | this concern only applies to those fields which have the same | resolver | greenpizza13 wrote: | Things have matured quite a bit. With Apollo Server it's | possible to fully understand which fields are being requested | before creating and running, for example, and SQL query. | Fetching only the requested data for a given query reduces in- | memory footprint. Most people get the whole data object and | then allow GQL to select the subset of fields the user asked | for, but for cases where performance is a problem there is | another solution. | picardo wrote: | I haven't used Apollo Server lately. But the way you describe | it doesn't address the core issue, which is the | initialization of the intermediate objects in-memory. So just | to give an example, if I wanted to query for the projects of | listings of my company, I can write it this way in GQL: | me { company { listings { projects { id name } } } } | | This will initialize: a User, a Company, Listings and | Projects of all listings. | | I can also write this in SQL using a couple of joins and | return an array. The memory consumption is trivial in | comparison to the original request. | andrewem wrote: | You say "problem there is another solution" - what is the | other solution? (I'm guessing it involves somehow telling | Apollo Server which fields/related objects you will need?) | sergiotapia wrote: | I'm positive there's a way to only load the relevant data in | your stack. | | In Elixir with Absinthe we can resolve to the specific fields | we need and we don't load the entire records then slim down. | picardo wrote: | I never used Absinthe, but if you're initializing an ORM in | your resolver, loading the entire record into memory is | unavoidable. How does Absinthe get around that? (Sounds like | it generates the SQL?) | city41 wrote: | It's also possible to look at what fields were requested in the | GraphQL query and use them to aid what gets fetched. | CharlesW wrote: | Reading the thread, this isn't a "dark side of GraphQL" but a | "dark side of not understanding how to debug/improve performance | in my software dependency". | lasdfas wrote: | Not sure why you are getting downvoted. The person actually | states that they don't know how to debug, "honestly, I'm not | 100% sure the best way to debug from here." They are just | looking at Datadog stats and not finding the root cause. They | could do some basic JS debugging of the open source library to | figure out the issue. Blaming Apollo would be a stretch (which | may not even be the issue since they haven't done any | debugging), but the protocol of graphql is way too far. | jmull wrote: | > Not sure why you are getting downvoted. | | I don't know why anyone downvotes as they do, but the | previous post is an irrelevant argument about semantics, so | in my opinion it deserves to be downvoted. | | Actually, now that I think of it, it's a little worse than | that. The OP is being criticized for not understanding how to | debug or improve the performance of their dependency _while | actively engaged in figuring out how to debug and improve the | performance of their dependency_. (People respond with | questions, OP provides substantive answers, there 's a back- | and-forth and OP forms an idea it's related to a deeply | nested schema, and so on.) | ex3ndr wrote: | Quite strange, GQL server source code is literally just walking | by fields and resolving promises, very simple and | straightforward. | | We had something like this in our backend, but this long times is | usually meant that something wastes event loop and just blocks | everything from execution. | | It could be anything for example it could be async hooks that | makes ~1000 times slower if you are using a lot of promises | (since resolving fields often are just promises) since overhead | is per promise. In general in latest nodejs you can do huge | amount of promises and they have little to no overhead, but, | again - something wrong with nodejs setup, some library populate | event loop or something deeper in nodejs internals. It is not an | issue with gql itself since if you have gql performance issues | that means that your server is super slow in processing like | anything. Our team was shocked by performance and it turns out | that NodeJS is super fast and it is some libraries (like | sequelize) that kills the performance, but gql is not one of | them. | coding123 wrote: | Hmmmm... if anything the performance of a graphql query should | generally outshine REST in nearly any category of performance. | From the sound of things, the performance issue doesn't make any | sense. He's using Dataloader, and he is certain it's not related | to dataloader anyway. So maybe some dependency he's using is the | wrong version. ___________________________________________________________________ (page generated 2020-01-01 23:00 UTC)