[HN Gopher] Show HN: Octo - Generate a serverless API from an SQ...
       ___________________________________________________________________
        
       Show HN: Octo - Generate a serverless API from an SQL query
        
       Author : khalidlafi
       Score  : 192 points
       Date   : 2020-09-11 13:02 UTC (9 hours ago)
        
 (HTM) web link (octoproject.github.io)
 (TXT) w3m dump (octoproject.github.io)
        
       | bobbywilson0 wrote:
       | My main purpose of tools like these has always been prototypes,
       | or hobby one-off type stuff. For SPAs, or a sketch with a Jupyter
       | notebook. They're great for this sort of thing because in my
       | experience, this used to require building some sort of API just
       | to get a simple json interface to the database. It was my
       | understand that the purpose of these types of tools was mostly
       | that.
       | 
       | Are folks using these kind of things for non-trivial production
       | applications?
        
       | amani92 wrote:
       | Very impressive, Great job.
        
       | rimkms wrote:
       | Logo is looking good , gj
        
       | Glyptodon wrote:
       | One of the things that's not obvious to me about things like this
       | (and other similar tools) is where/how
       | scopes/limitations/permissions are handled. I assume they either
       | are or can be, I just never see it spelled out clearly. What am I
       | missing?
        
         | eatonphil wrote:
         | I tackled this by using SQL as the filter language where the
         | generated code will fill in some context-specific variables
         | like current session user id or current request object id.
         | 
         | This is a little limited in this current form and I'm working
         | to expand SQL filters to match up to HTTP codes so you can say
         | this request needs to have a session otherwise it is a 401,
         | then it also needs to match another filter otherwise it is a
         | 403. But this other endpoint is ok to show without a session if
         | the object being requested is marked public in the db.
         | 
         | There's a lot to think through especially when extending these
         | filters to bulk methods.
         | 
         | https://github.com/eatonphil/dbcore/blob/master/examples/not...
        
         | lukeramsden wrote:
         | I can't speak for this project specifically but for some
         | context, Postgraphile's way of solving this, as it only
         | supports Postgres, is to use Postgres's Row Level Security
         | feature, whereby you enforce scopes and permissions at the data
         | layer, as well as using table grants to roles specified in JWTs
         | and such. (https://www.graphile.org/postgraphile/postgresql-
         | schema-desi...)
         | 
         | This project doesn't seem to have any inbuilt AuthZ
         | functionality, so unless your database has that built in like
         | Postgres, or you need something that's not possible in-
         | database, I guess you just... can't.
        
           | ecf wrote:
           | I recently started a side project on top of Postgraphile and
           | I had to end up scrapping it in favor of something I was more
           | familiar with.
           | 
           | The biggest problem is that unless the main language you're
           | familiar with is PL/pgSQL, you'll eventually run into the
           | roadblock that is having everything reside in the DB.
           | 
           | In my case, I simply could not figure out how to use the
           | Users table without having the password returned in all
           | queries. I could turn off the automatically generated queries
           | that Postgraphile made, but at that point why bother with it?
           | 
           | Other problems are that there isn't a mature and well
           | established method of maintaining development, staging, and
           | production database schemas on projects with more than one
           | developer.
           | 
           | Graphile makes https://github.com/graphile/migrate, however,
           | it seems incredibly brittle and is built on a workflow that
           | will absolutely break production if it is slightly deviated
           | from.
           | 
           | I really liked using it! I just hope some of these developer
           | ergonomics issues could be better handled.
        
             | lukeramsden wrote:
             | > In my case, I simply could not figure out how to use the
             | Users table without having the password returned in all
             | queries
             | 
             | The recommended way of doing this is to store anything you
             | don't want public in a separate table with a one-to-one
             | relationship, and then controlling access to that table
             | through computed columns and such
             | (https://www.graphile.org/postgraphile/postgresql-schema-
             | desi...)
             | 
             | What I can't recommend enough to people starting with
             | Postgraphile is to check out graphile-starter, and
             | especially it's first migration: https://github.com/graphil
             | e/starter/blob/main/%40app/db/migr...
             | 
             | That really helped me to understand better how to structure
             | my schema, as separation of concerns within a model by
             | using different tables is not something that is necessary
             | when doing a regular application-in-front-of-datastore type
             | app.
             | 
             | > Other problems are that there isn't a mature and well
             | established method of maintaining development, staging, and
             | production database schemas on projects with more than one
             | developer.
             | 
             | Yes, I don't have to grapple with this issue so I don't
             | really have much to say about this. Graphile-migrate works
             | great for me, but I can see how it could be an issue for
             | larger projects, although I would think that database
             | migrations are simply tricky on their own, regardless of
             | Postgraphile.
        
               | ecf wrote:
               | You're completely right about the starter and I feel if I
               | had discovered it first instead of trying to roll my own
               | migration system with Knex and stuff, I would have stuck
               | with Postgraphile.
               | 
               | But instead of rebasing my project with the starter I
               | decided to go back to what I was comfortable with and do
               | a Rails gql api.
               | 
               | I'll definitely be trying it out in the future though! I
               | felt so incredibly productive with rolling stuff out
               | before auth became a concern.
        
               | lukeramsden wrote:
               | Yes it's been incredibly quick for me to iterate and
               | build my current project. You are still right about
               | migrations though, I'm not sure how that would work in a
               | bigger team with feature PRs etc.
        
           | awb wrote:
           | I tied this "no backend" approach but ended up writing all of
           | my business logic in the DB schema instead of in the app,
           | which seemed to defeat the purpose.
        
             | lukeramsden wrote:
             | Well if your application has business logic at all you're
             | going to have to write business logic at some point, and I
             | personally like the idea of keeping as much of it co-
             | located with the data - it makes your schema the SSOT for
             | your data structure and it's constraints and
             | transformations.
             | 
             | The complication is as always external services, like
             | sending email, validating IAP receipts, things like that.
             | For that I liked to take some of the principles from the
             | 3factor app devised by the Hasura team, and make them
             | asynchronous, and performed by worker processes (in my
             | case, with graphile-worker).
        
               | awb wrote:
               | Agreed, I really liked having everything so close and
               | tightly coupled with the DB. The integrations and
               | business logic were the part I had to build traditionally
               | and at that point it wasn't much more to query the DB
               | from the backend rather than trying to learn how to
               | script inside Postgres.
        
       | WrtCdEvrydy wrote:
       | Your timing is perfect.
        
       | [deleted]
        
       | Trisell wrote:
       | The idea is interesting. But it looks like you end up with a yaml
       | file that enumerates each of your tables/endpoints and the
       | queries that back them. So are we exchanging the "complexities"
       | of code, where we have control and testing, for the "lack of
       | complexity" of yaml that becomes unwieldy and untestable in the
       | name of "simplicity?"
        
         | stingraycharles wrote:
         | Don't forget that at some point, you'll want to generate the
         | yaml from code, because otherwise it becomes impossible to
         | maintain. And quickly you'll find yourself back at square 1. :)
        
       | o1lab wrote:
       | Interesting concept and quite liked the playful logo. Can we pass
       | in env variables to db connection ?
       | 
       | We are in similar space, we take input params of db and generate
       | CRUD apis with Auth+ACL and then APIs are packed into a single
       | lambda function. There is support for serverless framework as
       | well.
       | 
       | [1]: https://github.com/xgenecloud/xgenecloud
        
         | amal_kh5 wrote:
         | Yes, for example:-
         | 
         | ? Enter the database password: ${DB_PASSWORD}
        
       | mitjam wrote:
       | Reminds me of the venerable Datasette by Simon Willison:
       | https://github.com/simonw/datasette
        
       | ForHackernews wrote:
       | Looks vaguely similar to http://postgrest.org/
        
         | lukeramsden wrote:
         | And Hasura, Postgraphile et al too. These, as well as PostgREST
         | also give you much more flexibility in the form of plugins in
         | library mode and other such things - they also generate the
         | actual queries for you, via introspection, as opposed to this
         | which requires you to write the query yourself.
         | 
         | I think there's certainly space for this project, i.e. hand-
         | written queries, on any database (Postg[REST|raphile] both only
         | work with Postgres of course, not sure about Hasura). Not sure
         | it will succeed without support for more forms of Serverless
         | deployment, primarily Lambda.
        
       | alexzender wrote:
       | Interesting, I've built a similar project that generates GraphQL
       | API based on your database schema - https://okdb.io
        
       | cube2222 wrote:
       | Looks great!
       | 
       | If you like this, check out OctoSQL[0]... Also in Go... Though
       | OctoSQL lets you query multiple databases / files / event streams
       | like kafka using SQL from your command line, not as a server, so
       | a fairly different use case, but you should check it out
       | nevertheless!
       | 
       | The naming clash is funny.
       | 
       | [0]: https://github.com/cube2222/octosql
        
         | jhoechtl wrote:
         | I realy like your tool. In fact I am slowly integrating it into
         | a solution which will expose a REST API and workspaces
         | identified by a UUID. In our organisation it is so common to
         | receive an Excel or csv which you have to join with the
         | database. Octosql is great for that.
         | 
         | I am wondering what future role badger will play in the future?
         | It would also make a great additional KV backend btw.
        
           | cube2222 wrote:
           | That's really great to hear!
           | 
           | We're considering moving to a more in-memory model, as we're
           | not sure if the badger storage idea was a good one and worth
           | it.
           | 
           | TBH we're still not quite sure in what direction we'll be
           | continuing. Though we're surely gonna be developing it
           | further.
           | 
           | But currently we're considering a rewrite with multiple
           | assumptions changed (column oriented).
        
       | alexellisuk wrote:
       | Nice to see openfaas featured here and thanks for your PRs to
       | Arkade. I do wonder what your strategy is on connection pooling
       | and authentication?
       | 
       | Also not keen on the passwords being kept in a plaintext file -
       | someone will check that into git. OpenFaaS has secret support
       | which you can use Amal. So does Knative.
        
       | [deleted]
        
       | ahmadbana wrote:
       | Very interesting projects and can be scalable Keep up the good
       | work
        
       | akie wrote:
       | Perhaps I'm old, but who needs an API for an SQL query? I'm not
       | sure I understand the use case, or the advantage of something
       | like this over a regular API call to a backend which would also
       | allow you to do e.g. authentication. Enlighten me?
        
         | sixdimensional wrote:
         | In my case, I have a simple obvious use case.
         | 
         | I work for a large corporation. They want to implement the
         | Bezos mandate [1]. No direct database access between teams, API
         | abstraction for everything.
         | 
         | OK, now let's think in onion layers (or hexagonal/clean
         | architecture if you like). Think of layers of services with
         | different purposes - data services (containing no
         | application/business logic), application services (for business
         | logic, orchestration, process), and UI/UX services (to power
         | differing end user experiences).
         | 
         | Data services don't have to do much - be the data/repository
         | layer, expose productive interfaces for CRUD. Need to read
         | across data sources? Think of federated data services that can
         | combine data on the fly, perhaps like GraphQL.
         | 
         | These kinds of tools are perfect for the first layer of
         | services that abstract the database world from the application
         | world. Just simple services, even ones that effectively let you
         | mimic what SQL queries can do (filter, sort, page, etc.).
         | Individual record oriented interfaces, and bulk oriented
         | interfaces. The query side of CQRS (command query separation).
         | 
         | Many will say, "I don't need all this complexity and layers" -
         | and sure, for smaller or simpler applications, probably not!
         | 
         | But, if you have to operate on any kind of larger scale, with
         | multiple data sources, systems, etc., you end up needing the
         | layers. And these types of tools automate some of the lower
         | layers.
         | 
         | Perhaps when we talk about this, we shouldn't be focusing on
         | "oh it's too complicated", and instead building frameworks or
         | reference architectures that automate away the complexity - so
         | it looks easy again, but now it is more flexible, perhaps
         | easier to scale.
         | 
         | I believe that we are on the cusp still, of an almost fully
         | defined, service based architecture (microservices and server
         | less were just one part of the continuing development of that
         | story). Federation is another part of that story oft ignored.
         | Thinking of the onion as service layers is another part.
         | Erasing the network boundary as a concern through much higher
         | speed internetworking is another part.
         | 
         | Eventually we may come to see, that it is all a big "system",
         | some parts just aren't connected to each other directly.
         | 
         | Sorry, got a bit rant-y at the end there :) Just passionate
         | about sharing this world view with others - as I continue to
         | see this architecture developing!
         | 
         | [EDIT] I wanted to add, it's not just that the use case for
         | this is in a data service layer for automation - from a logical
         | perspective I mean. In big companies such as the one I work
         | for, we never get the resources we need, ironically. We are
         | overwhelmed with demands, and must operate under the Bezos
         | mandate rules. Tools such as Octo are not a panacea, but, they
         | are a good compromise if you have to move fast, they are time-
         | and-cost-savers. And they can get you surprisingly far.
         | 
         | [1] https://www.calnewport.com/blog/2018/09/18/the-human-api-
         | man...
        
         | mritchie712 wrote:
         | As part of our product (https://seekwell.io/), we let people
         | access SQL results with an API key and unique endpoints per
         | query. There's also an option to add parameters.
         | 
         | The main use case is giving a data scientist or another
         | application access to the results of a few arbitrary queries
         | without giving them full access to the database. So it's a bit
         | like giving them access to a SQL view, but without them needing
         | to set up a driver, etc. to connect.
        
         | chadhutchins10 wrote:
         | One of the best use cases for this is say you have a
         | backend/internal system and you want other things to start
         | interacting with it. Instead of having to write the api to
         | interface with it, you can just use something like this and
         | with little effort you have an api and can talk with the
         | database.
        
           | alangibson wrote:
           | I think the point he was making is: why the API if you just
           | want to talk to the db? You can connect to a SQL db over the
           | network and protect the data with views and stored
           | procedures.
        
             | RussianCow wrote:
             | Because you can't (easily) connect to an SQL DB from any
             | arbitrary client, like a web browser or mobile app.
        
               | chatmasta wrote:
               | That's actually exactly what we're trying to build at
               | Splitgraph [0]. :) We're building a "data delivery
               | network" (DDN), which is like a CDN, except instead of
               | forwarding HTTP requests to upstream web servers, we
               | forward SQL queries to upstream databases.
               | 
               | The premise of the idea is that we can cut out the
               | middle-man for a lot of data distribution use cases. We
               | give you a way to deliver your data in native SQL, using
               | the Postgres wire protocol. We've decoupled
               | authentication from the database, so we can do it in a
               | gateway / LB layer using PgBouncer + Lua/Python
               | scripting. Any SQL client can connect to the public
               | Splitgraph endpoint (as far as a client is concerned,
               | Splitgraph is just a really big Postgres database). You
               | can write queries referencing and joining across any of
               | the 40k datasets on the platform.
               | 
               | In fact, just this week we've been working on v0.0.0 of
               | our web client. This lets you do things like share and
               | embed SQL queries on Splitgraph, e.g. [1] (this query
               | actually joins across two live data portals at
               | data.cityofchicago.org and data.cambridgema.gov).
               | 
               | There's also an example here of using an Observable
               | notebook with the Splitgraph REST API [2]. It also works
               | with the Splitgraph DDN configured as a Postgres
               | database, but that's only supported in private notebooks
               | for now (since normally it's a bad idea to expose your DB
               | to the public!)
               | 
               | In general, we like the idea of adding more logic to the
               | database. Tools like OP's are useful in this regard. In
               | fact, at Splitgraph we use Postgraphile internally (along
               | with graphql-codegen for autogenerated types) and we have
               | nothing but good things to say about it.
               | 
               | [0] https://www.splitgraph.com/
               | 
               | [1] https://www.splitgraph.com/workspace/ddn?layout=hspli
               | t&query...
               | 
               | [2] https://observablehq.com/@mbostock/hello-splitgraph
        
       | eatonphil wrote:
       | I've got a similar project that reads your db schema and
       | generates a Go REST API and a TypeScript/React web interface.
       | (The code-generation is language agnostic so at some point I'd
       | like to add at least a Java REST API as well.) It supports
       | PostgreSQL, MySQL, and SQLite.
       | 
       | Unlike PostgREST/Hasura and some other dynamic tools you can
       | "eject" at this point if you'd like and continue on development
       | without the generator in a language you already know. But I'm
       | working on exposing Lua-based hooks you could carry across
       | whatever backend language you choose to generate and avoid the
       | need to eject.
       | 
       | It has builtin support for paginated bulk GET requests with
       | filtering, sorting, limiting. Built-in support for bcrypt-
       | password authentication and optional SQL filters specified in
       | configuration for authorization of particular endpoints based on
       | session and request metadata.
       | 
       | Still very much a work in progress but the goal is to push the
       | envelope on application boilerplate.
       | 
       | Screenshots are of the example/notes project in the repo.
       | 
       | https://www.dbcore.org/
       | 
       | https://github.com/eatonphil/dbcore
        
         | MuffinFlavored wrote:
         | I feel like projects like this work for simple stuff but as
         | soon as you need analytics/insights or actual business logic,
         | you almost always need to just "roll your own" API. Am I wrong?
         | Do other people feel this way? Can anybody think of a few
         | projects they've worked on that would be too complex/a ton of
         | work to make work with these kind of simple template
         | generators?
        
           | CuriouslyC wrote:
           | With postgrest you can add stored procedures as rest rpc
           | calls, and you can always roll a microservice for more
           | advanced stuff. In practice these sorts of auto-api tools
           | make a good starting point as long as they support your
           | authentication and authorization needs.
        
           | robmccoll wrote:
           | It seems like when working with generators, the trick is to
           | have the right boundaries between generated code, points
           | where you can extend the generated code, and the API through
           | which you use the generated code. If successful, you should
           | never feel the need to hand edit the generated code itself,
           | and you shouldn't need to worry too much about re-running the
           | generator breaking things or stomping on your code.
        
             | [deleted]
        
             | xgenecloud wrote:
             | >> you shouldn't need to worry too much about re-running
             | the generator breaking things or stomping on your code.
             | 
             | Can't agree more! There are code parsers which give out a
             | DOM and then code can be manipulated. But is a bit of work.
             | We are trying this concept in our framework.
        
           | jon-wood wrote:
           | The sweet spot for things like this is for them to generate
           | the sort of code you'd want anyway, in a way which allows you
           | to selectively replace bits where you need additional
           | business logic or UI customisation.
           | 
           | Rails' generators are a pretty good starting point here, if a
           | little bit more verbose than I'd like. They're great for
           | getting the boilerplatey bits off the ground, and focusing on
           | the bits that are unique to what you're doing.
        
           | eatonphil wrote:
           | Long-term maintainability is definitely my concern. I don't
           | see projects like this so much as products themselves (maybe
           | I'm myopic) but as core infrastructure. I don't want to ever
           | write the boilerplate again, but I should be able to extend
           | it maintainably over time (hence ejecting or Lua hooks).
           | 
           | My goal in building this is to allow myself to more rapidly
           | prototype real, complex applications. It's not there yet but
           | I've got such an application in mind, building toward support
           | for it as I'm developing this.
        
           | chadhutchins10 wrote:
           | DreamFactory is basically a paid service for this sort of
           | thing. They support something like 20 types of databases
           | (among many other data sources). They have a lot of features
           | that make the exposed api be good enough long-term.
           | https://www.dreamfactory.com
        
           | BlackCherry wrote:
           | I think on average you're right. As the project grows, these
           | generators lose some of their initial value and speed.
           | 
           | If I were to use something like this, it'd be for rapid
           | development initially for prototyping purposes. Then I'd
           | transition to something more bespoke as needed.
           | 
           | If you know all your requirements up front, then starting
           | bespoke from the beginning may be the better route.
        
           | lukeramsden wrote:
           | Postgraphile allows you to write plugins to arbitrarily
           | extend (GraphQL) schema and wrap resolvers, which is useful
           | for things like interfacing with external APIs etc. It works
           | just fine for me at my current project.
        
           | FinalDestiny wrote:
           | I'm using Prisma right now and they allow you to manually
           | expose your fields, and they allow you to "resolve" any field
           | as you see fit. If you want to run the original function, you
           | can provide "originalResolve" and call it later. I think
           | Prisma has a great (albeit in progress) way of doing what
           | you're saying.
           | 
           | It also integrates with graphql-codegen so you can generate
           | code for apollo/others
        
         | jdc wrote:
         | In case anyone else is wondering what's doing the templating in
         | this project, it's _Scriban_.
         | 
         | https://github.com/lunet-io/scriban
        
           | eatonphil wrote:
           | A port of Ruby's liquid templates to .NET, yep.
        
         | henryfjordan wrote:
         | This is not dis-similar to what Strapi.io does, although I
         | don't think they realize that's a big selling point from their
         | marketing materials.
         | 
         | With strapi you configure your DB and get code generated in JS
         | that supports a standard CRUD REST API. If you want to add
         | business logic, you can override any particular endpoint you
         | want. Their docs even come with the default implementation for
         | easy copy/paste.
         | 
         | I would love to see research in this space continue, I think
         | it's the future of bringing non-technical people into the
         | product development process (if you can understand building a
         | workflow with Excel/Google Sheets/Airtable, you can understand
         | building an API). I'm excited to check out your project.
        
       ___________________________________________________________________
       (page generated 2020-09-11 23:00 UTC)