[HN Gopher] Launch HN: Paigo (YC S22) - Measure and bill SaaS cu...
       ___________________________________________________________________
        
       Launch HN: Paigo (YC S22) - Measure and bill SaaS customers based
       on usage
        
       Hey HN! Daniel here, I'm a software engineer and hobbyist hacker.
       I'm joined by my cofounder Matt. We're building Paigo
       (https://paigo.tech). We make it easy for SaaS businesses to bill
       customers based on usage.  To get your hands dirty a bit we have a
       stateless and signupless demo you can try out:
       https://hn.paigo.tech/ and a video of me walking through the system
       in a bit more detail: https://youtu.be/T6J1Yh8GhdU.  The idea of
       our platform is fairly straightforward: You give us read-only
       access to your SaaS backend and based on tenant metadata for your
       infrastructure, we measure, persist, and aggregate SaaS tenant
       usage data to give a clear picture of per-client usage. We can
       measure metrics like API requests, Compute time, Data Storage,
       Transaction Volumes and many more. Some common scenarios would be:
       an ML platform could use Paigo to track processed input files for
       customers, a Data platform could use Paigo to determine the data
       size customers have consumed, and an API company can use Paigo to
       track customers' API requests. Additionally, we also help you
       understand your cost to serve your clients' usage, and this data
       allows us to provide your SaaS with usage based billing.  What's
       the problem we are solving? Many SaaS products need to measure
       their customer's usage in some form, and many want to incorporate
       it into their billing plans. It's fairly annoying to either build
       the entire system in house or to build a measurement system in
       house and then connect to a billing provider. It takes months to
       get a usage based billing system up and running and usually
       requires several engineers (if not more) to maintain and operate.
       Also, when Sales wants to offer specific discounts or deals to
       major enterprises, it's typically handled outside of the in-house
       system in Excel spreadsheets with some good guesses. This is how a
       lot of money gets lost for major deals.  With Paigo we handle 100%
       of the measurement and collection of SaaS customers' usage for the
       business. SaaS business can see their customers' usage within 10
       minutes, because all they need to do is give us read access to
       their cloud account. Since we pull the lower level infra-data we
       can additionally give information like per tenant cost, and profit
       margin.  Matt and I came to this project after we built similar
       internal billing systems at previous jobs and we realized how
       error-prone these systems can be--one incident might have even
       undercharged a client by a few million dollars! We also realized
       there was no solution which integrated directly to a backend system
       and handled the measurement and gathering of usage data as well as
       providing the end billing integration to platforms like stripe, AWS
       marketplace, or through ACH.  To get into the technical details
       Paigo has a few measurement systems to measure different forms of
       usage data: infrastructure-based, where we connect directly to
       cloud APIs then to slice-and-dice per tenant usage data; agent-
       based, where our agent is deployed into a runtime to gather usage
       like pod cpu time, memory, and file read write, along with any
       exported metrics that are prometheus compatible; and datastore-
       based, where we connect directly to datastores like S3, Kinesis, or
       log file. We require that the data in the datastore based approach
       adhere to a standard data format so we can process it. However this
       allows us to Pull, any custom metrics and dimensions directly from
       your Datastore. All of this data is then processed and sent to our
       backend usage journal, where we store it in an append-only ledger
       pattern.  For clients to search, and aggregate their data into an
       end bill or to slice and dice their client's cost and usage we have
       an API clients can use. We're an API first company, which is why
       our demo can work with Retool--the demo is just a very thin skin
       over our API. The API is a NestJS based application, currently
       running in AWS Lambda with API-Gateway.  We bill based on invoiced
       revenue (surprise surprise its usage based) and we have a platform
       fee, roughly it breaks down to 1% of invoiced revenue on Paigo.
       Note that pricing is not currently transparent on our website. Our
       typical customers are mid-sized enterprises where an initial sales
       call is typically expected. However, we will be updating our main
       webpage soon to have some self-service options.  For a bit of
       deeper dive on the measurement engine we have some docs here:
       https://docs.paigo.tech/  Thanks for taking time to read! Let us
       know what you hate and maybe what you love :P. We'd also love to
       hear your thoughts and experiences with measuring customer usage
       and usage-based billing!
        
       Author : twosdai
       Score  : 58 points
       Date   : 2022-10-25 17:27 UTC (5 hours ago)
        
       | wizwit999 wrote:
       | Heads up, you forgot to handle confused deputy in your IAM Role
       | policy: (In 'Configuring IAM role' at https://docs.paigo.tech/
       | can't link to a page), which means anyone can pass a role (e.g.
       | for another user) and you'll assume it.
       | 
       | Check out
       | https://docs.aws.amazon.com/IAM/latest/UserGuide/confused-de...
       | for how to handle it. You need to require and use an
       | 'ExternalId'.
        
         | twosdai wrote:
         | Drat, We even have the option in the API to pass it in and use
         | it. Just didnt get propigated everywhere else.
         | 
         | Thanks for that callout, I'll update everything ASAP.
        
       | ignoramous wrote:
       | Much needed! Congrats on the launch, Daniel and Matt.
       | 
       | > _Matt and I came to this project after we built similar
       | internal billing systems at previous jobs and we realized how
       | error-prone these systems can be--one incident might have even
       | undercharged a client by a few million dollars!_
       | 
       | A BigCloud provider (no points for guessing) I worked for found
       | about how they were undercharging customers due to a bug, and so,
       | they fixed the bug for new customers, but continued to
       | undercharge customers grandfathered in.
       | 
       | > _However this allows us to Pull, any custom metrics and
       | dimensions directly from your Datastore._
       | 
       | Most SaaS providers would rather _push_ data than have it _pull_
       | ed, is what I'd imagine. Are you hearing otherwise from folks
       | you've been speaking with? For instance, in _serverless_
       | environments (which is the poison of choice for me, at least),
       | _pull_ is much harder to accomplish, even where possible.
       | 
       | > _All of this data is then processed and sent to our backend
       | usage journal, where we store it in an append-only ledger
       | pattern._
       | 
       | Apparently, a BigCloud, in perhaps a case of NIH, ended up
       | creating a highly-parallel event-queue as a direct result of the
       | scale it was dealing with: https://archive.is/IUKvT Curious to
       | hear how you deal with the barrage of multi-dimensional events?
       | 
       | > _Additionally, we also help you understand your cost to serve
       | your clients' usage, and this data allows us to provide your SaaS
       | with usage based billing._
       | 
       | 2 cents: Fly.io _Machines_ is a tremendous platform atop which I
       | fully expect businesses to build multiple successful SaaS
       | products; may be that 's one niche for you folks to focus on and
       | own.
       | 
       | > _We bill based on invoiced revenue (surprise surprise its usage
       | based) and we have a platform fee, roughly it breaks down to 1%
       | of invoiced revenue on Paigo._
       | 
       | This sounds a bit _steep_. I know for a fact that _togai.com_ are
       | also in private beta (their choice of datastore is TimescaleDB,
       | and event-store is NATS), but unsure what their pricing model is;
       | I 'd be surprised if it is the same as _paigo_ 's.
        
         | twosdai wrote:
         | Thanks so much!
         | 
         | > Most SaaS providers would rather push data than have it
         | pulled, is what I'd imagine. Are you hearing otherwise from
         | folks you've been speaking with? For instance, in serverless
         | environments (which is the poison of choice for me, at least),
         | pull is much harder to accomplish, even where possible.
         | 
         | We totally offer Push based as well, all of our workers are
         | just using the same API endpoint to push the data we collect
         | to, its just not the strong highlight since other providers
         | have push offered.
         | 
         | We went down the pull path since during our discovery process
         | about 6 months ago, we were chatting with some DB and Infra
         | companies who just built out and integrated with a billing
         | provider. All of them mentioned being annoyed with the amount
         | of engineering commitment it took for them to measure, persist,
         | and then transmit the usage data to the provider for them to
         | handle the rest. So we wanted to offer a pull based solution to
         | help with this need.
         | 
         | You're totally right that its architecture dependent, and we
         | don't want to cause a huge load (and cost) on a serverless
         | platform. So for some dimensions, push is definitely an option.
         | 
         | > Apparently, a BigCloud, in perhaps a case of NIH, ended up
         | creating a highly-parallel event-queue as a direct result of
         | the scale it was dealing with: https://archive.is/IUKvT Curious
         | to hear how you deal with the barrage of multi-dimensional
         | events?
         | 
         | So to dive into more technical detail, we have an event queue
         | where our workers drop data off to and then it gets persisted
         | into our ledger, by workers reading from the queue. We have the
         | queue hosted by a major cloud platform and its offered as a
         | managed service, similar to kinesis.
         | 
         | For the different dimensions, we have a standard dataformat
         | they need to be in before we can persist them, this
         | transformation typically occurs on the client side, though in
         | some cases we can transform the data from an open standard
         | format (Prometheus
         | https://prometheus.io/docs/concepts/data_model/) to our backend
         | format.
         | 
         | At a criminally high level, this data format consists of a
         | measurement, a value, a field, and a set of metadata tags. Our
         | ledger is built on a schema-less Timeseries DB, so it doesn't
         | matter if the same measurement has a different set of metadata
         | from another. This gives us a boat ton of flexibility when it
         | comes to how we want to query data.
         | 
         | For the different types of dimensions and their different data
         | types, it becomes an issue when wanting to aggregate on them.
         | For instance, a you may want the Total of one dimension,
         | however Average and Count wouldn't make any sense.
         | 
         | To get by this, clients need to tell us what Aggregation method
         | to use per dimension.
         | 
         | This really isn't present in the demo, since its a fairly
         | simplistic version of the whole app, but its a requirement we
         | have implemented into the API.
         | 
         | > 2 cents: Fly.io Machines is a tremendous platform
         | 
         | Thanks for the hot tip, looks awesome! I'll definitely check
         | this out and how we can sneak it into our product.
        
       | lukeqsee wrote:
       | Congrats on launching a product of this complexity! Best of luck.
       | 
       | I'm staring down the barrel of a potential usage pricing
       | implementation, and I'm glad the majority of the foundational
       | work is already done. It'd be no cakewalk to implement from
       | scratch.
       | 
       | How do you generally address the risks of read access, GDPR, and
       | other similar security and privacy concerns related to your
       | technical model?
        
         | twosdai wrote:
         | Thanks so much, yeah its a lot broader than we initially
         | thought it would be, but we love backend programming a lot so
         | its a joy to work on it, (most of the time).
         | 
         | For the security concerns, at a policy level we're currently
         | going through getting SOC II compliance. In the future we will
         | be progressing for fedramp complaince as well, but we haven't
         | started on that.                 Regarding read access risks,
         | the client has complete control in their account for giving us
         | exactly what access they want/are able to via IAM.
         | Additionally, right now our application is multi-tennant behind
         | the scenes, however early on Matt and I saw that we will need
         | to offer single tennant solution for increased data privacy, so
         | we've designed the system with that in mind and its not a major
         | lift for us to provide a single tenant and isolated environment
         | within any region which data needs to reside in. It just hasn't
         | come up as a concern for us yet.
         | 
         | As an aside, we had some background working with government
         | agencies where this was a major concern for them and single
         | tenant region localized storage was frequently table stakes for
         | deals.
         | 
         | For GDPR, we don't store or process PII. Which sounds kind of
         | insane saying it out loud but its true. We integrate with end
         | payment providers like Stripe and AWS marketplace, where all
         | we're reporting with is a UUID which is associated in their
         | platform to the end customer's billing info, which we never
         | need to see or touch.
         | 
         | Now it is possible that someone could manually enter client PII
         | into the platform in which case we would need to deal with
         | that, but it has yet to come up. If it did, we have API
         | endpoints which can delete all data pertaining to specific
         | clients by request.
         | 
         | I suspect in the future that this will change (we may start
         | persisting PII), and we will need to have a more cohesive
         | strategy for dealing with right to be forgotten, but in the
         | near term it hasn't come up.
        
           | lukeqsee wrote:
           | Great coverage on those answers. Thanks!
           | 
           | One followup: have you considered handling the actual usage
           | calculation and aggregation? One use case that comes to mind
           | is accepting API request logs from
           | (CloudFlare|CloudFront|Logstash), processing them, and
           | directly deriving billing from those. That moves the entire
           | process outside of a system your customer has to touch (in
           | cases where complex application-layer knowledge isn't
           | needed). One less thing for a potential customer to worry
           | about (and _removes_ one of the reasons to need read access
           | to a database in many cases).
           | 
           | Again, all the best! Happy hacking. :)
        
             | twosdai wrote:
             | > have you considered handling the actual usage calculation
             | and aggregation?
             | 
             | So for aggregation we definitely do some of that work
             | already, we enable clients to aggregate their raw data with
             | a few different methods, like via running a total, average
             | or count, over any arbitrary time frame.
             | 
             | For us doing the actual calculation of the dimensions at a
             | specific time, we haven't thought about it, but we're
             | always interested in building more so we might be able to
             | prototype something.
             | 
             | Your specific example though is sounds like another way of
             | metrics to be pushed into our platform, this really
             | wouldn't be that big of a problem for us to implement, and
             | we even floated the idea around of us exposing a stream
             | that clients could push data to which we would process.
             | 
             | Given that the logs would be in an open standard format, we
             | could definitely do that and it sounds like a good idea. :)
             | Thanks for the suggestion.
        
       | nedwin wrote:
       | FYI your "stateless and signupless demo" link isn't linking and
       | copying/pasting the link didn't work.
        
         | twosdai wrote:
         | Thanks for the callout, that sucks. I shouldn't use URL
         | shorteners :/
         | 
         | https://paigo.retool.com/embedded/public/99bd5e9a-af3c-4e9c-...
         | 
         | Here is the expanded link.
         | 
         | And here is a fun HN specific one linking to the same thing:
         | https://hn.paigo.tech/
        
           | mritchie712 wrote:
           | there's nothing in there, all the tables are empty
        
             | twosdai wrote:
             | Yeah so to utilize the platform, during the same session
             | you'd need to create an offering and a service during which
             | you'd provide an IAM role for your account where Paigo can
             | go and read usage data for you and then give you your
             | usage, and cost.
             | 
             | I walk through it all during the first 3 minutes of the
             | youtube video if you wanted to follow along.
             | 
             | https://www.youtube.com/watch?v=T6J1Yh8GhdU
             | 
             | For creating the IAM role, we have some information under
             | https://docs.paigo.tech/
             | 
             | If you get stuck or have issues feel free to ping me!
        
       | pumpitup wrote:
       | Congrats Daniel and Matt - great to see you guys hit this
       | milestone :)
        
       | saurik wrote:
       | > Matt and I came to this project after we built similar internal
       | billing systems at previous jobs and we realized how error-prone
       | these systems can be--one incident might have even undercharged a
       | client by a few million dollars!
       | 
       | This honestly doesn't lend me the confidence in your solution
       | that I assume you think it does ;P. "We've done this before, and
       | dude... we failed at it SO BAD, you just don't understand: this
       | is HARD. So, instead of doing it yourself, you'd be much better
       | off out-sourcing that effort... to us." :( It is a subtle
       | difference, I admit, between versions of this pitch that work and
       | the ones that don't, but this one just didn't work for me.
        
         | dang wrote:
         | I edited that sentence when I was helping these guys with their
         | text and it's possible I introduced a misleading connotation.
         | The way I understood what they originally wrote is that they
         | had observed such a billing lapse inside some organization, but
         | not that they were responsible for it.
         | 
         | On the other hand, even if so, there's the classic proverb
         | about 'expensive training' etc...
        
           | twosdai wrote:
           | No worries :)
           | 
           | Yeah to add more color to this, we just observed the failure
           | of a process and system we weren't apart of. Basically there
           | were manual elements in the billing and aggregation of usage
           | data on a SaaS platform and those were forgotten about over
           | the course of many months which lead to the company under
           | billing.
        
       | cdolan wrote:
       | Sounds useful. Not sure I'd trade a 1% reduction in revenue
       | across the board for this though.
        
         | twosdai wrote:
         | For transparency, we're at an early stage and its much more
         | useful for us to have more clients using and working with the
         | software. Reporting bugs, telling us what they need etc...
         | We're definitely open to negotiating on price, and we don't
         | want it to be a blocker for people to use the platform.
         | 
         | That being said we still want to charge something to indicate a
         | level of seriousness to the use.
         | 
         | For some context on how we arrived to the figure the ~1% figure
         | is in the low end ballpark for how much we saw companies spend
         | internally on their billing systems built in house, and its
         | slightly higher but still in the same range as other billing
         | providers.
        
       | the_duke wrote:
       | A technical question:
       | 
       | I'm curious how you ship, aggregate and store usage data in a
       | resilient (network partition tolerant), scalable and cost
       | efficient way.
       | 
       | Your documentation doesn't seem to peak under the hood.
       | 
       | (Disclaimer: I'm currently building such a system, but using a
       | third party wouldn't be viable in that context)
        
         | twosdai wrote:
         | Great question.
         | 
         | For the resiliency part, our workers / agents we're using a
         | write-ahead log (WAL) https://en.wikipedia.org/wiki/Write-
         | ahead_logging to track what data has been collected and sent,
         | when agents or workers fail and need to be restarted they read
         | from the log to make sure that the data was sent appropriately.
         | For a good starting point I recommend looking at the Prometheus
         | agent for agent design and construction they have implemented a
         | lot of the resiliency into their agents, and if you're familiar
         | with GO forking their work might be a good starting point.
         | 
         | Additionally for all the off the shelf products we use to
         | process and transmit of the data we have "at least once"
         | delivery guaranteed, which realistically under the hood is
         | using a WAL.
         | 
         | For scalability, our agents are deployed into serverless and
         | managed components by some SaaS providers, Confluent and AWS
         | right now. These components have some auto scaling written in,
         | like with AWS lambda.
         | 
         | But for some our measurement components we're utilizing a
         | kubernetes cluster with our custom workers, right now we just
         | statically provision them, however we have the ability to auto
         | scale infra based on node resource utilization, (using AWS
         | ASG's) and for our pods, we currently scale horizontally based
         | on resource availability, like CPU consumption and memory
         | consumption using k8s horizontal pod autoscaler:
         | https://kubernetes.io/docs/tasks/run-application/horizontal-...
        
           | [deleted]
        
       ___________________________________________________________________
       (page generated 2022-10-25 23:00 UTC)