[HN Gopher] Reliability: It's Not Great ___________________________________________________________________ Reliability: It's Not Great Author : bishopsmother Score : 518 points Date : 2023-03-06 17:47 UTC (5 hours ago) (HTM) web link (community.fly.io) (TXT) w3m dump (community.fly.io) | revskill wrote: | Fly.io seems like Vercel 1.0 (where you can just deploy docker | image and done), but it's more than that, with configurable | volumes, secrets,... | | I'm bullish on fly.io. | [deleted] | skywhopper wrote: | Interesting issues. Nothing surprising for anyone who's run a | global SaaS before, especially if growth has been incredibly | fast. I find the gripes about Consul, Nomad, and Vault | interesting since it sounds like the problems are mainly due to | poor architectural decisions. Fly is rewriting those tools rather | than invest in deploying them properly and in the process are | running into new issues that those tools have already solved, | which doesn't give me confidence that the path forward will be | any less bumpy. | victorbjorklund wrote: | Great post. Love how transparent Fly is. Im a customer but | without any life and death important apps. And yea they had some | issues lately so good they are addressing it. | tebbers wrote: | I really feel for Fly, as a potential customer. They are trying | really hard. I would still love to use them one day and this post | is definitely a step in the right direction. Growing is painful | but they have smart people working there so fingers crossed that | they sort this ASAP and it doesn't become existential. | ChrisMarshallNY wrote: | Well, I feel for them. Scaling up is a bitch. | | I've been lucky, in the past, but a lot of that, is because I | have "overengineered," and the tools/frameworks have advanced to | meet the new demand. | | I am in the middle of a complete, bottom-to-top rewrite of the | app we've been developing for the last couple of years. It's | going great, but making this leap was a fraught decision. | | It's mainly, so I wouldn't have to write a post like that, in a | year or two. | | We spent all the time refining it, until we had what we wanted, | and it worked great on our small test team. | | Then, I loaded up a test server with 10,000 fake users, and | tossed the app at that. To be fair, we don't think we'll have | even that many users for quite a while. It's a very specialized | demographic. | | * SOB * | | It no do so well. | | At that point, I had to decide whether to fix the issues (they | were quite fixable), or revisit the architecture. | | The main issue with the architecture, was that it was an | "accreted" app, with changes gradually being factored in, as we | progressed. The main reason for this, is because no one really | knew what they wanted, until we ran it up the flagpole (sound | familiar?). | | The business logic was distributed throughout the app. That was | ... _ugly_. | | I envisioned myself, a year or two down the road, sucking on a | magnum, because the app had turned into a Cruftosaurus, and was | coming for me in my nightmares. | | So I decided to rewrite, as we hadn't done any kind of MVP or | public beta, so we actually had the runway to do this. | | I refined the entire business logic of the app into a single, | platform-agnostic SPM module, which took just over a month, and | have started to develop the app around that. It's pretty much a | rewrite, but I am recycling a lot of the app code. We also | brought in our designer, and he's looking at every screen I make. | It's working well for him. | | Like I said, it's going great. Better than I expected. | | I know that I have a huge luxury, and I'm grateful. I can credit | a lot of that, to doing some stress-testing before we got to a | point where we had a bunch of users to support. I was able to go | in, and go all Victor Frankenstein on the model. | | The result, so far, is that this thing screams, and you don't | really even notice that there's that many users on it. The model | has already been proven (that SPM module), and all we're doing, | is chrome (which is a _ton_ of work). | bbkane wrote: | Hope this isn't a dumb question, but what's an "SPM module"? | [deleted] | soperj wrote: | For django, they should really contribute to 2 scoops django | cookie cutter program, so that you can get an out of the box | django instance that can just deploy to Fly.io. | nilsbunger wrote: | Their problem right now isn't adding more customers - they seem | to have more than they can handle! | | If I were them I'd focus as many resources as possible on | making the stack rock-solid, and away from acquiring more | customers or adding more capabilities. | | In fact I'd try to down-scope some features if at all possible, | like the example they give of disabling app deploys while | they're doing platform updates. | | We use fly.io at a small scale and it's worked really well for | us, but the money is in customers at a larger scale who must | have 100% reliability. | llimllib wrote: | https://github.com/ehmatthes/django-simple-deploy will deploy | to fly (or platform.sh) out of the box, should be pretty much | the experience you're describing | soperj wrote: | thank you! | mixmastamyk wrote: | May be wrong, but they seem to be very focused on interesting | stacks like Elixir and RoR while building on Go/Rust. The | corollary being neglect of the bread/butter stacks with high | market share, like Python and Java/JavaScript. Don't think I've | seen a blog post discussing those three beyond a passing | mention? | | Not the end of the world, but mildly disappointing. At least | they are all in with Postgres and Linux, a great foundation. | jjtheblunt wrote: | I thought the only thing their tech stack was all in on was | docker images being rewritten to run on firecracker as a | substrate. | | is it not agnostic about things like Elixir etc, at the tech | level, though they've got super nice documentation for those | tools you mentioned? | ilaksh wrote: | What's necessary to change for them to run on Firecracker? | ign0ramus wrote: | Here's a post explaining how they do it | https://fly.io/blog/docker-without-docker/ | ilaksh wrote: | Yeah, saw that before. I thought he was saying you have | to change your Dockerfiles to use them with fly.io. Just | misinterpreted that sentence. | [deleted] | tomwojcik wrote: | I don't disagree with you, but they tweeted | | > We'll readily admit our docs still have a Django-shaped | hole in them. | | https://twitter.com/flydotio/status/1578039196618575874?t=nu. | .. | ilaksh wrote: | You can use any Docker image. | tomwojcik wrote: | I needed a cookie cutter for my side projects so I've created | one for sqlite that I actively use and a similar one for | postgres. Both are very basic, contributions welcome. | | Sqlite https://github.com/tomwojcik/django-fly-sqlite-template | | Psql https://github.com/tomwojcik/django-fly-postgres-template | nu11ptr wrote: | Not a client of fly.io, but dang impressive for the company to be | this open and honest. Definite respect - wish more companies were | like this. It puts them on my short list almost immediately for | future needs. | nathell wrote: | Came here to say pretty much this. Technical issues come and | go; open communication is a core part of the company culture, | and builds up trust. | samwillis wrote: | Fundamentally I think some of the problems come down to the | difference between what Fly set out to build and what the market | currently want. | | Fly (to my understanding) at its core is about _edge_ compute. | That is where they started and what the team are most excited | about developing. It 's a brilliant idea, they have the skills | and expertise. They are going to be successful at it. | | However, at the same time the market is looking for a successor | to Heroku. A zero dev ops PAAS with instant deployment, dirt | simple managed Postgres, generous free level of service, lower | cost as you scale, and a few regions around the world. That isn't | what Fly set out to do... exactly, but is sort of the market they | find themselves in when Heroku then basically told its low value | customers to go away. | | It's that slight miss alignment of strategy and market fit that | results in maybe decisions being made that benefit the original | vision, but not necessarily the immediate influx of customers. | | I don't envy the stress the Fly team are under, but what an | exciting set of problems they are trying to solve, I do envy | that! | decidertm wrote: | I'm a co-founder at Northflank. This is what we've spent 3+ | years building. https://northflank.com. | | I am sympathetic with much of Kurt's post. We spent a long time | building solutions to several of the areas highlighted (managed | PG, persistent volumes, secret management and service | discovery). | | Making radical changes to architecture on a live cloud platform | is always a challenge. | | On the front-end Northflank is a next-gen PaaS built for high | DX, speed, and powerful capability (real-time UI, API, CLI, | GitOps, IaC). | | Our backend is built using Kubernetes as an OS: providing a | huge amount of flexibility on service discovery, load- | balancing, persistence/volumes and scale. | | The benefit of using Kubernetes is a universal API across all | major cloud providers. We can scale clusters and regions across | EKS, GKE and AKS in seconds, either in our managed PaaS or | inside our customer's own cloud account. | | Our managed dataservices: MySQL, Postgres, Redis, Mongo, Minio | are all built using Kubernetes Operators with a small but | mighty team. | | From a generous free tier to autoscaling to managed postgres | and other advanced PaaS/DevOps automation workflows Northflank | offers something unique. | trilobyte wrote: | > generous free level of service, | | This is likely the biggest culprit for a lot of these | companies. Too many of us have grown up in the culture of | getting hosting and platform for "free", but at some point the | companies providing it still have to pay the bills. There has | to be a better pricing model that let's someone deploy their | relatively small, low-traffic app for $10s/month or even $200 - | $300 / year for the basics (e.g. - Heroku free tier type | capabilities). It's not going to save these companies but it | would limit excessive growth of their own costs from a free | tier while at the same time still being affordable for 1 - 2 | person teams who are trying to get something in front of users. | karmelapple wrote: | I agree. And I know this is unpopular, but I think none of | these companies should be expected to have a free tier. A | low-cost tier? Certainly. Perhaps even a free trial with a | credit card? Great. | | But our team, who has used Heroku for over a decade, got bit | multiple times by Heroku having a free tier. | | Why were we impacted by other apps? Because Heroku's load | balancers are shared amongst all their apps. That includes | all the sketchy apps running on the platform. | | If Heroku could somehow isolate us from everyone else? Great | - and they offered that for awhile with a reasonably-priced | Add-On supported by them called SSL Endpoint. It cost about | $15/month and put us into a pool that was shared with other | folks willing to spend that much per month to run their app. | | I understand that's not great for a hobby project. But for | those of us trying to run a large product on Heroku and not | have to spend multiple extra thousands of dollars every month | for a Heroku Private Space, this was a great way of pooling: | put a small fee in place for one pool of resources. Not many | malware writers or other misbehaving app creators will | probably want to spend that much per month. | | But they axed that a few years ago. Only a couple months | after when we were thrown back into the load balancer pool | with all the other free apps, one of the IPs was marked as | spam and we had to figure out a kind of janky solution. | | Additionally, Heroku seemingly spent a ton of resources on | free tier support, malware fighting, etc. I hope to see more | features on Heroku since they've dropped that support... but | I haven't seen much evidence of that in roughly six months | since they did that. But we'll see. | trilobyte wrote: | Nice write up! | | I wish I shared your enthusiasm for where Heroku could go | but I have a few friends at Salesforce I've asked about how | they see Heroku internally and it really doesn't seem like | it is going to get much love. Hope to be wrong though. | karmelapple wrote: | Thanks! I have talked with two Heroku folks who say (to | me, a paying customer of Heroku Enterprise) that Heroku | is absolutely in active development. | | I let them know they need to demonstrate that to me. They | have a roadmap [1], but it seems to have barely anything | moving forward, including some really important concepts | like http/2 support. | | [1] https://github.com/orgs/heroku/projects/130 | ignoramous wrote: | > _And I know this is unpopular, but I think none of these | companies should be expected to have a free tier._ | | Free tier is a GTM motion which makes sense for novel tech | products like Fly because: https://en.wikipedia.org/wiki/Te | chnology_adoption_life_cycle | mike_hearn wrote: | If you don't need all that much, Oracle Cloud does offer a | free tier for VMs. You get 2 AMD VMs or 4 ARM VMs and even a | free Oracle DB, object storage, load balancing and | monitoring. | | https://www.oracle.com/cloud/free/ | | It's still just a free tier so you can't expect good support, | but, it's there. | dcow wrote: | Check out DO app platform. It's literally exactly what you | describe. | ec109685 wrote: | The CloudFlare folks wrote a good blog post on how they are | seeing their customers use Edge compute -- latency is far down | on the list: https://blog.cloudflare.com/cloudflare-workers- | serverless-we... | everybodyknows wrote: | Hmm, that post is almost three years old -- still accurate? | prdonahue wrote: | Yes, especially as compliance and regulatory frameworks | continue to evolve and become more difficult to adhere to | as mentioned elsewhere in the comments. | | We're inherently faster than other "serverless" platforms | due to the scale and homogeneous design of our network, and | that network has presence in nearly 50% more cities than it | did just 3 years ago. We were plenty fast enough then and | we're even faster now. | | Other things that customers (still) really care about: | developer experience, ease of use, and cost. Nobody likes | paying the AWS tax to move data around--they just want to | use the best solution from the best cloud provider. Workers | and the associated storage primitives allow them to pick | and choose from the best that AWS, Azure, Cloudflare, GCP, | et al. have to offer. | | (Disclaimer: I'm a long time Cloudflare employee focused on | App Sec, and I speak to customers regularly who look to | Workers largely for compliance reasons, but I don't work on | the Developer Platform business. Am sure my Dev Platform | peers will chime in with more nuanced answers!) | fmajid wrote: | The US CLOUD Act means a EU customer cannot use a US cloud | provider to host PII, even if the server itself is physically | in the EU, because US law will still compel the provider to | yield the data to US authorities. The European Commission is | trying to paper over the cracks with a fig leaf of judicial | review, but it's only a matter of time until a Schrems III | decision from the CJEU invalidates that polite fiction. | LunaSea wrote: | The amount of EU companies following this law is exactly 0. | speedgoose wrote: | It's not true. I know people who lost contracts because | they were using Azure and the customer wanted to respect | the law. | LunaSea wrote: | I've talked with companies like that as well and they | start with strict rules and end up allowing clouds | because no solution is compliant anyway. | pjmlp wrote: | I can attest that there are a lot more than zero in | Germany. | LunaSea wrote: | I would be glad to be shown a company with AWS, Google | Chrome, Google Search, Slack and all the usual suspects. | e12e wrote: | This simply isn't true. At least not for EEC(Norway). | LunaSea wrote: | I have never seen a company without Google Search, Google | Chrome, AWS, Microsoft 360 and the lot. | | Which alternatives are they based on? | [deleted] | huijzer wrote: | Please tell the legal department of our uni. I'm stuck | with a home-made Kubernetes cluster where I have to mail | the admins for provisioning, SSL and domain management. | Would love to switch to Fly or Render | exac wrote: | I know I've personally spent a large portion of my time | updating systems to be compliant in the last few years, | in North American companies. | mro_name wrote: | might well have been yak shaving. If a company is under | US jurisdiction it simply cannot comply to EU data | protection. | mananaysiempre wrote: | ... Are those North American companies prepared to | willingly break EU laws then? Because in my (amateur) | understanding it's logically impossible to satisfy both | CLOUD Act requirements and EU data protection ones (not | just GDPR, but general due-process rights the CJEU | considers required for privacy violations and US courts | deny noncitizens). | mrkurt wrote: | This is, indeed, the exciting part. As Heroku fans, we never | really felt like it needed a replacement. And if it did, it | seemed like Render was the natural Heroku v.next. | | One thing we've noticed, though, is that people do actually | want Heroku but close to users. It's not exactly edge compute. | In some cases, it's "Heroku in Tokyo". In others it's "Heroku, | but running in all the english speaking regions". | | I think the thing that ate up most of our energy is also the | thing that might actually make this business work. We built on | top of our own hardware. That's the thing that made it | difficult to build managed Postgres. We put way more energy | into the underlying infrastructure than most nu-Heroku | companies. The cost was extreme, but I'm like 63% sure this was | the right choice. | jrochkind1 wrote: | > As Heroku fans, we never really felt like it needed a | replacement. | | If Salesforce kept investing in heroku, it might not. But | there is a huge loss of confidence in heroku's future going | on among heroku's customers right now, which is part of what | you're seeing, as I'm sure you know. (Also I think to some | extent you are being political/kind towards heroku... if | heroku's owners were still investing in heroku for real, | adding 'edge' functionality like fly.io is focusing on is | what one would probably expect...) | | And frankly... your tool seems more mature and... not to be | rude to competitors but seems to have more of that certain | `je ne sais quoi` of Developer Experience Happiness that | heroku _used_ to have and other competitors including the one | you mention don't really quite seem to have yet. Does what | you expect/need in a polished and consistent way. | | That work you put into the underlying infrastructure | definitely shows, and was the right choice. | | So I understand why people are looking to you as a heroku | replacement. I am too! (And I don't really need the edge | compute stuff; although I could potentially see using it in | the future, and it shows you folks are on top of things). | | And while I kept reading you saying on HN comments that you | _didn 't_ want to be a heroku replacement, so were | unconcerned with the few places people were mentioning where | you still felt short of it -- when I saw your investment in | Rails documentation and tools (and contribs back to Rails), I | thought, aha, i think they've realized this is a market | _looking for them_ , which they are only a couple steps from | and it would make sense to meet. | | When you mention in OP a "heroku exodus" to you... I'm | curious if that was all people who left when heroku ended | _free_ tier stuff, and they 've all come to you for your free | tier stuff... becuase that does seem dangerous, such a giant | spike in users who are not paying and don't bring revenue | with them! I don't personally use very much heroku free tier | stuff. I hope if that's a challenge, it's one you can get | over. I don't think you are under any obligation to offer | free stuff that can be used for real production workloads | indefinitely -- although, as I'm sure you know, free stuff is | huge for allowing people to try _before_ they buy, and | whatever limits you put on it to try to prevent indefinite | production use get in the way of someone's "try before you | buy" too... and at this point, _reducing_ your free offerings | is a lot harder PR-wise than having started out with less in | the first place. :( | KRAKRISMOTT wrote: | Just partner with Neon or other similar companies in this | space. Scale-to-zero replicated databases is well understood | technology. | | https://neon.tech/ | nikita wrote: | Yes, we are partnering with many companies like Hasura and | Replit to help with managed Postgres. Since Neon scales to | 0 and also autoscales to your workload it very economical | for the long tail of low usage customers. | mattbillenstein wrote: | Yeah, distributed systems at the global scale are very very | difficult - at least with the Heroku style problem, you'd be | looking at scaling in a single datacenter I think - deployments | to multiple datacenters wouldn't share dependencies. | | I do wonder however if they'd be better off using less l33t | tech - do almost everything on Postgres vs consul and vault, | etc. Scaling, failover, consistency, etc is a more well-known | problem and there are a lot of people who've ran other DBs at | tremendous scale than the alternatives. | | Simplicity is the key to reliability, but this isn't a simple | product, so idk. | ghiculescu wrote: | I coincidentally tweeted the exact same thing earlier today. | | I selfishly hope Fly put all their focus toward becoming Heroku | 2.0. I'm sure some people care about all the edge latency stuff | but I don't know many of them. | hiAndrewQuinn wrote: | God, I'm glad someone else sees it as clearly as I do. I | learned about Fly from them acquiring freaking Litestream! The | SQLite replicator! The canonical database at the edge of the | network! Of course that's what they want to do. | bostik wrote: | There's a wonderfully blunt saying that applies here (too): you | are not in the business you think you are, you are in the | business your customers think you are. | | If you offer data volumes, the _low water mark_ is how EBS | behaves. If you offer a really simple way to spin up Postgres | databases, you are implicitly promising a fully managed | experience. | | And $deity forbid, if you want global CRUD with read-your-own- | writes semantics, the yardstick people measure you against is | Google's Spanner. | quickthrower2 wrote: | In a nutshell if you offer cloud services you need to be | better than the MAG clan, Digital Ocean too. And people will | want it dirt cheap. It's still hard to be a profitable web | host as it always was (MAG has the advantage that none of | them were web hosts at first base) | blowski wrote: | MAG? | jerrygenser wrote: | microsoft apple google | gizmo wrote: | Microsoft (azure) Amazon (aws) Google (gcloud) | adw wrote: | I'm assuming Azure, AWS, Google Cloud, but it's new to me | too | avidal wrote: | From context, I'm assuming Microsoft / Amazon / Google, | referring to Azure / AWS / Google Cloud respectively. | [deleted] | zamnos wrote: | Where does the misalignment between what the customer thinks | they want, and what they actually want fit in to your | philosophy? Google Spanner is a great example of this because | who _doesn 't_ want instantaneous global writes? It's just | that, y'know, there's a ton of businesses, especially smaller | ones, that don't actually need that. The smarter customers | realize this themselves, and can judge the premium they'd pay | for Spanner over something far less complex. What I'm getting | to is that sales is a critical company function to bridge the | gap between what customers want, and what customers actually | need, and for you to make money. | | The first releases of EBS weren't very good and took a while | to get to where we are. Some places still avoid using EBS due | to bad experience back in 2011 when it was first released. | azurelake wrote: | > who doesn't want instantaneous global writes | | I want to gently note since I see a lot of misunderstanding | around Spanner and global writes: Global writes need at | least one round trip to each data center, and so they're | still subject to the speed of light. | mota7 wrote: | Like most things, it's more complex than that, and as a | result it can be either faster or slower than 'median(RTT | to each DC in quorum)'. | | It's a delicate balance based on the locations that rows | are being read and written. In the case where a row being | repeatedly written from only one location and not being | read from different location, the writes can be | significantly faster than would be naively expected. | richardhod wrote: | What are the limitations to heroku that people are going to Fly | for? Maybe there's a standard article that would be useful to | read about it? | zamnos wrote: | It's more about Heroku dropping free and low-cost plans, | which is them demonstrating that they don't currently care | about three low end of the market, more than any specfic | feature. | VWWHFSfQ wrote: | > dirt simple managed Postgres | | Heroku PostgreSQL is very simple, yes. But once you need non- | trivial scale it's expensive and extremely non-performant. Even | a medium-sized RDS will outperform Heroku's most expensive | database offering by 20x in my experience. My company doesn't | even run PG on Heroku anymore. We have a VPC/Private Space | connection to AWS Aurora because the cost/performance | difference is so extreme. | snacktaster wrote: | I don't know the details of how Heroku implements their | hosted postgres service, but I'm _guessing_ that it's just a | bunch of PG servers running on EC2 instances. There's | probably a lot of CPU stealing "noisy neighbors" going on. | But yeah, I've also experienced Heroku's PG databases being | dog-slow compared to RDS for the same workloads. | mixmastamyk wrote: | They're probably using older or cheaper instance types. By | not upgrading while charging the same or more over time, | one can skim more profit. | karmelapple wrote: | I have run the experiment, and Crunchy Data's Postgres | servers are 4X more bang for the buck than Heroku's. | | I let some folks at Heroku know this who are product | managers, and they are investigating it... but I would be | shocked if Heroku gets a big performance improvement anytime | in, say, 2023. | | 20X seems like a lot for RDS, though I'd be curious to learn | more! We are switching to Crunchy because of that clear | cost/performance difference you mention. | satvikpendem wrote: | I'm going to plug Coolify, an open source Heroku alternative | (with Docker support too) that I'm using on a cheap $5 Hetzner | server which is a lot cheaper than the equivalent Fly or Render | etc service, and it really doesn't have much upkeep from me | even if you add in the time setting up the server initially, | which is like an hour, and afterwards, it Just Works(tm). | | https://coolify.io | notpushkin wrote: | Dokku is also nice and battle-tested: https://dokku.com/ | | And may I also plug Lunni, a self-hosted Docker Swarm-based | PaaS I'm working on right now: https://lunni.dev/ | | Both work pretty well on $5 servers. | satvikpendem wrote: | Lunni looks really interesting! Looks like a Coolify | competitor, I'll definitely check it out. Do you have a | Discord to join? Coolify has one and I found it great to | discuss the project and talk directly to the creator. | | I used to use Dokku but I personally liked the GUI from | Coolify so I've been using that. Nice to see that you have | a GUI as well, makes configuring apps a lot easier. | arjvik wrote: | No experience with either, but how does Coolify compare to | Dokku, the OSS Heroku alternative I've been hearing about | until now? | leishman wrote: | This is spot on. I found myself using Fly for a project because | it was super easy, not because I needed edge compute. TBH it's | still actually unclear to me who needs edge compute? What apps | require this sort of infra? It's not 99% of web apps right? | hinkley wrote: | I still think that in the next pendulum swing we'll end up | with edge computing and (smaller) self-hosted backends. | Everything old is new again, and we haven't entirely | recreated Akamai from first principles yet. | davnicwil wrote: | Personally I see this as a 'why not, if it works' type thing. | | Sure you don't _need_ it for 99% of usecases, but if it just | works using familiar architectures then it _is_ also strictly | better for 99% of usecases so you might as well, and people | will naturally want it. | | That 'familiar architectures' part is the hard bit, though. | kevincox wrote: | But it isn't better in 99% of use cases. Lots of use cases | are rendering an API response or HTML page that involves | multiple database requests. Therefore the distance between | database and app server is more important than the distance | between the client and the app server. | | Edge compute can be helpful for static or quite cachable | content. But often this is handled as well or nearly as | well by a caching CDN. | | So that leaves a few cases where edge compute is useful. | Where you are globally distributing the data itself (and | ideally moving the data around as your users travel or | move) which is incredibly rare and expensive to build, and | when you need pure computation that needs no request to | your backend and if 50ms of latency is important for a pure | computation most of the time you can just move it to the | client. In my experience these tend to be rare. I would | estimate that edge compute is actually helpful for 1-5% of | projects, not 99%. | vineyardmike wrote: | One of the big benefits of edge compute is that it's | geographically distributed. Doesn't make a big impact across | the US, but globally a lot of nations have specific data | laws, so it's important to host data in the required nation. | Keep customer data in its nation of origin, but have a single | control plane and platform for ever data center. | vendiddy wrote: | I can second this. We were evaluating moving off Heroku and to | Fly.io, but we didn't need all of the edge compute stuff. We | just want a better Heroku without having to think about | infrastructure and having to think about edge compute just got | in our way. | | I feel like Next.js is in a similar position. While their main | vision is SSR, I wonder if they are missing out on a chunk of | the market that simply doesn't want to think about infra. We | use them because we just don't have to worry about webpack or | fiddling with deployment and hosting. We could care less about | SSR and in fact we disabled it app-wide. | alexgrover wrote: | Why would they be missing out? Vercel can host static sites | just fine, whether that's one generated by Next or any other | framework or written by hand | leerob wrote: | One of the key design choices of Next.js was to enable | granularity on the runtime (Node.js or Edge[1]) and the | rendering method (static or dynamic[2]) on a per-route basis. | If you want a full SSR site, that's okay. If you want a full | static site, that's also okay. | | We often see folks wanting a mix of both. For example, maybe | the /about page is static, but the home page is dynamic and | personalized based on the visitor. You can do all of this | with Next.js. Our future direction is adding even further | granularity, enabling this decision at the data fetch level, | allowing you to cache results across deployments[3]. | | [1]: https://beta.nextjs.org/docs/rendering/edge-and-nodejs- | runti... | | [2]: https://beta.nextjs.org/docs/rendering/static-and- | dynamic-re... | | [3]: https://vercel.com/blog/vercel-cache-api-nextjs-cache | sirsinsalot wrote: | Digital Ocean gave me the PaaS replacement and managed PG and I | couldn't be happier. | | If anyone else is looking. | erebe__ wrote: | You can take a look at www.qovery.com It provides an Heroku | like experience but runs on your cloud account (aws, scaleway | or digital ocean). | | They build on existing tech that is already working, so it is | more stable. | ignoramous wrote: | Save for a few "in-preview" features, Fly was stable too but | then they started growing faster than they could keep up (a | good problem to have!). Stability isn't a permanent state. | vineyardmike wrote: | I agree - fly is so easy to use (when it works) that it's hard | not to be impressed. BUT what I've found is that we don't need | edge compute, since our customers aren't that latency | sensitive, so it's lost on us. It's only a few more | milliseconds to us-east-1. | | I've heard (on HN) of a dozen different companies vying for the | heroku replacement spots and yet Fly seemed to capture the | attention. I couldn't name another one off hand. | | What I truly want and probably lots of other people too is | Flyctl (and workflow) for AWS. The same simplicity to run as | fly, but give me something cheap in Virginia or the Dalles. | latchkey wrote: | > What I truly want and probably lots of other people too is | Flyctl for AWS. The same simplicity to run as fly, but give | me something cheap in Virginia or the Dalles. | | Google Cloud. It is painfully easy to spin up managed | postgres, super easy to deploy gcp cloud functions or gcp | cloud run. It isn't expensive either and just works. | 0cf8612b2e1e wrote: | If someone is not already using the holy trinity | (AWS/Azure/GCP) there is probably a reason. | monsieurbanana wrote: | I'm not using gcp anymore because it's not worth risk | losing access to my personal gmail account just to play | around with pet projects. | | I might be paranoid, but I just don't feel comfortable | when there's so much in play. | guhcampos wrote: | Create a new Gmail account? | vineyardmike wrote: | Google associates different accounts that are from the | same owner when handling issues FYI. So if they think | your account is doing something wrong on GCP, be wary of | associated accounts. | namaria wrote: | Separating concerns, isolating things that are not | related, these are some basic tenets of good engineering. | Yet we all keep rolling the ball of mud downhill and act | shocked it keeps growing and swallowing everything. | 0cf8612b2e1e wrote: | Totally agree with this mindset. My digital life is on | the line because Google refuses to separate services. | amluto wrote: | Egress pricing, for one. | | fly.io charges an outrageous 2 cents/GB. Google is over | 4x that. | | At fly.io rates, 1Gbps average over a month is $6400/mo. | Google is tiered and you're looking at over $10k/mo. | | For comparison, a cheap managed switch that can handle | 1Gbps cost about $100, maybe a bit more if you want a | nice one. A nice router is more. You can rent _an entire | rack_ , including power, cooling, and an unmetered 1Gbps | for $300-$1k/mo (with maybe some wiggle room on both | ends). You can buy a pretty nice server, amortize the | price over a week or two, and still come out ahead. | | You certainly get considerable value from a major cloud | provider, and a lot of their other services are | reasonably priced, but, depending on your workload, the | egress prices and the corresponding Hotel California | factor may make using a major cloud provider a poor | proposition. | mattbillenstein wrote: | Do you have a guide in mind? | | If it's sorting and sifting and clicking a bunch of stuff | in the console, that's not painfully simple. If it's some | easy cli commands, I think that's in the ballpark... | cldellow wrote: | Render.com is another spiritual successor of Heroku. I'd love | a world where Fly and Render are both very successful | companies. | vorticalbox wrote: | Render has some great features like making a new sub domain | for when a PR is opened so you can test it as a fully | working API before you merge | alexgrover wrote: | That's supported on most PAAS these days, including | Heroku. | vorticalbox wrote: | On their free tiers though? | [deleted] | alexgrover wrote: | Well, no longer free on Heroku, but it was | rychco wrote: | Yeah I like them both a lot, having tried deploying small | projects on each. However, I've defaulted to render at the | moment because I've found it painless for my current | project, and edge compute is low on my list of priorities. | | Though to be fair, even if render collapsed overnight, I | think I'd still be equally satisfied after moving to fly. | te_chris wrote: | Not gonna happen. Both will get acquired because that's how | things work now | jamil7 wrote: | Not sure why this is downvoted, it's a valid point. | anurag wrote: | (Render founder) I'd love to understand why you think | this is the only outcome. Render has positive gross | margin and a clear path to profitability based on both | our growth so far and the tailwinds in this space. I'm | also aware of other companies like ours that have grown | all the way to IPO or are well on their way. | | I'm very explicit both internally and externally that an | acquisition is a failure mode for Render. We're building | this for the very long term and plan to keep it that way. | te_chris wrote: | I guess I'm just default cynical these days seeing how | much money's still floating around and the scale of the | cloud big 3. Apologies, it wasn't personal. I admire your | vision and hope it can work, money always seems to talk | eventually though. We need more companies that have the | nerve to hold on and develop on their own. | jstummbillig wrote: | Unless a company is very explicit about this not being in | the books, I tend to share this outlook. | | From the perspective of a recent founder, it's downright | spooky to build around any SaaS, considering how few of | them have been around for 10+ years, when that is | certainly what our business is aiming for. | | I know (and share the feels): Devs tend to get excited | about the new thing - but if Google Workspace shut down | next month, we would be in so much operational trouble. | When other peoples fancies stand in the way of the entire | operation you are responsible for, it actually begs the | question how much closed source SaaS you can allow before | it starts to be quite frankly irresponsible. | | We are not imagining things. SaaS of all sizes shut down | all the time, and when you are heavily relying on them | and building software around them to run a business the | prospect is spooky as hell. | zamnos wrote: | The difference between (free) Gmail and Google workspace | is that workspace is a paid product. If you're big enough | to warrant an AM, you can get terms which include | continuity of business planning if Google _does_ happen | to shut down Workspace. (They won 't.) | manmal wrote: | Is your argument that Workspace is a paid product and | therefore won't be shut down? If yes, let's keep in mind | that Stadia was paid-for too. My trust in the longevity | of Google products has been damaged beyond repair. | giovannibonetti wrote: | The difference is that Stadia was definitely losing | money, whereas Google Workspace might be profitable. | sethammons wrote: | I'm guessing that downvotes come from those who see the | macro environment changing. With increased rates, | borrowing to purchase companies may make less sense. | te_chris wrote: | Macro makes it harder to raise funding too though - VC no | longer as attractive given the risks and higher interest | rates available | morelisp wrote: | These threads from mrkurt a few months ago seem relevant | here - | | https://news.ycombinator.com/item?id=32955520 | | If they are a multiplier for a whole portfolio, there's | not much reason for any particular branch to purchase | them. | | (This post seems like some evidence they might actually | be building the wrong thing, though.) | trunnell wrote: | > Flyctl for AWS | | Have you tried AWS Copilot? I'm having good success with it. | Probably not quite as simple as flyctl, but still it's only | one command to deploy a container. | | I would really like fly.io to overcome these hurdles. I bet | they will. | manmal wrote: | I can second that I've seen render.com mentioned very often, | maybe even more so than fly. | gen220 wrote: | > What I truly want and probably lots of other people too is | Flyctl for AWS. The same simplicity to run as fly, but give | me something cheap in Virginia or the Dallas. | | Pardon the ignorance, is this not the Amplify CLI [1] ? | | [1]: https://docs.amplify.aws/cli/ | pid-1 wrote: | No | scubbo wrote: | Can you elaborate? | ctvo wrote: | Things that just work and are delight to use and the AWS | Amplify CLI are not often mentioned together. The Amplify | CLI is a growing collection of poorly thought out, poorly | implemented functionality that looks good in demos, but | falls apart under any close inspection. | zoomzoom wrote: | I think this whole category is interesting, from the next-gen | PaaS to the cloud-native ecosystem. Totally empathize with | how hard what fly is doing in terms of scale and reliability | is. | | At Coherence (withcoherence.com) we're focused on a developer | experience layer on top of AWS/GCP. You might describe it as | flyctl for AWS. | jrochkind1 wrote: | I remain kind of amazed about how heroku managed to pull off what | they pulled off, in the first case. | | Also: | | > The Heroku exodus broke our assumptions. Pre-Heroku, most of | the apps we were running were spread across regions. And: we were | growing about 15% per month. But post-Heroku, we got a huge | influx of apps in just a few hot spots -- and at 30% per month. | | I hadn't before seen anyone with a big picture view confirm a | heroku exodus was happening, although a lot of people _suspected_ | it or had anecdotes. | | But if fly is seeing a pretty enormous number of customers moving | from heroku to fly... oh wait, now I'm wondering, is this mainly | a result of heroku ending _free_ services, and those are free | customers coming to fly for free services? | | If so... that's a pretty big burden to take on without revenue to | match, it does seem kind of dangerous for fly. | tiffanyh wrote: | Would the simple solve be that Fly.io just mark any new service | of theirs "beta" for x-months post launch? | samwillis wrote: | None of the services that they have had issues with in the | recent past are new, they have been running for at least a | couple of years. They would need to put a "beta" sticker on the | whole platform for that suggestion to work. | | But the post makes it clear that the issue isn't that they had | problems with new services. It was rapid customer growth before | they had time to scale up the infrastructure as they had | planned to do. | ec109685 wrote: | I wonder what types of RPS they are seeing that required a gossip | based protocol to broadcast state around versus a more | traditional data store. | | I take it that it's far more important that the local region know | about changes than a remote region, which makes a mastered store | in one location as the source of truth problematic. | | I also wonder why these companies don't backstop themselves on | the public cloud? Failing into an AWS seems better than running | out of capacity and some its services could be used in | circumstances where an open source technology isn't ready. | likecarter wrote: | Yeah, reading the post made it seem like they followed "best | practices" without really thinking things through. KISS. | [deleted] | lll-o-lll wrote: | At first I was all like "Ha ha, losers can't scale" | | And then I was "Huh, these technical challenges are actually | pretty difficult" | | And _then_ I was all "crap, these are a bunch of technologies I | was about to add to our stack" | | Thanks heaps fly.io people; having the humility to honestly talk | about the challenges and failures massively helps people such as | myself as we navigate new unfamiliar technologies. If more | companies were willing to do this, it'd be a lot easier to avoid | common pitfalls. | chucky_z wrote: | The tech in their stack is still pretty good. Unless you're | supporting tens of thousands of customers and trying to make | the promises that fly makes today. Look at the fly engineer | replies in this thread. | | Also they basically only use OSS versions, they could go give | Hashicorp some money to solve their Vault problems. They could | probably partner with SecondQuadrant for PG as two examples. | That might not make sense for their business though. | | Hard problems are hard no matter the choice. | lll-o-lll wrote: | Sure, I was going for a little humour there. A little riff on | the whole "we always judge others until we walk in their | shoes". | | The take away I was hoping for is "providing insights into | how we struggle helps others" | lopatin wrote: | It sounds like they need more money to scale the shared stack | e1g wrote: | This reads like a mea culpa from an indie hacker, but Fly.io had | 5+ years and raised $40M to get these basic _fundamentals_ right. | And we get promises of a new status page. | sph wrote: | Big companies fail spectacularly as well, so it is refreshing | to read a indie hacker-style mea culpa than a pile of nonsense | PR one would expect from a company that raised $40M. | | Honesty pays off in the long run, but it's something businesses | quickly forget past a certain stage. | ignoramous wrote: | Well, that's one way to look at it. | | Fly's been many things over the course of its lifetime [0], but | I believe their latest pivot (on what they call "Machines") is | pretty darn good. I've been using Machines since Oct last year, | and things have gotten better week-over-week. Like with any | platform, Fly has its own idiosyncrasies, which don't take much | to get hang of. That said, I am the only person in my tech shop | that deals with Fly. Some orgs with larger teams and heavier | apps that deploy frequently or run DBs / disks on Fly (I don't) | have had a rough few months; so that's there too. | | [0] Ex A: https://news.ycombinator.com/item?id=13985940 | mbStavola wrote: | Fly Machines, if I understand them correctly, feels like a | step backwards. Sure they might work better than "standard" | Fly apps, but one of the motivating cases for Fly is being | able to effortlessly scale across the world without having a | Ph.D. in CS and a fistful of certs for Cloud engineering. | That vision for Fly is awesome, game changing even, ignoring | their current stability issues. | | Machines isn't that. From the documentation, it appears as | though it's "just" a VM pinned to a single region and none of | the "magic" of Fly really applies. If the server your VM is | hosted on goes down, Fly won't redeploy your container. It's | just downtime. Spinning up in other regions is something you | have to think about and actually _do_. It seems closer to | Heroku than it does Fly. | | Maybe I am totally misunderstanding Fly Machines and their | use-case, maybe they're aiming to close the gap between | Machines and Fly apps. It's just a bit of a bummer to see | something that looks like walking back the original "promise" | of Fly and makes me question whether or not Fly is going to | just become like every other PaaS (even if it's a really good | one). | ignoramous wrote: | Agree. Kurt's mentioned on the forums that _autoscale_ is | coming to Fly Machines. They haven 't implemented it just | yet. | | Even without _autoscale_ , spinning up Machine clones in | any of the 30+ Fly regions is as easy an instant scale-out | you'll likely come across on any of the _NewCloud_ | platforms. | phreack wrote: | It is concerning that they feel notifying a problem on the | status page hurts their ego. It is under no circumstances | something personal, and ideally it should even be automated. | [deleted] | sergiotapia wrote: | It's been almost a year since I gave Fly a review | (https://news.ycombinator.com/item?id=31391116) and it's a bummer | that they're still struggling to get things right. Double bummer | because I love Phoenix and Elixir and they employ Chris McCord | there. | | Maybe they were _too_ ambitious at the start? They have a hard | road ahead of them, and competition like Render.com and | Northflank have provided me with solutions to all of my problems. | Great dev ux, great prices and predictable solutions. They also | keep pushing out very useful features. A third competitor also | sprung up Railway! There's certainly blood in the water. | | Will they catch up to others before the competition solves the | "global mesh" unique value proposition Fly.io currently has? | That's the $1MM question. | zomglings wrote: | I read your review, and had a question so I thought I'd follow | up here. You mentioned render.com as a competitor - does render | host its own infrastructure or do they act as a go-between | between their users and AWS/GCP/whatever? | cldellow wrote: | They act as a go-between in that they ultimately host on | AWS/GCP. They host their own infrastructure in that they | appear to run Kubernetes and have built out their own | deployment and service fabric, so they're just using the | underlying machines as dumb compute, they're not, eg, | building on RDS. | | In March 2021, someone asked a question about carbon | emissions of their data centres. They said they hosted on | both GCP and AWS, but mentioned they were interested in | moving to their own bare metal [1]. | | In April 2021, I asked a question about egress fees to | Google, and they walked back a bit the comment about moving | to bare metal [2]. | | As of March 2022, they're still in AWS/GCP [3]. | | As of September 2022, workloads for new users deploy into | AWS, even in regions that were previously served by GCP [4]. | | [1]: https://community.render.com/t/does-render-use-green- | energy/... | | [2]: https://community.render.com/t/is-render-com-hosted-in- | googl... | | [3]: https://community.render.com/t/are-your-servers-owned- | by-you... | | [4]: https://community.render.com/t/which-render-regions-map- | to-w... | anurag wrote: | (Render founder) We're still on public clouds because even | if it doesn't help with margins, it helps us move faster on | features our customers want. It's all one big | prioritization problem (and lots of little ones too!). | zomglings wrote: | I'm curious how significant a risk products like AWS | Lightsail are to your business - it seems you are | competing in the same market, but: | | 1. They have vastly different ongoing capital and | cashflow requirements than you do. | | 2. They have all the leverage when it comes to the | question of your continued operations on their cloud. | | I'm also curious if they have already offered to just buy | you out since you're clearly succeeding where they seem | to just be treading water. (But not expecting you to | answer this question. :) ) | iamdbtoo wrote: | I'm a big fan of fly.io. From their hiring process to the product | itself it's all carried out in a thoughtful manner. I hope they | can weather this rough time. | emschwartz wrote: | One of my colleagues keeps repeating "reliability is our number | one feature". | | I'm not sure it is for 100% of early stage startups, but I guess | it is once you exceed some minimum usage threshold. | | That said, definitely appreciate the detailed explanation. | jpdb wrote: | > One of my colleagues keeps repeating "reliability is our | number one feature". | | I think reliability is the #1 feature at any stage because if | you're unavailable, you're at best useless and more than likely | you are actively harmful because your users have an | expectation. | | However, if you're unavailable outside of times customers don't | expect you to be there then you're not actually unavailable. | This is more likely for an early stage start-up, but you don't | typically choose or know when you're expected to be available | nor do you always get to choose when you're unavailable. | sa46 wrote: | In terms of confidentiality, availability, and integrity: | I'll bet LastPass would gladly trade availability right now | to regain confidentiality. | ignoramous wrote: | Our team at AWS had a poster up on the wall that more or less | went: | | 1. Security | | 2. Durability | | 3. Availability | | 4. Speed | | Similar: | https://twitter.com/colmmacc/status/1071088017190711296 | cschep wrote: | TL;DR -- It's very domain specific if reliability is your | number one feature. | | For a startup that is hosting other people's production | application/data then this is absolutely true. Less than 100% | always needs to be addressed. | | For a startup that is selling bingo cards then reliability | probably isn't nearly as important. I'm guessing there were | certain holidays that were more important than others as far as | reliability goes though? Maybe patio11 can chime in :) | jrochkind1 wrote: | > One of my colleagues keeps repeating "reliability is our | number one feature". | | > I'm not sure it is for 100% of early stage startups, | | I mean, it probably depends on the nature of the startup? | Platform-as-service seems particularly sensitive to reliability | (whether or not it's "#1 feature"), in a way that might not be | true of startups in other spaces. | yamrzou wrote: | I'm not a user of Fly.io. I can't help but notice how remarkable | the effect of open communication on potential end users like me. | I remember reading about their reliability problems on HN some | time ago. That biased my view of the company. After reading this, | the open communication and transparency restored my trust in | them, and would make them again a potential candidate for future | projects. Because now I know that they acknowledge the problem | and that they are trying to improve things. | willio58 wrote: | Agreed, this is how company communication should be. | | I don't use Fly but would consider them in the future even | given their recent issues. | | I look at this in contrast to Twitter who had/has? an outage | today. Their leadership is opaque and doesn't take | responsibility for the issues they are causing. | alfalfasprout wrote: | In fairness, a CEO who has basically been Kanye-ing himself | and his company into irrelevance is a low bar. | newaccount2021 wrote: | [dead] | alfalfasprout wrote: | This is huge. Even as a member of a larger company, this stuff | matters. If you have a vendor that doesn't bullshit you when | things go wrong, you can actually trust. This is how you avoid | companies having the "hmmm they seem to be having lots of | issues recently, let's consider moving off them" conversation. | snapetom wrote: | This is probably therapy, but your message and fly.io's post | resonates a lot with what I'm going through. I took a product | owner role about 6 months ago, my first, with a company that | has turned out to be just a mired mess, and a product | universally hated both internally and externally. | | Long story short, it's completely over-engineered by a bunch of | intellectual engineers with no focus, no discipline, and no | oversight. It ended up not delivering on any promises it made, | and there were a lot of them. | | I was warned left and right before presentations and meetings, | "this customer hates your product because of ...." I started | off every meeting with saying, "we're rearchitecting the | product, this is how we're doing it, this is the tech we are | using." Immediately there was a sense of relief from customers, | followed by questions like, "why can't <current product> | deliver <feature> that was promised?" I'm completely honest | with bad decisions that were made and how it impacted the | feature. Sure, there is skepticism on what we are doing, and I | tell them they should absolutely be skeptical based on our | track record. The result has been customers who have hated my | product now offering to work with us on development. | | I've also been completely forthcoming on configuration, | security, resources, and setup issues I am finding, many of | them are absolutely freakin' insane. I've flat out told | customers it's frankly embarrassing and never let us do | something like this in the future. The best feedback on this | was, "At least you're telling us something. We usually get | silence from this team." | | God, this is the most depressing job ever. | zamnos wrote: | Can you help me in a detailed sense - what did you tell | customers? did you literally say there's product is | "completely over-engineered by a bunch of intellectual | engineers with no focus, no discipline, and no oversight"? | That seems a little over-honest to me but of course I wasn't | there. | hinkley wrote: | Architectural astronauts. | mrkurt wrote: | I feel this. I hope you get over the hump and your job gets | fun. We've had flashes, at least, but I do think what we're | doing (and probably what you're doing) require some | irrational behavior. | ndneighbor wrote: | Part of what I hated about Product Management at my last role | was the consistent helplessness I felt when I was on calls | with our customers. I could tell our product wasn't meeting | their needs but all I could do was try my best to give the | engineers context on how best to eventually meet them. | | I remember my first few days on the job just being ripped to | shreds by our customers who (understandably) were slighted. | Don't miss those days at all. | claytonjy wrote: | Your job sure does sound depressing, and it's not one I would | succeed at, but if you can power through and turn this | product around that's a hell of an accomplishment you'll have | to be proud of. | | I'm curious what you'd like to do next. You could probably | have a great career doing these sorts of turnarounds | repeatedly across companies, maybe even as a consultant, but | would you want to? | leetrout wrote: | mrkurt[1] is also active here and has been very transparent in | his comments about scaling issues. | | Similar to this post he commented a week ago: | | > In a year we'll either be ahead of those, or not growing | anymore due to ongoing capacity issues. I'm hoping for the | former. | | I am rooting for Fly! Great team. The company reminds me of | early HashiCorp. | | [1] https://news.ycombinator.com/user?id=mrkurt | gizmo wrote: | This post is carefully worded corporate messaging, but because | they write for their developer audience it has an informal "oh | shucks we messed up bad y'all" vibe to it. But make no mistake, | this is 100% corporate messaging. | | I get that growing is super hard. And maybe fly will grow up to | be a good platform some day. But that's the future. Today, | they're flying by the seat of their pants and I mostly feel | sorry for people who were tricked into thinking this platform | is ready for production use. | spoiler wrote: | I'm not sure why the cynicism around their candor. Do you | think it's not genuine just because it was posted by a | company employee? | | Your post implies corporate messaging is bad. And anything | posted by a company--or at least I don't know where you draw | the line--can be considered corporate messaging. Am I just | reading too much into your phrasing? | gizmo wrote: | It's _strategic_ messaging. It can 't be genuine, because | of what it is. The benefit they get is publicity and damage | control, and as you can tell by the many responses here, it | buys them time because many developers are willing to give | them the benefit of the doubt. | | Companies that engage in this kind of candor are careful | not to disclose those things that would really hurt their | business. Those things are still kept secret. If the CEO | accidentally sexually harassed an employee that's not | getting disclosed. A mea culpa is only offered for the | issues that are already known regarding scaling, downtime, | and missing features. Struggles they have because they're | choosing to grow so fast. | dadrian wrote: | Sorry, what? Do you expect that no company can think | about what to write before they post it, or that any post | about anything internal must cover all internal issues? | Posts must be either all roses or a no-thought laundry | list of everything bad? | ignoramous wrote: | > _Do you expect that no company can think about what to | write before they post it..._ | | I guess, you and GP are in agreement for the _strategic_ | part of the argument at least, if not the _genuine_ part | of it. | | As someone who's been active on Fly's community forums | for close to 18 months now, I think Fly employs some of | the most genuine and helpful engs you'll see, so I'll | give them the benefit of the doubt. | skrtskrt wrote: | professionals don't get tricked into thinking a platform is | ready for production use | | If you don't have SLOs and SLAs, then you get what you get, | essentially. Even a company with a great reputation can | completely reverse course with a single bad incident, and you | get nothing in return if there's not a contract. | AtlasBarfed wrote: | Honestly, if you are a small fish to AWS... what is an SLA? | | They can trot out a low level person to stall you with | questions, or an AI question generator that maximizes the | amount of time you waste on your end, and call that "SLA | met". | | And even if they DON'T meet the SLA on occasion, you built | your stack on AWS. You are laying in the bed you made. | | SO, what, AWS throws some free credits (that their 30-40% | margin easily absorbs)? | | The only big stick in these types of things is having dual- | cloud capability, where you can move your service quickly | from one cloud to the other. Stateless API servers? Maybe. | Database servers? ouch. Cassandra could reliably span two | clouds, man would AWS kill you on their ludicrously | overpriced network costs. | | Has anyone does Postgres replication across providers as a | useful production system? Doubt it. | Trufa wrote: | I don't get what you're saying, this isn't a brag disguised | as a confession, they are actually admitting to poor | performance, of course it's to eventually make users trust | them, but a) I don't see nothing bad with that b) they are | choosing the hard route. | | They are being open and transparent (afaik) even if carefully | worded, which I also don't blame them for. | mbesto wrote: | Me too. | | However, this is a double edge sword. Their key value | proposition _is_ scale / speed which makes it concerning that | they haven't "solved" that yet. | bodecker wrote: | Open communication is great when there are incidents, but even | better is having no incidents. (of course there are nuances | depending on specific context) | pier25 wrote: | I've been using Fly for over two years or so. The sentiment of | this post doesn't align with my personal (anecdotal) experience. | | The PG issues hit me two times in the previous weeks but other | than that it's been working great for me. | | With the move to v2 apps (using their new machines infra) things | are actually faster and smoother than ever. | | About a year or so ago their CLI was quite buggy but I haven't | really hit any bugs in months. | | I will remain with Fly for the time being. Hopefully they don't | close shop! | tptacek wrote: | We're nowhere even within the line of sight of closing up shop. | We just haven't been doing a good job of aggressively | communicating (a) when things go wrong and (b) what we're doing | to account for it. | | The Fly.io of 2023 looks almost nothing like that of 2021 (all | for the better), and it's not obvious to our users what's | changed. We've been doing a shitty job of communicating, and | we're taking our licks for it now. | chris_st wrote: | A lot can happen in 11 years :-) | | And thanks a lot for fly.io -- it's working great for my | (rather small) use cases. | mrkurt wrote: | Oh my god that's a great callback. | russellthehippo wrote: | Agree - V2 apps on machines are incredibly slick to launch | (create/start/stop), get info on with graphql, and scale up and | out. Magic. When the PG administration experience is that good | I'll move it all over. | claytonjy wrote: | Very interesting to see Kurt assert theyre going to "solve | managed Postgres", and I'm super curious to know what that means. | Does it mean something like RDS, or more like CrunchyData? | | I could see them building something RDS-like on their own, but if | they're trying to go further than that I wonder if they'll buy or | partner with other companies rather than doing it themselves. | Neon strikes me as a Postgres-as-a-service that could pair well | with Fly. | mmcclure wrote: | That comment jumped out to me too, my recollection was that | they've been pretty vocal about that not being something they | wanted to solve themselves as a core competency. I'm not quite | parsing if these two blurbs should still be taken together or | if the second sentence is refuting the first. | | > The second problem we have with Postgres was a poor choice on | my part. We decided to ship "unmanaged Postgres" to buy | ourselves time to wait for the right managed Postgres provider | to show up. | | > We're going to solve managed Postgres. It's going to take a | while to get there, but it's a core component of the | infrastructure stack and we can't afford to pretend otherwise. | | +1 to Neon seeming like a good fit, but it's also very much a | beta (alpha?) both as a product and company (at least from my | impression). I'm not sure that's a bet they'd want to make | right now given the context of this post. | nikita wrote: | (Neon CEO) | | We are launching our paid tier March 15th and will be | production ready shortly after. We are running 20K+ databases | and measuring reliability and uptime. | | Generally reliability is a function of architecture (we are | solid there), good SRE practices, and a long tail of event | you live through, fix, and make sure they never happen again. | The bigger the fleet the faster the learning. | craigkerstiens wrote: | Craig here from Crunchy Data. Not sure if you mean Crunchy Data | is like RDS or isn't, in some cases we're very similar as a | managed service provider. But are focused on a better developer | experience and quality support. | | We've had a number of customers that use us for the database | and fly for the app. We had a user benchmark a number of heroku | alternatives with various database providers and we were | actually better response time than the unmanaged instances on | fly themselves in addition to all other providers they tested - | https://webstack.dancroak.com/ | | I won't speak for Fly, but we're big fans of them and think we | pair quite well together. | mrkurt wrote: | Yes we think they pair well together too. I believe the ball | is in your court though. ;) | winslett wrote: | <3 | claytonjy wrote: | I haven't used CrunchyData for work, but I see you as | offering what RDS does plus plenty more. RDS does a lot, but | after using Timescale Cloud professionally I saw how much RDS | _doesn 't_ do, like actually-simple upgrades, one-click | forks, etc. and Crunchy looks similar in going beyond RDS. | | I think the community would really love to see a direct | Fly+Crunchy integration! | aeyes wrote: | If I was in their shoes I'd probably aim for a "serverless" | Postgres experience where you get a connection string and you | know nothing else. | | I think RDS, Crunchy, Aiven and others aren't quite there yet. | chime wrote: | They kind of offer that with their Redis (via Upstash). But | for our use-case, we needed it to be managed PG and Redis. | Going out of the LAN introduces too much latency. | chronark wrote: | Upstash Redis for Fly runs on Fly infrastructure and we | observe latencies in the low single digit milliseconds. | jrochkind1 wrote: | I don't even understand what you mean as the difference between | "something like RDS" and "something like CrunchyData" -- they | seem like similar products to me? | claytonjy wrote: | I see RDS as the absolute bare minimum for a managed | database; providers like Timescale or Crunchy tend to add | some pretty useful stuff on top. | jrochkind1 wrote: | For my own curiosity, I am interested in hearing what | features Crunchy adds on top that RDS doesn't have, that | folks find pretty useful! | | (Timescale -- I think i know, it adds features specifically | about storing time series? But I don't think crunchy has | additional domain-specific stuff like this?) | pyentropy wrote: | Almost half of the issues are caused by their use of HashiCorp | products. | | As someone that has started tons of Consul clusters, analyzed | tons of Terraform states, developed providers and wrote a HCL | parser, I must say this: | | HashiCorp built a brand of consistent design & docs, security, | strict configuration, distributed-algos-made-approachable... but | at its core, it's a _very_ fragile ecosystem. The only benefit of | HashiCorp headaches is that you will quickly learn Golang while | reading some obscure github.com /hashicorp/blah/blah/file.go :) | tptacek wrote: | We are asking to HashiCorp products to do things they were not | designed to do, in configurations that they don't expect to be | deployed in. Take a step back, and the idea of a single global | namespace bound up with Raft consistency for a fleet deployed | in dozens of regions, providing near-real-time state | propagation, is just not at all reasonable. Our state | propagation needs are much closer to those of a routing | protocol than a distributed key-value database. | | I have only positive things to say about every HashiCorp | product I've worked with since I got here. | pyentropy wrote: | I respect that. Can you elaborate a bit on the routing | protocol thing? I assume you used WAN gossip? | | I love the simplicity of fly.io & wish you all the best | improving Fly's reliability! | tptacek wrote: | If you've ever implemented IS-IS or OSPF before, like 80% | of the work is "LSP flooding", which is just the process | that gets updates about available links from one end of the | network to another as fast as possible without drowning the | links themselves in update messages. Flooding algorithms | don't build consensus, unlike Raft quorums, which | intrinsically have a centralized set of authorities that | keep a single source of truth for all the valid updates. | | An OSPF router uses those updates to do build a forwarding | table with a single-point shortest path first routine, but | there's nothing to say that you couldn't instead use the | same notion of publishing weighted advertisements of | connectivity to, for instance, build a table to map | incoming HTTP requests to backends that can field them. | | The point is, if you're going to do distributed consensus, | you've got a dilemma: either you're going to have the Ents | moot in a single forest, close together, and round trip | updates from across the globe in and out of that forest | (painfully slow to get things in and out of the cluster), | or you're going to try to have them moot long distance | (painfully slow to have the cluster converge). The other | thing you can do, though, is just sidestep this: we really | don't have the Raft problem at all, in that different hosts | on our network do not disagree with each other about | whether they're running particular apps; if worker-sfu- | ord-1934 says it's running an instance of app-4839, I | pretty much don't give a shit if worker-sfu-maa-382a says | otherwise; I can just take ORD's word for it. | | That's the intuition behind why you'd want to do something | like SWIM update propagation rather than Raft for a global | state propagation scheme. | | But if you're just doing service discovery for a well- | bounded set of applications (like you would be if you were | running engineering for a single large company and their | internal apps), Raft gives you some handy tools you might | reasonably take advantage of --- a key-value store, for | instance. You're mostly in a single data center anyways, so | you don't have the long-distance-Entmoot problem. And | HashiCorp's tools will federate out across multiple data | centers; the constraints you inherit by doing that | federation mostly don't matter for a single company's | engineering, but they're extremely painful if you're | servicing an _unbounded_ set of customer applications and | providing each of them a _single global picture_ of their | deployments. | | Or we're just holding it wrong. Also a possibility. | [deleted] | chucky_z wrote: | mrkurt have you considered some of the lower tiers of vault | enterprise that allow for performance replicas that just outright | solve that problem? might be cheaper than an engineer at this | point. | bmorton wrote: | Can we please stop with the fly.io spam. | | >I've hesitated to share this because, well, I'm fighting a | debilitating feeling of failure. Fear, too. | | This is gross and I don't buy the emotional play -- this whole | piece is just an ad. | sph wrote: | It's a post on their forum, hardly an ad. | outworlder wrote: | > This is a theme. Existing open source is not designed for | global deployment | | Eh? Unless you are consuming something as a service and it | actually advertises it as a feature, nothing is ready for 'global | deployment'. | | If you have a 'centralized' secret storage, then you have made it | tied to a region. Want to have redundancies and lower latency? | You'll have to distribute it. Vault has docs about this: | https://developer.hashicorp.com/vault/tutorials/day-one-raft... | theloco wrote: | I love reading stuff like this. I don't use fly, don't plan to, | not totally sure everything it does and will check it out. But | this is some great raw data on how stressful it is after you | launch. | ashiban wrote: | One of the key challenges we observe is that if you're small | enough, a Heroku like experience works well - and most of your | needs would be covered by virtually any combination of | techstacks. | | It gets significantly more challenging when you grow, either in | feature complexity or scale complexity - and then very few | services can offer what AWS/GCP/Azure offer - albeit at the | increased engineering/monetary cost of using them. | | We're building a different kind of approach[0] that aims to | absorb the mechanical cost of using public cloud capabilities | (that are proven to scale) without hiding it altogether. | | [0] https://github.com/KlothoPlatform/klotho | deivid wrote: | I'm a bit sour reading this. I've always liked fly and | particularly the engineering blog, so much so that a couple of | months ago I decided to apply for an infra position, to work on | some of these very topics. Sadly after 4~5 rounds of interviews | (including a workday) they just ghosted me. | [deleted] | tptacek wrote: | If that happened, it absolutely was not on purpose. Shoot me an | email at thomas@fly.io. | ignoramous wrote: | > Don't feel too bad nor take it personal. They probably have a | lot of applicants, and are looking to grow their team by hiring | someone with very specific skills. | | > I also applied a few months ago while I was in the middle of | my job search. For one, I couldn't really answer their | "favorite syscall question" because I've never dealt with | syscalls :) so maybe I just wasn't a good fit. | | Surely, everyone's favourite syscall is _exit()_ | 1023bytes wrote: | I get it, I like fly.io, but the last outage made me switch to | Railway.app | a3w wrote: | What is this about? ___________________________________________________________________ (page generated 2023-03-06 23:00 UTC)