[HN Gopher] AWS S3: Sometimes you should press the $100k button
       ___________________________________________________________________
        
       AWS S3: Sometimes you should press the $100k button
        
       Author : korostelevm
       Score  : 332 points
       Date   : 2022-02-17 10:32 UTC (12 hours ago)
        
 (HTM) web link (www.cyclic.sh)
 (TXT) w3m dump (www.cyclic.sh)
        
       | dekhn wrote:
       | I had to chuckle at this article because it reminded me of some
       | of the things I've had to do to clean up data.
       | 
       | One time I had to write a special mapreduce that did a multiple-
       | step-map to converted my (deeply nested) directory tree into
       | roughly equally sized partitions (a serial directory listing
       | would have taken too long, and the tree was really unbalanced to
       | partition in one step), then did a second mapreduce to map-delete
       | all the files and reduce the errors down to a report file for
       | later cleanup. This meant we could delete a few hundred terabytes
       | across millions of files in 24 hours, which was a victory.
        
       | valar_m wrote:
       | Though it doesn't address the problem in TFA, I recommend setting
       | up billing alerts in AWS. Doesn't solve their issue, but they
       | would have at least known about it sooner.
        
       | pontifier wrote:
       | DON'T PRESS THAT BUTTON.
       | 
       | The egress and early deletion fees on those "cheaper options"
       | killed a company that I had to step in and save.
        
         | pphysch wrote:
         | On a related note, suppose the Fed raises rates to mitigate
         | inflation and indirectly kills thousands of zombie companies,
         | including many SaaS renting the cloud. What happens to their
         | data? Does the cloud unilaterally evict/delete it, or does it
         | get handled like an asset -- auctioned off, etc?
        
           | cmckn wrote:
           | I'm not aware of a cloud provider that is contractually
           | allowed to do such a thing (except maybe alibaba by way of
           | the CCP). Dying companies get purchased and have their assets
           | pilfered every day, the same thing would happen with cloud
           | assets.
        
           | Uehreka wrote:
           | > does it get handled like an asset -- auctioned off, etc?
           | 
           | Who would buy that? I guess if this happened enough then
           | people would start "data salvager" companies that specialize
           | in going through data they have no schema for looking for a
           | way to sell something of it to someone else. I have to
           | imagine the margins in a business like that would be abysmal,
           | and all the while you'd be in a pretty dark place ethically
           | going through data that users never wanted you to have in the
           | first place.
           | 
           | Of course, all these questions are moot because if this
           | happened the GDPR would nuke the cloud provider from orbit.
        
         | Aeolun wrote:
         | If they were already paying 100k per month for their storage, I
         | doubt the additional 100k would severely impact their business.
         | 
         | Proven by the fact that they happily went on to pay the bill
         | for the next 6 months.
        
       | charcircuit wrote:
       | Can someone explain what happened in the end? From my
       | understanding nothing happened (they deprioritizod the story for
       | fixing it) and they are still blowing through the cloud budget.
        
         | snowwrestler wrote:
         | They didn't resolve the issue.
         | 
         | There's an important moment in the story, where they realize
         | the fix will incur a one-time fee of $100,000. No one in
         | engineering can sign off on that amount, and no one wants to
         | try to explain it to non-technical execs.
         | 
         | They don't explain why. But it's probably because they expect a
         | negative response like "how could you let this happen?!" or
         | "I'm not going to pay that, find another way to fix it."
         | 
         | In a lot of organizations it's easier to live with a steadily
         | growing recurring cost than a one-time fee... even if the total
         | of the steady growth ends up much larger than the one-time fee!
         | 
         | It's not necessarily pathological. Future costs will be paid
         | from future revenue; whereas a big fee has to be paid from cash
         | on-hand now.
         | 
         | But sometimes the calculation is not even attempted because of
         | internal culture. When the decision is "keep your head down"
         | instead of "what's the best financial strategy," that could
         | hint at even bigger potential issues down the road.
        
           | hogrider wrote:
           | Sounds more like non technical leadership sleeping at the
           | wheel. I mean if they could just afford to lose money like
           | this why bother with all that work to fix it?
        
         | seekayel wrote:
         | How I read the article, nothing happened. I think it is a
         | cautionary tale of why you should probably bite the bullet and
         | press the button instead of doing the "easier" thing which ends
         | up being harder and more expensive in the end.
        
       | lloesche wrote:
       | I had a similar issue at my last job. Whenever a user created a
       | PR on our open source project artifacts of 1GB size consisting of
       | hundreds of small files would be created and uploaded to a
       | bucket. There was just no process that would ever delete
       | anything. This went on for 7 years and resulted in a multi-
       | petabyte bucket.
       | 
       | I wrote some tooling to help me with the cleanup. It's available
       | on Github:
       | https://github.com/someengineering/resoto/tree/main/plugins/...
       | consisting of two scripts, s3.py and delete.py.
       | 
       | It's not exactly meant for end-users, but if you know your way
       | around Python/S3 it might help. I build it for a one-off purge of
       | old data. s3.py takes a `--aws-s3-collect` arg to create the
       | index. It lists one or more buckets and can store the result in a
       | sqlite file. In my case the directory listing of the bucket took
       | almost a week to complete and resulted in a 80GB sqlite.
       | 
       | I also added a very simple CLI interface (calling it virtual
       | filesystem would be a stretch) that allows to load the sqlite
       | file and browse the bucket content, summarise "directory" sizes,
       | order by last modification date, etc. It's what starts when
       | calling s3.py without the collect arg.
       | 
       | Then there is delete.py which I used to delete objects from the
       | bucket, including all versions (our horrible bucket was versioned
       | which made it extra painful). On a versioned bucket it has to run
       | twice, once to delete the file and once to delete the then
       | created version, if I remember correctly - it's been a year since
       | I built this.
       | 
       | Maybe it's useful for someone.
        
         | k__ wrote:
         | What about the lifecycle stuff?
         | 
         | I thought, S3 can move stuff to cheaper storage automatically
         | after some time.
        
           | lloesche wrote:
           | Like I wrote for us it was a one-off job to find and remove
           | 6+ year old build artifacts that would never be needed again.
           | I just looked for the cheapest solution of getting rid of
           | them. I couldn't do it by prefix alone (prod files mixed in
           | the same structure as the build artifacts) which is why
           | delete.py supports patterns (the `--aws-s3-pattern` arg takes
           | a regex).
           | 
           | If AWS' own tools work for you it's surely the better
           | solution than my scripts. Esp. if you need something on an
           | ongoing bases.
        
         | coredog64 wrote:
         | AWS has an inventory capability for S3:
         | https://docs.aws.amazon.com/AmazonS3/latest/userguide/storag...
        
       | wackget wrote:
       | As a web developer who has never used anything except locally-
       | hosted databases, can someone explain what kind of system
       | actually produces billions or trillions of files which each need
       | to be individually stored in a low-latency environment?
       | 
       | And couldn't that data be stored in an actual database?
        
         | abhishekjha wrote:
         | An image service.
        
           | wackget wrote:
           | Yeah that use-case I get. Binary files which would be
           | difficult/impractical to index in a database.
           | 
           | However it feels like something at that scale will only ever
           | realistically be dealt with by enterprise-level software, and
           | I'd hazard a guess that _most_ developers - even those
           | reading HN - are not working on enterprise-level systems.
           | 
           | So I'm wondering what "regular devs" are using cloud buckets
           | for at such a scale over regular DBs.
        
         | rgallagher27 wrote:
         | Things like mobile/webisite analytics events. User A clicked
         | this menu item, User B viewed this images etc All streamed into
         | S3 in chunks of smallish files.
         | 
         | It's cheaper to store them in S3 over a DB and use tools like
         | Athena or Redshift spectrum to query.
        
           | wackget wrote:
           | Wow. What makes it cheaper than using a DB? Is it just
           | because the DB will create some additional metadata about
           | each stored row or something?
        
       | zmmmmm wrote:
       | The rationale for using cloud is so often that it saves you from
       | complexity. It really undermines the whole proposition when you
       | find out that the complexity it shields you from is only skin
       | deep, and in fact you still need a "PhD in AWS" anyway.
       | 
       | But as a bonus, now you face huge risks and liabilities from
       | single button pushes and none of those skills you learned are
       | transferrable outside of AWS so you'll have to learn them again
       | for gcloud, again for azure, again for Oracle ....
        
       | Mave83 wrote:
       | Just avoid the cloud. You get a Ceph storage with the performance
       | of Amazon S3 at the price point of Amazon S3 Glacier in any
       | Datacenter worldwide deployed if you want. There are companies
       | that help you doing this.
       | 
       | Feel free to ask if you need help.
        
       | solatic wrote:
       | TL-DR: Object stores are not databases. Don't treat them like
       | one.
        
         | wooptoo wrote:
         | They're also _not_ classic hierarchical filesystems, but k-v
         | stores with extras.
        
         | throwaway984393 wrote:
         | Try telling that to developers; they love using S3 as both a
         | database and a filesystem. It's gotten to the point where we
         | need a training for new devs to tell them what not to do in the
         | cloud.
        
           | Quarrelsome wrote:
           | do you know if such sources exist publicly? I would be most
           | interested in perusing recommended material on the subject.
        
           | mst wrote:
           | Honestly a Frequently Delivered Answers training for new
           | developers is probably one of the best things you can include
           | in onboarding.
           | 
           | Every environment has its footguns, after all.
        
           | hinkley wrote:
           | Communicating through the filesystem is one of the Classic
           | Blunders.
           | 
           | It doesn't come up as often anymore since we generally have
           | so many options at our fingertips, but when push comes to
           | shove you will still discover this idea rattling around in
           | people's skulls.
        
             | ijlx wrote:
             | Classic Blunders:
             | 
             | 1. Never get involved in a land war in Asia
             | 
             | 2. Never go in against a Sicilian when death is on the line
             | 
             | 3. Never communicate through the filesystem
        
           | solatic wrote:
           | You can either train them with a calm tutorial or you can
           | train them with angry billing alerts and shared-pain ex-post-
           | facto muckraking.
           | 
           | I, for one, prefer the calm way.
        
       | lenkite wrote:
       | _sigh_. My team is facing all these issues. Drowning in data.
       | Crazy S3 bill spikes. And not just S3 - Azure, GCP, Alibaba, etc
       | since we are a multi-cloud product.
       | 
       | Earlier, we couldn't even figure out lifecycle policies to expire
       | objects since naturally every PM had a different opinion on the
       | data lifecycle. So it was old-fashioned cleanup jobs that were
       | scheduled and triggered when a byzantine set of conditions were
       | met. Sometimes they were never met - cue bill spike.
       | 
       | Thankfully, all the new data privacy & protection regulations are
       | a _life-saver_. Now, we can blindly delete all associated data
       | when a customer off-boards or trial expires or when data is no
       | longer used for original purpose. Just tell the intransigent PM
       | 's that we are strictly following govt regulations.
        
         | candiddevmike wrote:
         | Are you multi-cloud because your customers need you to be
         | multi-cloud?
        
         | CydeWeys wrote:
         | The data protection regulations really are so freeing, huh.
         | It's amazing to be able to delete all this stuff without
         | worrying about having to keep it forever.
        
           | jeff_vader wrote:
           | In case of my previous employer it led to incredibly
           | complicated encryption system. It took couple years to maybe
           | implement in 10% of the system. Deleting any old data was
           | rejected.
        
             | stingraycharles wrote:
             | How is encryption compliant? I've implemented GDPR data
             | infrastructures twice now, and as far as I'm aware, the
             | only way to be compliant with encryption is when you throw
             | the decryption key away.
        
               | aeyes wrote:
               | Sometimes it might be a single field in a 1MB nested
               | structure that you have to remove. So it gets encrypted
               | when the whole structure gets stored and when the field
               | is to be deleted you just throw away the key instead of
               | modifying the entire 1MB just to remove a few kB.
        
               | dylan604 wrote:
               | If you're comparing gov't regulations to delete data to
               | saving a few KB, then I think you're looking at this
               | wrong.
        
               | spelunker wrote:
               | As mentioned, encrypt something and throw a way the key,
               | often called "crypto shredding".
        
               | stingraycharles wrote:
               | Ahh I see, and that way you can quickly "remove" a whole
               | lot of data by just removing the key, which makes for
               | cheap operations, and/or more flexible workflow (you can
               | periodically compact the database and remove entries for
               | which you have no key).
               | 
               | Is my understanding correct?
        
             | hinkley wrote:
             | I wonder sometimes if it would help if we collectively
             | watched more anti-hoarding shows, in order to see how the
             | consultants convince their customers they can get rid of
             | stuff.
        
               | mro_name wrote:
               | humans started their first 300k years as nomads - storing
               | was just impossible and decrufing happened by itself when
               | moving along.
               | 
               | So maybe that's why we're not good at it yet.
        
               | hinkley wrote:
               | Being a renter definitely kept me lighter for a long
               | time.
               | 
               | When you have to box things up over and over you find
               | that the physical and mental energy around keeping it
               | aren't adding up. I wonder if migrating from cloud to
               | cloud would simulate this experience.
        
               | Bayart wrote:
               | Being a renter just taught me to batch my $STUFF I/O to
               | minimize read-writes to disk and maximize available low-
               | latency space. ie. fill my bags to the brim with shit I
               | didn't plan using whenever I'd go to my parents'.
        
               | travisgriggs wrote:
               | Two space garbage collector in action right there. Maybe
               | all things software need a "move it or lose it" impetus.
               | Features in apps, old data, you name it. If you've gotta
               | keep transferring/translating it, it would definitely
               | pare things down.
        
           | whimsicalism wrote:
           | now this is a spin i havent heard before.
        
             | hvs wrote:
             | You haven't heard it because it's not spin, it's from an
             | engineer's point of view. That's not the view you hear in
             | the news when it comes to these things.
        
               | alisonkisk wrote:
               | Eh, Retention and Deletion are both pain for devs. Not
               | having to care is the happy state.
        
               | whimsicalism wrote:
               | HN seems like an odd place to assume that people only
               | hear about things from the news and aren't engineers
               | themselves.
               | 
               | i am a dev that has to deal with these regulations in my
               | day to day. it is a pain, it is not freeing in any sense,
               | and it makes my models worse.
               | 
               | granted, i think there are good reasons for it, but it
               | does not make my life easier for sure.
        
             | jabroni_salad wrote:
             | As a sysadmin I really wish you had. SO MANY problems have
             | come to my desk because some dude 3 years ago did not
             | consider retention or rotation and now I have to figure out
             | what to do with a 4TB .txt that is apparently important.
        
               | dylan604 wrote:
               | Find out how important it is with a `mv 4TB.txt 4TB.old`
               | type of things. See how many people come screaming
        
               | briffle wrote:
               | "You never know when you might need this info to debug"
               | The developer says as their cronjob creates a 250MB csv
               | file, and a few MB of debug logs per day, for the past
               | few years. "Disk is cheap" they say.
               | 
               | As a sysadmin, I hate that too.
        
               | whimsicalism wrote:
               | sometimes the data is just big...
        
               | colechristensen wrote:
               | Often a considerable portion of those logs are useless,
               | trace level misclassified as info, kept for years for no
               | reason.
               | 
               | You should keep a minimal set of logs necessary for
               | audit, logs for errors which are actually errors, and
               | logs for things which happen unexpectedly.
               | 
               | What people do keep are logs for everything which
               | happens, almost all of which is never a surprise.
               | 
               | One needs to go through logs periodically and purge the
               | logging code for every kind of message which doesn't
               | spark joy, I mean seem like it would ever be useful to
               | know.
        
               | whimsicalism wrote:
               | sure, in a world where machine learning doesnt exist i
               | would agree with you. for low level logs of things like
               | "memory low, spawning a new container" i would also agree
               | with you. not for user actions though (which is the topic
               | closest to whats under discussion given what sort of data
               | these regulations cover)
        
           | theshrike79 wrote:
           | Yep, having everything disappear at 2 months max is a life-
           | saver.
           | 
           | That "absolutely essential thing" isn't essential any more
           | when there is a possible GDPR/CCPA violation with a
           | significant fine just around the corner.
        
             | koolba wrote:
             | Just make sure you actually test your backups. Two months
             | of unusable backups are just as useful as no backups.
        
               | marcosdumay wrote:
               | Well, you should have done this before GDPR too, but
               | reminding people to test backups is never too late and
               | never too often.
        
         | StratusBen wrote:
         | Disclosure: I'm Co-Founder and CEO of a cloud cost company
         | named https://www.vantage.sh/ - I also used to be on the
         | product management team at AWS and DigitalOcean.
         | 
         | I'm not intentionally trying to shill but this is exactly why
         | people choose to use Vantage. We give them a set of features
         | for automating and understanding what they can do to manage and
         | save on costs. We're also adding multi-cloud support (GCP is in
         | early access, Azure is coming) to be a single pane of glass
         | into cloud costs.
         | 
         | If anyone needs help on this stuff, I really love it. We have a
         | generous free tier and free trial. We also have a Slack
         | community of ~400 people nerding out on cloud costs.
        
           | imwillofficial wrote:
           | I work on a team the computes bills, shoot me a slack invite
           | and perhaps I can offer insight.
        
             | [deleted]
        
           | cookiesboxcar wrote:
           | I love vantage. Thank you for making it.
        
           | samlambert wrote:
           | Vantage is a seriously awesome product. We love it at
           | PlanetScale. Obviously being a cloud product things can get
           | pricy and so Vantage is essential.
        
           | vdm wrote:
           | https://www.google.com/search?q=site%3Ahttps%3A%2F%2Fdocs.va.
           | ..
           | 
           | I gave vantage.sh 5 minutes and did not see anything for S3
           | that is not already available from the built-in Cost
           | Explorer, Storage Lens, Cost and Usage Reports, and taking 1
           | hour to study the docs https://docs.aws.amazon.com/AmazonS3/l
           | atest/userguide/Bucket...
           | 
           | Most "cloud optimisation" products want to tell you which EC2
           | instance type to use, but can't actually give actionable
           | advice for S3. Happy to be corrected on this.
        
             | StratusBen wrote:
             | We are in process of updating the documentation because
             | you're right that it needs more work. For the record, if
             | you're doing everything on your own via Cost Explorer,
             | Storage Lens and processing CUR you may be set. From what
             | we hear, most folks do not want to deal with processing CUR
             | (or even know what it is) and struggle with Cost Explorer.
             | 
             | Vantage automates everything you just mentioned to allow
             | you to make quicker decisions. Here's a screenshot of what
             | we do on a S3 Bucket basis: https://s3.amazonaws.com/assets
             | .vantage.sh/www/s3_example.pn...
             | 
             | We'll profile storage classes and the number of objects,
             | tell you the exact cost of turning on things like
             | intelligent tiering and how much that will cost with
             | specific potential savings. This is all done out of the
             | box, automatically - and we profile List/Describe APIs to
             | always discover newly created S3 Buckets.
             | 
             | From speaking with hundreds of customers, I can also assure
             | you that at a certain scale, billing does not take an
             | hour...there are entire teams built around this at larger
             | companies.
        
             | simonw wrote:
             | Saving people from learning how to use Cost Explorer,
             | Storage Lens, Cost and Usage Reports - and then taking 1
             | hour to study documentation - sounds to me like a
             | legitimate market opportunity.
        
               | alar44 wrote:
               | Not really. Sometimes you actually have to understand
               | things. If you're so concerned about your billing,
               | someone on your team should probably invest a freaking
               | hour to understand it. If that can't happen, you are just
               | setting yourself up for failure.
        
               | Jgrubb wrote:
               | I've been learning the ins and outs of the major 3
               | providers cloud billing setups for the last year, and I'm
               | just getting started. This is not a 1 hour job, but
               | you're right that someone in your team needs to
               | understand it.
        
               | llbeansandrice wrote:
               | At my last job we had a team spend an entire quarter just
               | to help visualize and properly track all of our AWS
               | expenditures. It's a huge job.
        
               | beberlei wrote:
               | Not confused why there is talk about software developer
               | shortage when it seems a good amount of them work on this
               | kind of nonsense. Talk about bs jobs.
        
               | banku_brougham wrote:
               | its a lot more than an hour, in my experience
        
       | jopsen wrote:
       | One of the biggest pains is that cloud services rarely mention
       | what they don't do.
       | 
       | I think it's really sad, because when I don't see docs clearly
       | stating the limits, I assume the worst and avoid the service.
        
       | pattycake23 wrote:
       | Here's an article about Shopify running into the S3 prefix rate
       | limit too many times, and tackling it:
       | https://shopify.engineering/future-proofing-our-cloud-storag...
        
         | sciurus wrote:
         | Their solution was to introduce entropy into the beginning of
         | the object names, which used to be AWS's recommendation for how
         | to ensure objects are placed in different partitions. AWS
         | claims this is no longer necessary, although how their new
         | design actually handles partitioning is opaque.
         | 
         | "This S3 request rate performance increase removes any previous
         | guidance to randomize object prefixes to achieve faster
         | performance. That means you can now use logical or sequential
         | naming patterns in S3 object naming without any performance
         | implications."
         | 
         | https://aws.amazon.com/about-aws/whats-new/2018/07/amazon-s3...
        
           | pattycake23 wrote:
           | Seems like it's a much higher rate limit, but it exists none
           | the less, and Shopify's scale has also grown significantly
           | since 2018 (when that article was written) - so it was
           | probably a valid way for them to go.
        
             | sciurus wrote:
             | I think two things happened that are covered in that blog
             | post
             | 
             | 1) The performance per partition increased
             | 
             | 2) The way AWS created partitions changed
             | 
             | When I was at Mozilla, one thing I worked on was Firefox's
             | crash reporting system. It's S3 storage backend wrote raw
             | crash data with the key in the format
             | `{prefix}/v2/{name_of_thing}/{entropy}/{date}/{id}`. If I
             | remember correctly, we considered this a limitation since
             | the entropy was so far down in the key. However, when we
             | talked to AWS Support they told us their was no longer a
             | need to have the entropy early on; effectively S3 would
             | "figure it out" and partition as needed.
             | 
             | EDIT: https://news.ycombinator.com/item?id=30373375 is a
             | good related comment.
        
       | ebingdom wrote:
       | I'm confused about prefixes and sharding:
       | 
       | > The files are stored on a physical drive somewhere and indexed
       | someplace else by the entire string app/events/ - called the
       | prefix. The / character is really just a rendered delimiter. You
       | can actually specify whatever you want to be the delimiter for
       | list/scan apis.
       | 
       | > Anyway, under the hood, these prefixes are used to shard and
       | partition data in S3 buckets across whatever wires and metal
       | boxes in physical data centers. This is important because prefix
       | design impacts performance in large scale high volume read and
       | write applications.
       | 
       | If the delimiter is not set at bucket creation time, but rather
       | can be specified whenever you do a list query, how can the prefix
       | be used to influence where objects are physically stored? Doesn't
       | the prefix depend on what delimiter you use? How can the sharding
       | logic know what the prefix is if it doesn't know the delimiter in
       | advance?
       | 
       | For example, if I have a path like
       | `app/events/login-123123.json`, how does S3 know the prefix is
       | `app/events/` without knowing that I'm going to use `/` as the
       | delimiter?
        
         | twistedpair wrote:
         | This is where GCP's GCS (Google Cloud Storage) shines.
         | 
         | You don't need to mess with prefixing all your files. They auto
         | level the cluster for you [1].
         | 
         | [1] https://cloud.google.com/storage/docs/request-
         | rate#redistrib...
        
           | kristjansson wrote:
           | S3 does too, now.
           | 
           | https://aws.amazon.com/about-aws/whats-
           | new/2018/07/amazon-s3...
        
         | inopinatus wrote:
         | There's no delimiter. There is only the appearance of a
         | delimiter, to appease folks who think S3 is a filesystem, and
         | fool them into thinking they're looking at folders.
         | 
         | The object name is the entire label, and every character is
         | equally significant for storage. When listing objects, a prefix
         | filters the list. That's all. However, S3 also uses substrings
         | to partition the bucket for scale. Since they're anchored at
         | the start, they're also called prefixes.
         | 
         | In my view, it's best to think of S3's object indexing as a
         | radix tree.
         | 
         | This article, as if you couldn't guess from the content, is
         | written from a position of scant knowledge of S3, not
         | surprising it misrepresents the details.
        
           | charcircuit wrote:
           | >There's no delimiter.
           | 
           | What's the delimiter parameter for then?
           | 
           | https://docs.aws.amazon.com/AmazonS3/latest/API/API_ListObje.
           | ..
        
             | inopinatus wrote:
             | To help you fool yourself. It affects how object list
             | results are presented in the api response.
        
               | ec109685 wrote:
               | They can't present a directory abstraction for list
               | operations without a delimiter. E.g. CommonPrefixes.
        
               | throwhauser wrote:
               | "To help you fool yourself" seems like a euphemism for
               | "to fool you". It's gotta be tough to go from "scant
               | knowledge of S3" to genuine knowledge if the
               | documentation is doing this to you.
               | 
               | If the docs are misrepresenting the details, who can
               | blame the author of the post?
        
               | inopinatus wrote:
               | The documentation is very clear on the purpose of the
               | delimiter parameter.
               | 
               | The OP does not read the docs, makes bad assumptions
               | repeatedly throughout, and then reaps the consequences.
        
             | nightpool wrote:
             | To provide a consistent API response as part of the
             | ListObjects call. It has nothing to do with the storage on
             | disk.
        
           | ebingdom wrote:
           | So if I have a bunch of objects whose names are hashes like
           | 2df6ad6ca44d06566cffde51155e82ad0947c736 that I expect to
           | access randomly, is there any performance benefit to
           | introducing artificial delimiters like
           | 2d/f6/ad6ca44d06566cffde51155e82ad0947c736? I've seen this
           | used in some places.
        
             | jstarfish wrote:
             | I don't know what impact that partitioning pattern has on
             | s3, but it has some obvious benefits if your app needs to
             | revert to write to a normal filesystem instead (like for
             | testing).
        
             | dale_glass wrote:
             | To AWS S3, '/' isn't a delimiter, it's a character that's
             | part of the filename.
             | 
             | So for instance "/foo/bar.txt" and "/foo//bar.txt" are
             | different files in S3, even though they'd be the same file
             | in a filesystem.
             | 
             | This gets pretty fun if you want to mirror a S3 structure
             | on-disk, because the above suddenly causes a collision.
        
             | elcomet wrote:
             | No difference other than readability. And amazon may
             | distribute your application with another prefix anyway,
             | like "2d/f6/ad6c"
        
         | korostelevm wrote:
         | AWS does the optimizations over time based on access patterns
         | for the data. Should have made that clearer in the article.
         | 
         | The problem becomes unusual burst load - usually from
         | infrequent analytics jobs. The indexing cant respond fast
         | enough.
        
           | ebingdom wrote:
           | Thanks for the clarification. But now I'm confused about the
           | limits:
           | 
           | > 3,500 PUT/COPY/POST/DELETE requests per second per prefix
           | 
           | > 5,500 GET/HEAD requests per second per prefix
           | 
           | Most of those APIs don't even take a delimiter. So for these
           | limits, does the prefix get inferred based on whatever
           | delimiter you've used for previous list requests? What if
           | you've used multiple delimiters in the past?
           | 
           | Basically what I'm trying to determine is whether these
           | limits actually mean something concrete (that I can use for
           | capacity planning etc.), or whether their behavior depends on
           | heuristics that S3 uses under the hood.
           | 
           | I'm fine with S3 optimizing things under the hood based on
           | access my patterns, but not if it means I can't reason about
           | these limits as an outsider.
        
             | ec109685 wrote:
             | Delimiter isn't used for writes, only list operations.
             | 
             | S3 simply looks at the common string prefixes in your
             | object names and uses that to internally shard objects, so
             | you can achieve a multiple of those request limits.
             | 
             | aaa122348
             | 
             | aaa484585
             | 
             | bbb484858
             | 
             | bbb474827
             | 
             | Would have same performance as:
             | 
             | aaa/122348
             | 
             | aaa/484585
             | 
             | bbb/484858
             | 
             | bbb/474827
        
             | Macha wrote:
             | S3 does a lot of under the hood optimisation. e.g. Create a
             | brand new bucket, leave it cold for a while, and start
             | throwing 100 PUT requests a second at it. This is way less
             | than the advertised 3500, but they'll have scaled the
             | allocated resources down so much you'll get some
             | TooManyRequests errors.
        
             | acdha wrote:
             | Those are what I would assume for performance when the
             | system is stable. The concerns come from bursty behaviour
             | -- for example, if you put something new into production
             | you might have a period of time while S3 is adjusting
             | behind the scenes where you'll get transient errors from
             | some operations before it stabilizes (these have almost
             | always been resolved by retry in my experience). This is
             | reportedly something your AWS TAM can help with if you know
             | in advance that you're going to need to handle a ton of
             | traffic and have an idea of what the prefix distribution
             | will be like -- apparently the S3 support team can optimize
             | the partitioning for you in preparation.
        
         | xyzzy_plugh wrote:
         | The prefix isn't delimited, it's an arbitrary length based on
         | access patterns.
         | 
         | A fictitious example which is close to reality:
         | 
         | In parallel, you write a million objects each to:
         | tomato/red/...        tomato/green/...
         | tomatoes/colors/...
         | 
         | The shortest prefixes that evenly divides writes are thus
         | tomato/r        tomato/g        tomatoes
         | 
         | If you had an existing access pattern of evenly writing to
         | tomatoes/colors/...        bananas/...
         | 
         | The shortest prefixes would be                  t        b
         | 
         | So suddenly writing 3 million objects that begin with a t would
         | cause an uneven load or hotspot on the backing shards. The
         | system realizes your new access pattern and determines new
         | prefixes and moves data around to accommodate what it thinks
         | your needs are.
         | 
         | --
         | 
         | The delimiter is just a wildcard option. The system is just a
         | key value store, essentially. Specifying a delimiter tells the
         | system to transform delimiters at the end of a list query like
         | my/path/
         | 
         | into a pattern match like                  my/path/[^/]+/?
        
           | stepchowfun wrote:
           | Thank you! This is the first explanation that I think fully
           | explains what I was confused about. So essentially the prefix
           | is just the first N bytes of the object's name, where N is a
           | per-bucket number that S3 automatically decides and adjusts
           | for you. And it has nothing to do with delimiters.
           | 
           | I find the S3 documentation and API to be really confusing
           | about this. For example, when listing objects, you get to
           | specify a "prefix". But this seems to be not directly related
           | to the automatically-determined prefix length based on your
           | access patterns. And [1] says things like "There are no
           | limits to the number of prefixes in a bucket.", which makes
           | no sense to me given that the prefix length is something that
           | S3 decides under the hood for you. Like, how do you even know
           | how many prefixes your bucket has?
           | 
           | [1] https://docs.aws.amazon.com/AmazonS3/latest/userguide/opt
           | imi...
        
             | inopinatus wrote:
             | It is related, in the sense both "prefixes" are a substring
             | match anchored at the start of the object name. They're
             | just not the same mechanism.
        
             | xyzzy_plugh wrote:
             | The sharding key is an implementation detail, so you're not
             | supposed to care about it too much.
        
               | kristjansson wrote:
               | That's true now. Used to be the case that they'd
               | recommend random or high-entropy parts of the keys go at
               | the beginning to avoid overloading a shard as you
               | described above.
               | 
               | From [0]:
               | 
               | > This S3 request rate performance increase removes any
               | previous guidance to randomize object prefixes to achieve
               | faster performance. That means you can now use logical or
               | sequential naming patterns in S3 object naming without
               | any performance implications. This improvement is now
               | available in all AWS Regions. For more information, visit
               | the Amazon S3 Developer Guide.
               | 
               | [0]: https://aws.amazon.com/about-aws/whats-
               | new/2018/07/amazon-s3...
        
       | wodenokoto wrote:
       | I've never been in this situation, but I do wish you could query
       | files with more advanced filters on these blob storage services.
       | 
       | - But why SageMaker?
       | 
       | - Why do some orgs choose to put almost everything in 1 buckets?
        
         | tyingq wrote:
         | >Why do some orgs choose to put almost everything in 1 buckets?
         | 
         | The article seems to be making the case it's because the
         | delimiter makes it seem like there's a real hierarchy. So the
         | ramifications of /bucket/1 /bucket/2 versus /bucket1/ /bucket2/
         | aren't well known until it's too late.
        
           | charcircuit wrote:
           | >So the ramifications of /bucket/1 /bucket/2 versus /bucket1/
           | /bucket2/ aren't well known until it's too late.
           | 
           | What's the difference?
        
             | musingsole wrote:
             | In the choice between a single bucket with hierarchical
             | paths versus multiple buckets, there's a long list of
             | nuances between either strategy.
             | 
             | For the purposes of this article, you can probably have
             | more intuitive, sensible lifecycle policies across multiple
             | buckets than you can trying to set policies on specific
             | paths within a single bucket. Something like
             | "ShortLifeBucket" and "LongLifeBucket" would allow you to
             | have items with similar prefixes (something like a
             | "{bucket}/anApplication/file1.csv" in each bucket) that
             | then have different lifecycle policies
        
         | liveoneggs wrote:
         | 1 athena?
         | 
         | 2 some jobs make a lot of data
        
         | korostelevm wrote:
         | For many at orgs like this, SageMaker is probably the shortest
         | path to an insane amount of compute with a python terminal.
         | 
         | Why single bucket? Once someone refers to a bucket as "the"
         | bucket - it is how it will forever be.
        
         | akdor1154 wrote:
         | > But why SageMaker?
         | 
         | You could ask the same thing of most times it gets used for ML
         | stuff as well.
         | 
         | > Why do some orgs choose to put almost everything in 1
         | buckets?
         | 
         | Anecdote: ours does because we paid (Multinational Consulting
         | Co)(tm) a couple of million to design our infra for us, and
         | that's what the result was.
        
       | liveoneggs wrote:
       | I have caused billing spikes like this before those little
       | warnings were invented and it was always a dark day. They are
       | really a life saver.
       | 
       | Lifecycle rules are also welcome. Writing them yourself was
       | always a pain and tended to be expensive with list operations
       | eating up that api calls bill.
       | 
       | ----
       | 
       | Once I supported an app that dumped small objects into s3 and
       | begged the dev team to store the small objects in oracle as BLOBS
       | to be concatenated into normal-sized s3 bjects after a reasonable
       | timeout where no new small objects would reasonably be created.
       | They refused (of course) and the bills for managing a bucket with
       | millions and millions of tiny objects were just what you expect.
       | 
       | I then went for a compromise solution asking if we could stitch
       | the small objects together after a period of time so they would
       | be eligible for things like infrequent access or glacier but,
       | alas, "dev time is expensive you know" so N figure s3 bills
       | continue as far as I know.
        
         | sharken wrote:
         | I suppose it's not just dev time on the line, but also the risk
         | of doing the change that is thought to be too high.
         | 
         | If I ever get to be a manager I'd go for an idea such as yours.
         | Though I suspect too many managers are too far removed from the
         | technical aspect of things and don't listen nearly enough.
        
         | vdm wrote:
         | The warning should say "you have N million objects technically
         | eligible for an archive storage class and hitting the button to
         | transition them will cost $M".
         | 
         | Also S3 should no-op transitions for objects smaller than the
         | break-even size for each storage class, even if you ask it to.
        
         | darkwater wrote:
         | > I then went for a compromise solution asking if we could
         | stitch the small objects together after a period of time so
         | they would be eligible for things like infrequent access or
         | glacier but, alas, "dev time is expensive you know" so N figure
         | s3 bills continue as far as I know.
         | 
         | This hits home so hard that it hurts. In my case is not S3 but
         | compute bills but the core concept is the same.
        
           | WrtCdEvrydy wrote:
           | Because the bill isn't a "dev problem". Once you move those
           | bills to "devops", it becomes an infrastructure problem.
        
             | zrail wrote:
             | A big chunk of responsibility for teams doing cloud devops
             | is cost attribution. Cloud costs are incurred by services
             | and those services are owned by teams. Those teams should
             | be billed for their costs and encouraged (via spiffs or the
             | perf process if necessary) to manage them. Devops' job is
             | to build the tooling that allows that to happen.
        
           | [deleted]
        
       | StratusBen wrote:
       | On this topic, it's always surprising to me how few people even
       | seem to know about different storage classes on S3...or even
       | intelligent tiering (which I know carries a cost to it, but
       | allows AWS to manage some of this on your behalf which can be
       | helpful for certain use-cases and teams).
       | 
       | We did an analysis of S3 storage levels by profiling 25,000
       | random S3 buckets a while back for a comparison of Amazon S3 and
       | R2* and nearly 70% of storage in S3 was StandardStorage which
       | just seems crazy high to me.
       | 
       | * https://www.vantage.sh/blog/the-opportunity-for-cloudflare-r...
        
         | blurker wrote:
         | I think that it's not just people not knowing about the
         | lifecycle feature, but also that when they start putting data
         | into a bucket they don't know what the lifecycle should be yet.
         | Honestly I think overdoing lifecycle policies is a potentially
         | bigger foot gun than not setting them. If you misuse glacier
         | storage that will really cost you big $$$ quickly! And who
         | wants to be the dev who deleted a bunch of data they shouldn't
         | have?
         | 
         | Lifecycle policies are simple in concept, but it's actually not
         | simple to decide what they should be in many cases.
        
       | rizkeyz wrote:
       | I did the back-of-the-envelope math once. You get a Petabyte of
       | storage today for $60K/year if you buy the hardware (retail
       | disks, server, energy). It actually fits into the corner of a
       | room. What do you get for $60K in AWS S3? Maybe a PB for 3 months
       | (w/o egress).
       | 
       | If you replace all your hardware every year, the cloud is 4x more
       | expensive. If you manage to use your getto-cloud for 5 year, you
       | are 20x cheaper than Amazon.
       | 
       | To store one TB per person on this planet in 2022, it would take
       | a mere $500M to do that. That's short change for a slightly
       | bigger company these days.
       | 
       | I guess by 2030 we should be able to record everything a human
       | says, sees, hears and speaks in an entire life for every human on
       | this planet.
       | 
       | And by 2040 we should be able to have machines learning all about
       | human life, expression and intelligence to slowly making sense of
       | all of this.
        
       | zitterbewegung wrote:
       | I was at a presentation where HERE technologies told us that they
       | went from being on the top ten (or top five) S3 users (by data
       | stored) to getting off of that list. This was seen as a big deal
       | obviously.
        
       | harshaw wrote:
       | AWS budgets is a tool for cost containment (among other external
       | services).
        
       | 0x002A wrote:
       | Each time a developer does something on a cloud platform, that
       | moment the platform might start to profit for two reasons: vendor
       | lock-in and accrued costs in the long term regardless of the unit
       | cost.
       | 
       | Anything limitless/easiest has a higher hidden cost attached.
        
       | Tehchops wrote:
       | We've got data in S3 buckets not nearly at that scale and
       | managing them, god forbid trying a mass delete, is absolute
       | tedium.
        
         | amelius wrote:
         | Mass delete also takes an eternity on my Linux desktop machine.
         | 
         | The filesystem is hierarchical, but the delete operation still
         | needs to visit all the leaves.
        
           | amelius wrote:
           | (This is a good example where Garbage Collection wins over
           | schemes which track reference explicitly, like reference
           | counting. A garbage collector can just throw away the
           | reference, while other schemes need to visit every leaf
           | resulting in hours of deletion time in some cases.)
        
           | sokoloff wrote:
           | Is S3 actually hierarchical? I always took the mental model
           | that the S3 object namespace within a bucket was flat and the
           | treatment of '/' as different was only a convenient fiction
           | presented in the tooling, which is consistent with the claim
           | in this article.
        
             | cle wrote:
             | This is mostly correct, with the additional feature that S3
             | can efficiently list objects by "key prefix" which helps
             | preserve the illusion.
        
               | sokoloff wrote:
               | Followup question: Is there something special about the
               | PRE notations in the example output below? I can list
               | objects by _any_ textual prefix, but I can 't tell if the
               | PRE (what we think of as folders) is more efficient than
               | just the substring prefix.
               | 
               | Full bucket list, then two text prefix, then an (empty)
               | folder list                 sokoloff@ Downloads % aws s3
               | ls s3://foo-asdf
               | PRE bar-folder/                                  PRE baz-
               | folder/       2022-02-17 09:25:38          0 bar-
               | file-1.txt       2022-02-17 09:25:42          0 bar-
               | file-2.txt       2022-02-17 09:25:57          0 baz-
               | file-1.txt       2022-02-17 09:25:49          0 baz-
               | file-2.txt       sokoloff@ Downloads % aws s3 ls
               | s3://foo-asdf/ba                                  PRE
               | bar-folder/                                  PRE baz-
               | folder/       2022-02-17 09:25:38          0 bar-
               | file-1.txt       2022-02-17 09:25:42          0 bar-
               | file-2.txt       2022-02-17 09:25:57          0 baz-
               | file-1.txt       2022-02-17 09:25:49          0 baz-
               | file-2.txt       sokoloff@ Downloads % aws s3 ls
               | s3://foo-asdf/bar                                  PRE
               | bar-folder/       2022-02-17 09:25:38          0 bar-
               | file-1.txt       2022-02-17 09:25:42          0 bar-
               | file-2.txt       sokoloff@ Downloads % aws s3 ls
               | s3://foo-asdf/bar-folder
               | PRE bar-folder/
        
               | jsmith45 wrote:
               | Umm... that output seems confusing.
               | 
               | The ListObjects api will omit all objects that share a
               | prefix that ends in the delimiter, and instead put said
               | prefix into the CommonPrefix element, which would be
               | reflected as PRE lines. (So with a delimiter of '/', it
               | basically hides objects in "subfolders", but lists any
               | subfolders that match your partial text in the
               | CommonPrefix element).
               | 
               | By default `aws s3 ls` will not show any objects within a
               | CommonPrefix but simply shows a PRE line for them. The
               | cli does not let you specify a delimiter, it always uses
               | '/'. To actually list all objects you need to use
               | `--recursive`.
               | 
               | The output there would suggest that bucket really did
               | have object names that began with `bar-folder/`, and that
               | last line did not list them out because you did not
               | include the trailing slash. Without the trailing slash it
               | was just listing objects and CommonPrefixes that match
               | the string you specified after the last delimiter in your
               | url. Since only that one common prefix matched, only it
               | was printed.
        
               | jrochkind1 wrote:
               | I don't understand the answer to that question either.
               | Other AWS docs says you can choose whatever you want for
               | a delimiter, there's nothing special about `/`. So how
               | does that apply to what they say about performance and
               | "prefixes"?
               | 
               | Here is some AWS documentation on it:
               | 
               | https://docs.aws.amazon.com/AmazonS3/latest/userguide/opt
               | imi...
               | 
               | > For example, your application can achieve at least
               | 3,500 PUT/COPY/POST/DELETE or 5,500 GET/HEAD requests per
               | second per prefix in a bucket. There are no limits to the
               | number of prefixes in a bucket. You can increase your
               | read or write performance by using parallelization. For
               | example, if you create 10 prefixes in an Amazon S3 bucket
               | to parallelize reads, you could scale your read
               | performance to 55,000 read requests per second.
               | 
               | Related to your question, even if we just stick to `/`
               | because it seems safer, does that mean that
               | "foo/bar/baz/1/" and "foo/bar/baz/2/" are two prefixes
               | for the point of these request speed limits? Or does the
               | "prefix" stop at the first "/" and files with these
               | keypaths are both in the same "prefix" "foo/"?
               | 
               | Note there was (according to docs) a change a couple
               | years ago that I think some people haven't caught on to:
               | 
               | > For example, previously Amazon S3 performance
               | guidelines recommended randomizing prefix naming with
               | hashed characters to optimize performance for frequent
               | data retrievals. You no longer have to randomize prefix
               | naming for performance, and can use sequential date-based
               | naming for your prefixes.
        
           | the8472 wrote:
           | Most recursive deletion routines are not optimized for speed.
           | This could be done much faster with multiple threads or
           | batching the calls via io_uring.
           | 
           | Another option are LVM or btrfs subvolumes which can be
           | discarded without recursive traversal.
        
           | res0nat0r wrote:
           | Use the delete-objects instead and it will be much faster, as
           | you can supply up to 1000 keys to remove per a single API
           | call.
           | 
           | https://awscli.amazonaws.com/v2/documentation/api/latest/ref.
           | ..
        
           | deepsun wrote:
           | I believe it's mostly a problem of latency between your
           | machine and S3. Since each Delete call is issued separately
           | in its own HTTP connection.
           | 
           | 1. Try parallelization of your calls. Deleting 20 objects in
           | parallel should take the same time as deleting 1.
           | 
           | 2. Try to run deletion from an AWS machine in the same region
           | as the S3 bucket (yes buckets are regional, only their names
           | are global). Within-datacenter latency should be lower than
           | between your machine and datacenter.
        
         | anderiv wrote:
         | Set a lifecycle rule to delete your objects. Come back a day
         | later and AWS will have taken care of this for you.
        
           | ramraj07 wrote:
           | The issue is this isn't free. I played and emended up with a
           | few hundred million object S3 bucket on a personal project
           | and am trying to get rid of it without getting a bill.
           | Seriously considering just getting suspended from aws if
           | that's a viable path lol.
        
             | anderiv wrote:
             | "You are not charged for expiration or the storage time
             | associated with an object that has expired."
             | 
             | From: https://docs.aws.amazon.com/AmazonS3/latest/userguide
             | /lifecy...
        
             | orf wrote:
             | Lifecycle rules are free. Use them to empty the bucket.
        
         | lnwlebjel wrote:
         | Very true: it took me about a month of emptying, deleting and
         | life cycling about a dozen buckets of about 20 TB (~20 million
         | objects) to get to zero.
        
       | vdm wrote:
       | DeleteObjects takes 1000 keys per call.
       | 
       | Lifecycle rules can filter by min/max object size. (since Nov
       | 2021)
        
         | electroly wrote:
         | Thank you for mentioning that lifecycle rule change. I must
         | have missed the announcement; that is exactly the functionality
         | I needed.
        
         | vdm wrote:
         | Athena supports regexp_like(). By loading in an S3 inventory
         | this can match what a wildcard would. Then a Batch Operations
         | job can tag the result.
         | 
         | Not easy, but is possible and effective.
        
       | asim wrote:
       | The AWS horror stories never cease to amaze me. It's like we're
       | banging our heads against the wall expecting a different outcome
       | each time. What's more frustrating, the AWS zealots are quite
       | happy to tell you how you're doing it wrong. It's the users fault
       | for misusing the service. The reality is, AWS was built for a
       | specific purpose and demographic of user. It's now complexity and
       | scale makes it unusable for newer devs. I'd argue, we need a
       | completely new experience for the next generation.
        
         | jiggawatts wrote:
         | My theory is that single-platform clouds actually make more
         | sense than trying to be everything for everyone. While the
         | latter can scale to $billions, the former might actually have
         | higher margins because it delivers more value.
         | 
         | An example might be something like a Kubernetes-only cloud
         | driven entirely by Git-ops. Not TFVC, or CVS, or Docker Swarm,
         | or some hybrid of a proprietary cloud and K8s. Literally just a
         | Git repo that materialises Helm charts onto fully managed K8s
         | clusters. That's it.
         | 
         | If you try to do anything similar in, say, Azure, you'll
         | discover that:
         | 
         | Their DevOps pipelines are managed by a completely separate
         | product group and doesn't natively integrate into the platform.
         | 
         | You now have K8s labels and Azure tags.
         | 
         | You now have K8s logging and Azure logging.
         | 
         | You now have K8s namespaces and Azure resource groups.
         | 
         | You now have K8s IAM and Azure IAM.
         | 
         | You now have K8s storage and Azure disks.
         | 
         | Just that kind of duplication of concepts _alone_ can take this
         | one system 's complexity to a level where it's impossible for a
         | pure software development team to use without having a
         | dedicated DevOps person!
         | 
         | Azure App Service or AWS Elastic Beanstalk are similarly overly
         | complex, having to bend over backwards to support scenarios
         | like "private network integration". Yeah, that's what
         | developers _want_ to do, carve up subnets and faff around with
         | routing rules!  /s
         | 
         | For example, if you deploy a pre-compiled web app to App
         | Service, it'll... compile it again. For compatibility with a
         | framework you aren't using! You need a poorly documented
         | environment variable flag to work around this. There's like a
         | _dozen_ more like this and clocking up so fast.
         | 
         | Developers just want a platform they can push code to and have
         | it run with high availability and disaster recovery provided
         | as-if turning on a tap.
        
         | rmbyrro wrote:
         | What do you see missing or not well explained in AWS
         | documentation that newer devs wouldn't understand?
         | 
         | I started using S3 early in my career and didn't see this
         | problem. I always thought in data retention during design
         | phase.
         | 
         | My opinion is that lazy, careless or under time pressure
         | developers will not, and then will get bitten. But it would
         | happen to any tool. Maybe a different problem, but they'll
         | always get bitten hard ...
        
         | 015a wrote:
         | I agree 110%.
         | 
         | Actually, I disagree with one statement: "AWS was built for a
         | specific purpose and demographic of user". AWS wasn't built for
         | anyone. It was built for everyone, and is thus even reasonably
         | productive for no one. AWS's entire product development
         | methodology is "customer asks for this, build it"; there's no
         | high level design, very few opinions, five different services
         | can be deployed to do the same thing, it's absolute madness and
         | getting worse every year. Azure's methodology is "copy whatever
         | AWS is doing" (source: engineers inside Azure), so they inherit
         | the same issues, which makes sense for Microsoft because
         | they've always been an organization gone mad.
         | 
         | If there's one guiding light for Big Cloud, its: they're built
         | to be sold to people who buy cloud resources. I don't even feel
         | this is entirely accurate, given that this demographic of
         | purchaser should _at least_ , if nothing else, be considerate
         | of the cost, and there's zero chance of Big Cloud winning that
         | comparison without deceit, but if there was a demographic
         | that's who it'd be.
         | 
         | > I'd argue, we need a completely new experience for the next
         | generation.
         | 
         | Fortunately, the world is not all Big Cloud. The work
         | Cloudflare is doing between Workers & Pages represents a
         | _really_ cool and productive application environment. Netlify
         | is another. Products like Supabase do a cool job of vendoring
         | open source tech with traditional SaaS ease-of-use, with fair
         | billing. DigitalOcean is also becoming big in the  "easy cloud"
         | space, between Apps, their hosted databases, etc. Heroku still
         | exists (though I feel they've done a very poor job of
         | innovating recently, especially in the cost department).
         | 
         | The challenge really isn't in the lack of next-gen PaaS-like
         | platforms; its in countering the hypnosis puked out by Big
         | Cloud Sales in that they're the only "secure" "reliable"
         | "whatever" option. This hypnosis has infected tons of otherwise
         | very smart leaders. You ask these people "lets say we are at
         | four nines now; how much are you willing to pay, per month, to
         | reach five nines? and remember Jim, four-nines is one hour of
         | downtime a year." No one can answer that. No one.
         | 
         | End point being: anyone who thinks Big Cloud will reign supreme
         | forever hasn't studied history. Enterprise contracts make it
         | impossible for them to clean the cobwebs from their closets.
         | They will eventually become the next Oracle or IBM, and the
         | cycle repeats. It's not an argument to always run your own
         | infra or whatever; but it _is_ an argument to lean on and
         | support open source.
        
           | jiggawatts wrote:
           | > Azure's methodology is "copy whatever AWS is doing"
           | (source: engineers inside Azure), so they inherit the same
           | issues, which makes sense for Microsoft because they've
           | always been an organization gone mad.
           | 
           | I guess this, but it's funny to see it confirmed.
           | 
           | I got suspicious when I realised Azure has many of the same
           | bugs and limitations as AWS despite being supposedly
           | completely different / independent.
        
         | inopinatus wrote:
         | That's just it, though: it isn't an AWS horror story. It's the
         | sorcerer's apprentice.
        
         | deanCommie wrote:
         | HackerNews loves to criticize the cloud. It always reminds me
         | of this infamous Dropbox comment:
         | https://news.ycombinator.com/item?id=9224
         | 
         | The cloud abstracts SO MUCH complexity from the user. The fact
         | that people are then gleefully taking these "simple" services
         | and overloading them with way too much data, and way too much
         | complexity on top is not a failure of the underlying
         | primitives, but a success.
         | 
         | Without these cloud primitives, the people footgunning
         | themselves with massive bills would just not have a working
         | solution AT ALL.
        
           | jiggawatts wrote:
           | > The people footgunning themselves with massive bills would
           | just not have a working solution AT ALL.
           | 
           | Sometimes guard rails are a good thing, and the AWS
           | philosophy has _very firmly_ been against guard rails,
           | especially related to spending. The issue has come up here
           | again and again that AWS refuses to add cost limits, even
           | though they are capable of it. Azure _copied_ this
           | limitation. I don 't mean that they didn't implement cost
           | limits. They _did!_ The Visual Studio subscriber accounts
           | have cost limits. I mean that they refused to allow anyone to
           | use this feature in PayG accounts.
           | 
           | Let me give you a practical example: If I host a blog on some
           | piece of tin with a wire coming out of it, my $/month is not
           | just predictable, but _constant_. There 's a cap on the
           | outbound bandwidth, and a cap on compute spending. If my blog
           | goes viral, it'll slow down to molasses, but my bank account
           | will remain unmolested. If a DDoS hits it, it'll go down...
           | and then come back up when the script kiddie get bored and
           | move on.
           | 
           | Hosting something like this on even the _most efficient_
           | cloud-native architecture possible, such as a static site on
           | an S3 bucket or Azure Storage Account is _wildly dangerous_.
           | There is literally nothing I can do to stop the haemorrhaging
           | if the site goes popular.
           | 
           | Oh... set up some triggers or something... you're about to
           | say, right? The billing portal has a _multi-day_ delay on it!
           | You can bleed $10K per _hour_ and not have a clue that this
           | is going on.
           | 
           | And even if you know... then what? There's no "off" button!
           | Seriously, try looking for "stop" buttons on anything that's
           | not a VM in the public cloud. S3 buckets and Storage Accounts
           | certainly don't have anything like that. At best, you could
           | implement a firewall rule or something, but each and every
           | service has a unique and special way of implementing a "stop
           | bleeding!" button.
           | 
           | I don't have time for this, and I can't wear the risk.
           | 
           | This is why the cloud -- as it is right now -- is just too
           | dangerous for most people. The abstractions it provides
           | aren't just leaky, the hole has razor-sharp edges that has
           | cut the hands of many people that think that it works just
           | like on-prem but simpler.
        
         | jollybean wrote:
         | In this case it is absolutely the user 'doing it wrong'.
         | 
         | AWS allows you to store gigantic amounts of data, thus lowering
         | the bar dramatically for the kinds of things that we will keep.
         | 
         | This invariably creates a different kind of problem when those
         | thresholds are met.
         | 
         | In this case, you have 'so much data you don't know what to do
         | with it'.
         | 
         | Akin to having 'really cheap warehouse storage space' that just
         | gets filled up.
         | 
         | "It's now complexity and scale makes it unusable for newer
         | devs. I'"
         | 
         | No - the 'complexity' bit is a bit of a problem, but not the
         | scale.
         | 
         | The 'complexity bit' can be overcome if you stick to some very
         | basic things like running Ec2 instances and very basic security
         | configs. Beyond that, yes it's hard. But the 'equivalent' of
         | having your own infra would be simply to have a bunch of Ec2
         | instances on AWS and 'that's it' - and that's essentially
         | achievable without much fuss. That's always an option to small
         | companies, i.e. 'just fun some instances' and don't touch
         | anything else.
        
         | dasil003 wrote:
         | I'm not sure any sizable group is banging their head against a
         | wall. Yes, AWS is complex. Yes, AWS has cost foot guns. These
         | are natural outcomes of removing friction from scaling.
         | 
         | Sure we could start with something simpler, but as you may have
         | noticed, even the more basic hosting providers like
         | DigitalOcean and Linode have been adding S3-compatible object
         | storage because of its proven utility.
         | 
         | In terms of making something meaningfully simpler, I think
         | Heroku was the high water mark. But even though it was a great
         | developer experience, the price/performance barriers were a lot
         | more intractable than dealing with AWS.
        
           | WaxProlix wrote:
           | Heroku did so much right. I recently was toying with some bot
           | frameworks (think Discord or IRC, nothing spammy or review-
           | gaming) and getting everything set up on a free tier dyno
           | with free managed sql backing it up, and a github test/build
           | integration, all took an hour or so. Really exceeded my
           | expectations.
           | 
           | Not sure how it scales for production loads but my experience
           | was so positive I'll probably go back for future projects.
        
             | greiskul wrote:
             | Yeah, heroku is absolutely the best in just getting
             | something running. Truth is most projects don't ever have
             | to scale, either because they are hobby projects, or cause
             | they just fail. Heroku is the simplest platform that I know
             | to just quickly test something. If you do find a good
             | market fit and then need to scale, then sure, use some time
             | to get out of it. But for proof of concepts, rapid
             | iteration, etc. Heroku is awesome.
        
               | ericpauley wrote:
               | I'll argue that Fly.io is beginning to meet that need in
               | a lot of ways, especially with managed Postgres now.
        
           | marcosdumay wrote:
           | > These are natural outcomes of removing friction from
           | scaling.
           | 
           | Yes, and making scaling frictionless brings a very tiny bit
           | of value for everybody, but a huge amount of value for the
           | cloud operator. Any bit of friction would completely remove
           | that problem.
           | 
           | Also, focusing on scaling before efficiency benefits nobody
           | but the cloud provider.
        
         | _jal wrote:
         | > we need a completely new experience for the next generation
         | 
         | I mean, at some point, if you're (say) using some insane amount
         | of storage, you're going to pay for that.
         | 
         | I would agree that getting alerting right for billing-relevant
         | events _at whatever you 're currently operating at_ should be a
         | lot easier than it is. And I agree that there is a lot of room
         | to baby-proof some of the less obvious mistakes that people
         | frequently make, to better expose the consequences of some
         | changes, etc.
         | 
         | But the flip side is that infra has always been expensive, and
         | vendors have always been more than happy to sell you far more
         | than you need along with the next new shiny whatever.
         | 
         | To the extent that these are becoming implicit decisions made
         | by developers rather than periodic budgeted refresh events
         | built by infra architects, developers need to take
         | responsibility for understanding the implications of what
         | they're doing.
        
         | ignoramous wrote:
         | > _It 's the users fault for misusing the service._
         | 
         | I believe, AWS' _usage_ -based billing make for long-tail
         | surprises because its users are designing systems exactly as
         | one would expect them to. For example, S3 is never meant for a
         | bazillion small objects which Kinesis Firehose makes it easy to
         | deliver to it. In such cases, dismal retrieval performance
         | aside [0], the cost to list/delete dominate abnormally.
         | 
         | We spin up a AWS Batch job every day to coalesce all S3 files
         | ingested that day into large zlib'd parquets (kind of reverse
         | _VACCUM_ as in postgres  / _MERGE_ as in elasticsearch). This
         | setup is painful. I guess the lesson here is, one needs to
         | architect for both billing and scale, right from the get go.
         | 
         | [0] https://news.ycombinator.com/item?id=19475726
        
       | jwalton wrote:
       | Your website renders as a big empty blue page in Firefox unless I
       | disable tracking protection (and in my case, since I have
       | noscript, I have to enable javascript for "website-files.com", a
       | domain that sounds totally legit).
        
         | mst wrote:
         | I have tracking protection and ublock origin both enabled and
         | it rendered fine (FF on Win10).
         | 
         | (presented as a data point for any poor soul trying to
         | replicate your problem)
        
         | tazjin wrote:
         | Chrome with uBlock Origin on default here, and it renders a big
         | blue empty page for me, too. That's despite dragging in an
         | ungodly amount of assets first.
         | 
         | Here's an archive link that works without any tracking, ads,
         | Javascript etc.: https://archive.is/F5KZd
        
         | moffkalast wrote:
         | Noscript breaking websites? Who woulda thunk.
         | 
         | How do you manage to navigate the web with that on by default?
         | It breaks just about everything since nothing is a static site
         | these days.
        
         | Sophira wrote:
         | The problem is that the DIV that contains the main text has the
         | attribute 'style="opacity:0"'. Presumably, this is something
         | that the JavaScript turns off.
         | 
         | A lot of sites like to do things like this for some reason. I
         | haven't figured out why. I like to use Stylus to mitigate these
         | if I can, rather than enabling JavaScript.
        
           | test1235 wrote:
           | mitigation for a flash of unstyled content (FOUC) maybe?
        
           | ectopod wrote:
           | A lot of these sites (including this one) do work in reader
           | view.
        
           | acdha wrote:
           | This is a common anti-pattern -- I believe they're trying to
           | ensure that the web fonts have loaded before the text
           | displays but it's really annoying for mobile users since it
           | can add up to 2.5 seconds (their timeout) to the time before
           | you can start reading unless you're using reader mode at
           | which point it renders almost instantly.
        
           | [deleted]
        
           | MattRix wrote:
           | The page animates in. I have no idea why it does, but it
           | does, which explains why the opacity starts at 0%.
        
       | cj wrote:
       | Off topic: for people with a "million billion" objects, does the
       | S3 console just completely freeze up for you? I have some large
       | buckets that I'm unable to even interact with via the GUI. I've
       | always wondered if my account is in some weird state or if
       | performance is that bad for everyone. (This is a bucket with
       | maybe 500 million objects, under a hundred terabytes)
        
         | hakube wrote:
         | I have millions (about 16m PDF and text files) of objects and
         | it's completely freezing
        
         | albert_e wrote:
         | I suggest you raise a support ticket.
         | 
         | AFAIK there is server-side paging implemented in the List* API
         | operations that the Console UI should be using so that the
         | number of objects in a bucket should not significantly impact
         | the webpage performance.
         | 
         | But who knows what design flaws lurk beneath the console.
         | 
         | Curious to know what you find.
         | 
         | Does it happen only on opening heavy buckets? or the entire S3
         | console? Different Browser / incognito / different machine
         | ...dont make a difference?
        
         | liveoneggs wrote:
         | the newer s3 console works a little better. It gives pagination
         | with "< 1 2 3 ... >"
        
         | kristjansson wrote:
         | Just checked, out of curiosity. A bucket at $WORK with ~4B
         | objects / ~100TB is completely usable through the console. Keys
         | are hierarchal, and relatively deep, so no one page on the GUI
         | is trying to show more than a few hundred keys. If your keys
         | are flatter, I could see how the console be unhappy.
        
         | grumple wrote:
         | Sort of related, I faced such an issue when I had a gui table
         | that was triggering a count on a large object set via sql so it
         | could display the little "1 to 50 of 1000000". This is
         | presumably why services like google say "of many". Wonder if
         | they have a similar issue.
        
         | base698 wrote:
         | Yes, and sometimes even listing can take days.
         | 
         | I worked somewhere that a person decided using Twitter Firehose
         | was a good idea for S3. Keyed by tweet per file.
         | 
         | Ended up figuring out a way to get them in batches and
         | condense. Ended up costing about $800 per hour to fix coupled
         | with lifecycle changes they mentioned.
        
           | properdine wrote:
           | Doing an S3 object inventory can be a lifesaver here!
        
           | orf wrote:
           | > Yes, and sometimes even listing can take days.
           | 
           | You have a versioned bucket with a lot of delete markers in
           | it. Make sure you've got a lifecycle policy to clean them up.
        
         | CobrastanJorji wrote:
         | I'm curious. If you have a bucket with perhaps half a billion
         | objects, what is the use case that leads you to wanting to
         | navigate through it with a GUI? Are you perhaps trying to go
         | through folders with dates looking for a particular day or
         | something?
        
         | jq-r wrote:
         | In my previous company we had around 15K instances in a EC2
         | region and the EC2 GUI was unusable if it was set on the "new
         | gui experience" so we always had to use classic one. The new
         | one would try to get all the details of them so once loaded it
         | was fast. But to get there it would take many minutes or it
         | would just expire. Don't know if they've fixed that.
        
         | twistedpair wrote:
         | Honestly this is when most folks move to using their own
         | dashboards, metrics, and tooling. The AWS GUIs were designed
         | for small to moderate use cases.
         | 
         | You don't peer into a bucket with a billion objects and ask for
         | a complete listing, or accounting of bytes. There are tools and
         | APIs for that.
         | 
         | That's what I do with my thousands of buckets and billions of
         | files (dashboards).
        
           | scapecast wrote:
           | It's also the reason why some AWS product teams have started
           | acquiring IDE- or CLI-type of start-ups. They don't want to
           | be boxed in by the constraints of the AWS Console - which is
           | run by a central team. For example, the Redshift team bought
           | DataRow.
           | 
           | Disclosure, co-founder here, we're building one of those
           | CLIs. We started as an internal project at D2iQ (my co-
           | founder Lukas commented further up), with tooling to collect
           | an inventory of AWS resources and be able to search it
           | easily.
        
       | gfd wrote:
       | Does anyone have recommendations on how to compress the data
       | (gzip or parquet).
        
       | gtirloni wrote:
       | A "TLDR" that is not.
        
       | hughrr wrote:
       | For every $100k bill there's a hundred of us with 14TB that costs
       | SFA to roll with.
        
       ___________________________________________________________________
       (page generated 2022-02-17 23:00 UTC)