[HN Gopher] Gcloud storage: up to 94% faster data transfers for ...
       ___________________________________________________________________
        
       Gcloud storage: up to 94% faster data transfers for Cloud Storage
        
       Author : ingve
       Score  : 75 points
       Date   : 2022-10-07 18:31 UTC (4 hours ago)
        
 (HTM) web link (cloud.google.com)
 (TXT) w3m dump (cloud.google.com)
        
       | alpb wrote:
       | Does anyone know if the "gcloud storage" vs "gsutil" chart in the
       | article uses "gsutil -m" (parallel mode of the tool)?
        
       | davecamp1717 wrote:
       | Wow, feel good to see good news from GCP
        
       | ZeroCool2u wrote:
       | "These tests have been performed on Google Cloud Platform using
       | n2d-standard-16 (8 vCPUs, 32 GB memory) and 1x375GB NVME in RAID0
       | in us-east4."
       | 
       | Am I missing something? Is there some configuration of RAID0 that
       | allows you to use a single drive? Or is this just a confusing way
       | of saying they have one logical drive available, because the two
       | attached NVME drives are configured in RAID0?
        
         | gtirloni wrote:
         | You can create a RAID-0 volume on top of a single drive. But it
         | seems this was a typo in the article.
        
       | unilynx wrote:
       | gsutil hardly ever finished when trying to download a couple-
       | hundred google workspace takeout, preferring to consume all
       | available memory instead - so that's a pretty easy thing to beat.
       | 
       | does it outperform rclone?
        
       | carbocation wrote:
       | I initially misunderstood the article's explanation of the gsutil
       | shim: namely, I thought it was "turned on" by default, but that
       | is not the case. One either needs to apply the shim, or to start
       | using `gcloud storage` instead of `gsutil`.
        
       | seabrookmx wrote:
       | This was inevitable it seems. If you look under the hood gsutil
       | is a wrapper around a vendored version of boto2. boto3 is a big
       | departure and never got support for Google Cloud Storage.
       | 
       | Also the fact that it was separate from the rest of the gcloud
       | CLI was a weird quirk. This fixes both of those issues.
        
       | mooman219 wrote:
       | I actually worked on gsutil a while ago! I added in flight
       | compression -J/j. Glad to see gsutil is on its way out.
        
         | titanomachy wrote:
         | How did you choose the letter "j"? were all the more obvious
         | ones just taken?
        
       | fragmede wrote:
       | Connect it up with Skyplane and you'll really be speeding along
       | 
       | https://medium.com/@paras_jain/skyplane-110x-faster-data-tra...
        
         | capableweb wrote:
         | I'm interested to see more benchmarks. The posted numbers in
         | that blogpost seems too good to be true, is tooling really so
         | far behind that a single tool could surpass the transfer time
         | by that much?
         | 
         | The architecture goes through some of the methodology:
         | https://skyplane.org/en/latest/architecture.html which is what
         | I'm currently reading through and trying to understand.
        
       | svarovski wrote:
        
       | keltex wrote:
       | Looks like they are missing the useful "rsync" command from
       | gsutil (Synchronize content of two buckets/directories).
        
       | nomadiccoder wrote:
       | Also a lot more convenient to use with WIF
        
       | rsync wrote:
       | From the article:
       | 
       | "The new gcloud storage CLI offers significant performance
       | improvements over the existing gsutil ..."
       | 
       | It makes sense to see this comparison because 'gsutil' was
       | relatively poor - we[1] made a decision to deprecate the 'gsutil'
       | binary[2] in our environment in favor of rclone[3] because
       | _rclone is better in every way_.
       | 
       | We made the same decision for 's3cmd'.
       | 
       | All of that to say:
       | 
       | The biggest question to ask about this is "how does it compare to
       | rclone". I _suspect_ the answer is that authentication and token
       | handling (and things like that) are much better as rclone has a
       | pretty clunky auth workflow for google cloud resources. That is
       | _not_ the case for Amazon resources which behave just as you 'd
       | expect them to, with regard to API tokens, and make it hard to
       | justify using any tool other than rclone.
       | 
       | [1] rsync.net
       | 
       | [2] Yes, a binary - we "freeze" python into binary exe because we
       | don't allow interpreters in our environment.
       | 
       | [3] https://rclone.org/
        
         | xen2xen1 wrote:
         | Rclone is clunky because it works with everything, even things
         | like OneDrive on Linux, which is not supported officially. Love
         | rclone. Also just a great copy / move utility.
        
         | kldx wrote:
         | How do you freeze python tools into binaries?
        
           | kleton wrote:
           | par- I assume this is the open sourced version
           | https://github.com/google/subpar
        
           | rsync wrote:
           | "How do you freeze python tools into binaries?"
           | 
           | I think there are several workflows for doing this - I know
           | in the past we used something called 'cx_Freeze' but I think
           | the person in charge of this is now using 'py2exe' ?
           | 
           | I think the 'borg' project packages up their single file .exe
           | distribution with 'pyinstaller' but I may not fully
           | understand that part of their process...
        
         | faizshah wrote:
         | I use s5cmd, it's a fast parallel s3 cli client:
         | https://github.com/peak/s5cmd
         | 
         | Haven't heard of rclone before so can't compare yet.
        
       ___________________________________________________________________
       (page generated 2022-10-07 23:00 UTC)