[HN Gopher] Launch HN: Blyss (YC W23) - Homomorphic encryption a...
       ___________________________________________________________________
        
       Launch HN: Blyss (YC W23) - Homomorphic encryption as a service
        
       Hi everyone! I'm Samir, and my co-founder Neil and I are building
       Blyss (https://blyss.dev). Blyss is an open source homomorphic
       encryption SDK, available as a fully managed service.  Fully
       homomorphic encryption (FHE) enables computation on encrypted data.
       This is essentially the ultimate privacy guarantee - a server that
       does work for its users (like fetching emails, tweets, or search
       results), without ever knowing what its users are doing - who they
       talk to, who they follow, or even what they search for. Servers
       using FHE give you cryptographic proof that they aren't spying on
       you.  Unfortunately, performing general computation using FHE is
       notoriously slow. We have focused on solving a simple, specific
       problem: retrieve an item from a key-value store, without revealing
       to the server which item was retrieved.  By focusing on retrievals,
       we achieve huge speedups that make Blyss practical for real-world
       applications: a password scanner like "Have I Been Pwned?" that
       checks your credentials against breaches, but never learns anything
       about your password (https://playground.blyss.dev/passwords),
       domain name servers that don't get to see what domains you're
       fetching (https://sprl.it/), and social apps that let you find out
       which of your contacts are already on the platform, without letting
       the service see your contacts (https://stackblitz.com/edit/blyss-
       private-contact-intersecti...).  Big companies (Apple, Google,
       Microsoft) are already using private retrieval: Chrome and Edge use
       this technology today to check URLs against blocklists of known
       phishing sites, and check user passwords against hacked credential
       dumps, without seeing any of the underlying URLs or passwords.
       Blyss makes it easy for developers to use homomorphic encryption
       from a familiar, Firebase-like interface. You can create key-value
       data buckets, fill them with data, and then make cryptographically
       private retrievals. No entity, not even the Blyss service itself,
       can learn which items are retrieved from a Blyss bucket. We handle
       all the server infrastructure, and maintain robust open source JS
       clients, with the cryptography written in Rust and compiled to
       WebAssembly. We also have an open source server you can host
       yourself.  (Side note: a lot of what drew us to this problem is
       just how paradoxical the private retrieval guarantee sounds--it
       seems intuitively like it should be impossible to get data from a
       server without it learning what you retrieve! The basic idea of how
       this is actually possible is: the client encrypts a one-hot vector
       (all 0's except a single 1) using homomorphic encryption, and the
       server is able to 'multiply' these by the database without learning
       anything about the underlying encrypted values. The dot product of
       the encrypted query and the database yields an encrypted result.
       The client decrypts this, and gets the database item it wanted. To
       the server, all the inputs and outputs stay completely opaque. We
       have a blog post explaining more, with pictures, that was on HN
       previously: https://news.ycombinator.com/item?id=32987155.)  Neil
       and I met eight years ago on the first day of freshman year of
       college; we've been best friends (and roommates!) since. We are
       privacy nerds--before Blyss, I worked at Yubico, and Neil worked at
       Apple. I've had an academic interest in homomorphic encryption for
       years, but it became a practical interest when a private Wikipedia
       demo I posted on HN (https://news.ycombinator.com/item?id=31668814)
       became popular, and people started asking for a simple way to build
       products using this technology.  Our client and server are MIT open
       source (https://github.com/blyssprivacy/sdk), and we plan to make
       money as a hosted server. Since the server is tricky to operate at
       scale, and is not part of the trust model, we think this makes
       sense for both us and our customers. People have used Blyss to
       build block explorers, DNS resolvers, and malware scanners; you can
       see some highlights in our playground:
       https://playground.blyss.dev.  We have a generous free tier, and
       you get an API key as soon as you log in. For production use, our
       pricing is usage-based: $1 gets you 10k private reads on a 1 GB
       database (larger databases scale costs linearly). You can also run
       the server yourself.  Private retrieval is a totally new building
       block for privacy - we can't wait to see what you'll build with it!
       Let us know what you think, or if you have any questions about
       Blyss or homomorphic encryption in general.
        
       Author : blintz
       Score  : 110 points
       Date   : 2023-03-14 15:42 UTC (7 hours ago)
        
       | miketmahlkow wrote:
       | Can you you elaborate on the differences between this and end-to-
       | end encryption?
        
         | blintz wrote:
         | Sure! End-to-end encryption (E2EE) in a messaging context is
         | about the service provider (Meta for WhatsApp, Apple for
         | iMessage) not learning the _contents_ of messages sent on the
         | platform. E2EE also gets used when referring to backups, where
         | it again refers to the service provider of the backups not
         | learning the contents of backups.
         | 
         | Private retrieval is a more general concept, which refers to
         | retrieving data from a server without letting it learn your
         | access pattern. In a specific application, it's easier to see
         | the contrast: for example, in our password checker
         | (https://playground.blyss.dev/passwords), the data that Blyss
         | helps keep encrypted, and prevents the server from learning, is
         | which password you are checking. With standard E2EE techniques,
         | it would not really possible to keep your query private.
         | 
         | In messaging, Blyss can be used to build messaging services
         | that not only do not learn what you say (the standard E2EE
         | guarantee), but also do not learn who you talk to. We're
         | working on this, but it's a tricky thing to ship.
        
       | ngneer wrote:
       | This capability is not exactly Fully Homomorphic Encryption
       | (FHE). In the cryptographic literature this is typically referred
       | to as PIR, or Private Information Retrieval
       | [https://en.wikipedia.org/wiki/Private_information_retrieval].
       | Counterintuitive indeed. The idea is not totally new, though...
        
         | Oreko wrote:
         | I haven't read their protocol, but you can easily implement PIR
         | using FHE through polynomial evaluation.
        
         | blintz wrote:
         | True! We are _using_ FHE to perform PIR. The underlying scheme
         | we use is a real homomorphic encryption scheme (Regev + GSW),
         | but yeah, we explicitly do not support performing arbitrary
         | computation on encrypted data. As it turns out, that 's still
         | quite slow - the Google FHE C++ transpiler still takes seconds
         | to do 32-bit arithmetic operations. Our PIR system is able to
         | achieve much more practical speed + communication overheads.
        
       | nurhdmsx wrote:
       | I read homophobic encryption as a service and was seriously
       | confused
        
       | arcanemachiner wrote:
       | Might just be my browser, but on the homepage, both the "scan for
       | breached credentials" and "block malicious URLs" links both lead
       | to the password checker when clicked.
        
       | eternalban wrote:
       | This is great -- sorely needed and long overdue. Thanks for
       | sharing the code and good luck with the company!
        
       | wizzard0 wrote:
       | A thing long overdue, I'd say!
       | 
       | Have you thought about making some ELI5 explainer on how the algo
       | essentially works?
       | 
       | The post you link to is already a great start, I feel like it's
       | just a question of a little editing work and maybe more examples
       | 
       | -- for the nerds to get interested and actually read the paper
       | 
       | -- for the users to understand privacy properties better (eg why
       | this is better than TLS in case of a server infected with
       | malware, etc
       | 
       | -- and also things which it doesn't do, which would calm anxiety
       | in those who /need/ to understand the limitations to feel safe
       | 
       | -- and to keep devs from thinking it's a magic pixie dust and
       | over-promising users, only to get hacked
        
         | blintz wrote:
         | A big part of this company has turned out to be figuring out
         | how to explain FHE :)
         | 
         | I'm working on a higher-level "why/how to use this" blog post
         | that should help. Thanks for the suggestions!
        
       | jonathan-kosgei wrote:
       | What is the read latency?
        
         | blintz wrote:
         | It's 1-2 seconds for a 1 GB database with millions of items.
         | 
         | (A couple years ago this was more like minutes, and about 10
         | years ago it would have taken hours!)
        
       | sshine wrote:
       | Awesome work, I look forward to finding applications for this.
       | 
       | Question: Have you considered using zk-STARKs for succinct proofs
       | of computation? Or would that be too far off target wrt. being
       | good at one thing?
       | 
       | E.g. https://github.com/TritonVM
        
         | blintz wrote:
         | Things tend to get pretty slow when you try to compute SNARKs
         | over FHE computations. Some progress is getting made, but it's
         | still pretty academic.
         | 
         | There is a cool company trying to instead use FHE to accelerate
         | SNARKs: https://github.com/Sunscreen-tech/Sunscreen. They seem
         | to be making some headway!
        
       | themoonisachees wrote:
       | I'm guessing this solves a very specific pet peeve of mine:
       | 
       | When your bitwarden vault is not opened, if you log in to
       | website, the extension will ask if you want to store the
       | password, even if your vault already has an entry for that
       | website. Of course, this is by design so that bitwarden doesn't
       | store websites you have credentials for in plaintext (unlike
       | lastpass and it blew up in their face).
       | 
       | Would this allow your browser to query a database of "domains i
       | have a password for" without a leak on bitwarden's server
       | exposing this exact database? There are other implementation
       | details but you get the idea.
        
         | julvo wrote:
         | wouldn't salting and hashing be enough for this use case if you
         | keep the salt on the client?
        
       | brap wrote:
       | Looks good, congrats!
       | 
       | In your landing page example, where does the secret client key
       | fit in?
        
         | blintz wrote:
         | Thanks! The secret client key stays in the browser or app. It's
         | used to encrypt queries, and decrypt the server responses.
        
           | brap wrote:
           | Right, but is it generated under the hood for each query?
           | 
           | And how is the data that was initially written
           | encrypted/decrypted? who holds the key for that?
        
             | blintz wrote:
             | Yes, it's generated in the browser for each query.
             | 
             | And this depends on the application - for example, for the
             | private password checker, all the dumped passwords data is
             | from a public dataset, so its not encrypted. In messaging,
             | the data would be encrypted under the intended recipient's
             | public key.
        
       | colesantiago wrote:
       | > This is essentially the ultimate privacy guarantee - a server
       | that does work for its users (like fetching emails, tweets, or
       | search results), without ever knowing what its users are doing -
       | who they talk to, who they follow, or even what they search for.
       | 
       | Isn't this perfect for mostly criminals and all the bad actors?
       | 
       | Is there anything you're going to do about these people using
       | your service?
        
         | blintz wrote:
         | We don't think this guarantee is only useful to bad actors, in
         | the same way that end-to-end encryption has turned out to be
         | useful even if you're not doing something illegal.
         | 
         | The businesses using Blyss want to perform tasks (like scanning
         | for breached credentials) without seeing sensitive customer
         | data. Even the US government's civilian cybersecurity agency,
         | CISA, recommends that you use end-to-end encrypted solutions
         | for credential vaults (https://www.cisa.gov/news-
         | events/cybersecurity-advisories/aa... Blyss is an added layer
         | for these services, protecting even access metadata.
        
         | abtinf wrote:
         | Individuals have a right to privacy. This right is not
         | contingent on there being no bad actors on the planet. If
         | anything, the existence of bad actors reinforces the right to
         | privacy of good actors.
        
         | inariakagane wrote:
         | Crime is illegal, best to leave that up to various law
         | enforcement agencies.
         | 
         | While talking about a crime is illegal, it is the person(s)
         | talking about the crime that are criminals, not the letter it
         | is written on.
        
       | robszumski wrote:
       | Are there any hardware acceleration strategies for FHE or is it
       | all making the calculations more efficient on the software side
       | right now? My guess is that the software needs to mature before
       | baking silicon?
        
         | neilmovva wrote:
         | Our FHE scheme uses lots of Number Theoretic Transforms (NTTs),
         | which are pretty computationally expensive. NTT is a good
         | candidate for acceleration, and there is quite a bit of
         | interest from the zk community in doing so
         | (https://www.zprize.io/prizes/accelerating-ntt-operations-
         | on-...).
         | 
         | From a hardware perspective, NTT can be done in parallel, but
         | has a fairly large working set of data (~512 MB) with lots of
         | unstructured accesses. This is too big to fit in even the
         | largest CPU L3 caches, so DRAM bandwidth is still relevant. It
         | may be eventually be feasible to build an ASIC with this much
         | on-chip memory, but in the meantime, GPUs do a pretty decent
         | job with their massive HBM bandwidth.
        
           | byteware wrote:
           | interesting prize, I wonder why they fix that it has to be
           | radix-2 NTT, using higher radix speeds things up an order of
           | magnitude on GPU (granted I am using a 256 bit field, so it
           | might be more memory bound)
        
       | 1differential wrote:
       | Congrats on the launch! I actually considered launching something
       | tangential - though I never figured out who the customers would
       | really be nor how I would pitch this to companies. Excited to see
       | where this takes you!
        
         | neilmovva wrote:
         | Thanks! Yup, private retrieval is interesting as a product
         | because it's a fundamentally new capability; there aren't
         | really competitors we can show incremental improvements
         | against. If you're still interested in the space, we'd be happy
         | to compare notes! Feel free to email us: founders AT blyss.dev
        
       | dougk16 wrote:
       | Let's say I sent up a key "foo" to get the value "bar", and I did
       | this again and again. Will either "foo" or "bar" be encrypted to
       | the same ciphertext again and again? Or is there some kind of
       | nonce or salt or other mechanism that will make the ciphertext
       | always different? Congrats on launching and thank you for any
       | answer.
        
         | blintz wrote:
         | Great question! The ciphertexts will be different every time,
         | just like in standard encryption; the scheme uses something
         | very similar to a nonce.
         | 
         | We are trying to avoid the "ECB Penguin", of course:
         | https://crypto.stackexchange.com/questions/14487/can-someone...
        
       | asm64me wrote:
       | How do you guarantee the server that sent the Javascript to the
       | browser, which stores the client secret key in the browser,
       | didn't get hacked to also send the client secret key somewhere
       | else after it was generated in the client?
        
         | wizzard0 wrote:
         | Assuming the malicious server operator, you need to obtain the
         | client out-of-band (package manager, app store etc), or if we
         | require it's the web app - thru somethink like an IPFS gateway
         | where you can be sure the bits received match a particular
         | hash.
         | 
         | Or do a git clone (pinned to commit hash) and host the client
         | locally, I guess))
        
         | blintz wrote:
         | Yeah, this is definitely a risk of any in-browser demo of this
         | tech. The story for apps is much better, since there's a
         | routine installation process, signatures are checked, etc. We'd
         | like private retrievals to eventually be part of the browser
         | itself, so that it can make a kind of "private GET" request
         | natively.
         | 
         | We'd also love to bind our client JS code to a hash of our
         | build output from GitHub, but as of now there's no simple way
         | to do this that the browser will pin automatically - integrity
         | checks are good, but don't prevent the server from just
         | changing the hash. We've toyed with writing an extension for
         | this, but haven't gotten around to it.
        
           | wizzard0 wrote:
           | I wonder how the Subresource Integrity can expand to the root
           | document hash (other than using IPFS gateways).
           | 
           | UPD yeah, extension hashing resources sounds nice too
        
             | holmesworcester wrote:
             | I've wanted this too! You could include subresource
             | integrity hash in the URL that the browser will check
             | against the page. This would make things like Cryptpad and
             | Skiff, or group invite links in Signal, way more secure.
        
       | oars wrote:
       | Exciting times.
       | 
       | OpenAI's GPT-4 announcement, Google announcing AI for Workspace,
       | Meta additional 10k layoffs, and now we're seeing homomorphic
       | encryption come out to the masses.
       | 
       | All in one day!
        
       | JanisErdmanis wrote:
       | [dead]
        
       | namank wrote:
       | Which companies would use this and why? Data worth making private
       | is also worth some $$ to the business hosting it.
        
         | blintz wrote:
         | Some data, like passwords or other credentials, isn't stuff
         | anyone really wants to monetize - so secrets managers (things
         | like HashiCorp Vault) and password managers are both interested
         | in using this to allow them to collect even less data.
         | 
         | In other cases, for the same compliance and data security
         | reasons behind the desire for on-prem, larger enterprises
         | prefer that their SaaS vendors collect as little data about
         | them as possible. Blyss can get you the best of both worlds:
         | the data security of on-prem, with the convenience and ease-of-
         | deployment of SaaS.
        
       | haliax wrote:
       | Is this FHE or oblivious transfer?
        
         | blintz wrote:
         | It's FHE applied to solve a variant of oblivious transfer,
         | called "private information retrieval"
         | (https://en.wikipedia.org/wiki/Private_information_retrieval).
         | PIR is very similar to oblivious transfer, except that in
         | oblivious transfer, the privacy is mutual - the client learns
         | _exactly one_ element from the database; in PIR, it 's ok if
         | the client learns some number of 'extra' items other than the
         | one it queried.
        
       | dariosalvi78 wrote:
       | this is great, thanks for launching it, I may actually be a
       | future user of yours ;)
       | 
       | one important feature I see missing is that one cannot run
       | queries with comparisons, such as "give me any message sent
       | between 2022-10-10 and 2023-02-01". This would be very important
       | when one doesn't have all the keys, or when the keys are too
       | many, like in the messages example above.
       | 
       | Any idea for this kind of scenario?
        
         | neilmovva wrote:
         | Thanks! Yup, it's not always practical to make a huge number of
         | queries when you expect many of them to come back empty.
         | Instead, we first perform private lookups against a Bloom
         | filter, to find out which keys actually hold data (e.g.
         | messages). Then, we privately retrieve only the useful keys.
         | 
         | The Bloom filter is also served over Blyss, so the server still
         | learns nothing about which keys you're interested in. We
         | implemented this system for our private password checker, which
         | tests passwords against almost a billion breached credentials:
         | https://playground.blyss.dev/passwords
        
           | dariosalvi78 wrote:
           | Thanks for the answer, but I was meaning from your customer
           | perspective. My understanding is that you offer a key-value
           | store, so the only operation available on the encrypted data
           | is a comparison (==).
           | 
           | If my application wants to retrieve data within a certain
           | range (< and > operators) is there anything I can do to
           | implement it on top of you SDK?
           | 
           | Think of the encrypted messages app: how can I retrieve this
           | month's messages using your SDK?
           | 
           | I hope this clearer now...
        
       | Foomf wrote:
       | You stole my name!
       | 
       | I'm kidding... for a while I wanted to make a game named "blyss".
       | I own the blyss.io domain name. I'll sell it to you if you want!
        
       ___________________________________________________________________
       (page generated 2023-03-14 23:02 UTC)