[HN Gopher] Launch HN: Flower (YC W23) - Train AI models on dist...
       Launch HN: Flower (YC W23) - Train AI models on distributed or
       sensitive data
       Hey HN - we're Daniel, Taner, and Nic, and we're building Flower
       (https://flower.dev/), an open-source framework for training AI on
       distributed data. We move the model to the data instead of moving
       the data to the model. This enables regulatory compliance (e.g.
       HIPAA) and ML use cases that are otherwise impossible. Our GitHub
       is at https://github.com/adap/flower, and we have a tutorial here:
       https://flower.dev/docs/tutorial/Flower-0-What-is-FL.html.  Flower
       lets you train ML models on data that is distributed across many
       user devices or "silos" (separate data sources) without having to
       move the data. This approach is called federated learning.  A silo
       can be anything from a single user device to the data of an entire
       organization. For example, your smartphone keyboard suggestions and
       auto-corrections can be driven by a personalized ML model learned
       from your own private keyboard data, as well as data from other
       smartphone users, without the data being transferred from anyone's
       device.  Most of the famous AI breakthroughs--from ChatGPT and
       Google Translate to DALL*E and Stable Diffusion--were trained with
       public data from the web. When the data is all public, you can
       collect it in a central place for training. This "move the data to
       the computation" approach fails when the data is sensitive or
       distributed across organizational silos and user devices.  Many
       important use cases are affected by this limitation:  * Generative
       AI: Many scenarios require sensitive data that users or
       organizations are reluctant to upload to the cloud. For example,
       users might want to put themselves and friends into AI-generated
       images, but they don't want to upload and share all their photos.
       * Healthcare: We could potentially train cancer detection models
       better than any doctor, but no single organization has enough data.
       * Finance: Preventing financial fraud is hard because individual
       banks are subject to data regulations, and in isolation, they don't
       have enough fraud cases to train good models.  * Automotive:
       Autonomous driving would be awesome, but individual car makers
       struggle to gather the data to cover the long tail of possible edge
       cases.  * Personal computing: Users don't want certain kinds of
       data to be stored in the cloud, hence the recent success of
       privacy-enhancing alternatives like the Signal messenger or the
       Brave browser. Federated methods open the door to using sensitive
       data from personal devices while maintaining user privacy.  *
       Foundation models: These get better with more data, and more
       diverse data, to train them on. But again, most data is sensitive
       and thus can't be incorporated, even though these models continue
       to grow bigger and need more information.  Each of us has worked on
       ML projects in various settings, (e.g., corporate environments,
       open-source projects, research labs). We've worked on AI use cases
       for companies like Samsung, Microsoft, Porsche, and Mercedes-Benz.
       One of our biggest challenges was getting the data to train AI
       while being compliant with regulations or company policies.
       Sometimes this was due to legal or organizational restrictions;
       other times, it was difficulties in physically moving large
       quantities of data or natural concerns over user privacy. We
       realized issues of this kind were making it too difficult for many
       ML projects to get off the ground, especially in domains like
       healthcare and finance.  Federated learning offers an alternative
       -- it doesn't require moving data in order to train models on it,
       and so has the potential to overcome many barriers for ML projects.
       In early 2020, we began developing the open-source Flower framework
       to simplify federated learning and make it user-friendly. Last
       year, we experienced a surge in Flower's adoption among industry
       users, which led us to apply to YC. In the past, we funded our work
       through consulting projects, but looking ahead, we're going to
       offer a managed version for enterprises and charge per deployment
       or federation. At the same time, we'll continue to run Flower as an
       open-source project that everyone can continue to use and
       contribute to.  Federated learning can train AI models on
       distributed and sensitive data by moving the training to the data.
       The learning process collects whatever it can, and the data stays
       where it is. Because the data never moves, we can train AI on
       sensitive data spread across organizational silos or user devices
       to improve models with data that could never be leveraged until
       now.  Here's how it works: (0) Initialize the global model
       parameters on the server; (1) Send the model parameters to a number
       of organizations/devices (client nodes); (2) Train model locally on
       the data of each organization/device (client node); (3) Return the
       updated model parameters back to the server; (4) On the server,
       aggregate the model updates (e.g., by averaging them) into a new
       global model; (5): Repeat steps 1 to 4 until the model converges.
       This, of course, is more challenging than centralized learning: we
       must move AI models to data silos or user devices, train locally,
       send updated models back, aggregate them, and repeat. Flower
       provides the open-source infrastructure to easily do this, as well
       as supporting other privacy-enhancing technologies (PETs). It is
       compatible with PyTorch, TensorFlow, JAX, Hugging Face, Fastai,
       Weights & Biases and all the other tools used in ML projects
       regularly. The only dependency on the server side is NumPy, but
       even that can be dropped if necessary. Flower uses gRPC under the
       hood, so a basic client can easily be auto-generated, even for most
       languages that are not supported today.  Flower is open-source
       (Apache 2.0 license) and can be run in all kinds of environments:
       on a personal workstation for development and simulation, on Google
       Colab, on a compute cluster for large-scale simulations or on a
       cluster of Raspberry Pi's (or similar devices) to build research
       systems, or deployed on public cloud instances (AWS, Azure, GCP,
       others) or private on-prem hardware. We are happy to help users
       when deploying Flower systems and will soon make this even easier
       through our managed cloud service.  You can find PyTorch example
       code here: https://flower.dev#examples, and more at
       https://github.com/adap/flower/tree/main/examples.  We believe that
       AI technology must evolve to be more collaborative, open and
       distributed than it is today
       (https://flower.dev/blog/2023-03-08-flower-labs/). We're eager to
       hear your feedback, experiences regarding difficulties in training,
       data access, data regulation, privacy and anything else related to
       federated (or related) learning methods!
       Author : niclane7
       Score  : 110 points
       Date   : 2023-03-22 13:38 UTC (9 hours ago)
       | brookst wrote:
       | Very interesting project. Your write up here does a much better
       | job of explaining the market need and value prop than the GitHub
       | readme.md... consider bringing some of this text over as the "why
       | / what" story?
         | tanto wrote:
         | Thank you! We'll make sure to improve the readme and add more
         | explanation to it.
       | photochemsyn wrote:
       | This looks very interesting. I'd like to see a model trained on
       | the complete body of scientific research literature from the past
       | 100 years or so, I wonder if this approach could facilitate that?
         | niclane7 wrote:
         | Yes, this would be exciting to see. One approach wouldn't
         | require federated learning however. If you had direct access to
         | the data then you could build a conventionally trained large
         | language model (i.e., collect all the data together placed in a
         | data center). However, given the context of this discussion --
         | you are probably asking about if we could use Flower to train
         | in a federated manner. I believe so. Although again, we'd
         | probably be training a LLM which brings added complications due
         | to its size (and other factors). Internally at Flower we have
         | been testing methods to overcome this and are confident we can
         | pull this off. One could imagine someone hosting a pre-trained
         | LLM and contributing institutions acting as nodes in the
         | network, each performing some small part of the training based
         | on the fraction of the literature they have access to. We plan
         | to release LLM based federated technology in the coming months.
         | For those that are interested: The best work currently I've
         | seen on training very large models under federated learning,
         | that also makes very realistic assumptions about the likely
         | underlying participating hardware, is this:
         | https://arxiv.org/abs/2206.11239 -- although I expect more in
         | this direction to come soon.
         | haldujai wrote:
         | I'm not sure that this would be as useful as one might think at
         | face value. When you stretch out the training corpus like that
         | you're going to have more noise/inaccuracies/refuted facts then
         | you will have correct information.
         | It's also unclear how useful full scientific articles are,
         | Microsoft/PubMedBERT interestingly showed PMC abstracts was
         | better than full text.
       | JohnFen wrote:
       | Isn't this still moving your data to a central repository? It's
       | encoded in a neural net rather than in a more accessible form,
       | but it's still being moved out of your control.
       | techwizrd wrote:
       | I've been working with Flower to implement and study Federated
       | Learning for a few years, and have just started contributing back
       | on Slack and Github. Congrats on launching on HN!
         | tanto wrote:
         | Really happy to hear that and your support is much appreciated!
         | I saw you answering many questions before we could do so :)
         | Thank you for that. We are reaching out to all contributor. Let
         | me know in Slack if you are up to a short call to understand
         | better how we can support you.
           | techwizrd wrote:
           | Absolutely. Happy to help!
       | spangry wrote:
       | Another interesting use case - government training models on
       | legislatively protected data (e.g. tax data). Lots of data the
       | government holds is governed by confidentiality restrictions
       | built into legislation, limiting its utility. Sounds like
       | federated learning could be a way around that.
         | tanto wrote:
         | Absolutely, there are many cases where governments and
         | generally large organisations can use Flower to internally
         | train models on internal data silos. Often enough large
         | organisations have all the incentives to utilize their own data
         | but can't centralize it due to regulations.
       | rjtc wrote:
       | How is your approach different than tf federated or any of the
       | other federated libraries out there?
         | danieljanes wrote:
         | There are some similarities, but also some differences.
         | Flower's take is that it wants to support the entire FL
         | workflow from experimental research to large-scale production
         | deployments and operation. Some other FL frameworks fall either
         | in the "research" or "production deployment" bucket, but few
         | have good support for both.
         | Flower does a lot under the hood to support these different
         | usage scenarios: it has both a networked engine (gRPC,
         | experimental support for REST, and the possibility to "bring
         | your own communication stack") and a simulation engine to
         | support both real deployment on edge devices/server and
         | simulation of large-scale federations on single machines or
         | compute clusters.
         | This is - to the best of our knowledge - one of the drivers of
         | our large and active community. The community is very
         | collaborative and there are many downstream projects in the
         | ecosystem that build on top of Flower (GitHub lists 748
         | dependent projects:
         | https://github.com/adap/flower/network/dependents).
       | fedgenerativeai wrote:
       | This is not new at all. There is a much stronger competitor
       | existing in the market already: FedML (https://fedml.ai). They
       | have a much larger open-source community, and a well-managed and
       | widely-used MLOps (https://open.fedml.ai).
         | dang wrote:
         | I'm not sure what pre-existing conflict is breaking out in this
         | subthread between accounts created for the purpose, but would
         | all of you please stop?
         | sisso97 wrote:
         | I'm very familiar with FL frameworks. I chose Flower after
         | having spent a lot of time benchmarking many of them. FedML
         | wasn't able to carry on the easiest workloads I tried, like the
         | baseline of the Shakespeare dataset. The simulator you have was
         | the slowest compared to the other frameworks I tried. Your
         | platform doesn't give developers like me what they want. Flower
         | was the best to use since day one.
         | fednongenai wrote:
         | As a founder of an FL startup, I strongly advise against using
         | FedML as it is one of the worst frameworks in terms of
         | scalability. Regrettably, the FedML team has also a poor
         | reputation due to their toxicity and suspicious behavior. If
         | you value the trustworthiness of your product, I suggest
         | avoiding FedML at all costs.
           | ulikis2 wrote:
           | LOL, starting FL wars, are we? I never tested FedML in
           | production but it was cool for simple experiments. I concur
           | with what you are saying though about their devs. The CTO is
           | super sus.
       | guites wrote:
       | Hey! Glad to see flower getting attention on hn.
       | I've been working on a project for over a year that uses flower
       | to train cv models on medical data.
       | One aspect that we see being brought up again and again is how we
       | can prove to our clients that no unnecessary data is being shared
       | over the network.
       | Do you have any tips on solving that particular problem? I.e.
       | proving that no data apart from model weights are being
       | transferred to the centralized server?
       | Thanks a lot for the project.
       | edit: Just to clarify I am aware of differential privacy, I'm
       | talking more on a "how to convince a medical institution that we
       | are not sending its images over the network" level.
         | danieljanes wrote:
         | Thanks, glad you like it!
         | One approach to increase the transparency on the client side
         | (and build trust with the organization where the Flower clien
         | is deployed) is to integrate a review step that asks the
         | someone to confirm the update that gets send back to the
         | server.
         | On top of that, you should definitely use differential privacy.
         | To quote Andrew Trask here: "friends don't let friends use FL
         | without DP". Other approaches like Secure Aggregation can also
         | help, depending on what kind of exposure your clients are
         | concerned about.
         | My general take is that the best way to solve for transparency
         | and trust is to tackle it on multiple layers of the stack.
           | jorgeili wrote:
           | What about MPC + DP? Are you planning to integrate any SMPC
           | algorithms on flower or do you find any limitations for not
           | doing so.
           | I'm trying to apply federated learning to the medical domain
           | too and I'm trying to define the best "stack" that guarantees
           | privacy and compliance with regulations like the GDPR
             | danieljanes wrote:
             | Agreed that this is an interesting direction. The core
             | Flower abstractions are "federated learning agnostic",
             | which means that they can be used for different kinds of
             | distributed/federated workloads, not just federated
             | learning. We'll add examples for more approaches (like
             | SMPC) in the future, we just don't have the bandwidth to do
             | it immediately.
             | williamtrask wrote:
             | I can't speak for Flower's core dev roadmap, but PySyft is
             | in the process of integrating Flower and some Secure
             | Enclave options which would let you do this.
             | Congrats on the launch Flower team!
               | danieljanes wrote:
               | Thanks! We're huge fans of the work that PySyft is doing,
               | and we're very supportive of the Flower PySyft
               | integration.
           | guites wrote:
           | A review steps sounds like a good idea. Our implementation
           | involves very little interaction on the client side, besides
           | setting up the datasets etc, so maybe a way to log
           | information sent for later inspection would help.
           | I'll be looking into secure aggregation as I'm not fully
           | aware of how it works. As of now we rely on differential
           | privacy only.
           | Thanks!
             | ngneer wrote:
             | Cool. I saw a proposal to use TEEs for secure aggregation.
             | OpenFL uses Gramine for that. Not sure if that provides
             | sufficient protection, really, but worth having on the
             | radar.
             | https://arxiv.org/abs/2105.06413
             | https://openfl.readthedocs.io/en/latest/index.html
             | https://gramineproject.io/
         | cpmpcpmp wrote:
         | If you're concerned about data leakage, it's worth noting that
         | model weights can very easily be used to reconstruct the
         | original data that it was trained on: so it could be misleading
         | to claim that user data isn't being shared over the network. To
         | avoid this, you'd need to look into techniques like Secure
         | Aggregation or local differential privacy. Flower does provide
         | some of this, FWIW.
         | tanto wrote:
         | Hi guites, Thank you! That is undoubtedly something relatable.
         | We have it on the screen and plan to provide helpful material
         | and presentations helping to convince stakeholders. If you are
         | up for a call to share the specific challenges, we could ideate
         | with you.
           | guites wrote:
           | Would love to! You can grab my email on my profile. Could you
           | ping me over there? Thanks
       | blintz wrote:
       | This is really cool. Federated learning seems like it could
       | unlock a lot of value in healthcare settings.
       | Have you had any luck convincing hospitals / insurers / etc that
       | this satisfies HIPAA and is safe? How do you convince them?
         | JohnFen wrote:
         | What about patients? I would be very, very worried about going
         | to a health provider that participated in this.
       | jaggirs wrote:
       | It has been shown that the input data can be reverse-engineered
       | from the model weights. How do you deal with this issue?
         | technologia wrote:
         | looks like they've introduced some differential privacy
         | wrappers, the changelog points to that:
         | https://github.com/adap/flower/blob/94a1f942abfce5dff4e9aff2...
           | danieljanes wrote:
           | Thanks for adding this here! We added these DP wrappers, and
           | we're working on something similar for Secure Aggregation,
           | but I must admit that we have to document them better to make
           | using them easier for everyone
           | niclane7 wrote:
           | Yes, we have developed modular and efficient secure
           | aggregation and differential privacy solutions that can help
           | people dial in the amount of protection they need. We have
           | documented an early version of the secure aggregation here:
           | https://flower.dev/docs/secagg.html Documentation and updates
           | on both methods will be released soon.
             | technologia wrote:
             | I hate to ask a product comparison question, but why would
             | I use this versus other projects like PySyft.
               | niclane7 wrote:
               | Thanks for the question, very natural to ask. We are also
               | fans of PySft. It offers support for a very wide range of
               | privacy enhancing machine learning tools. But where
               | Flower and PySft differ is in focus. Federated learning
               | is difficult and requires many technical moving parts all
               | working together (e.g., secure aggregation, differential
               | privacy, scalable simulation, device deployments,
               | integration with conventional ML frameworks etc.). All of
               | these need to tightly integrated, and in a manner that
               | performs federated learning efficiently. This is where
               | Flower currently excels. It offers comprehensive,
               | extensible and, most important, _easy to use_
               | construction of federations that need these different
               | parts together. We believe it offers the best user
               | experience for federated learning currently out there. We
               | hope in the future many tool suites that offer private
               | machine learning (like PySft and others) will actually
               | adopt Flower components so we can all work better
               | together.
               | williamtrask wrote:
               | Can confirm that PySyft is currently in the process of
               | integrating with Flower. Best of both worlds.
               | danieljanes wrote:
               | Indeed - looking forward to this
               | technologia wrote:
               | I appreciate you taking the time to break this down, I've
               | spent a decent chunk of time having to roll my own stuff
               | so when pygrid/pysyft came along it was just easier. I
               | will say the flower components look interesting and I'll
               | give it a shot
       | dontreact wrote:
       | There is so much hype around federated learning but often the
       | hard and insurmountable part of this is federated labeling.
       | For example for your cancer use case, you have to convince
       | multiple hospitals to feed the system labels and this is a very
       | very tall ask.
       | For healthcare it's also not clear how to get a regulatory
       | clearance if you can't actually test the performance of the
       | federated deployments.
       | So while federated learning solves some problems generated by an
       | unwillingness to share data, it doesn't solve all of them.
       | Describe the use cases of your product carefully.
         | niclane7 wrote:
         | Regarding federated labeling, you might be interested in some
         | recent prototypes built on Flower that use forms of self
         | supervised learning. By combining SSL with federated learning
         | we can start to leverage unlabeled data and this will be a big
         | deal once it becomes common place. I'd suggest looking at these
         | two research papers that build on Flower and include members of
         | the Flower team as authors:
         | https://arxiv.org/abs/2207.01975
         | https://arxiv.org/abs/2204.02804
       | juanma91p wrote:
       | Great to see Flower here! We use the framework for our projects
       | because of its modularity, scalability, and ease to use. Another
       | important aspect of FL, on top of the already mentioned privacy
       | preservation, is network resource utilisation. By transferring
       | only the weights of the model, less bandwidth is required, which
       | can reduce network congestion. This is especially important given
       | that it is expected that by 2030, more than 50 billion devices
       | will be connected and transferring data.
         | danieljanes wrote:
         | Great to hear, thanks for sharing - modularity, scalability,
         | and user friendliness are what we think a lot about :)
       | yawnxyz wrote:
       | Hi! As someone new to all of this -- how would I interact with
       | the trained data after it's been trained?
       | Is it possible to create a conversation or QA style interaction
       | with it? I see there's examples of "pytorch" but as a someone
       | new-- I'm not sure what that means in terms of public use cases.
       | I guess I'm asking is "ok I use Flower to train on a bunch of
       | stuff... then what do I do with that?"
       | Thanks!
         | danieljanes wrote:
         | Hi there - the data never moves if you train a model using
         | federated learning. It stays on user devices or in
         | organizational silos. After the training, you have the model
         | parameters of the model on the server, without the server
         | having ever seen a single data example.
         | After the training, you can deploy the model in different ways.
         | If you want to use it on device (or in one of the
         | organizational silos), you can send the final model parameters
         | there and deploy it locally. Or you just deploy the model on
         | the server behind an API. It all depends on the use case.
         | Hope that helps, I'm happy to provide more details.
       | northlondoner wrote:
       | Many congratulations! Glad to hear about UK & EU collaborative
       | innovation in open-source projects. Keep up the fantastic work!
       | Others asked similar question regarding comparable projects.
       | What's your take on OpenFL from Intel? Do you think Flower moves
       | into more commercial-MLOps direction? Looks like OpenFL
       | particularly focused on to academic imaging community.
       | elijahbenizzy wrote:
       | Congratulations! Really excited for you!
       | I love how you found a niche, valuable problem, built a
       | framework, and are seeing a lot of success. A question (and I'm
       | far from an expert so let me know if the assumptions are wrong):
       | It seems to me that the federated users have to be coordinated
       | around timing for this to work. Otherwise this could take
       | weeks/lots of slack messages for a single model to train. E.G.
       | one team is having infra issues and doesn't get a job started,
       | the other team is ready but then their lead goes on vacation,
       | etc... In the internal-to-an-organization case this is probably
       | fine (E.G. a hospital where the data has to be separated by
       | patient/cohort), but if there are different teams managing the
       | data then (a) have you seen this problem and (b) do you have
       | tooling to fix it?
         | danieljanes wrote:
         | Thanks, we're excited too!
         | Flower tries to automate this as much as it can. In cases where
         | multiple organizations are involved, the workload can run in a
         | fully automated manner if that's fine for all organizations. If
         | a review step is required, that can be integrated (either on
         | the client side or on the server side) - the availability of
         | reviewers will then become the bottleneck for end-to-end
         | latency.
         | In the long run, we will evolve the permissioning system to
         | allow workloads to be automatically executed if they fall
         | within pre-approved boundaries, or require manual review if
         | they don't. Pre-approved boundaries could, for example, be used
         | to configure a particular combination of models and
         | hyperparemter ranges that are ok to run without additional
         | (manual) approvals.
           | elijahbenizzy wrote:
           | Awesome! Makes sense. I think the challenge is going to be
           | coordinating with the various orchestration systems --
           | timeouts, etc.. Excited to see how you pull it off!
       (page generated 2023-03-22 23:00 UTC)