[HN Gopher] Launch HN: Patterns (YC S21) - A much faster way to ...
       ___________________________________________________________________
        
       Launch HN: Patterns (YC S21) - A much faster way to build and
       deploy data apps
        
       Hey HN, I'm Ken, co-founder of Patterns (https://www.patterns.app/)
       with with my friend Chris. Patterns gets rid of repetitive
       gruntwork when building and deploying data applications. We
       abstract away the micro-management of compute, storage,
       orchestration, and visualization, letting you focus on your
       specific app's logic. Our goal is to give you a 10x productivity
       boost when building these things. Basically, we're Heroku for AI
       apps. There's a demo video here:
       https://www.patterns.app/videos/homepage/demo4k.mp4.  Our target
       audience are data engineers and scientists who feel limited by
       Jupyter notebooks and frustrated with Airflow. They're stitching
       together business apps (e.g. CRMs or email marketing tools), AI
       models, and building proprietary automations and analytics in
       between them (e.g. generating a customer health score and acting on
       it). We want to solve the impedance mismatch between analytical
       systems (like your pipelines for counting customers and revenue)
       and automations (like a "do something on customer signup" event).
       We built Patterns because of our frustration trying to ship data
       and AI projects. We are data scientists and engineers and have
       built data stacks over the past 10 years for a wide variety of
       companies--from small startups to large enterprises across FinTech,
       Ecommerce, and SaaS. In every situation, we've been let down by the
       tools available in the market.  Every data team spends immense time
       and resources reinventing the wheel because none of the existing
       tools work end-to-end (and getting 5 different tools to work
       together properly is almost as much work as writing them all
       yourself). ML tools focus on just modeling; notebook tools are
       brittle, hard to maintain, and don't help with ETL or
       operationalization; and orchestration tools don't integrate well
       with the development process.  As a result, when we worked on data
       applications--things like a trading bot side-project, a risk
       scoring model at a startup, and a PLG (product-led growth)
       automation at a big company--we spent 90% of our time doing things
       that weren't specific to the app itself: getting and cleaning data,
       building connections to external systems and software, and
       orchestrating and productionizing. We built Patterns to address
       these issues and make developing data and AI apps a much better
       experience.  At its core, Patterns is a reactive (i.e.
       automatically updating) graph architecture with powerful node
       abstractions: Python, SQL, Table, Chart, Webhook, etc. You build
       your app as a graph using the node types that make sense, and write
       whatever custom code you need to implement your specific app.  We
       built this architecture for modularity, composability, and
       testability, with structurally-typed data interfaces. This lets you
       build and deploy data automations and pipelines quickly and safely.
       You write and add your own code as you need it, taking advantage of
       a library of forkable open-source components--see
       https://www.patterns.app/marketplace/components and
       https://github.com/patterns-app/patterns-components.git .  Patterns
       apps are fully defined by files and code, so you can check them
       into Git the same way you would anything else--but we also provide
       an editable UI representation for each app. You work at either
       level, depending on what's convenient, and your changes propagate
       automatically to the other level with two-way consistency.  One
       surprising thing we've learned while building this is that the
       problem actually gets simpler when you broaden the scope.
       Individual parts of the data stack that are huge challenges in
       isolation--data observability, lineage, versioning, error handling,
       productionizing--become much easier when you have a unified
       "operating system".  Our customers include SaaS and ecommerce co's
       building customer data platforms, fintech companies building
       lending and risk engines, and AI companies building prompt
       engineering pipelines.  Here are some apps we think you might like
       and can clone: 1. Free Eng Advice - a GPT-3 slack bot:
       (https://studio.patterns.app/graph/kybe52ek5riu2qobghbk/eng-a...)
       2. GPT3 Automated Sales Email Generator:
       (https://studio.patterns.app/graph/8g8a5d0vrqfp8r9r4f64/sales...)
       3. Sales lead enrichment, scoring, and routing:
       (https://studio.patterns.app/graph/9e11ml5wchab3r9167kk/lead-...)
       Oh and we have two Hacker News specials. Our Getting Started
       Tutorial features a Hacker News semantic search and alerting bot
       (https://www.patterns.app/docs/quick-start). We also built a
       template app that uses a LLM from Cohere.ai to classify HN stories
       into categories like AI, Programming, Crypto, etc.
       (https://studio.patterns.app/graph/n996ii6owwi5djujyfki/hn-co...).
       Long-term, we want to build a collaborative ecosystem of reusable
       components and apps. To enable this, we've created abstractions
       over both data infrastructure (https://github.com/kvh/dcp.git) and
       "structurally-typed data interfaces"
       (https://github.com/kvh/common-model.git), along with a protocol
       for running data operations in Python or SQL (other languages soon)
       in a standard way across any cloud database or compute engine.
       Thanks for reading this--we hope you'll take a look! Patterns is an
       idea I've had in my head for over a decade now, and I feel blessed
       to have the chance to finally build it out with the best co-founder
       on the planet (thanks Chris!) and a world-class engineering team.
       We're still early beta and have a long road ahead, but we're ready
       to be tried and eager for your feedback!
        
       Author : kvh
       Score  : 90 points
       Date   : 2022-11-30 14:37 UTC (8 hours ago)
        
       | moklick wrote:
       | This looks so good! Congrats to the launch!
        
       | marban wrote:
       | Yahoo Pipes would be have been proud.
        
       | chrisweekly wrote:
       | Wow! Congrats! Looks amazing! :)
        
       | yuppiepuppie wrote:
       | Congrats on the launch! Looks really nice. Curious about the
       | marketplace: how is it populated? Are these a set of apps that
       | are available? Can third parties build apps that would be
       | available? And could a user build a custom app?
        
         | kvh wrote:
         | The marketplace is an open ecosystem, yes! Anyone can build
         | their own components and apps and submit them. More details
         | here https://www.patterns.app/docs/marketplace-faq/, and guide
         | for building your own:
         | https://www.patterns.app/docs/dev/building-components. It's
         | early days but our goal is coverage of all data sources and
         | sinks, the ontology layer of common transformations and ETL
         | logic, and AI / ML models.
        
       | bukhtarkhan wrote:
       | Love the focus on these AI workflows. Any plans to support other
       | AI models like Dalle / Stable Diffusion?
        
         | cstanley wrote:
         | Absolutely. After we add better support for file storage,
         | building workflows that leverage APIs like Dalle/Stable
         | Diffusion become much easier.
        
       | ankrgyl wrote:
       | It'd be great to show the debugging experience in the video (in
       | fact, I'd prefer seeing that over the breadth of features). E.g.
       | what happens when there's a syntax error in my sql query or the
       | python code fails on an invalid input?
       | 
       | That tends to be the critical make it or break it feature when
       | you're writing code in an app builder.
        
         | kvh wrote:
         | Agree, debugging is a critical user experience! In Patterns,
         | you'll see the full stack trace and all logs when you execute
         | Python or SQL.
        
       | nathantotten wrote:
       | I really like the mix of drag and drop with the ability to also
       | write code. This is something that I always run into with tools
       | like Zapier - they get me close, but then I need a small amount
       | of customization that isn't built in and then hit a wall. This
       | seems like it might be a nice solution. Congrats on the launch.
        
       | sails wrote:
       | Congrats Ken and Chris! I think Patterns is very interesting,
       | having closely followed since the early days (but not
       | affiliated).
       | 
       | For those interested, I've been doing some exploring around
       | automation tools more broadly, and this is my attempt at
       | orientation, within the context that Patterns exists (my own
       | interpretation).
       | 
       | The paradigm is within the context of data analysts and data
       | engineers required to centralise data for reporting and
       | automation (see "analytical + operational workflows" on the
       | Patterns website). This process tool chain broadly looks like the
       | following aggressively reduced flow of core components:
       | 
       | `Extract > Integrate > Analyse/Report > Automate` (and Ops)
       | 
       | - Extract: Regularly pull data into a centralised storage/compute
       | location from multiple source systems such as SaaS APIs,
       | databases, logs etc (eg Airbyte)
       | 
       | - Integrate: Combine data into a coherent unified standardised
       | data warehouse (data warehouse is a thing you build not a tool
       | you buy) (eg dbt, Pandas)
       | 
       | - Analyse: Explore data, discover facets/aspects/dimensions in
       | the combined data. Typically notebook type tools, and includes
       | data science and analytics (eg Jupyter)
       | 
       | - Reporting: Dashboards and regular reports on formalised data
       | concepts such as metrics, using BI tools (eg Metabase)
       | 
       | - Automation: Tools that exist to trigger actions on the data
       | available in the system, typically by sending those actions into
       | other systems. (eg Patterns!, also Zapier et al, which are more
       | consumer oriented, no version control etc)
       | 
       | - Ops: The tools needed to effectively achieve the above, many
       | sub-categories (eg Airflow)
       | 
       | ---
       | 
       | Some observations to then mention, specifically about
       | Patterns/Automation:
       | 
       | 1. Probably best to be used in conjunction with the previous
       | components. Automation tools seemingly can exist without, but the
       | Integrate, Analyse and Reporting are specific, highly related and
       | likely required processes at any kind of scale (team, data volume
       | or complexity) at least this has been true historically, and
       | companies have deployed significant resources to implement these
       | tools.
       | 
       | 2. However, one obviously can use just an Automation tool alone,
       | as they provide the Ops and technical components to run the full
       | `Extract > Integrate > Analyse/Report > Automate` process, and
       | possibly with the least complexity, and the best containment of
       | abstractions. This is a huge gain for small teams with limited
       | resources.
       | 
       | 3. My reservation for bigger teams is the "Integrate" component,
       | which if not done carefully (regardless of where/how) leads to a
       | mountain of technical debt to try and maintain the data
       | transformations (data modelling), and nothing but care solves
       | this very time consuming process
       | 
       | 4. Data is stored on Patterns, and it would be interesting to
       | know how users with pre-existing processes would extract data
       | into other tools, say for example to write scored lead IDs into
       | another system for targeting (see Reverse ETL tools).
       | 
       | 5. Most Automation tools lack a real "user input" component by
       | default, as they seem not to be designed to build user-interface
       | CRUD apps per-se. This is similar to Hex.Tech (as far as I can
       | tell?) which has an "app" interface, but users cannot really
       | change the state of the app. If they reload the page, their
       | inputs will be lost (I think I am correct here for Patterns -
       | could a "non technical user" change the lead scoring parameters
       | without getting into the code?) Feels like a simple feature, but
       | probably very complex to implement.
       | 
       | 6. The Patterns on face value feels like a better way to deploy
       | data analytics in terms of what users are trying to achieve
       | (Create and share insight, and automate in specific instances),
       | with nice abstractions over storage, streams, scheduling, compute
       | (none of which are worth direct contact for an analyst if they
       | can avoid)
       | 
       | Distinct but possibly worthwhile mentioning these categories:
       | 
       | - Reverse ETL: A similar desire to Automation; send the data back
       | to a tool that can use it to automate something
       | 
       | - No-code: providing CRUD App development capabilities without
       | (much) code (eg Retool)
       | 
       | - Feature Flag/Remote Config: Provide non-technical users with an
       | interface into the configuration of a web-app (eg Flagsmith)
        
         | cstanley wrote:
         | Spot on analysis of the market -- we also think the next
         | frontier for data eng/science is towards further automation.
         | After all, dashboards are input to a person that makes a
         | decision and takes an action; and to your point, tools on the
         | market don't really serve this use case.
         | 
         | To this end, we plan to support more human-in-the-loop
         | workflows by expanding on our dashboard feature and enabling
         | stateful user input interfaces.
         | 
         | One small note to point #4, while we provide out-of-box
         | infrastructure for easy setup, Patterns can run on your
         | existing data warehouse -- so it plays nicely with big teams
         | that have existing tools.
        
           | sails wrote:
           | > makes a decision and takes an action
           | 
           | Exactly, this is such an obvious pattern and yet the second
           | half is so badly catered for.
           | 
           | #4 - Great, that is useful, not obvious from the
           | docs/examples that this is possible.
        
       | sharp11 wrote:
       | Congrats on the launch! As an iOS dev who dabbles in ML, I'm
       | having trouble understanding what you mean by "data applications"
       | and who this is for. I'm guessing it's targeted at teams that
       | crank out lots of small apps and therefore investment in learning
       | your platform would make sense? It would be helpful if you gave
       | clearer explanation of the use case(s), beyond the generic
       | "customer data platforms, fintech companies building lending and
       | risk engines, and AI companies building prompt engineering
       | pipelines" (which, tbh, means nothing to me).
        
         | dang wrote:
         | That was a good and important question - I've moved cstanley's
         | answer to the top text so other people will get the information
         | sooner. Thanks!
        
         | cstanley wrote:
         | Our target audience are data engineers and scientists who are
         | stitching together business apps like CRMs/email marketing
         | tools etc, and building proprietary automations and analytics
         | in between them like generating a customer health score and
         | actioning on it.
         | 
         | One of the problems we've intended to solve is the separation
         | of analytical systems (like all your pipelines for counting
         | customers and revenue) and your automations (like do something
         | on customer signup event).
        
       | candiddevmike wrote:
       | This looks like a data equivalent of WordPress, and I don't mean
       | that in a good way. What happens when someone no longer wants to
       | use Patterns?
        
         | kvh wrote:
         | Good concern. All Patterns apps are fully defined by code that
         | you can download. We're building our open source execution
         | engine, once that lands you'll be able to self host forever if
         | desired
        
       | ricklamers wrote:
       | First want to say congrats to the Patterns team for launching a
       | gorgeous looking tool. Very minimal and approachable. Massive
       | kudos!
       | 
       | Disclaimer: we're building something very similar and I'm curious
       | about a couple of things.
       | 
       | One of the questions our users have asked us often is how to
       | minimize the dependence on "product specific"
       | components/nodes/steps. For example, if you write CI for GitHub
       | Actions you may use a bunch of GitHub Action references.
       | 
       | Looking at the `graph.yml` in some of the examples you shared you
       | use a similar approach (e.g. patterns/openai-completion@v4). That
       | means that whenever you depend on such components your
       | automation/data pipeline becomes more tied to the specific tool
       | (GitHub Actions/Patterns), effectively locking in users.
       | 
       | How are you helping users feel comfortable with that problem (I
       | don't want to invest in something that's not portable)? It's
       | something we've struggled with ourselves as we're expanding the
       | "out of the box" capabilities you get.
       | 
       | Furthermore, would have loved to see this as an open source
       | project. But I guess the second best thing to open source is some
       | open source contributions and `dcp` and `common-model` look quite
       | interesting!
       | 
       | For those who are curious, I'm one of the authors of
       | https://github.com/orchest/orchest
        
         | rubenfiszel wrote:
         | I am working on something adjacent to this problem. We focus
         | much less on data pipelines but on automation, but in the end
         | also have an abstraction for flows that one can use to build
         | data pipeline. The locking-in issue was something I thought a
         | lot about and ended up deciding that our generic steps should
         | just be plain code in typescript/python/go/bash, the only
         | requirement is that those snippets code have a main function
         | and return a result. We built the https://hub.windmill.dev
         | where users can share their scripts directly and we have a team
         | of moderators to approve the one to integrate directly into the
         | main product. The goal with those snippets is that they are
         | generic enough to be reusable outside of Windmill and they
         | might be able to work straight out of the box for orchest for
         | the python ones.
         | 
         | nb: author of https://github.com/windmill-labs/windmill
        
           | ricklamers wrote:
           | Thanks for chipping in.
           | 
           | I've been leaning towards this direction. I think I/O is the
           | biggest part that in the case of plain code steps still needs
           | fixing. Input being data/stream and parameterization/config
           | and output being some sort of typed data/stream.
           | 
           | My "let's not reinvent the wheel" alarm is going off when I
           | write that though. Examples that come to mind are text based
           | (Unix / https://scale.com/blog/text-universal-interface) but
           | also the Singer tap protocol (https://github.com/singer-
           | io/getting-started/blob/master/doc...). And config obviously
           | having many standard forms like ini, yaml, json, environment
           | key value pairs and more.
           | 
           | At the same time, text feels horribly inefficient as encoding
           | for some of the data objects being passed around in these
           | flows. More specialized and optimized binary formats come to
           | mind (Arrow, HDF5, Protobuf).
           | 
           | Plenty of directions to explore, each with their own
           | advantages and disadvantages. I wonder which direction is
           | favored by users of tools like ours. Will be good to poll (do
           | they even care?).
           | 
           | PS Windmill looks equally impressive! Nice job
        
         | kvh wrote:
         | Yes, great point, we share that concern. All of our components
         | (patterns/openai-completion@v4) are open-source and can be
         | downloaded and "dehydrated" into your Patterns app. They all
         | use the same public API available to all apps.
         | 
         | We're working towards a fully open-source execution engine for
         | Patterns -- we want people to invest with full confidence in a
         | long-term ecosystem. For us, sequencing meant dialing in the
         | end-to-end UX and then taking those learnings to build the best
         | framework and ecosystem with a strong foundation. Stay tuned!
         | 
         | Thank you for the kind words and congrats on the great work on
         | Orchest!
        
       | karam_qusai wrote:
       | This is a step ahead indeed. good job!!
        
       | hermitcrab wrote:
       | How is it different to node-based data tools such as Alteryx,
       | Easy Data Transform or Knime?
        
         | kvh wrote:
         | Those are great tools, but built for a different era. We've
         | built Patterns with the goal of fostering an open ecosystem of
         | components and solutions that interface with modern cloud
         | infrastructure and the rest of the modern data stack, so folks
         | can build on top of other's work. As more and more data lives
         | in the cloud, in standard saas, more and more businesses are
         | solving the same data problems over and over. We hope to fix
         | that!
        
           | hermitcrab wrote:
           | So more of a development platform than an end user tool?
        
       ___________________________________________________________________
       (page generated 2022-11-30 23:00 UTC)