[HN Gopher] Launch HN: Jitsu (YC S20) - Open-Source Segment Alte...
       ___________________________________________________________________
        
       Launch HN: Jitsu (YC S20) - Open-Source Segment Alternative
        
       Hey HN! Vlad here with Sergey, Ildar, and Kirill. We are building
       Jitsu, an open-source Segment alternative
       (https://github.com/jitsucom/jitsu, https://jitsu.com/). We help
       companies collect events from their apps, websites, and APIs and
       send them to databases.  I've been doing data engineering for more
       than ten years (half of that time, I didn't know that it's called
       "data engineering"). Before Jitsu, I was a co-founder and CTO of
       GetIntent, an ad-tech startup. Although it was ad-tech (I'm sorry
       for that!), we also built a quite fascinating technology platform.
       We processed up to 1 million events per second at peak, and all
       those events needed to be stored somewhere.  We churned through a
       few data warehouse platforms along the way. In 2013, we started
       with Hadoop's HDFS and a bunch of map-reduce jobs on top of it.
       Then, when we decided to allow our customers to run ad-hoc reports,
       we switched to BigQuery. BigQuery was great, but expensive--
       especially with some customers obsessively clicking the refresh
       button. Finally, in 2017 we migrated to self-hosted ClickHouse
       which in my opinion is still the best analytics database in the
       world.  All that time, we spent a fair amount of effort to get data
       to the database. When you're dealing with millions of events per
       minute, running an INSERT statement per event won't work. What if
       the DB is down for maintenance? How can you be sure that all 50+
       edge nodes are aware of recent DB schema changes? Also, did you
       know streaming data to BigQuery is costly while batching data is
       free?  We tried different approaches: first, we would write local
       log files, sync them to HDFS, and load data to BQ (or ClickHouse)
       with map-reduce jobs. To improve data freshness, we ditched HDFS
       and started to send data in batches to the DB directly from edge
       servers. We experimented with Kafka, but it felt too complex for
       that task at the time.  I always dreamed about a straightforward
       service, to which I'd throw JSON objects, and it would take care of
       the rest: queueing, retrying, updating database schema, etc.  Then
       I discovered Segment. I liked it at first. It seemed very
       developer-friendly with a nice API and excellent documentation. But
       the pricing model and data delays (the event gets to DB in 12 hours
       after it has been sent to Segment) killed the whole idea. And it
       was not open-sourced. In my opinion, being open-source and self-
       hostable is a must for such a fundamental part of the architecture
       as data collection.  I left GetIntent and got accepted to YC with a
       different idea for the Summer 2020 batch. The idea was to build a
       churn prevention and BI tool for online retailers. It didn't take
       off, but in the process we made a component to collect customer's
       app events and put it to DB. We tried to hack a solution on top of
       the ELK stack, but I was frustrated with ElasticSearch's lack of
       SQL support. Here I was back to square one: there's no good open-
       source event collection service yet, and we needed to build one,
       once again.  So we decided to focus solely on that problem. We
       ditched all the previous code, which was in Java, rewrote the data
       collection server in Go and hacked together what we called
       EventNative [1]. It was received very well, and we started to get
       users.  Over the last 11 months, we've been busy building the UI,
       adding Connectors (to pull data from external APIs), polishing data
       warehouse support, adding javascript support to transform incoming
       data, and implementing dozens of other features.  Now we're
       launching Jitsu, an open-source Segment alternative. With Jitsu, we
       make it easy to collect data and send it to databases (we support
       all major players: ClickHouse, Redshift, Snowflake, BigQuery and
       Postgres). We're deployed in production, including into a large
       gaming publisher, eSignature service, and many other great
       companies. We're going for an open-core model. So far we don't have
       paid features, but soon we'll have some, presumably around things
       like authorization and data masking. Also we run Jitsu.Cloud[2]
       which you can buy if you don't want to self-host  Give it a spin:
       https://github.com/jitsucom/jitsu.  Thank you for reading this
       story - I hope it was interesting. I would love to read your
       feedback on Jitsu and answer questions!  [1]
       https://news.ycombinator.com/item?id=24120325 [2]
       https://cloud.jitsu.com
        
       Author : vklmn
       Score  : 192 points
       Date   : 2021-11-04 12:07 UTC (10 hours ago)
        
       | davidkell wrote:
       | Many of your integrations talk about "syncing" rather than event
       | collection, which to me sounds like what Fivetran is doing. Does
       | that distinction make sense and how are you thinking about that?
        
         | vklmn wrote:
         | We call it "push" (you send event to Jitsu) and "pull" (Jitsu
         | pulls data) integrations. Technically speaking, push
         | integrations are outnumbered. But that's because we took
         | advantage of other open-source projects (Airbyte and Singer) to
         | implement them.
         | 
         | Our core is push integrations, that's the most complex part of
         | the system. We see "pull" integrations as an additional feature
         | that helps to enrich the data after events made it to DHW
        
       | xondono wrote:
       | Kudos on the work done and I hope you the best of luck.
       | 
       | A little note of advise: I wouldn't start my company description
       | as "the Y of X" or "the Y alternative to X".
       | 
       | It's okay to mention if you are similar to another well known
       | company, but don't use it to describe your company, specially not
       | in the first line.
        
         | oDot wrote:
         | I disagree, much easier to understand when it's described like
         | this
        
           | xondono wrote:
           | I'm just saying this is better:
           | 
           | We are building Jitsu, (https://github.com/jitsucom/jitsu,
           | https://jitsu.com/) We help companies collect events from
           | their apps, websites, and APIs and send them to databases.
           | 
           | Think of us as an open-source Segment alternative.
        
         | vklmn wrote:
         | There's always a tradeoff. By describing our company as
         | "alternative to X" you saving tons of time by explaining what
         | you company does. People are overloaded with information, and
         | average attention span very short. But I see the disadvantages
         | too! We have an internal debate about tagline a few months ago,
         | and essentially decided to go with "Open-source alternative to
         | Segment". But it wasn't an easy decision!
         | 
         | However, I think that the product matters the most. You can
         | change the tagline in a few clicks. Can't say the same about
         | the product
        
           | xondono wrote:
           | It's okay to use that for an elevator pitch, I'd just reverse
           | the order.
           | 
           | First tell me what you do, then tell me what you are similar
           | to.
           | 
           | >> We are building Jitsu, (https://github.com/jitsucom/jitsu,
           | https://jitsu.com/) We help companies collect events from
           | their apps, websites, and APIs and send them to databases.
           | Think of us as an open-source Segment alternative
        
         | nopcode wrote:
         | I agree.
         | 
         | I hadn't heard of Segment and now I'm reading your competitors
         | website.
        
           | vklmn wrote:
           | That's a chance you have to take when you're making this
           | decision!
        
       | laex wrote:
       | Can we expect react native integration any time soon ?
        
         | vklmn wrote:
         | It depends on if we get a PR anytime soon. At the moment we're
         | just 4 engineers, and unfortunately React.Native is not on the
         | top of our list. I'd say we'll have it in 2 months unless we
         | get a PR earlier
        
       | shcheklein wrote:
       | Congrats on the launch! We've started using the open source
       | version for one of the tools we are building (CLI anonymized
       | telemetry) and it looks good so far- thank for the great product.
       | It was very easy and straightforward to get started, deploy and
       | start collecting things (into BigQuery).
       | 
       | Overall, I like this recent trend a lot - more companies are
       | building open-source, lightweight, GDPR compatible analytics,
       | chats (e.g. Papercups). I hope there will be good ways to
       | monetize and sustain this. Wish you all the best, folks!
        
       | browsec wrote:
       | Great job!
        
       | nkotov wrote:
       | Congrats on the launch!
        
       | polskibus wrote:
       | How do I scale jitsu if the load from my app servers become too
       | big? Will adding more jitsu nodes trample the database nodes that
       | jitsu writes to? How should I plan capacity for a jitsu
       | deployment in a multi node scenario, and what should I take into
       | consideration when scaling it?
        
         | vklmn wrote:
         | Yes, just add more jitsu nodes. It's hard to answer how many
         | nodes do you need (depends on transformations, CPU/RAM/etc),
         | but you can count on thousands request per second per node at
         | least
        
       | einpoklum wrote:
       | > We help companies collect events from their apps, websites, and
       | APIs and send them to databases.
       | 
       | For those who don't know what "Segment" is (like me) - this Jitsu
       | thing seems to only be relevant to web-based/web-oriented apps.
        
         | leetrout wrote:
         | Not affiliated but that is not accurate.
         | 
         | Segment is a fancy event router / multiplexer. You emit events
         | to it and it sends them to reporting and storage destinations.
         | 
         | It does have more features for web apps but that is not the
         | only use case.
        
           | vklmn wrote:
           | Jitsu can multiplex events and send it to different
           | destinations too. I admit, we don't have that much
           | destinations as Segment (we support Amplitude, Hubspot, GA
           | and Facebook). But we can send data to any HTTP-enpoint (see
           | Webhook destination). Since the body of HTTP request is a
           | JavaScript expression, with a little hacking you can support
           | almost any service.
        
         | vklmn wrote:
         | You mean we don't have libs for Mobile platforms / backend
         | frameworks? That's true, tough we have iOS SDK[1] and community
         | maintained Go client. However, all libs are merely wrapper
         | around http api[2]. We will implement other client libs soon,
         | but calling HTTP API directly works very well too
         | 
         | [1] https://jitsu.com/docs/sending-data/mobile-apps/ios-sdk [2]
         | https://jitsu.com/docs/sending-data/api
        
         | bmm6o wrote:
         | There's an http endpoint so it's easy to use from a browser,
         | but of course it's usable from any process that can post to the
         | endpoint.
        
       | reidalert wrote:
       | Congrats on your launch, and looks really exciting! I'm curious
       | how this compares to tools like Snowplow [1]? I guess Jitsu comes
       | with more sources and destinations out of the box?
       | 
       | [1] https://github.com/snowplow/snowplow
        
         | vklmn wrote:
         | Essentially we're doing same thing. But we build Jitsu to be as
         | simple as possible: you don't need to setup multiple services
         | (just one Docker service!), data goes to DWH almost instantly.
         | And we can pull data from more that 100 external APIs
         | 
         | Think of us as Snowplow 2.0 )
        
         | alexdean wrote:
         | Snowplow CEO here. We haven't used Jitsu before but are very
         | familiar with Segment. It looks like Jitsu sits in the Segment
         | product family, along with Rudderstack: basically a Customer
         | Data Platform bundle of simple JSON event tracking, Fivetran-
         | style transactional/SaaS data ingest, and then relaying of data
         | out to various SaaS endpoints plus cloud DWs.
         | 
         | Snowplow started at the same time as Segment (2012) but has
         | evolved along a separate tech tree. Micro-service architecture,
         | cloud native, using Kinesis or Cloud Pub-Sub as the data
         | transit, enrichment framework plus a Confluent-style schema
         | registry supporting very rich and versioned JSON Schema-based
         | event payloads. We are built by and for data platform teams;
         | our open-source behavioral data engine doesn't have a UI (our
         | commercial Behavioral Data Platform does). Hosted trial here
         | https://try.snowplowanalytics.com/
         | 
         | Definitely room for both product families in the market! I'm
         | sure Jitsu will do great.
        
       | hasurabd wrote:
       | Great product! Lovely team!
        
       | kposehn wrote:
       | Great product. I'm a frequent user of Segment from the early days
       | and have been curious to see when an open-source competitor comes
       | around that will match feature-for-feature.
       | 
       | Thoughts:
       | 
       | 1. You've got most major ads sources that I care about, but it
       | _seems_ that there is a higher bar to implementation. Segment
       | lets me just plug in Google  & FB ads and dump the entire shebang
       | right into my data warehouse. A lot of marketing teams are going
       | to have less time/resources to deal with implementation so
       | smoothing this out is key.
       | 
       | 2. Functions are an underrated and highly powerful feature of
       | Segment. The ability to operate on data in transit, create custom
       | connectors that "just run" (akin to CF Workers) and the like is a
       | big selling point for more technically advanced marketing teams.
       | It doesn't seem present here and that would hold a customer such
       | as myself back on bigger scope projects.
       | 
       | 3. I'd love to see a "compare us to _your_ segment usage " where
       | I select my data sources and destinations to see what you cover
       | vs. Segment in a specific use case (and possibly pricing
       | advantages on a self-hosted vs. non). This would make it much
       | easier to sell through procurement and devops for new customers
       | that are switching.
       | 
       | 4. There are going to be a lot of people like me that are soon to
       | start fresh in terms of marketing stack, so going after people
       | before they select Segment might also be a play.
       | 
       | Looking forward to seeing where you all take this. Good luck!
        
         | vklmn wrote:
         | Thank you!
         | 
         | 1. Thats exactly the reason we have native connectors for
         | Facebook and Google Ads (we didn't use ones from Airbyte and
         | Singer). Jitsu can pull any combination from FB/GAds -- it's
         | almost like SQL! Airbyte/Singer just can't do that. Later we're
         | going to vet other connectors too and decide if we need to re-
         | implement them
         | 
         | 2. We have functions too! https://jitsu.com/blog/javascript-
         | transform
        
           | kposehn wrote:
           | Ah, ok - I didn't see the transform compared to Functions.
           | Very cool and I like the multiplexing.
        
           | sherifnada wrote:
           | Airbyte recently added support for custom GSQL & Facebook
           | Marketing queries FYI ;)
        
       | tarun_anand wrote:
       | Adding on to the previous comment, how does this compare to
       | rudderstack?
        
         | vklmn wrote:
         | We're very similar indeed. But we attack the same problem from
         | different angles:
         | 
         | - We truly believe that our product should be accessible for
         | small teams too. That's why se made Jitsu very easy to deploy.
         | I'm not sure you can deploy Rudder on Heroku, or on any service
         | with a single Docker file.
         | 
         | - Our ETL component is open-source (and based on other great
         | OSS projects - Airbyte & Singer). RudderStack haven't published
         | the Cloud Extract (their ETL) to my knowledge.
         | 
         | - RudderStack aims to replace Segment, we go beyond that. We
         | didn't copy Segment API one-to-one, we just added a Segment
         | compatibility layer. Jitsu can be used for any kind of data. An
         | example: a few companies (including our-selves) using Jitsu to
         | collect open-source telemetry (anonymous usage). I'm not sure
         | Rudder can be used for that use-case
        
           | zimmerx wrote:
           | Rudderstack user here (and ex Segment). Thanks for sharing
           | this stuff, very cool project.
           | 
           | Just to answer some of this:
           | 
           | - rudderstack has deploy ready helm charts, which I'd argue
           | are significantly better than docker compose or docker setups
           | because they set up all the other niggly parts. Would be cool
           | to see that here :)
           | 
           | - rudderstack has gone quite far away from replacing segment.
           | It's true that their core API is compatible and I think your
           | transformation layer is really cool. However it can be used
           | for those use cases because rudderstack doesn't really care
           | about users or user IDs and can be used for any sort of data
           | generally.
           | 
           | There's a piece in the docs talking about the fact that you
           | don't get caught by adblock - whilst this may be true when
           | someone launches it, that's not true of your platform. That's
           | just the fact that a lot of smaller businesses will not get
           | their URLs added to the ad block lists. I think it's a bit
           | misleading to mention that in such a way because technically
           | we're all tracking users and ad block is a way for users to
           | choose not to be tracked, not be tricked into being tracked
           | because someone has masked the tracking script ;) if a huge
           | client (a la Adidas or something) decided to use your scripts
           | I'm sure someone would eventually add it to the ad block
           | lists that get propagated.
           | 
           | One of the things that would be cool would be some sort of
           | opt in configuration. Segment has some awful consent SDK that
           | is really bad, would be cool to see what you do there. GDPR
           | is a big deal and browser fingerprinting is data processing.
           | It's worth looking at your comments on being GDPR compliant
           | btw https://www.eff.org/deeplinks/2018/06/gdpr-and-browser-
           | finge...
        
           | dmolot wrote:
           | Fellow YC company (hotglue) here - we love the Singer spec so
           | it is cool to see your modeling after that. It is worth
           | giving a shout out to Meltano who is helping grow it (an
           | Airbyte competitor). Love what you all are doing Vladimir!
           | Congrats on the launch :D
        
       | ComputerGuru wrote:
       | The black banners at the top and bottom of your website breaks
       | scrolling on mobile.
        
         | vklmn wrote:
         | Thanks, we will fix that!
        
       | adrianthedev wrote:
       | Good job guys. Amazing work!
       | 
       | It was fun watching you grow Jitsu and love the way you provide
       | support!
        
       | okhuman wrote:
       | I miss the eventnative name :)
        
       | MaxiaNN wrote:
       | Great story. How do you feel Jitsu compares to Rudderstack?
        
         | MaxiaNN wrote:
         | Read the explanations below in the comment thread and the
         | contribution from RudderStack itself.
         | 
         | Conclusion, Jitsu is wide event based tracking whereas
         | RudderStack is focused on ID specific customer events.
        
         | vklmn wrote:
         | Check the answer in another thread!
         | https://news.ycombinator.com/item?id=29106531
        
       | leftnode wrote:
       | Congrats on the launch. Unrelated to the actual software, are any
       | of you actual Jiu-Jitsu grapplers? If not, why'd you go with the
       | name Jitsu?
        
         | [deleted]
        
         | vklmn wrote:
         | To be honest, we just grab a first short .com domain name we
         | liked a) which was available at reasonable price b) we like
        
       | simplyinfinity wrote:
       | The name is 1 letter off of an already existing opensource
       | project: Jitsi
        
         | airstrike wrote:
         | Very confusing indeed
        
         | jwithington wrote:
         | i thought this was a new product line within jitsi at first
         | haha
        
         | loceng wrote:
         | It's also a Segment-like tool?
        
           | jwithington wrote:
           | Mostly communications tooling IIRC
        
       | colesantiago wrote:
       | > Jitsu solves the AdBlocker problem...
       | 
       | This line alone is enough to infuriate me. So I am unable to
       | block spying and data collection now?
       | 
       | I don't understand why we are still praising spyware tools?
        
         | [deleted]
        
         | vklmn wrote:
         | Jitsu certainly can be used to build spyware (and Linux too,
         | probably 90% of spyware is running on Linux servers), but
         | following this logic any OSS project can be accused of being
         | spyware. I
         | 
         | Yes, jitsu can be deployed at custom domain such as
         | track.yourcompany.com. And while some AdBlockers will block
         | *.segment.com, track.yourcompany.com will remain functional. We
         | don't consider this feature unethical, though. It depends on
         | how data collected by Jitsu is used. If the app owner sells it
         | without telling end-users that's probably bad. However, I
         | believe most of our users using the data to improve product
         | experience. And Jitsu can be configured to respect do-not-
         | track/gdpr settings.
        
         | bberenberg wrote:
         | Where is this line from?
        
           | sol_invictus wrote:
           | https://jitsu.com/vs-segment
        
         | FunnyLookinHat wrote:
         | I think calling this spyware is a pretty far stretch. We're
         | talking about SAAS platforms that want to record events based
         | on user behavior for a myriad of purposes; these are platforms
         | that you are choosing to use, not tracking pixels dropped all
         | across the web.
         | 
         | There are dozens of valid uses for this beyond ad-tech. Where
         | we use Segment it barely even touches with marketing. Most of
         | the value for Segment is piping user lifecycle events around to
         | every platform and service you use to help enrich customer
         | experience. Sure, call it sales or marketing or ad-tech, but
         | that's really just an umbrella for trying to maximize revenue-
         | per-customer - and isn't that the point of a SAAS platform?
         | 
         | I think we should be cautious about throwing the terms
         | "spyware" and "malware" around right now; there are lots of
         | very valid cases that should be labeled as such, but if we
         | over-use the word it just makes it harder for us to delineate
         | between powerful tools being used for good/valid purposes or
         | deceptive ones.
        
         | gizdan wrote:
         | I mean, that's a common difference between a SaaS and a self-
         | hosted solution. For a SaaS service a block list can include
         | *.somesaas.com vs a self-hosted service that has it's own
         | domain per owner, meaning you'd have to add every single one to
         | the list. You can always find a common pattern in the URL (e.g.
         | the API), and block based on something else. There's also an
         | issue[0] to block based on POST body.
         | 
         | [0] https://github.com/uBlockOrigin/uBlock-issues/issues/1357
        
       | soumyadeb wrote:
       | Congrats on your launch. Great to see more innovation in this
       | space. Segment deserves some serious competition.
       | 
       | -RudderStack team.
        
         | neximo64 wrote:
         | What is the difference with Jitsu and Rudderstack. Aware of
         | both projects but keen to get your take.
        
           | soumyadeb wrote:
           | Disclaimer: Haven't looked at Jitsu in depth so my
           | understanding below may be limited.
           | 
           | Reading through their comment below - "Jitsu can be used for
           | any kind of data while Segment compatibility is just a thin
           | layer on top".
           | 
           | I am guessing they have built a generic event API that can be
           | used to send any JSON payload while RudderStack (like
           | Segment) has a opinionated view of events - e.g. there has to
           | be a userID (or anonymousID), that ID is persisted in a
           | cookie (for web), every event must include that userID.
           | Furthermore, there are certain standard for event tracking
           | for specific verticals which RudderStack supports (e.g
           | eCommerce https://rudderstack.com/docs/rudderstack-api/api-
           | specificati...)
           | 
           | Having this opinionated view helps us map these events to all
           | the 100s of destinations, otherwise, you cannot send any
           | arbitrary JSON to these destinations. It also lets us build
           | more post processing in the warehouse (e.g identity
           | stitching, user sessions etc
           | https://hub.getdbt.com/rudderlabs/, we are going to build
           | more MLish applications like churn-models and release them
           | too).
           | 
           | On the other hand, it becomes hard to send generic events
           | (e.g. application telemetry) via RudderStack which seems
           | possible via Jitsu. With RudderStack, you would have to
           | create hacky userID to tag on every event which doesn't make
           | sense.
           | 
           | In summary, go deep on one use case (customer-data) or wide
           | as a generic event streaming platform.
           | 
           | Beyond this, there are other feature differences
           | (transformation, reverse-ETL etc) but that's not a
           | fundamental difference imo. They are just getting started and
           | are a much smaller team so that's expected. Impressive to see
           | what they have built.
        
             | vklmn wrote:
             | We have suggested event structure
             | (https://github.com/jitsucom/jitsu/blob/master/javascript-
             | sdk...). If you want to send data to destination, you
             | should either follow suggested structure or write your own
             | JavaScript mapping (https://jitsu.com/blog/javascript-
             | transform).
             | 
             | And we have DBT models too
             | https://hub.getdbt.com/jitsucom/jitsu/latest/ !
        
               | tomnipotent wrote:
               | Is that dbt project also doing the sessionization? I see
               | this:
               | jitsu_sessionization_trailing_window: 3
               | jitsu_session_inactivity_cutoff: 30 * 60
               | 
               | EDIT: No idea why valid reply from dev is marked dead,
               | but thanks! Really, really cool that you're using dbt for
               | this process.
        
               | absorbb wrote:
               | Hi! Ildar here, one of Jitsu's core engineers. Yep. That
               | is exactly what it does.
        
               | dang wrote:
               | (New accounts are subject to extra restrictions and
               | sometimes software kills their posts. We review those and
               | try to find and unkill all the good ones, though it takes
               | time and we do miss a few. I've marked absorbb's account
               | legit now so this won't happen again.)
        
       | dominotw wrote:
       | Is https://meltano.com/ a more general version of this ?
       | 
       | i am wondering how this compares to it?
        
         | vklmn wrote:
         | We have some overlap. In simple words: a Singer + DBT, Jitsu is
         | Event Collector + Singer + Airbyte.
         | 
         | Meltano will pull data from Singer connectors and do
         | transformations, but they won't run Airbyte connectors, and you
         | can't push data to Meltano
         | 
         | Jitsu will use Airbyte or Singer to pull data, and you can push
         | the data to Jitsu. But Jitsu won't run DBT transformation.
         | Although we can trigger DBT cloud jobs:
         | https://jitsu.com/blog/dbt-integration
         | 
         | P.S. Meltano has a CLI, and we don't (yet)
        
       | polskibus wrote:
       | what value does the airbyte / elt integration provide? Surely I
       | could just run etls on airflow or similar on tables that jitsu
       | generates?
        
       | polskibus wrote:
       | Are there any examples on how the resultant SQL tables look like
       | in postgres or clickhouse for a given event schema? I'd like to
       | know how generic it is per event type (is it sth like (event id,
       | blob), or tries to decompose each event field into a column -
       | what about nested objects then, etc.)). Knowing this would
       | greatly improve my understanding on reusability of jitsu for
       | various event-collection tasks I may have.
        
         | vklmn wrote:
         | That's what the website missing indeed. We have a few words
         | about that in docs, but it's still not enough
         | https://jitsu.com/docs/internals/jitsu-server#mapping-step
         | 
         | Overall, Jitsu tries to decompose (aka flatten) JSON as deep as
         | possible. E.g. {a: {b:1, c:2}} will become a_b=1, a_c=2. If
         | column is missing, it will be created. We don't decompose
         | arrays so far
        
       | slig wrote:
       | Congrats on launching and thanks for making it easy to deploy
       | using docker! I'd like to suggest that you make it available as a
       | 1-click app on DigitalOcean as well.
        
       | tomislavpet wrote:
       | Congrats on the launch!
       | 
       | Noticed a typo on jitsu.com - DHW should probably be DWH.
        
         | vklmn wrote:
         | Ops... Thank you very much! Vercel is deploying the fix already
        
       | 0xferruccio wrote:
       | Congrats on the launch, much needed! Would love to see if it's
       | possible for you to connect to https://june.so for easy to use
       | product analytics.
       | 
       | Are you following the same tracking convention spec as Segment?
        
         | vklmn wrote:
         | Yes, if segment compatibility mode is enabled
        
       | ajbosco wrote:
       | Why would I use this instead of Airbyte?
        
         | vklmn wrote:
         | Two reasons a) you can push events from your apps b) if you
         | want to have more connectors available (Singer, and few native
         | connectors)
        
           | mritchie712 wrote:
           | Airbyte uses Singer too, why would it have less connectors?
           | That doesn't make sense.
        
       | finiteseries wrote:
       | Please consider posting some form of this as a blog post as well,
       | I love to hear about product inceptions.
       | 
       | HN doesn't allow lesser users with lesser eyesight to read light
       | grey on beige self text, unfortunately.
        
         | dang wrote:
         | Sorry--it's on the list to make this more configurable. I know
         | we take a long time but we get there eventually.
        
       | public_void wrote:
       | Hey congrats on the launch, clearly a lot of thought and effort
       | went into this. I'm pretty new to this space, and maybe this is a
       | dumb question but how does this differ from Mixpanel? Would I use
       | this for something different?
        
         | vklmn wrote:
         | Mixpanel will store the data for you and do visualization.
         | Jitsu just help you to get you data to your data warehouse.
         | 
         | Downside: you'll need to build all visualization by yourself.
         | Fortunately that's easy with tools such as Looker, Mode,
         | Metabase etc
         | 
         | Upside: you can do with your data whatever you want - built any
         | reports, join with other datasets etc. You not limited by
         | reports MixPanel team build.
         | 
         | In reality, Jitsu and MixPanel could co-exist. Jitsu support s
         | MixPanel as a destination (e.g. you send data to Jitsu ; Jitsu
         | sends it to MixPanel and data warehouse).
        
       | tailspin2019 wrote:
       | This looks really cool. I'm keen to try it.
       | 
       | It looks like it might play well with my current logging system
       | of choice, Seq [0].
       | 
       | Do you support inbound webhooks? I can see webhooks as a
       | destination but not as a source?
       | 
       | [0] https://datalust.co/seq
        
         | vklmn wrote:
         | You can hack almost anything using inbound Event API
         | (https://jitsu.com/docs/sending-data/api) and JavaScript
         | transformations (https://jitsu.com/blog/javascript-transform)
        
           | tomnipotent wrote:
           | A standard webhook source abstraction would be very useful,
           | that captures the URI, POST payload and HTTP headers.
           | 
           | This way I can setup my source in Jitsu, get a unique URL,
           | and then paste that URL into the tool generating webhook
           | events (e.g. Shopify). A normalized schema based on the JSON
           | payload doesn't need to be created for this to be useful.
        
           | tailspin2019 wrote:
           | Ok cool.
           | 
           | As a bit of feedback, I highly suggest adding Webhooks as a
           | source on your marketing site.
           | 
           | The first thing I did is navigate to the Sources page and
           | searched for "webhook" which brought up no results.
           | 
           | I then searched your docs which only mention Webhooks in the
           | context of being a destination rather than a source.
           | 
           | I realise now that you have quite a flexible ingestion API,
           | but it took quite a while (and your confirmation above) to
           | understand this!
           | 
           | The product looks awesome though! Good luck with the launch.
        
             | vklmn wrote:
             | Thanks for observation! We will add it. A fresh look to
             | marketing materials is always appreciated!
        
       | santiagobasulto wrote:
       | I really like this, congratulations on the launch. And this is
       | such a huge space that there's definitively room for other
       | options (aside from Segment).
       | 
       | I'm a little bit out of the loop in this event processing space.
       | Do you think Jitsu could replace lower-level event processing
       | implementations as Kafka/Kinesis? Or this is thought for more
       | "high level" marketing stuff.
        
         | vklmn wrote:
         | That's a good question. We're aiming to replace Kafka in some
         | cases. There're many ways how people use Kafka. But it could be
         | roughly divided into two buckets
         | 
         | - Kafka as a company wide message bus: dozen's of
         | (micro)services sending data there, and consumers listens to
         | data. Each service doesn't know which other service will
         | consume the data. For that case, we're not looking to replace
         | Kafka -- we're going to work along with it. We have a PR about
         | supporting Kafka as destination [1] (Jitsu sends data to
         | Kafka), and we will support Kafka as a source at some point
         | (PRs are always welcome :))
         | 
         | - Kafka is used just as a transport between web-app and DB. In
         | that case Jitsu is a perfect replacement
         | 
         | [1] https://github.com/jitsucom/jitsu/pull/537
         | 
         | P. S. The same applies to Kinesis too
        
           | polskibus wrote:
           | In the second point, presumably you mean only one-way
           | communication, right?
        
           | tomnipotent wrote:
           | > We're aiming to replace Kafka in some cases
           | 
           | How are your handling data between collection agents and
           | storage? With Kafka, I know what I'm getting when it sits
           | between the two and the advantages it offers.
        
             | vklmn wrote:
             | The same way as Kafka. Jitsu nodes (=collection agents)
             | writes to write ahead log, and then either sends data to
             | destination right away, or sends data in batches.
        
               | tomnipotent wrote:
               | Thanks! I take it this file is where I can get started to
               | learn more:
               | 
               | https://github.com/jitsucom/jitsu/blob/0aaa74b59eb9d8c885
               | c80...
               | 
               | I see that it instantiates an "AsyncLogger" - does the
               | service wait until data is written to the log prior to
               | returning success to the client?
               | 
               | Is the WAL the same source used to feed both database
               | storage destinations and other SaaS destinations?
        
               | xtreding wrote:
               | Hi! My name is Sergey, I'm a Jitsu product engineer. I'll
               | gladly answer your question! AsyncLogger works
               | asynchronously by design. There is a go channel which
               | writes JSON's to the log file. Answering your question:
               | the service doesn't wait until data is written to the log
               | prior to returning success to the client. WAL log is
               | designed for keeping events JSON's between Jitsu instance
               | restarts to prevent data loss. When you deploy your Jitsu
               | application, it will handle service restart signals (e.g.
               | sigterm) and closes database connections as well as other
               | resources. All incoming events are stored in WAL log in
               | this time. So, after the Jitsu starts, all events from
               | WAL log will be passed to the main events JSON pipeline
               | and stored to the destinations.
        
               | tomnipotent wrote:
               | Is the WAL only used during restart, or also during
               | normal operations? Trying to create a mental model of how
               | data flows through the system and into destinations.
        
       | ThePhysicist wrote:
       | Great work! It's funny that YC funded Segment as well as two
       | direct Segment competitors (RudderStack is the other one, though
       | I think they initially started with a different idea an pivoted
       | to that afterwards), though given the size of YC this is probably
       | expected and Segment is probably large enough to "deserve" some
       | good smaller competition.
       | 
       | As someone who also builds an open-core product (though not
       | directly modeled after an existing closed-source product) I
       | really hope this kind of business model will become more
       | accepted.
        
         | tarun_anand wrote:
         | They have been doing that in many areas now...
         | 
         | Look at the alternatives to closed source analytics that they
         | have funded
        
           | [deleted]
        
         | vklmn wrote:
         | I don't believe that YC funded RudderStack, but they funded
         | Segment indeed! It's not uncommon. If I have to guess, every
         | batch or so have at least 2 companies directly competing with
         | each over
        
           | soumyadeb wrote:
           | Yes, RudderStack is not YC funded. Though I have learnt a lot
           | from YC (and PG) through the years and attribute our
           | entrepreneurship to them.
           | 
           | Founder of RudderStack here.
        
             | ThePhysicist wrote:
             | Ah sorry, I mixed you up with Freshpaint, which is actually
             | a YC startup and Segment competitor that pivoted to their
             | current business model.
        
         | hima_hydra wrote:
         | I'm sure YC is out of Segment by now.
        
       ___________________________________________________________________
       (page generated 2021-11-04 23:00 UTC)