[HN Gopher] Launch HN: SigNoz (YC W21) - Open-source alternative...
       ___________________________________________________________________
        
       Launch HN: SigNoz (YC W21) - Open-source alternative to DataDog
        
       Hi HN,  Pranay and Ankit here. We're founders of SigNoz (
       https://signoz.io ), an open source observability platform. We are
       building an open-core alternative to DataDog for companies that are
       security and privacy conscious, and are concerned about huge bills
       they need to pay to SaaS observability vendors.  Observability
       means being able to monitor your application components - from
       mobile and web front-ends to infrastructure, and being able to ask
       questions about their states. Things like latency, error rates,
       RPS, etc. Better observability helps developers find the cause of
       issues in their deployed software and solve them quickly.  Ankit
       was leading an engineering team, where we became aware of the
       importance of observability in a microservices system where each
       service depended on the health of multiple other services. And we
       saw that this problem was getting more and more important, esp. in
       today's world of distributed systems.  The journey of SigNoz
       started with our own pain point. I was working in a startup in
       India. We didn't use application monitoring (APM) tools like
       DataDog/NewRelic as it was very costly, though we badly needed it.
       We had many customers complaining about broken APIs or a payment
       not processing - and we had to get into war room mode to solve it.
       Having a good observability system would have allowed us to solve
       these issues much more quickly.  Not having any solution which met
       our needs, we set out to do something about this.  In our initial
       exploration, we tried setting up RED (Rate, Error and Duration) and
       infra metrics using Prometheus. But we soon realized that metrics
       can only give you an aggregate overview of systems. You need to
       debug why these metrics went haywire. This led us to explore
       Jaeger, an open source distributed tracing system.  Key issues with
       Jaeger were that there was no concept of metrics in Jaegers, and
       datastores supported by Jaeger lacked aggregation capabilities. For
       example, if you had tags of "customer_type: premium" for your
       premium customers, you couldn't find p99 latency experienced by
       them through Jaeger.  We found that though there are many backend
       products - an open source product with UI custom-built for
       observability, which integrates metrics & traces, was missing.
       Also, some folks we talked to expressed concern about sending data
       outside of boundaries - and we felt that with increasing privacy
       regulations, this would become more critical. We thought there was
       scope for an open source solution that addresses these points.  We
       think that currently there is a huge gap between the state of SaaS
       APM products and OSS products. There is a scope for open core
       products which is open source but also supports enterprise scale
       and comes with support and advanced features.  Some of our key
       features - (1) Seamless UI to track metrics and traces (2) Ability
       to get metrics for business-relevant queries, e.g. latency faced by
       premium customers (3) Aggregates on filtered traces, etc.  We plan
       to focus next on building native alert managers, support for custom
       metrics and then logs ( waiting for open telemetry logs to mature
       more in this). More details about our roadmap here (
       https://signoz.io/docs/roadmap )  We are based on Golang & React.
       The design of SigNoz is inspired by streaming data architecture.
       Data is ingested to Kafka and relevant info & meta-data is
       extracted by stream processing. Any number of processors can be
       built as per business needs. Processed data is ingested to real-
       time analytics datastore, Apache Druid, which powers aggregates on
       slicing and dicing of high dimensional data. In the initial
       benchmarks we did for self-hosting SigNoz, we found that it would
       be 10x more cost-effective than SaaS vendors (
       https://signoz.io/blog/signoz-benchmarks/ )  We've launched this
       repo under MIT license so any developer can use the tool. The goal
       is to not charge individual developers & small teams. We eventually
       plan on making a licensed version where we charge for features that
       large companies care about like advanced security, single sign-on,
       advanced integrations and support.  You can check out our repo at
       https://github.com/SigNoz/signoz We have a ton of features in mind
       and would love you to try it and let us know your feedback!
        
       Author : pranay01
       Score  : 165 points
       Date   : 2021-02-09 16:29 UTC (6 hours ago)
        
       | sbuccini wrote:
       | How is this different than Opstrace?
       | 
       | https://news.ycombinator.com/item?id=25991485
        
         | knrz wrote:
         | I guess YC is diversifying their bets :-)
        
         | pranay01 wrote:
         | As we understand, both of us are taking a very different
         | approach. Opstrace is removing the operational burden of
         | running existing open source projects like Cortex & Loki, while
         | we are building a new observability platform including the UI.
         | 
         | We are focusing on making the experience seamless like existing
         | SaaS tools rather than stitching together disparate tools. We
         | are more focused on observability with traces rather than only
         | metrics and logs - and support things like custom aggregates on
         | traces. We believe that going from metrics to traces to find
         | the exact root cause will be increasingly more important
        
         | zapita wrote:
         | I think PostHog is a more relevant comparison:
         | https://posthog.com
        
           | pranay01 wrote:
           | Yes, PostHog is one of projects we really like.
           | 
           | They are also taking a similar approach of providing great
           | open source alternative to existing SaaS tools.
           | 
           | Though we are in very different domains - PostHog primarily
           | deals with product analytics, while we focus more on
           | application monitoring like finding application latency of
           | your deployed applications, finding error rates in APIs, etc.
           | 
           | Our product will be useful for devops engineers while PostHog
           | is for product managers & digital marketing manager
        
           | nijave wrote:
           | Posthog looks like OSS MixPanel
        
             | ignoramous wrote:
             | nit: public github repo != f/oss.
        
       | [deleted]
        
       | tnolet wrote:
       | First of all: I love the idea, effort and everything in general.
       | So take my comment lightly.
       | 
       | Datadog is big because they shipped a gazillion integrations
       | across I don't know how many products.
       | 
       | Anytime I see a "alternative to Datadog" I think: so you are
       | going to have an agent and integrations page that integrates with
       | everything from HAproxy to Kafka to the full AWS and Azure API's
       | and etc. etc. etc?
        
         | pranay01 wrote:
         | Thanks. You make an interesting point. That is actually
         | something we are constantly asked by our users. I guess our
         | approach would be to prioritise developing of integrations
         | based on community demand, and as we mature as a project - we
         | would possibly have integrations contributed by community also.
         | 
         | Though one thing which is making things a bit simpler for us is
         | the increasing maturity of opentelemetry (
         | https://opentelemetry.io/ ) It is an instrumentation library
         | which supports many languages and frameworks, and by supporting
         | Opentelemetry we get at least instrumentation for many
         | languages and framework in one go.
        
           | [deleted]
        
       | rubiquity wrote:
       | Kafka and Druid are expensive and complicated components to run
       | for an open core biz trying to help people save money against
       | Datadog. This guarantees a lot of people won't take advantage of
       | the "open" part of your open core but maybe that doesn't matter
       | for your business anyway.
       | 
       | I can't speak to Druid but I'm always puzzled when I hear about
       | Kafka being used for metrics. Most metrics are timestamped and
       | also support being calculated in ways that support out of order
       | handling.
       | 
       | It's true that people are being gouged on storage markup by
       | monitoring companies but I don't think this particular approach
       | is the solution. Obsessing over storage and querying costs isn't
       | a good starting point for a startup so maybe driving good habits
       | (stop collecting so much junk, keep it around less frequently,
       | etc.) is a better route to help people save quiche. Either way
       | good luck!
        
         | gen220 wrote:
         | I'd totally agree that the operating costs (speaking eng time,
         | which is more expensive than machines) of Kafka+ZK alone is
         | quite high.
         | 
         | If a company is not already using Kafka, they wouldn't want to
         | maintain it "just" to have a self hosted APM system.
         | 
         | If I could make a recommendation to the developers of this
         | system, it would be to focus on the _interface_ with the
         | streaming platform, before the implementation of using Kafka to
         | support that interface.
         | 
         | Ideally, one should be able to plug in and out the queueing
         | system of preference.
         | 
         | This will help adoption, and avoid coupling the success of your
         | project to the implementation and success of Kafka.
        
           | jwatte wrote:
           | Even if you maintain Kafka for business logic, you don't want
           | to run your observability in the same Kafka cluster, because
           | then when the business Kafka goes down, how will you debug
           | it?
        
             | ankitnayan wrote:
             | Correct, Ideally monitoring stack should be outside the
             | blast radius of other applications. Will handling another
             | Kafka cluster (probably smaller than business Kafka) be a
             | pain for the team given the team already knows managing one
             | business Kafka. What do you think?
        
           | ankitnayan wrote:
           | I completely agree with you. For companies not already using
           | Kafka, this will ask for a big commitment to self-host Kafka.
           | 
           | You mentioned a great approach. Queueing system as a plugin.
           | Thanks
        
         | TameAntelope wrote:
         | Kafka has a terrible reputation, but once you get familiar with
         | it, it only occasionally lives up to that reputation, and
         | oftentimes outperforms expectations by quite a bit.
        
           | ankitnayan wrote:
           | I was pretty much surprised to see the results too. A single
           | node Kafka with 2GB as xmx value, was ingesting at 4500
           | events/sec (around 1MB/s) on a single partition.
           | 
           | I blogged my experiments with SigNoz's scale at
           | https://signoz.io/blog/signoz-benchmarks/. Hoping to get
           | better in fine-tuning configs and blogging.
        
             | gen220 wrote:
             | I think the concerns raised in this thread are less
             | regarding raw throughput, and more about (1) the complexity
             | of the typical production Kafka deployment (2) the arguably
             | unnecessary, highly complex ecosystem around Kafka that you
             | have to pay people or companies to use effectively, (3) the
             | history of problems regarding data loss with ZK/Kafka,
             | caused by leadership election bugs.
        
               | ankitnayan wrote:
               | hmm..I get your point. I searched for Kafka alternatives
               | for a bit before including it on our stack. Though, I
               | couldn't find something more adopted by all. It would be
               | good to know a few Kafka alternatives you prefer which
               | can handle equivalent production scale?
        
               | gen220 wrote:
               | I agree with my sibling comment, and reiterate my cousin
               | comment that you've replied to (commenting here to
               | complete this sub-tree).
               | 
               | Queuing technologies will come and go, IMO it's better to
               | focus on the interface, and allow people to swap in
               | whatever implementation they prefer and are accustomed
               | to. It also benefits you in the long-term too, because an
               | application that is less-coupled to a particular external
               | dependency will be easier to test.
               | 
               | Some examples of queuing tech that's deployed
               | successfully at scale: Redis Streams, RabbitMQ, Amazon's
               | SQS. Since this is written in Go, you could even offer an
               | in-memory, channel-oriented stream implementation, with
               | no external dependencies.
               | 
               | Not one of these is universally better than Kafka: each
               | offers a set of trade-offs, but a very similar interface
               | from SigNoz's point of view.
               | 
               | For SigNoz's hosted/tenant-based solution, it might
               | absolutely make more sense to use Kafka. But self-hosted
               | users bring different trade-offs to the table, and might
               | prefer to use another solution.
               | 
               | Strategically, can write/maintain the plugin for Kafka
               | (very similar to how you operate right now, except it
               | leaves the door open to more plugins existing in the
               | future), and encourage community contributions for other
               | tech. Or, when you're big enough, you might want to
               | employ people to maintain those plugins too, since
               | they're good for adoption.
        
               | ankitnayan wrote:
               | really liked the way you put things to clarity. Thanks
               | for these inputs and suggestions, will definitely think
               | harder on this.
        
               | dflock wrote:
               | Have an interface for a queuing system and support other
               | things, not just Kafka. Ideally, you want a default/dev
               | instance to ship with something super simple, zero setup
               | and maybe in-memory - but allowing you to swap-in kafka
               | or something more capable as needed.
        
               | pranay01 wrote:
               | That's an interesting point. Curious, would you use a
               | project which supports a simple/in-memory datastore, but
               | not anything which would be useful in production
               | environment? Do you think that easy to get running and
               | setup in dev environment valuable for adoption - even if
               | it won't work in prod?
               | 
               | I am trying to understand - what would be a good way to
               | prioritise.
        
               | stmw wrote:
               | Exactly right. In my personal experience, Kafka's
               | reputation for data loss and other mishaps is well-
               | earned. Some of them are well explained by Jepsen tests.
        
           | stmw wrote:
           | Guess your were luckier? YMMV, I've found Kafka generally
           | lives up to its terrible reputation -- and even when it
           | doesn't, it's all somehow more difficult than its initial
           | appeal. I certainly agree with others that inclusion of Kafka
           | in an open-source package like this would discourage me from
           | using it.
        
             | pranay01 wrote:
             | Curious, is there an alternative to Kafka which would be
             | more easier for you to adopt?
        
               | vladsanchez wrote:
               | - [RedPanda|https://vectorized.io/redpanda/]?
               | 
               | - [Apache Pulsar|https://pulsar.apache.org]?
        
               | pranay01 wrote:
               | We had checked out both these projects. Our view was that
               | RedPanda was still an early project ( ~1.5K stars) and
               | Pulsar was very similar to Kafka, and Kafka was more well
               | known compared to Pulsar
        
               | stmw wrote:
               | some options: - https://pulsar.apache.org/ - the many
               | systems based on https://en.wikipedia.org/wiki/Advanced_M
               | essage_Queuing_Proto... - cloud-specific queues (SNS
               | Kinesis et al)
        
               | TameAntelope wrote:
               | Kafka does not fill the same use cases as these
               | suggestions, this may be part of the trouble you were
               | experiencing!
               | 
               | My use of Kafka was as a "system of record", and
               | attaching connectors to create views into the data from
               | there.
               | 
               | I could replay a Kafka topic into a MongoDB, run some
               | analysis, and destroy the MongoDB instance.
        
             | TameAntelope wrote:
             | I don't think it was luck, I just continued to learn about
             | the mistakes I was making, until I resolved the problems
             | people typically bail upon encountering.
        
             | stmw wrote:
             | TameAntelope - sorry I should've phrased it differently,
             | not "lucky" but more skillful as you say.
        
               | TameAntelope wrote:
               | I should say I had a team backing me up, and I many lost
               | nights and a handful of weekends because Kafka didn't do
               | what we expected.
               | 
               | The journey to feeling good with Kafka was difficult, but
               | I was too stubborn to let us give up. :)
        
         | ankitnayan wrote:
         | Nice thoughts, A few other users also pointed this out.
         | 
         | We observed enterprise and other Observability SaaS vendors
         | have some scripts and controllers to keep running these
         | components. We plan to open-source that too. As you rightly
         | pointed out running OSS needs man hours and we will try to
         | remove those frictions.
         | 
         | Also when working with Prometheus and Jaeger, we observed
         | people anyhow have to use Kafka to handle scale and mostly OSS
         | are good at start but become pretty complicated at handling
         | scale. Eg, Prometheus long term storage solution is Cortex
         | which itself is difficult to manage. In that case, Kafka should
         | be better beast to handle than multiple moving components
         | inside Cortex. We built SigNoz as a scalable alternative
         | inspired from stream processing architecture.
         | 
         | We will also be proving sampling strategies including tail-
         | based sampling to retain important data and not unnecessarily
         | clogging disks.
        
       | jzer0cool wrote:
       | Leaving a note here so I can come back and visit to try this out.
       | I was recently looking some new monitoring services and so I like
       | to try this and see how it goes.
        
       | esseti wrote:
       | what about an ELK stack?
        
         | ankitnayan wrote:
         | Hey, I am one of the maintainers of SigNoz. ELK is tightly
         | coupled to Elastic which may not be the ideal database to
         | handle opentelemetry data. We wanted to be more of a platform
         | where we can provide different DBs as plugins. Users can also
         | build their own usecases by building more stream processing
         | applications.
         | 
         | On the other hand, Druid powers analytical queries on data and
         | is efficient in handling high-dimensional data. Many companies
         | use Druid at scale (https://druid.apache.org/druid-powered).
         | 
         | Also Jaeger, a distributed tracing tool, provides plugin for
         | cassandra, elastic, badger, etc. Some users found limitation in
         | running fast aggregation of filtered traces. With Druid we can
         | now search by annotations(without need of service name) and get
         | aggregates on filtered traces, like p99 of version=xyz filters.
        
       | polskibus wrote:
       | How do you compare to opstrace that has also launched recently?
       | 
       | https://news.ycombinator.com/item?id=25991485
       | 
       | Another comparison I'm interested in is Microsoft's Application
       | Insights. What is your value prop over their offering?
        
         | pranay01 wrote:
         | Reg. Opstrace, as I understand, they are taking a very
         | different approach than us. Have answered this in an earlier
         | comment - https://news.ycombinator.com/item?id=26079637
         | 
         | Regarding Application Insights, I have not used the product -
         | so don't have much idea about detailed features. But generally
         | application monitoring tools provided by cloud vendors like
         | MSFT, AMZN, etc. are very tied to that particular cloud - and
         | are not as advanced as independent APM product like DataDog.
         | Also, some users prefer to keep monitoring independent of cloud
         | vendors so that its easier to change cloud vendors and have a
         | multi-cloud strategy
        
       | amzans wrote:
       | Congrats on the launch! It's always nice to see alternatives in
       | this space.
       | 
       | I just have a couple of observations:
       | 
       | > Industry trusted Kafka & Druid to handle enterprise scale. No
       | scaling pains. Ever.
       | 
       | From my (limited) experience, Kafka and Druid are not exactly
       | simple pieces of infrastructure for most shops. Often requiring
       | significant effort to scale and maintain.
       | 
       | Also, in the past I've had some pains supporting those self-
       | hosting my open source projects, and just wanted to give some
       | friendly suggestions:
       | 
       | - A quickstart guide plus a "Production tips" article would be
       | really helpful for those self-hosting.
       | 
       | - A troubleshooting guide would help reduce common support
       | requests.
       | 
       | - Creating a chat group or a forum can reduce the load as users
       | might help each other out.
       | 
       | It's mostly about small things that can help save you time and
       | effort, while making it easier for people to adopt the project.
       | 
       | Besides that, I think a lot of the value DataDog provides is in
       | the form of integrations with pretty much every other service out
       | there. We use plenty of these at my day job and it's particularly
       | useful to connect PagerDuty/Slack to the monitoring system. Maybe
       | these features would help you drive adoption over time, and
       | enable more use cases too.
        
         | [deleted]
        
         | pranay01 wrote:
         | Thanks for your suggestions on better ways to support self
         | hosting! I agree we need to do a much better job here.
         | 
         | We chose Kafka and Druid because: 1. Any company which reaches
         | a decent scale invariably uses some form of Kafka. And it is a
         | trusted system which scales upto huge scale. 2. Community
         | adoption and support. When choosing datastore, we also
         | evaluated Apache Pinot & Clickhouse, but Druid seemed to have
         | the best community. Also, it was proven to use at scale in
         | places like Lyft
         | 
         | I agree though that these are not simple systems, and may be
         | too much for smaller orgs. We are also evaluating supporting
         | simpler datastores, but that would depend on what the community
         | demands. Our architecture is modular so we are not strictly
         | tied to druid and we can support other datastores if there is
         | interest.
         | 
         | I agree with your point around integrations. That is one of the
         | moats of DataDog in my opinion. Agree to the usefulness of
         | integrations for PagerDuty/Slack. I have added an issue for
         | this -
         | https://github.com/SigNoz/signoz/issues/21#issue-804860212
         | 
         | Though we are hoping being an open source projects, our
         | community would be able to create integrations. Have answered
         | this in more detail in another comment -
         | https://news.ycombinator.com/item?id=26080530
        
           | julienfr112 wrote:
           | What was wrong with clickhouse ?
        
             | ankitnayan wrote:
             | Nothing wrong there. If enough users want, we can add
             | clickhouse also
        
       | polskibus wrote:
       | What are your plans on supporting open telemetry?
        
         | pranay01 wrote:
         | We do support opentelemetry. Our current instrumentation
         | instruction are in OpenTelemetry and our stack also uses otel
         | collector
        
       | vhiremath4 wrote:
       | Off-topic but notice you're a Loom user from the demo video you
       | created. Just wanted to say thank you for recording with us! (co-
       | founder)
        
         | pranay01 wrote:
         | Loom is awesome!
        
       | BringerOfChaos wrote:
       | "We found that though there are many backend products - an open
       | source product with UI custom-built for observability, which
       | integrates metrics & traces, was missing."
       | 
       | Ummm... https://grafana.com/products/cloud/features/
        
         | spech wrote:
         | +1 for Grafana together with Loki and Prometheus. This saved us
         | so much trouble. We thought about using DD first but the costs
         | were so intransparent.
        
           | pranay01 wrote:
           | Curious, do you use self-host Prom+Loki?
        
             | technics256 wrote:
             | I do it with my clients, but we're not using a lot of
             | clusters.
        
         | ankitnayan wrote:
         | Grafana, for long, has been used to monitor time-series data
         | and recently has been moving towards observability (including
         | traces and logs). We are different in quite a few fronts.
         | 
         | 1. There are specific observabilty specific UI widgets like
         | serviceMap, SLOs and error budgets, I don't know whether
         | Grafana provides it now. Also, last I used Grafana, linking and
         | moving from one dashboard to another is still a pain. You can
         | get a better idea of how different observability UI can get
         | from Grafana by looking into LightStep demo.
         | 
         | 2. We can run aggregated on filtered traces. Eg, I can get 99th
         | percentile response time of a tag say payment_channel. am
         | afraid this can be extracted from traces by Grafana.
         | 
         | 3. SigNoz is easily extendible by adding your stream processing
         | application to slice n dice data in your own way
        
       | primitivesuave wrote:
       | This is awesome, can't wait to try it out. I run engineering for
       | a healthcare startup, where HIPAA requirements prevent us from
       | using many SaaS products since we cannot risk PII leakage to a
       | third party. Assuming this works for us, if you set up GitHub
       | sponsorship, it would take me 15 seconds to convince management
       | to financially support your project.
        
         | pranay01 wrote:
         | Thanks!
         | 
         | This is exactly the sort of use cases we had in mind. Would
         | love to work closely with you to help you in any way. If
         | possible, can you drop me a note on pranay at signoz dot io
        
       | devops000 wrote:
       | I think what is missing is a tool that suggest you how to set
       | hardward parameters (RAM, CPU) and configurations settings (n.
       | workers etc...) based on usage metrics and it tells you when you
       | need to scale servers.
        
         | SteveNuts wrote:
         | That's what monitors are for, at least in datadog that's how it
         | would work. "Tell me when available workers drops below x"
        
           | devops000 wrote:
           | Yeah, but it doesn't tell how how to set up Postgres
           | parameters or Rails server workers or optimal thread size
           | based on usage metrics. There are tons of parameters to
           | config
        
         | ankitnayan wrote:
         | Great point. To start off we shall provide different hardware
         | configs like micro, small, medium, large, xlarge with the scale
         | that they can handle.
         | 
         | We soon plan to emit metrics from different components of
         | SigNoz and setup autoscaling of different components. Druid has
         | already put some thought in autoscaling. Checkout
         | https://druid.apache.org/docs/latest/configuration/index.htm...
         | and https://www.adaltas.com/en/2019/07/16/auto-scaling-druid-
         | wit...
        
       | adolph wrote:
       | Best of fortune to the SigNoz team! This seems like an area where
       | many different solutions are being tried and maybe this will be
       | the right choice for some.
       | 
       | Here is the CNCF Landscape for Observability products like the
       | compared DataDog. Many of the products listed are partial
       | components that would go into an overall solution (i.e. Beats or
       | Graphana) or are specific to a particular cloud (i.e. Amazon
       | CloudWatch does AWS or onprem).
       | 
       | https://landscape.cncf.io/card-mode?category=monitoring&grou...
        
         | pranay01 wrote:
         | Thanks! We are aware of the CNCF landscape. As you mentioned,
         | most of the products are point solutions which need to be
         | combined to build an end to end solution. What we found was
         | that combining different point solution tools can be non-
         | trivial and some times they don't talk well with each other -
         | making correlation between different tools difficult.
         | 
         | For example, what were the traces responsible when p99 latency
         | of a service crossed a threshold. This would be non trivial to
         | do if traces and metrics are in different systems. And that's
         | why solutions like DataDog are popular as they provide a single
         | pane of view. Our motivation is to make such a 'single pane of
         | view' tool in open-source.
        
       | Sebb767 wrote:
       | Looks great! I'll set up an instance and play with it this week
       | to take a look :)
       | 
       | Only minor nitpick: You README first describes deploying on
       | Kubernetes in the "getting started" section and then links to the
       | docker deployment guide in the documentation section. An overview
       | with "you can deploy on docker or Kubernetes", with
       | subsections/links for each one, would be great, especially since
       | it would immediately show that you don't need a full k8s cluster
       | to get started.
        
         | pranay01 wrote:
         | Cool. Hit me up on our slack community if you face any issues.
         | 
         | Good point regarding README. We certainly need to do a better
         | job at it. Will update it soon.
        
       | mayank wrote:
       | Any thoughts around the viability of open-core/self-hosted
       | monitoring, when that also entails bringing the burden of
       | scaling/monitoring your monitoring solutions in-house?
       | 
       | Do you feel that the market of people who need DataDog or Splunk
       | or Lightstep for their scale but can't afford it is large enough
       | to sustain this model? Or is this targeted at smaller shops where
       | cost overrides other concerns?
        
         | pranay01 wrote:
         | Great question! Our aim is to make the self-hosting of
         | monitoring/observability systems so simple that people would
         | prefer it compared to sending everything to SaaS vendors. We
         | think that the current open source solutions are disparate
         | systems ( like prom, jaeger) and thats what makes it difficult
         | to manage. Of course, scaling Kakfa/Druid is also not trivial,
         | but this can be managed by providing better scripts and
         | controllers to manage the complexity.
         | 
         | The market we are primarily targeting is customers who see that
         | they are paying huge (storage) price to Datadog/Lightstep and
         | would prefer to have things in house. Self hosting also becomes
         | more important for users who prefer data to not leave their
         | network boundaries - either due to privacy or security concerns
        
       | LawnGnome wrote:
       | Given recent events around licensing, such as Elastic moving to
       | the SSPL, choosing an MIT licence is certainly bold! Do you
       | require a CLA from contributors?
        
         | pranay01 wrote:
         | Thanks. No, we don't require a CLA from contributors. Though
         | honestly speaking, we have not given lots of thought around
         | CLAs - as we are still pretty young as a project.
        
       | gianm wrote:
       | I'm a committer on Apache Druid and generally a big fan of
       | observability. I'm glad that you found Druid useful in building
       | this!
       | 
       | A tip, if you aren't already doing it: with metric and trace
       | data, it helps a ton to set up partitioning and sorting according
       | to the query patterns you expect. Timeseries databases usually do
       | this out of the box, because they can make assumptions about your
       | query patterns, but general purpose databases like Druid usually
       | need an extra step or two. Some references:
       | 
       | https://druid.apache.org/docs/latest/ingestion/index.html#pa...
       | 
       | https://twitter.com/gianmerlino/status/1287134114844270592
        
         | pranay01 wrote:
         | Thanks for the tips. Agree, we need to fine tune our druid
         | setup and make it more performant. If its Ok, can I reach out
         | to you on twitter DM to get some specific advice?
        
       ___________________________________________________________________
       (page generated 2021-02-09 23:00 UTC)