[HN Gopher] Show HN: DataStation - App to easily query, script, ... ___________________________________________________________________ Show HN: DataStation - App to easily query, script, and visualize data Author : eatonphil Score : 45 points Date : 2022-05-31 20:10 UTC (2 hours ago) (HTM) web link (github.com) (TXT) w3m dump (github.com) | canMarsHaveLife wrote: | How does it compare to Redash (now Databricks SQL): | https://github.com/getredash/redash? | [deleted] | eatonphil wrote: | I haven't used it but just from looking at the Github page. It | looks like redash has more advanced dashboarding features today | (I'd like to catch up here). In contrast redash doesn't really | allow you to manipulate data very much if it doesn't come in a | form you want or if you can't get it into the right form with | SQL alone. | | DataStation allows you to script results of database queries | (or loaded Parquet, Excel, CSV, etc. files or HTTP API | responses) in Python, Node, R, Julia, etc. | | Also, DataStation is first-off a desktop app today so it's very | easy to install and use -- especially in a corporate | environment. Data never leaves your laptop. In the future I | think more people will use the server version of DataStation so | you can get server features like recurring exports and hosted | dashboards but desktop will always be supported too. | programmarchy wrote: | Looks very useful! In terms of feedback, I think if you brought | in a designer you'd have a much bigger "wow" factor. There's a | lot of low hanging fruit like consistent button styles, fonts, | whitespace, larger text inputs, that'd go a long way. And I'm | sure you've thought of this already, but seems like a node-based | paradigm could be an improvement over the panel-based paradigm | e.g. more akin to something like Blender nodes, or Tableau. | eatonphil wrote: | > In terms of feedback, I think if you brought in a designer | you'd have a much bigger "wow" factor. There's a lot of low | hanging fruit like consistent button styles, fonts, whitespace, | larger text inputs, that'd go a long way. | | Yes this would be nice to have! If there's a version of this | that gets funded or bootstrapped then I'd definitely like to | bring someone on to help. | | > And I'm sure you've thought of this already, but seems like a | node-based paradigm could be an improvement over the panel- | based paradigm e.g. more akin to something like Blender nodes, | or Tableau. | | Actually no I'm not familiar with this concept. But I have seen | what natto.dev does and I'm concerned that that is too free | form compared to how DataStation works. A little structure is | useful IMO. I'm not sure how similar Blender nodes or Tableau | are to natto.dev. | | That said, DataStation panels show up in an order but the order | of evaluation is not set. You can import the results of a panel | defined below the current panel it just matters that the panel | you refer to has been _run_. So it may be closer to a node- | based design in that case. But again I 'm not sure if that's | what you mean. | programmarchy wrote: | Hadn't seen natto before, but I agree that's pretty far out | there! If you search images of Tableau Prep, that's more | along the lines of what I had in mind. Although Tableau | supports Python and R, it's not nearly as well integrated as | what you've done with DataStation. In general, it's more | geared towards Excel power user types, rather than | programmers. | eatonphil wrote: | > If you search images of Tableau Prep, that's more along | the lines of what I had in mind. | | Ah! I think this is a visualization of what does happen | with DataStation panels too. Eventually I'd like to have | better support for understanding the dependency graph like | this but for now that's just been a nice idea to have | sometime in the future. | | > Although Tableau supports Python and R, it's not nearly | as well integrated as what you've done with DataStation. In | general, it's more geared towards Excel power user types, | rather than programmers. | | Yeah it was definitely my impression it was not geared | toward programmers as much (though I know many programmers | or data scientists use it). | bamazizi wrote: | The UX reminded me of [PipeDream](https://pipedream.com/) | | The industry around abstractions tools/ui on top DBs is growing. | We use Retool very heavily and it does get pricy. | | This is a very neat execution and has potential for SAAS or Cloud | offering. Like "Bring your own DB" and build your own | abstractions. | [deleted] | eatonphil wrote: | > This is a very neat execution and has potential for SAAS or | Cloud offering. Like "Bring your own DB" and build your own | abstractions. | | Definitely my goal for the future is SaaS/Cloud where you can | work on projects as a team and configure hosted dashboards, | recurring exports and alerts out of panels you set up in a | DataStation project. | eatonphil wrote: | Hey folks! I quit my job at Oracle almost a year ago now to build | DataStation. It's an app I've wanted as an engineering manager | for years. It's entirely open-source and while I've had a few | awesome contributors I'm mostly the only person on it. It has | been funded out of contract development and savings. | | DataStation helps you query a variety of data sources | (conventional SQL like PostgreSQL and MySQL, non-SQL like | Prometheus or Elasticsearch), files and HTTP APIs. It is not a | SQL layer on top of these various APIs like FDW in Postgres or | Apache Calcite. | | DataStation just tries to abstract away glue code. So in | DataStation for Prometheus you query with PromQL. For | Elasticsearch you query with Lucene. And for SQL databases you | query with their SQL dialect. But you don't need to remember how | to use the appropriate library for your language. You just need | your own credentials. | | DataStation is made of panels (other apps might call them cells) | that each produce a result. Panels can refer to other panels. | These allow you to build workflows that cross the boundary of a | particular datasource. For example you might have some data in a | CSV a product manager gave you and the bulk of your data is in | PostgreSQL. In DataStation you could pull in the CSV with a File | panel and pull in the Postgres data with a Database panel. Then | you can join both panel results in a Code panel using your | favorite language like Python, Ruby, R, Node, Julia, etc. You can | even script Code panels in a SQLite dialect with a bunch of rich | addons (url parsing, best-effort date parsing, statistics | aggregation, etc.): https://github.com/multiprocessio/go- | sqlite3-stdlib. | | You can watch a simple introductory video: | https://www.youtube.com/watch?v=q_jRBvbwIzU. Or if you want to | see that cross-datasource interaction taken to an extreme, check | out this video using Postgres metadata to filter log data in | Elasticsearch to do historic request analysis on a subset of | customers: https://www.youtube.com/watch?v=tIh99YVHoRE. | | DataStation is mainly a desktop app today where the end result is | that you export graph SVGs or HTML tables or markdown tables or | just a CSV file. All this data stays on your laptop so it's as | easy to use in a corporate environment as any existing SQL IDE or | Jupyter Notebook. | | In the last year it's reached 1.5k stars on Github, over 1000 | unique users and currently on-average about 40 fairly active | users per month (defined as having opened the app more than a few | times). | | Since it's only just now 12 months old it's been going through a | lot of maturing during this time. If you've tried it before and | it was buggy or too slow it's probably worth another try now if | you're still interested. | | DataStation is primarily an Electron app but the code that | evaluates panels is written in Go. The Go evaluation code forms | the backbone of another app you may have seen around HN, dsq: | https://github.com/multiprocessio/dsq, which is a limited version | of DataStation as a CLI for querying files with SQL. | | In the future I'd like to see more people using it as a server | app where my goal is to support read-only dashboards and | recurring exports. That part is still work-in-progress. | | You can find a ton of tutorials on how to interact with supported | databases on the DataStation website: | https://datastation.multiprocess.io/docs/. | | Looking forward to your feedback! | lopatin wrote: | This is really cool. Maybe in the future you can make a paid | version with a bunch of BI features. | | In your opinion, how does it compare to PyCharm (Enterprise | version) when it's all blinged out with big data tools and | integrations? I recently realized that PyCharm is my Data IDE | and not just my Python editor. I only use limited features | though, so hard for me to compare the extent of functionalities | between the two. | | Edit: Well, PyCharm won't let you join two different data | sources, so that's one big difference! | eatonphil wrote: | > Edit: Well, PyCharm won't let you join two different data | sources, so that's one big difference! | | Right! | | On the other hand, any real code IDE will have high-quality | autocomplete, jump-to-definition, all that code IDE stuff. In | the future DataStation may be able to hook into tree-sitter | or LSP but for now it's more like a textarea with syntax | highlighting (although the SQL code panel autocomplete is | relatively complete). | | Similarly, SQL IDEs have better exploration of your database. | DataStation can't tell you about which tables or schemas | exist yet (although I want it to in the future). | | DataStation competes more directly with _Python scripts_ than | with SQL IDEs and code IDEs (although there is of course | overlap). | tyingq wrote: | It does look at bit like parts of Tableau's desktop | product. | eatonphil wrote: | I haven't used Tableau but I have had some people show up | in Discord to ask about using DataStation as an | alternative. So maybe it is similar, but I don't know. | alashow wrote: | Any reason for not having a web client? | eatonphil wrote: | You can run it as a web server! It's just not as commonly | done right now since I haven't put much time into integration | with cloud providers (stuff like CloudFormation templates I | mean) and I don't yet have a public Docker image that is up | to date. | | https://datastation.multiprocess.io/docs/0.11.0/DataStation_. | .. | moltar wrote: | Looks amazing. | | Will try tomorrow. Athena alone is a superior offer in my mind. | Even TablePus, my favourite SQL client doesn't do that :) | | If you can add dbt integration it will be a killer product! | | Thank you! | eatonphil wrote: | Thanks for the kind words! | | The only caveat I'll say is that it's definitely not as mature | in general as SQL clients (stuff like table, column discovery | and autocomplete does not exist yet). But it is pretty | convenient to use DataStation if you like being able to easily | switch into Python/JavaScript/whatever without needing to look | up the docs for how to connect to and run a query against every | database. | | > If you can add dbt integration it will be a killer product! | | I haven't used dbt and my impression was that it was a glue | system for copying data from one place to another. But maybe | that's not correct. Is it possible to query dbt data directly? | Or how would you imagine it fitting into a DataStation flow. | Thank you! ___________________________________________________________________ (page generated 2022-05-31 23:00 UTC)