[HN Gopher] Who needs MLflow when you have SQLite?
       ___________________________________________________________________
        
       Who needs MLflow when you have SQLite?
        
       Author : edublancas
       Score  : 202 points
       Date   : 2022-11-16 14:55 UTC (8 hours ago)
        
 (HTM) web link (ploomber.io)
 (TXT) w3m dump (ploomber.io)
        
       | praveenhm wrote:
       | what is alternative to MLflow other than SQLite, like Kubeflow,
       | Metaflow?
        
         | crucialfelix wrote:
         | Weights and Balances https://wandb.ai/site
        
           | pcerdam wrote:
           | Weights and *Biases :)
        
         | kuba_dmp wrote:
         | neptune.ai https://neptune.ai/
        
         | the83 wrote:
         | Comet: https://www.comet.com/site/
        
         | mmq wrote:
         | https://github.com/polyaxon
        
       | isoprophlex wrote:
       | Yeah, MLFlow is a shitshow. The docs seem designed to confuse,
       | the API makes Pandas look good and the internal data model is
       | badly designed and exposed, as the article says.
       | 
       | But, hordes of architects and managers who almost have a clue
       | have been conditioned to want l and expect mlflow. And it's baked
       | into databricks too, so for most purposes you'll be stuck with
       | it.
       | 
       | Props to the author for daring to challenge the status quo.
        
         | idomi wrote:
         | How many data scientists that use Databricks for modeling do
         | you know?
        
           | isoprophlex wrote:
           | It's ubiquitous. I've consulted for a 100 person company that
           | built a data product on top of some IoT data. Everything was
           | in databricks, literally everything. (Not endorsing that,
           | just an observation)
           | 
           | Talking to a 2000+ person org now that is standardizing data
           | science across the org using... you guessed it
        
             | idomi wrote:
             | Pretty interesting. I think this is part of this notion to
             | release half baked products, like some of the stuff in
             | there are really cool, just enough to get you in but it
             | doesn't scale and usually is complex to deploy/use.
        
           | dachryn wrote:
           | its forced upon many of them that are in finance, banking,
           | insurance, ...
           | 
           | Mainly because those tend to run on Microsoft Azure, which
           | has no decent analytics offering, and are pushing Databricks
           | extremely hard. The CTO or whatever just pushes databricks.
           | On paper it checks all the boxes. Mlops, notebooks,
           | experiment management. It just does all of those things very
           | badly, but the exec doesn't care. They only care about the
           | microsoft credits. Just to avoid using Jupyter so the
           | compliance teams stay happy as well because Microsoft sales
           | people scared them away from from open source.
        
             | akdor1154 wrote:
             | What would you go with instead for collaborative notebooks?
             | 
             | I ask because normally I tend pretty strongly towards the
             | "NO just let the DSes/analysts work how they want to",
             | which in this case would be running Jupyter locally.
             | However DBr's notebooks seem genuinely useful.
             | 
             | Is your issue "but I don't need Spark" or "i wanna code in
             | a python project, not a notebook?", or something else?
             | 
             | Imo if DBr cut their wedding to Spark and provided a
             | Python-only nb environment they'd have a killer offering on
             | their hands.
        
             | nerdponx wrote:
             | My team very nearly had this happen to us.
             | 
             | We pushed back on it very, very, very hard, and finally
             | convinced "IT" to not turn off our big Linux server running
             | JupyterHub. We actually ended up using Databricks (PySpark,
             | Delta Lake, hosted MLFlow) quite a bit for various
             | purposes, and were happy to have it available.
             | 
             | But the thought of forcing us into it as our _only_
             | computing platform was a spine-chilling nightmare.
             | Something that only a person who has no idea what data
             | analysts and data scientists actually do all day would
             | decide to do.
        
         | chaps wrote:
         | "the API makes Pandas look good"
         | 
         | It sparks joy in my heart whenever I see shade cast against
         | pandas.
        
           | lordgroff wrote:
           | Every time I open up pandas I jealously remember the
           | expressive beauty of R for these tasks. But because we're all
           | "serious" of course we must use Python for production lest we
           | not be serious.
        
             | laichzeit0 wrote:
             | To be fair, taking R to production is a goddamn nightmare.
        
               | chaxor wrote:
               | R is a trash of a language. It doesn't have any sense of
               | coherency to it at all. They keep trying to fix the
               | underlying problems by ducktaping paradigms on to it over
               | and over (S3, S4, R6, etc). There's never a clear sense
               | of the best way to do anything, but plenty of options to
               | do a thing in a very hacky 'script-kiddy' way. Looking
               | out at the community of different projects it becomes
               | clear that everyone is pretty lost as to what design
               | principles should be used for certain tasks, so every
               | repo has its own way of doing things (I know personal
               | style occurs in other languages, but commonalities are
               | much less recognizable in R projects). It's tragic that
               | such a large community uses it.
        
               | jmt_ wrote:
               | Trash language is a bit harsh. I'm not sure I would try
               | to put an R project into production or build a huge
               | project with it but, at the very least, R/R Studio was
               | the best scientific calculator I've ever used. Was
               | particularly great during college
        
               | lordgroff wrote:
               | Yep, this is a mark of someone that's never used R but
               | has heard a lot of incredibly ill informed criticism
               | around it.
               | 
               | One look of dplyr code over pandas would of course
               | disabuse anyone of the notion that R is trash and the
               | tragedy is Python will in the current state never have
               | anything like that. That's the advantage of the language
               | being influenced by Lisp vs not.
        
               | tomrod wrote:
               | I've heavily used R several times.
               | 
               | I agree that it is a trash language and that, outside
               | that many frontier academic ideas are available and some
               | plotting preferences are solidly prescriptive, it should
               | be thrown into the trash bin.
               | 
               | Python, Julia when it gets its druthers for TTFP, Octave,
               | Fortran, C, and eventually Rust. These are the tools I've
               | found in use over and over and over again across
               | business, government, and non-profits.
               | 
               | Everywhere R is used by the org I have seen major gaps in
               | capacity to deliver specifically because R doesn't scale
               | well.
        
               | nerdponx wrote:
               | Try to separate the language from its standard library.
               | Neither one is "trash".
               | 
               | I agree that the standard library is what you might call
               | "a chaotic disorganized mess".
        
               | tomrod wrote:
               | I'm not emotionally invested in tools so am happy to
               | identify the user experience and operational experience
               | as "trash."
               | 
               | "Trash", despite its connotations of lacking value, is
               | really just a chaotic disorganized mess of something made
               | by artifice with dubious reclaim/reuse/recycle value.
               | Being a subjective assessment, it is natural that one
               | person's trash is a treasure to another.
        
               | nerdponx wrote:
               | I take issue with your implication that I'm emotionally
               | invested in something when I shouldn't be. You are free
               | to dislike R and not use it, but to claim that it's
               | "trash" is to wrongly disavow its usefulness for the many
               | people that do find it useful, and to cast aspersions on
               | the judgement of all those people.
        
           | whatever1 wrote:
           | I have never seen a worse documented library. Initially I
           | thought that they were lazy, now I realize that it cannot be
           | documented because it is a total mess of a library held
           | together with tape.
           | 
           | Close second is the plotly library.
        
             | nerdponx wrote:
             | The Pandas documentation has improved quite a bit. Last I
             | checked, the only part of the reference docs with a big gap
             | was the description of "extension arrays" and accessors.
             | 
             | The _user guide_ material absolutely needs work, and the
             | examples in the reference docs tend to be a little
             | contrived. But I absolutely have seen worse-documented
             | libraries, such as Gunicorn and Pydantic.
        
               | claytonjy wrote:
               | I'm surprised to see Pydantic in here; I've used Pandas
               | and Pydantic both quite a lot, and have found the
               | Pydantic docs to be quite good! Also a much smaller
               | library with a saner API, and thus easier to document
               | well.
        
             | __mharrison__ wrote:
             | Genuinely curious what you have against the Pandas
             | documentation. It has some of the best docstrings I've
             | seen.
             | 
             | (I also wrote a Pandas book or two... So there's that)
        
               | chaps wrote:
               | Docstrings are one thing, but functionality discovery,
               | picking up from scratch, troubleshooting, etc are... not
               | fun, nor easy with the documentation. If you know it well
               | already and use it a lot it's easier to forgive its
               | documentation faults since you can waive off the problems
               | as "that's just learning something new".
               | 
               | But for a lot of people who use it infrequently its
               | documentation is a frustrating mess. Simple problems turn
               | into significant time sinks of trying to find which page
               | of the documentation to look at.
               | 
               | A lot of issues are made worse by shit-awful interop
               | between libraries that claim to fully support dayaframes,
               | but often fail in non-obvious ways... meaning back to the
               | documentation mines.
               | 
               | I'd argue that because there's a market for a single
               | author to write two books about it is indicative of
               | documentation problems.
        
             | 333luke wrote:
             | What makes the documentation so bad in your opinion? I'm
             | not arguing but curious since I use pandas all day at my
             | job and can't think of any times the docs weren't clear to
             | me. (Plotly I have had some annoying times with!)
        
             | bobertlo wrote:
             | I think the R docs are the intended reference material for
             | pandas ;)
        
           | dekhn wrote:
           | What bothers me the most is the egregious data types for any
           | argument. If it's a string, do this. If it's a list, do that.
           | If it's a dictionary of lists, do this other thing.
           | 
           | No, I want you to force me to provide my data in the right
           | way and raise a noisy exception if I don't.
        
             | nerdponx wrote:
             | Series and DataFrame have "alternate constructors" for this
             | purpose, and the loc/iloc accessors give you a bit more
             | control.
             | 
             | I agree that the magic type auto-detection is a bit too
             | magical and sloppy, but you have to realize that data
             | analysts and scientists have historically been incredibly
             | sloppy programmers who _wanted_ as much magic as possible.
             | It 's only in recent years that researchers have begun to
             | value some amount of discipline in their research code.
        
         | mostdataisnice wrote:
         | Where does the article say that?
        
           | isoprophlex wrote:
           | About exposing the data inside MLFlow
           | 
           | > I found the query feature extremely limiting (if my
           | experiments are stored in a SQL table, why not allow me to
           | query them with SQL).
        
       | guangyeu wrote:
       | As noted in an earlier comment, I think there is a false
       | equivalence between end-to-end MLOps platforms like MLflow and
       | tools for experiment tracking. The project looks like a solid
       | tracking solution for individual data scientists, but it is not
       | designed for collaboration among teams or organizations.
       | 
       | > There were a few things I didn't like: it seemed too much to
       | have to start a web server to look at my experiments, and I found
       | the query feature extremely limiting (if my experiments are
       | stored in a SQL table, why not allow me to query them with SQL).
       | 
       | While a relational database (like sqlite) can store
       | hyperparameters and metrics, it cannot scale for the many aspects
       | of experiment tracking for a team/organization, from visual
       | inspection of model performance results to sharing models to
       | lineage tracking from experimentation to production. As noted in
       | the article, you need a GUI on top of a SQL database to make
       | meaningful model experimentation. The MLflow web service allows
       | you to scale across your teams/organizations with interactive
       | visualizations, built-in search & ranking, shareable snapshots,
       | etc. You can run it across a variety of production-grade
       | relational dBs so users can query the data directly through the
       | SQL database or through a UI that makes it easier to search for
       | those not interested in using SQL.
       | 
       | > I also found comparing the experiments limited. I rarely have a
       | project where a single (or a couple of) metric(s) is enough to
       | evaluate a model. It's mostly a combination of metrics and
       | evaluation plots that I need to look at to assess a model.
       | Furthermore, the numbers/plots themselves have no value in
       | isolation; I need to benchmark them against a base model, and
       | doing model comparisons at this level was pretty slow from the
       | GUI.
       | 
       | The MLflow UI allows you to compare thousands of models from the
       | same page in tabular or graphical format. It renders the
       | performance-related artifacts associated with a model, including
       | feature importance graphs, ROC & precision-recall curves, and any
       | additional information that can be expressed in image, CSV, HTML,
       | or PDF format.
       | 
       | > If you look at the script's source code, you'll see that there
       | are no extra imports or calls to log the experiments, it's a
       | vanilla Python script.
       | 
       | MLflow already provides low-code solutions for MLOps, including
       | autologging. After running a single line of code -
       | mlflow.autolog() - every model you train across the most
       | prominent ML frameworks, including but not limited to scikit-
       | learn, XGBoost, TensorFlow & Keras, PySpark, LightGBM, and
       | statsmodels is automatically tracked with MLflow, including all
       | relevant hyperparameters, performance metrics, model files,
       | software dependencies, etc. All of this information is made
       | immediately available in the MLflow UI.
       | 
       | Addendum: As noted, there is a false equivalence between an end-
       | to-end MLOps lifecycle platform like MLflow and tools for
       | experiment tracking. To succeed with end-to-end MLOps,
       | teams/organizations also need projects to package code for
       | reproducibility on any platform across many different package
       | versions, deploy models in multiple environments, and a registry
       | to store and manage these models - all of which is provided by
       | MLflow.
       | 
       | It is battle-tested with hundreds of developers and thousands of
       | organizations using widely-adopted open source standards. I
       | encourage you to chime in on the MLflow GitHub on any issues and
       | PRs, too!
        
         | czumar wrote:
         | +1. I'd also like to note that it's very easy to get started
         | with MLflow; our quickstart walks you through the process of
         | installing the library, logging runs, and viewing the UI:
         | https://mlflow.org/docs/latest/quickstart.html.
         | 
         | We'd love to work with the author to make MLflow Tracking an
         | even better experiment tracking tool and immediately benefit
         | thousands of organizations and users on the platform. MLflow is
         | the largest open source MLOps platform with over 500 external
         | contributors actively developing the project and a maintainer
         | group dedicated to making sure your contributions &
         | improvements are merged quickly.
        
       | bfung wrote:
       | How about a side-by-side comparison?
       | 
       | Far too often, these articles of X is bad, use my homebrew Y
       | instead, without showing comparison to X doesn't help illustrate
       | 'why Y instead'.
       | 
       | You know... <cheeky>For science.</cheeky>
        
       | benjaminwootton wrote:
       | The elephant in the room with data is that we don't need a lot of
       | the fancy and powerful technology. SQL against a relational
       | database gets us extraordinarily far. Add some Python scripts
       | where we need some imperative logic and glue code, and a sprinkle
       | of CI/CD if we really want to professionalise the work of data
       | scientists. I think this covers the vast majority of situations.
       | 
       | Despite being around it for some time, I'm not sure big data or
       | machine learning needed to be a thing for the vast majority of
       | businesses.
        
         | bob1029 wrote:
         | > SQL against a relational database gets us extraordinarily
         | far.
         | 
         | I think it gets us all the way once you consider the ability to
         | expose domain-specific functions to SQL that are serviced by
         | your application code.
         | 
         | I've always been of the mindset that you can do anything with
         | SQL if you are clever enough.
        
         | citizenpaul wrote:
         | Unless your income is depending on carrying out the exact
         | demands of some money guy that's most common phrase while using
         | a computer is "it won't let me" and they want "big data".
         | 
         | Then you just suck it up and build one of the totally
         | unnecessary big data systems that have been excreted all over
         | the business world these days. I don't think the problem is
         | that devs are over-engineering.
         | 
         | I wonder what its called, makes me think of tragedy of the
         | commons but probably not quite right.
        
           | morelisp wrote:
           | Maybe like 20 years ago you were right but today there's a
           | generation that's _been working for 10 years_ on systems
           | built like that. They don 't know any better, and in most
           | cases nobody is around to teach them otherwise.
        
           | tomrod wrote:
           | Hierarchy on bueracracies, by Jean Tirole. I know because
           | this was the phenomenon I wanted to study in grad school only
           | to find he scooped me (on this an several items) by several
           | decades.
           | 
           | Edit: Tirole, Jean. "Hierarchies and bureaucracies: On the
           | role of collusion in organizations." JL Econ. & Org. 2
           | (1986): 181.
        
         | chasil wrote:
         | The article mentions this workflow:
         | 
         | "Let's now execute the script multiple times, one per set of
         | parameters, and store the results in the experiments.db SQLite
         | database... After finishing executing the experiments, we can
         | initialize our database (experiments.db) and explore the
         | results."
         | 
         | Be warned that issuing queries while DML is in process can
         | result in SQLITE_BUSY, and the default behavior is to abort the
         | transaction, resulting in lost data.
         | 
         | Setting WAL mode for greater concurrency between a writer and
         | reader(s) can lead to corruption if the IPC structures are not
         | visible:
         | 
         | "To accelerate searching the WAL, SQLite creates a WAL index in
         | shared memory. This improves the performance of read
         | transactions, but the use of shared memory requires that all
         | readers must be on the same machine [and OS instance]."
         | 
         | If the database will not be entirely left alone during DML,
         | then the busy handler must be addressed.
        
           | habibur wrote:
           | None of these are a problem for the workload discussed.
           | 
           | When I am working with sqlite I am more likely accessing it
           | from a single machine.
           | 
           | And in this case of ML, most likely from 1 process and by
           | running multiple times in serial.
        
         | isoprophlex wrote:
         | Yeah and even if you do need to do proper big-dataset-ML... a
         | SQL box and maybe something like a blob storage for large
         | artifacts (S3, Azure storage account, whatever) is all you need
         | as well. But if your boss bought The MLOps Experience, you
         | gotta do what the cool kids are doing!
        
       | navbaker wrote:
       | I work in an environment where there are multiple tech teams
       | developing models for multiple use cases on VMs and GPU clusters
       | spread across our corporate intranet. Once you move beyond a
       | single dev working on a model on their laptop, you absolutely
       | need something that can handle not just metrics tracking, but
       | making the model binaries available and providing a means to
       | ensure reproducibility by the rest of the team. That's what
       | MLFlow is providing for us. The API is a mess, but at least we
       | didn't have to code up some bespoke in-house framework, we just
       | put some engineers on task to play around with it for a few hours
       | and figure out the nuances of basic interactions and deployed it.
        
         | edublancas wrote:
         | Agree. Once you have a team, you need to have a service they
         | can all interact with. This release is a first step, we want to
         | get the user experience right for an individual and then think
         | of how to expand that to teams. Ultimately, the two things
         | we're the most excited about are 1) you don't need to add any
         | extra code (and it works with all libraries, not a pre-defined
         | set) 2) SQL as the query language
        
       | spicyramen_ wrote:
        
       | cdong wrote:
       | I don't get why a lot of people are calling mlflow a shitshow
       | when it has done so much getting data scientist out of recording
       | experiments via CSV. I can log models and parameters and use the
       | UI to track different runs. After comparisons, I can use the
       | registry to register different staging. If you have other model
       | diagnostic charts you can log the artifact as well. I think
       | mlflow v2 has auto logging included so why all the fuss?
        
         | nerdponx wrote:
         | People tend to forget that first movers rarely tend to also
         | have the best design. MLFlow (and DVC) brought us out of the
         | dark ages. Now we can build better tools, with the benefit of
         | hindsight.
         | 
         | Claiming that something is "broken" or "trash" when you mean "I
         | don't like it" is a good way to make yourself feel big and
         | smart, but it's not actually constructive.
        
         | cameronfraser wrote:
         | There are those who create and those who complain on the
         | internet about tools they've used one time
        
           | isoprophlex wrote:
           | Okay that's coming across as a pretty snide remark aimed at
           | me, I'll bite.
           | 
           | Yes, I can understand why you comment that. I don't like
           | blind slagging of free software either.
           | 
           | But there are ALSO those whose day job it is, and has been
           | for the last 2 years, to use a badly designed overcomplex
           | horrorshow of a tool that could be replaced easily by
           | something better ... if it wasn't for the lock-in effects and
           | strong marketing.
           | 
           | So I'm ventilating my frustration and at the same time
           | expressing my gratitude to the person who made something
           | fresh, that shows us things can be better.
           | 
           | I can't build the replacement to MLFlow myself, but I can
           | cheer people on who do, and let them know their efforts are
           | sorely needed.
        
       | phr0k wrote:
        
       | guangyeu wrote:
       | Could you provide context on why SQLite would replace MLflow?
       | From the standpoint of model tracking (record and query
       | experiments), projects (package code for reproducibility on any
       | platform), deploy models in multiple environments, registry for
       | storing and managing models, and now recipes (to simplify model
       | creation and deployment), MLflow helps with the MLOps life cycle.
        
         | [deleted]
        
         | edublancas wrote:
         | Fair point. MLflow has a lot of features to cover the end-to-
         | end dev cycle. This SQLite tracker only covers the experiment
         | tracking part.
         | 
         | We have another project to cover the orchestration/pipelines
         | aspect: https://github.com/ploomber/ploomber and we have plans
         | to work on the rest of features. For now, we're focusing on
         | those two.
        
       | mostdataisnice wrote:
       | SQLite is literally a backend for MLflow, so the argument being
       | made really is that you should just use SQL when you can, which
       | is kind of adjacent to any criticisms of MLflow
        
         | edublancas wrote:
         | Is querying the underlying SQL database officially supported in
         | MLflow? Last time I used it, it wasn't documented. I took a
         | look at the database and it wasn't end-user friendly.
        
           | mostdataisnice wrote:
           | As someone replied above, it's because SQL is just 1 backend
           | and it's weird to expose an API that only works on 1 backend.
           | Once you have many devs working together, you need a remote
           | server. If you have a remote abstracted backend, it needs to
           | have a unified API surface so the same client can talk to any
           | backend. You might argue "This interface _should_ be SQL ",
           | and to that I would say there are many file stores (like your
           | local file system) that are not easy to control with SQL.
        
       | afrnz wrote:
       | You can also use mlflow locally with SQLite (https://www.mlflow.o
       | rg/docs/latest/tracking.html#scenario-2-...). Even though I
       | haven't tried querying the db directly ...
        
       | frgtpsswrdlame wrote:
       | Wow this looks perfect for what I need right now - just a bit of
       | lightweight tracking.
        
         | nerdponx wrote:
         | DVC also fills the "lightweight tracking" niche, although it
         | relies on automatically creating Git branches as its technique
         | for tracking experiments. I personally find that distasteful,
         | so I don't use it specifically for experiment tracking, but the
         | feature is there.
         | 
         | The company behind DVC is also building a handful of other
         | related tools, e.g. https://iterative.ai/blog/iterative-studio-
         | model-registry
        
           | wxnx wrote:
           | Hm, in what way do you find that DVC requires creating new
           | branches for experiment tracking?
           | 
           | I find the following workflow works well, for example:
           | 
           | 1. Define steps depending on a `config.yml`.
           | 
           | 2. Run an initial experiment (with an initial config) and
           | commit the results.
           | 
           | 3. Update config (preserving the alternate config and using
           | symlinks from `config.yml` to various new configs if
           | necessary), re-run, and commit.
           | 
           | 4. Results are then all preserved in your git history.
        
           | shcheklein wrote:
           | It doesn't require creating a branch when you iterate, it
           | requires creating a branch or commit if you want to share it
           | with the team - see it on GitHub or in Studio. But even those
           | lightweight iterations (https://dvc.org/doc/command-
           | reference/exp/run) could shared as well via Git server - they
           | won't be visible for now via UI in GH/Studio at the moment.
           | 
           | Happy to provide more details on how it's done. It's actually
           | quite interesting technical thing - custom Git namespace
           | https://iterative.ai/blog/experiment-refs
        
         | edublancas wrote:
         | If you need help, you can open an issue on GitHub
         | (https://github.com/ploomber/ploomber-engine) or join our
         | Slack! (https://ploomber.io/community/)
        
       | geminicoolaf wrote:
       | What about BentoML?
        
       | LeanderK wrote:
       | I think MLflow is a good idea (very) badly executed. I would like
       | to have a library that combines:
       | 
       | - simple logging of (simple) metrics during and after training
       | 
       | - simple logging of all arguments the model was created with
       | 
       | - simple logging of a textual representation of the model
       | 
       | - simple logging of general architecture details (number of
       | parameters, regularisation hyperparameters, learning rate, number
       | of epochs etc.)
       | 
       | - and of course checkpoints
       | 
       | - simple archiving of the model (and relevant data)
       | 
       | and all that without much (coding) overhead and only using a
       | shared filesystem (!) And with an easy notebook integration.
       | MLflow just has way to many unnecessary features and is
       | unreliable and complicated. When it doesn't work it's so
       | frustrating, it's also quite often super slow. But I always end
       | up creating something like MLflow when working on an architecture
       | for a long time.
       | 
       | EDIT: having written this...I fell like trying to write my own
       | simple library after finishing the paper. A few ideas have
       | already accumulated in my notes that would make my life easier.
       | 
       | EDIT2: I actually remember trying to use SQLite to manage my
       | models! But the server I worked on was locked down and going
       | through the process to get somebody to install me SQLite was just
       | not worth it. It's also was not available on the cluster for big
       | experiments, where it would be even more work to get it, so I
       | gave up on the idea of trying SQLite.
        
         | Fiahil wrote:
         | > I think MLFlow is a good idea (very) badly executed.
         | 
         | Oh yes, I'm glad to see other with similar opinion.
        
         | pletnes wrote:
         | Sqlite is in python's stdlib, so how can this be an issue? Was
         | there no local filesystem whatsoever?
        
           | tekknolagi wrote:
           | sqlite bindings are in the stdlib but not the library itself.
        
             | imachine1980_ wrote:
             | im asking from ignorance, what the difference in effect
             | this context of not having the library itself?
        
               | funklute wrote:
               | Using the bindings is only possible if the library itself
               | is already installed (since the bindings directly make
               | use of the library, under the hood).
        
         | edublancas wrote:
         | I'm happy to collaborate with you, let's build the best
         | experiment tracker out there! Feel free to ping me at
         | eduardo@ploomber.io
        
         | smehta73 wrote:
         | have you used comet? it basically does everything you are
         | asking and lot more user-friendly than MLFlow
        
           | nerdponx wrote:
           | Isn't Comet a proprietary SaaS? I like MLFlow because I can
           | run it on my own computer if I want to.
        
             | tomrod wrote:
             | Check out flyte and union.ml. No personal affiliation, just
             | good projects in the vein of
             | airflow/prefect/mlflow/kubeflow
        
               | YetAnotherNick wrote:
               | I really like guild.ai. The best thing is that their
               | developers assumed people to be lazy and automatically
               | makes flag for global variables and track them.
        
       ___________________________________________________________________
       (page generated 2022-11-16 23:00 UTC)