[HN Gopher] Designing better file organization around tags, not ...
       ___________________________________________________________________
        
       Designing better file organization around tags, not hierarchies
        
       Author : homarp
       Score  : 122 points
       Date   : 2021-11-07 18:01 UTC (4 hours ago)
        
 (HTM) web link (www.nayuki.io)
 (TXT) w3m dump (www.nayuki.io)
        
       | techsin101 wrote:
       | it's easier for me to put folder "summer photos 2012" under
       | photos than tag everything. In some cases it'd just bring
       | garbage, i.e. script tag -> get ready of tsunami of script.js
       | that got downloaded with html files i downloaded. I'm saying I
       | can't think of a scenario where it'd be mostly helpful but that
       | don't know whether there are other scenarios where there are such
       | scenarios where tags are better.
        
       | civilized wrote:
       | Tags are superfluous if you have a good search engine. Hierarchy
       | isn't.
        
         | guerrilla wrote:
         | Yeah, tagging really seems like the job of a file manager (and
         | indexer) rather than a file system. That seems like an easy way
         | to get everything we want without rewriting billions of lines
         | of code that deal with hierarchies.
        
       | encyclic wrote:
       | Wasn't this the intent behind Windows Vista - a tag-based DB-as-
       | filesystem with hierarchical paths just one "lens" through which
       | to view the DB? I use Google Drive this way, largely through
       | search rather than directory-based organization, though I do also
       | employ that for often-used collections.
        
         | ximeng wrote:
         | https://en.wikipedia.org/wiki/WinFS WinFS - was cancelled
        
         | sigg3 wrote:
         | It's the entire point of MS SharePoint afaict: create a tag
         | based FS on top of some SQL database.
        
       | fungiblecog wrote:
       | tags are great when you know what you're looking for but are
       | terrible for browsing a new dataset.
       | 
       | this is especially true when you're the newbie eg on a project
       | and you didn't have any input into the tag structure.
        
         | sgc wrote:
         | There just needs to be a default view that shows you the most
         | important tags to start from - either by most used files, or
         | largest number of files, or most active recent changes, or as
         | managed by someone. The nice thing about tags is all of those
         | could be top level tabs and you are sailing.
        
         | japanuspus wrote:
         | This. Hierarchy allows a simple manual traversal, leading to
         | better discoverability.
        
           | sgc wrote:
           | Any tag based system should have a breadcrumb function that
           | allows for manual traversal.
        
       | Tarsul wrote:
       | ok, maybe this is the right thread for this question: Anyone has
       | a good structure for their own mp3s? I'd imagine tagging all
       | files and using those tags for playlists would be a good way to
       | do it. Does anyone do something like this or similar and can give
       | pointers?
        
         | sparkie wrote:
         | Use MusicBrainz Picard.
         | 
         | The MusicBrainz database is probably the best there is, and it
         | does a good job of automatically tagging your music. Once
         | tagged, every file will have a unique musicbrainz_trackid
         | (UUID) in its metadata, which can be used to recover/update the
         | metadata associated with the track automatically from the
         | database, which is constantly updated (and to which you can
         | contribute if it is missing metadata for your tracks).
         | 
         | You can configure Picard to arrange files and rename them
         | however you want. It has some simple scripting functionality so
         | you can name things conditionally based on the presence or
         | absence of metadata, etc. [https://picard-
         | docs.musicbrainz.org/en/tutorials/naming_scri...]
         | 
         | If you are concerned about privacy, you can run your own
         | musicbrainz instance in a VM and download a copy of the entire
         | database.
         | 
         | Picard is extensible with Python. There's some existing plugins
         | for generating playlist files.
        
           | ghostly_s wrote:
           | A complementary alternative I'd suggest is beets[1], a front-
           | end agnostic CLI tagging utility that also matches your files
           | against the MusicBrainz database and can both correct the ID3
           | tags and maintain a directory hierarchy based on those tags.
           | 
           | The biggest shortcoming of the MusicBrainz database I've
           | found so far, however, is genre tags. Most releases seem to
           | have only one or a handful of genres listed with no
           | consistent genre hierarchy convention, but I've been
           | experimenting with an extensions that pulls genre tags from
           | discogs.
           | 
           | 1. https://beets.readthedocs.io/
        
         | jazzyjackson wrote:
         | I guess I'll need to blog about it because I can't find any
         | information online, but still nothing has beat Sony's
         | SonicStage, which is by most accounts very annoying proprietary
         | software for working with minidisc players that want ATRAC
         | encoding, but also included an excellent tag navigator that
         | worked like so:
         | 
         | While playing any song in your library, you could display a
         | graph view which showed the song center screen, and radially
         | arranged spokes enumerating what the song was tagged with,
         | "rock", "instrumental", "upbeat" etc, and when you clicked that
         | tag, it would become center-screen and all the songs with that
         | tag would be radially arranged around that tag. So you could
         | navigate your library by kind of surfing the tag-graph and hit
         | play/add to playlist as you go.
         | 
         | Last I checked there were .exe's compatible with Windows 10
         | available so I'll have to download it again and try it out.
        
         | crooked-v wrote:
         | Run an old version of iTunes in a VM?
         | 
         | The column browser is still one of the generally best media
         | browsing interfaces I've ever dealt with.
        
       | throwaway984393 wrote:
       | I think the author is just a little bit behind the times in terms
       | of organizing large corpuses of inter-related data. Tagging as a
       | general idea is too fast-and-loose without a system/structure to
       | organize the tags. Semantic Web stacks are one way to improve on
       | this using taxonomies and ontologies, query languages and data
       | specs; they almost mention it in the alternative systems
       | ("touples") but don't dig into just how hard it is to manage data
       | using unstructured references (or even structured ones!).
       | 
       | Simple hierarchies are ..... simple. Tags are simple too, but
       | they quickly devolve into new complexities as people try to
       | figure out how to apply them, find them, organize them.
       | Hierarchies aren't typically as difficult to manage because it
       | boxes you into re-creating the same mental model for
       | organization, just with different classifiers for each level of
       | the hierarchy. They're less flexible, but they're easier to grok,
       | maintain, and use.
       | 
       | My major concern is that there isn't really a need to "fix"
       | hierarchies, it's just a nagging problem that someone doesn't
       | want to deal with, so their solution is to make something more
       | complicated.... and more complicated might not make it better. It
       | should also be feasible to design applications to organize the
       | files without having to rewrite filesystems.
        
         | pimlottc wrote:
         | You seem to know a lot about this; what is considered the
         | "correct" way to organize data like this these days?
        
           | throwaway984393 wrote:
           | I couldn't claim to know what is "correct" and what isn't!
           | I've just worked on projects to organize large collections of
           | interrelated datasets (for example, to update correlations
           | between concepts, to make search engines more effective, to
           | identify related or dependent item relationships, etc) and
           | for our project we used a Semantic Web stack. Browsing GitHub
           | for "knowledge graph" or "knowledge management" seems to pop
           | up some cool looking projects, but I think everyone is still
           | trying to figure out what works for a particular use case
           | rather than generally.
           | 
           | I hope a real data scientist can reply with whatever the
           | latest and greatest solutions are. Semantic Web tech for
           | knowledge graphs are continuing to evolve, but also kind of
           | old, and they're still mostly used for research projects.
           | Part of that is probably because the terminology is unusual,
           | and application leads you down a long rabbit hole of new and
           | confusing concepts. So that's why I'm thinking that just
           | sticking to a boring inefficient hierarchy might not be so
           | bad...
        
       | pininja wrote:
       | I like how this article is laid out as first defining the
       | existing systems used today since while it's all I know, I
       | haven't spent the time defining it. And then describes a number
       | of inspirational examples of how it could be different.
       | 
       | Desktop file systems seem impossibly hard to change at this
       | point, but cloud storage and mobile file systems are still so new
       | and not amazing in my opinion - there's still hope for a better
       | experience.
        
       | ghoward wrote:
       | Good ideas, but there are a few things wrong with this.
       | 
       | First, we forget that filesystems are not hierarchies, they are
       | graphs, whether DAG's or not. [1]
       | 
       | Second, and this follows from the first, _both_ tags and
       | hierarchy are possible with filesystems as they currently are.
       | 
       | Here's how you do it:
       | 
       | 1. Organize your files in the hierarchy you want them in.
       | 
       | 2. Create a directory in a well-known place called `tags/` or
       | whatever you want.
       | 
       | 3. For every tag `<name>`, create a directory `tags/<name>/`
       | 
       | 4. Hard-link all files you want to tag under each tag directory
       | that apply.
       | 
       | 5. For extra credit, create a soft link pointing to the same
       | file, but with a well-known name.
       | 
       | This allows you to use the standard filesystem tools to get all
       | files under a specific tag. For example,                   find
       | tags/<name> -type f
       | 
       | (The find on my machine does not follow symbolic links and does
       | not print them if you use the above command.)
       | 
       | If you want to find where the file is _actually_ under the
       | hierarchy, use                   find -L tags/ -xtype l
       | 
       | Having both hard and soft links means that 1) you cannot lose the
       | actual file if it's moved in the hierarchy (the hard link will
       | always refer to it), and 2) you can either find the file in the
       | hierarchy from the tag or you know that the file has been moved
       | in the hierarchy.
       | 
       | Of course, I'm no filesystem expert, so I probably got a few
       | things wrong. I welcome smarter people to tell me how I am wrong.
       | 
       | [1]:
       | https://lobste.rs/s/ydno8w/tree_structure_file_systems#c_njg...
        
         | jeddy3 wrote:
         | > 2. Create a directory in a well-known place called `tags/` or
         | whatever you want.
         | 
         | > 3. For every tag `<name>`, create a directory `tags/<name>/`
         | 
         | > 4. Hard-link all files you want to tag under each tag
         | directory that apply.
         | 
         | Does this only give you one level of tags? (i.e you can't
         | combine tags when exploring)
        
           | ghoward wrote:
           | It does, unfortunately. But you probably could implement
           | finding something with two tags with some command-line fu.
           | 
           | My first crack at it is this:                   find -L
           | tags/tag1 tags/tag2 -xtype l | sort | uniq -d
           | 
           | I don't know if that would work (on mobile; can't test), but
           | from reading the man pages, it seems like it would do the
           | trick.
        
       | hyperpallium2 wrote:
       | MS was going to use a relational filesystem at one time.
        
       | spicybright wrote:
       | I've been hearing this for years, but I've never had the mental
       | model for it. Maybe I just have to try it out.
       | 
       | There's a very neat project (can't remember the name, and some
       | details below may be wrong,) that mounts a FUSE filesystem that
       | uses only tags.
       | 
       | Paths have identical syntax to hierarchical ones, only each
       | segment of a path is a tag.
       | 
       | so `/document/taxes/2020` would return a collection of files, but
       | you could also write `/taxes/2020/document` and it would mean the
       | same thing.
       | 
       | The advantage is you can use standard unix tools like ls, mv, cp,
       | etc. as well as R/W using standard programs (like how you'd
       | export an image.)
       | 
       | I'm sure there's edge cases to it, but I always found the idea
       | very neat. Anyone remember what I'm talking about?
        
         | daenz wrote:
         | I'm the author of Supertag, which is what you might be talking
         | about. https://github.com/amoffat/supertag
        
         | sdeer wrote:
         | You are probaly thinking of https://github.com/cfagiani/cotfs
         | which allows ln but not mv or cp.
        
           | spicybright wrote:
           | Ah, that's right. Thank you for the link!
        
         | webmobdev wrote:
         | > _I 've never had the mental model for it._
         | 
         | This is one way to think about the hybrid model of folders +
         | tags that is currently available:
         | 
         | 1. Folders tell you _where_ the file is stored.
         | 
         | 2. Tags tell you _what_ is the file.
         | 
         | So basically use the _Tags_ to add more data (metadata) about
         | your files, so that if you forget where the file is, you can
         | still search for it by what is in it. This also slightly helps
         | in easing the burden of trying to figure out where to store a
         | file (e.g. _" Do I put a home video in my 'Videos' folder or my
         | 'Personal' folder?"_ - if you tag it properly, you can put it
         | in either, as you can use the tags to figure out where the
         | video is later).
         | 
         | Examples:
         | 
         | 1. In Documents folder - _" mom-2020.xls"_ (tags => docs, tax,
         | finance, mom, unfiled).
         | 
         | 2. In Videos folder - _" newyear bash.mp4"_ (tags => video,
         | family, 2021, home).
        
         | oneplane wrote:
         | While I don't remember that specific project's name, I do
         | remember that a lot of that was possible because files in a
         | filesystem often aren't just some hierarchical folder
         | structure. There is often a tree-like structure to be able to
         | find a single file quickly amongst many files, but there was no
         | technical reason why it wouldn't also support off-tree indexing
         | based on tags.
         | 
         | I imagine one of the problems you'd run in to is mass updates
         | since deleting a directory with many files in it below a single
         | tree node is very much limited to that part of the filesystem's
         | tree, but doing it with tags causes updates all over the place.
         | 
         | It does remind me of this more metadata-like FS:
         | https://github.com/marook/tagfs
         | 
         | As well as the thing Apple does in Finder where it has a tag
         | database and dynamic 'tag' directories where it shows you
         | everything that was tagged with that tag. Then the search
         | function would allow selecting files based on tags so you can
         | do many-tag searches and only find the files that match all of
         | them. I think that one is based on an in-filesystem metadata
         | stream.
        
         | remus wrote:
         | > I've been hearing this for years, but I've never had the
         | mental model for it. Maybe I just have to try it out.
         | 
         | Gmail uses tags rather than folders, nice easy way to have a
         | play if you've already got a google account.
         | 
         | Conceptually, I think of it as basically being able to have
         | lots of different hierarchical folder structures applied to a
         | set of objects at the same time. Or to flip it round, a normal
         | folder structure is like using tags but where you are limited
         | to one tag (the folder that holds the object) per object.
        
           | thunderbong wrote:
           | Not exactly. In Gmail, tags are hierarchical.
           | 
           | So, if I have a tag called 'projects', and another tag called
           | 'newproject', the overall tag is 'projects-newproject'.
           | 
           | If I try to see everything under 'projects', it'll only show
           | me the ones tagged 'projects' directly, not the ones under
           | 'newproject'.
        
             | joshuamorton wrote:
             | I think what you're seeing is the opposite: tags aren't
             | hierarchical. Tags can be organized hierarchically, for
             | organizational purposes, and you can use rules to ensure
             | that everything tagged "projects-FOO" is also tagged
             | "projects", but the tags as they apply to tagged objects
             | aren't hierarchical.
        
             | maxpro wrote:
             | This is only if you use it hierarchically. No one prevents
             | you from having a tag projects and another tag newproject.
             | Your emails just have both tags assigned and you are good
             | to go
        
         | DurhamPete wrote:
         | Fuse tagging: TMSU. At tmsu.org.
        
         | rkagerer wrote:
         | 10 years ago I was prototyping something very similar for
         | Windows.
         | 
         | If I recall, CBFS and Dokan were the closest things you had to
         | FUSE on Windows back then. Alternatively considered emulating a
         | network drive. Like you pointed out, it had to be transparent
         | to your existing software's Open / Save dialogs (although there
         | was some effort to hook the standard ones to give users a place
         | to apply tags when saving).
         | 
         | We've been stuck in the same old directory paradigm for a long
         | time. There are some use cases where the traditional
         | hierarchical approach is desirable (e.g. when you need to
         | "visit" a set of files exactly once, like to browse through a
         | folder to clean it up, enumerate for backup, calculate sizes,
         | etc). But it's a constraint when a file belongs in more than
         | one place.
        
         | tacticalmook wrote:
         | The downside of folders is trying to figure out where things
         | belong in the hierarchy, or trying to update that hierarchy to
         | a new standard.
         | 
         | The downside of tagging is you still need to establish
         | conventions to ensure things can be found again, but
         | enforcement of your conventions is harder.
         | 
         | Exploring, learning, and using an unfamiliar folder hierarchy
         | is easier than exploring, learning, and using an unfamiliar
         | tagging methodology.
         | 
         | But manually searching for something in somebody else's tagged
         | data is easier than manually searching for something in
         | somebody else's folders.
        
           | johnchristopher wrote:
           | > The downside of folders is trying to figure out where
           | things belong in the hierarchy, or trying to update that
           | hierarchy to a new standard.
           | 
           | That's why I stick to Documents/{folder1..folder[?]} and
           | folders don't have hierarchical sub folders, just contextual
           | folders. Eg: Documents/taxes 2021/{invoices, stuff},
           | Documents/taxes 2021/, Documents/Cthulluh Roleplaying/{pdf
           | files of characters}, Documents/Covid vaccination
           | certificates,
           | 
           | Yes, it's messy but I don't have the mental burden of a
           | holding a tree in my head or a tagging system.
        
             | Andrex wrote:
             | I try not to go more than three folders deep (starting from
             | ~). I don't mind a lot of files in a folder, search helps
             | me with that.
        
               | johnchristopher wrote:
               | My current setup is: Documents, owncloud, Downloads, Dev,
               | Media and tmp. No desktop. At work I have an additional
               | git folder.
        
       | Kalanos wrote:
       | Tags are the lazy man's schema
        
       | Razengan wrote:
       | One of the best uses of tags is to let a file effectively exist
       | in multiple "folders"
       | 
       | For example I have folders of screenshots named after various
       | shows and games, and I use tags to further organize the images
       | based on their suitability for different "reactions" on online
       | forums :)
       | 
       | So naturally I have hundreds of tags but macOS doesn't seem to
       | keep up with that many and after a certain point it feels like
       | Apple have forgotten about tags and improving their integration
       | into the system.
        
       | polote wrote:
       | (2017)
        
       | tiberiusteng wrote:
       | If using Windows, use Everything (https://www.voidtools.com/) and
       | when naming a new file treat the filename as a tag set.
       | 
       | Then Everything could become your shell ...
        
       | dusted wrote:
       | Reminds me of an old project I did where I built a fuse system
       | for this, files still lived on a normal ext3 fs, but the overlay
       | presented files as tag paths, yo you could access, for instance,
       | a movie like /tagfs/movies/year/1999/matrix.avi or
       | /tagfs/movies/genre/scifi/matrix.avi
       | 
       | There was a special path /tagfs/untagged/ which listed any files
       | that didn't have at least one tag.
        
       | dsr_ wrote:
       | The difference between tags and hierarchies is this: if you don't
       | know what you're looking for, hierarchy can guide you.
        
         | Rygian wrote:
         | That's just bad UX of the tag browsers.
        
         | arduinomancer wrote:
         | Couldn't you just look at a list of tags?
        
           | dsr_ wrote:
           | Sure. How many tags are in your system?
           | 
           | The number of tags should be more or less similar to the
           | number of non-leaf nodes in a hierarchy, or else you aren't
           | capturing the same information. Any tag that applies to more
           | than half of the files is probably useless. On my blog, the
           | tags "blog" and "technology" are definitely useless. That's
           | fewer than 250 entries and already it has cruft.
           | 
           | Were people consistent when they added tags? Does your system
           | suggest tags automatically? Is this actually a full-text
           | search minus stop words? Is there a librarian who cleans up
           | after you and merges tags that have the same meanings? Would
           | the full-text search be more useful than tags?
        
       | jasode wrote:
       | The previous 2018 discussion had comments from the author:
       | 
       | https://news.ycombinator.com/item?id=16763235
        
       | [deleted]
        
       | btrettel wrote:
       | From the article:
       | 
       | > ### Hard and soft links as non-solutions
       | 
       | > Soft links have a nearly opposite set of problems as hard links
       | - soft links can span different file systems, but they generally
       | don't track the target files getting moved or renamed (except
       | that Windows provides such a system service, but may not be
       | reliable); while hard links are all indistinguishable, some
       | application software behaves differently on soft links than on
       | real hard-linked files. In spite of these problems, hard and soft
       | links still require an exponential amount of effort to classify a
       | set of files in multiple ways, and require the user to manually
       | remember all the possible paths that a file can be reached from
       | (important when editing and removing files, not important when
       | browsing/retrieving files). They are non-scalable kludges
       | compared to true tagging.
       | 
       | As a fan of hierarchies for file organization, and as someone
       | with quite literally thousands of soft links, when I read things
       | like this, I don't know if the person arguing against links has
       | tried this approach. It works totally fine for me.
       | 
       | Yes, with soft links, moving the original file breaks the links.
       | I wrote a fairly simple bash script to automatically fix these
       | for my reference PDF files. It works because each PDF file I save
       | has a unique file name. So figuring out where the links need to
       | point is pretty simple. That makes me a "power user", I guess,
       | but the author is at least at the same level and I think could
       | figure it out.
       | 
       | With respect to the "exponential amount of effort to classify a
       | set of files in multiple ways", I guess the author is referring
       | to navigating the hierarchy to link a file in multiple places? I
       | use tagging at my work, and I personally find scrolling through
       | my list of about 200 tags to be comparable in terms of time to
       | navigating through a hierarchy. The bottleneck is the human.
       | 
       | Edit: Here's another article by someone who is not a fan of
       | links: https://karl-voit.at/2018/08/25/links/
        
         | livrem wrote:
         | I use hard and soft links (mostly the former for files and the
         | latter for directories) and I think it works very well in
         | practice despite the theoretical problems brought up.
        
       | lmilcin wrote:
       | Or better idea. Use directed graphs. Directed graphs are strictly
       | a superset of both tree and tag (set) functionality.
       | 
       | Tags (sets) can be thought as a special case of directed graph.
       | You make the tag a node in the graph and you make files also
       | nodes and have edge pointing in direction of the tag node.
       | 
       | You can then do graph queries to find files tagged in a certain
       | way.
       | 
       | But graphs offers so much more.
       | 
       | Because they are superset of both trees and sets, you can use it
       | to represent both, _at the same time_.
       | 
       | It is not very useful to have a lot of things tagged the same way
       | because you end up with just a long list of things. Whereas in a
       | graph you could say that you want to find objects from which you
       | can reach certain tag node and all these objects can still have
       | their own structure and even be part of multiple structures.
       | 
       | For many years I had this idea to build my PIM where I could make
       | arbitrary nodes being anything that could let me connect anything
       | to anything.
       | 
       | A node could be an email, a file, a link to external external
       | website, a task, a contact, a reminder, etc.
       | 
       | And you could connect everything to anything and have, for
       | example a project that has important emails attached to it, the
       | email could have attached a reminder to respond and a file that
       | you want to include in response, and a note.
       | 
       | You could browse this graph as a tree because locally it could be
       | interpreted as a tree, you open the tree one level by finding all
       | elements that pointing to the node.
        
         | MrLeap wrote:
         | How do you transform a digraph with cycles into a tree without?
         | Seems perilous.
        
           | [deleted]
        
           | lmilcin wrote:
           | How do you traverse a symlink that points to one of its
           | parent folders? How do you traverse a web of interlinked
           | sites?
           | 
           | Somehow Linux, Linux software, web browsers etc. are already
           | able to deal with directed graphs.
        
             | MrLeap wrote:
             | You can handwave cycles created by symlinks away by
             | ignoring them. Full traversal is possible relying only on
             | the tree.
             | 
             | If all you have is the digraph you have to rely on cycle
             | detection. Transforming it into a hierarchy of depth
             | requires snipping edges somewhere.
             | 
             | I'm just gently disputing the claim that a digraph can be
             | mapped 1:1 to a hierarchical filesystem. I'm open to being
             | wrong.
        
       | tdrdt wrote:
       | The same is happening in webshops. Products can be assigned to
       | hierarchies (categories) but in shops like Amazon it is obvious
       | it is more like tagging.
        
         | indymike wrote:
         | In ecommerce, products are often presented to the shopper by
         | search facets, so kind of all of the above applies.
        
         | kzrdude wrote:
         | And amazon thanks to that is opaque, I never look around for
         | stuff, just search and find or not. They could have made
         | something better where you can discover other stuff.
        
       | s1k3s wrote:
       | Just cancelled my Google Drive subscriptions last week due to
       | this, and moved to SFTP on rented metal. If you can't 100%
       | predict what I'm looking for then don't even bother. Just give me
       | files and folders.
        
       | agumonkey wrote:
       | I had this idea long ago but at the same time I worry about
       | things too fluid. Both for performance and both for information
       | efficiency. Tree can be seen as a preemptive good enough tag
       | order. Some obvious dimensions like category, time will always be
       | of use.
        
       | austincheney wrote:
       | For tags I use meta data, which are included as mapped text at
       | the extreme end of many media formats. For example ID3 data on
       | MP3 files.
       | 
       | I have found this incredibly helpful for MP3s because I have
       | thousands of them and there are many similar names. Windows
       | Explorer provides columns for this data in its detailed file
       | system view (not by default) which is incredibly helpful and
       | trivial to customize.
       | 
       | For everything else folders are enough. I have hundreds of movies
       | on a hard disk and yet folders are enough. When I do need more
       | the data I want is generated by the file system: last modified,
       | file size, and so forth.
       | 
       | What improves file usability the most for me is network access by
       | meta data. For example Windows Explorer and OSX Finder are nice
       | but I would rather have the exact same interface on the same
       | local machine for a bunch of remote machines regardless of their
       | file system or operating system. Then copy to a different machine
       | is just drag and drop from one window onto another in an
       | application that looks like some local OS, that windowing
       | interface needs to allow sorting and filtering and search by meta
       | data just like Windows Explorer. Having an application that does
       | this for me has been great.
        
       | xixixao wrote:
       | I love Bear with it's 1+ hiearchical tags per note. It's the
       | perfect combination of both worlds for me. Would encourage the
       | author to check it out (if it's any different).
        
       | megous wrote:
       | Usefullness of this is highly content specific. It maybe works
       | for mp3s or videos/photos you made yourself and contain some
       | metadata. I can't imagine tag based organization of all the
       | random 2 mil. files I have that don't fit into these neat
       | categories.
       | 
       | I don't need access to most of these files unless I'm working on
       | something relevant to them. When I work on X, I go to directory X
       | and everything I need sits below X in some hierarchy. I'm never
       | interested in anything that's below X unless I'm working on X. I
       | don't want stuff from X to pollute some global namespace just
       | because some mp3 or PDFs are present under X. That's true for
       | hundreds of personal projects and tools I've made.
       | 
       | Directory hierarchy is a pretty neat abstraction to me and all
       | the tools I use already support it well.
        
         | Rygian wrote:
         | Does it mean that you will store multiple copies of a file, if
         | it ends up being useful for multiple projects?
         | 
         | About the random 2 mil. files you have lying around, the tags
         | you should set should describe not the files themselves, but
         | the reason why _you_ chose to keep those files.
        
       | fmajid wrote:
       | Librarians have been classifying the world's knowledge since
       | forever, and they developed faceted classification systems
       | (Ranganathan, 1933) to deal with these issues.
       | 
       | https://www.researchgate.net/publication/321840994_Ranganath...
       | 
       | A notable one that hierarchies embed the point of view of the
       | classifier, e.g. the Dewy Decimal's ridiculous classifications of
       | religions (codes 200-299), making minute distinctions like 285
       | (Presbyterian, Reformed, Congregational), 286 (Baptist, Disciples
       | of Christ, Adventist), then putting all non-Christian religions
       | under a handful of afterthought headings 292-299: 294 for
       | Hinduism, Buddhism, Sikkhism and other religions of Indian
       | origin, 295 for Zoroastrianism and its descendants, 296 for
       | Judaism, 297 for Islam and Bahaism lumped together, and 299 for
       | New Age.
       | 
       | Unfortunately, most of us do not have access to the services of a
       | librarian to develop a taxonomy that corresponds to our own point
       | of view then classify our files accordingly, which is why simple
       | hierarchical taxonomies have endured and faceted ones seldom
       | beyond specialized applications like Digital Asset Management.
        
         | maratc wrote:
         | Dewey Decimal System has only one (often-misunderstood) task:
         | organizing bookshelf space. It does not have a task of
         | classifying world information.
         | 
         | The reason that fairly narrow "Presbyterian, Reformed,
         | Congregational" topic has one DDS code and an extremely wide
         | "Hinduism, Buddhism, Sikkhism and other religions of Indian
         | origin" topic has one DDS code is simple: an average library
         | has similar book-width for both topics.
        
           | karaterobot wrote:
           | > Dewey Decimal System has only one (often-misunderstood)
           | task: organizing bookshelf space. It does not have a task of
           | classifying world information.
           | 
           | How is this distinct from organizing world information? If
           | the goal is to shelve books in a way that adds value to the
           | information-seeker, you have to arrange them according to
           | some definition of similarity. That decision about what makes
           | one thing similar to another encodes a view of the world.
        
             | TheCoelacanth wrote:
             | The physical shelves in a library only serve a single,
             | small geographic area, not the whole world.
        
           | tobr wrote:
           | An average _American_ library, you mean.
        
             | maratc wrote:
             | Of course, and while we are there, _an average American
             | Public library as of the end of the 19th century_.
             | 
             | In the last century, DDS has seen its adaptations for
             | China, Japan and other countries which are based on the
             | same idea but are rather different.
        
         | ifethereal wrote:
         | One example of a data structure implementing faceted
         | classification would be the multitree [0]. Unfortunately
         | multitrees seem to receive far less support than 2 other data
         | structures it intermediates: trees and DAGs.
         | 
         | k-d trees [1] are close but use cases seem to predominantly
         | target data with inherently ordinal (rather than nominal)
         | dimensions.
         | 
         | Further abstraction could lead to the knowledge graph [2] or
         | graph databases.
         | 
         | In all cases, the availability of "low-code" tools (in the
         | domain of single-user personal information management, at
         | least) seems sparse. I have been looking for some time, but the
         | search continues.
         | 
         | [0]: https://en.wikipedia.org/wiki/Multitree
         | 
         | [1]: https://en.wikipedia.org/wiki/K-d_tree
         | 
         | [2]: https://github.com/JeffreyBenjaminBrown/hode
        
       | vinodkd wrote:
       | Some really interesting ideas in a well organized article.
       | Bookmarked!
        
       | nathanmcrae wrote:
       | I find that using Everything search (https://www.voidtools.com/)
       | makes me use the filesystem more like a tagging system. I still
       | name files and directories meaningfully, but I don't worry about
       | the hierarchy at all. Then when I want to get something, I just
       | search (parts of) the terms I want and see the matching paths
       | instantly.
       | 
       | Using the filesystem hierarchically now feels painfully slow and
       | awkward.
        
         | _dain_ wrote:
         | I do this with fd and ripgrep, which also lets me do full-text
         | search through the files themselves. My filesystem has become
         | much flatter and coarser-grained; just a few very large
         | categories.
        
       | webwielder2 wrote:
       | Tags vs. Hierarchy are like the WFH vs. Office debate: something
       | people assume has a right answer but is actually personal
       | preference.
        
       | slaymaker1907 wrote:
       | I don't like the idea of file hashes for storing files. That
       | really only works well for a limited set of files completely
       | ignoring things like databases, note files, etc. If you want to
       | generate a name, UUIDs would make way more sense. Additionally,
       | even for files which never change, what happens when a bit gets
       | flipped inside of the file? That would presumably change the hash
       | without knowledge of the file system until you try and read it.
       | For drives intended for consumers, you really don't want to have
       | more data loss than is absolutely necessary in case of minor
       | corruption.
       | 
       | On a different note, I use Tiddlywiki with tags a lot for my
       | personal notes and I think a hybrid approach with tags and a
       | hierarchy works best. Hierarchies are useful for generating
       | meaningful unique names where it is nice to have the notes be
       | meaningful. Another thing that would be useful with a hybrid
       | system is to have rules like
       | /Skyrim/Guilds/ThievesGuild/NPC/Mercer automatically assign tags
       | to Mercer by virtue of his placement like {"Skyrim", "NPC",
       | "ThievesGuild"}. You could do this automatically via the path or
       | by some sort of rules engine.
       | 
       | Finally, I would really like it for a more complex information
       | storage system to fully support 1, 2, and 3-tuple metadata. A
       | 1-tuple is a tag, a 2-tuple is a key-value pair associated with a
       | file, and a 3-tuple would be a relation between files with an
       | optional key/value (some relations would be merely a tag while
       | others might want a key/value). 2-tuples are obviously useful
       | since they exist in limited form currently via file attributes,
       | though they unfortunately are not exposed very well to users even
       | though some file systems support arbitrary key/value pairs as
       | attributes.
       | 
       | 3-tuples are a little more niche, but I think it would be useful
       | for keeping track of stuff like what file imports another. Humans
       | probably wouldn't generate the metadata in the previous example,
       | but it would help tools play nicely together since you could have
       | a small static analysis tool which simply updates this metadata
       | which could then be used by other tools or by users directly. One
       | of the greatest things about file systems is that they allow for
       | separate tools to interact with the same data in a structured
       | way.
        
       | rasengan0 wrote:
       | I ditched everything for a simple Zettelkasten in PlainVanillaVim
       | all on a SdCard
       | 
       | EverythingIsAFile NoExtensions CamelCase
       | 
       | hit gf on a word/file/tag and enjoy freedom
       | 
       | :map gf :e <cfile><CR>
       | 
       | https://vimhelp.org/editing.txt.html#gf
       | 
       | sprinkle in RipGrep Fzf CtrlP as needed
        
         | klodolph wrote:
         | I like to organize files in a similar way but use spaces in
         | filenames. I find it a lot easier to read filenames that way,
         | and in the shell, <tab> will add all the necessary quotes /
         | backslashes.
        
         | megous wrote:
         | This is tempting, at least for note taking. :)
        
       | jcelerier wrote:
       | I remember trying to use tags to tag my media collection a decade
       | ago and miserably failing
        
       ___________________________________________________________________
       (page generated 2021-11-07 23:00 UTC)