[HN Gopher] Designing better file organization around tags, not ... ___________________________________________________________________ Designing better file organization around tags, not hierarchies Author : homarp Score : 122 points Date : 2021-11-07 18:01 UTC (4 hours ago) (HTM) web link (www.nayuki.io) (TXT) w3m dump (www.nayuki.io) | techsin101 wrote: | it's easier for me to put folder "summer photos 2012" under | photos than tag everything. In some cases it'd just bring | garbage, i.e. script tag -> get ready of tsunami of script.js | that got downloaded with html files i downloaded. I'm saying I | can't think of a scenario where it'd be mostly helpful but that | don't know whether there are other scenarios where there are such | scenarios where tags are better. | civilized wrote: | Tags are superfluous if you have a good search engine. Hierarchy | isn't. | guerrilla wrote: | Yeah, tagging really seems like the job of a file manager (and | indexer) rather than a file system. That seems like an easy way | to get everything we want without rewriting billions of lines | of code that deal with hierarchies. | encyclic wrote: | Wasn't this the intent behind Windows Vista - a tag-based DB-as- | filesystem with hierarchical paths just one "lens" through which | to view the DB? I use Google Drive this way, largely through | search rather than directory-based organization, though I do also | employ that for often-used collections. | ximeng wrote: | https://en.wikipedia.org/wiki/WinFS WinFS - was cancelled | sigg3 wrote: | It's the entire point of MS SharePoint afaict: create a tag | based FS on top of some SQL database. | fungiblecog wrote: | tags are great when you know what you're looking for but are | terrible for browsing a new dataset. | | this is especially true when you're the newbie eg on a project | and you didn't have any input into the tag structure. | sgc wrote: | There just needs to be a default view that shows you the most | important tags to start from - either by most used files, or | largest number of files, or most active recent changes, or as | managed by someone. The nice thing about tags is all of those | could be top level tabs and you are sailing. | japanuspus wrote: | This. Hierarchy allows a simple manual traversal, leading to | better discoverability. | sgc wrote: | Any tag based system should have a breadcrumb function that | allows for manual traversal. | Tarsul wrote: | ok, maybe this is the right thread for this question: Anyone has | a good structure for their own mp3s? I'd imagine tagging all | files and using those tags for playlists would be a good way to | do it. Does anyone do something like this or similar and can give | pointers? | sparkie wrote: | Use MusicBrainz Picard. | | The MusicBrainz database is probably the best there is, and it | does a good job of automatically tagging your music. Once | tagged, every file will have a unique musicbrainz_trackid | (UUID) in its metadata, which can be used to recover/update the | metadata associated with the track automatically from the | database, which is constantly updated (and to which you can | contribute if it is missing metadata for your tracks). | | You can configure Picard to arrange files and rename them | however you want. It has some simple scripting functionality so | you can name things conditionally based on the presence or | absence of metadata, etc. [https://picard- | docs.musicbrainz.org/en/tutorials/naming_scri...] | | If you are concerned about privacy, you can run your own | musicbrainz instance in a VM and download a copy of the entire | database. | | Picard is extensible with Python. There's some existing plugins | for generating playlist files. | ghostly_s wrote: | A complementary alternative I'd suggest is beets[1], a front- | end agnostic CLI tagging utility that also matches your files | against the MusicBrainz database and can both correct the ID3 | tags and maintain a directory hierarchy based on those tags. | | The biggest shortcoming of the MusicBrainz database I've | found so far, however, is genre tags. Most releases seem to | have only one or a handful of genres listed with no | consistent genre hierarchy convention, but I've been | experimenting with an extensions that pulls genre tags from | discogs. | | 1. https://beets.readthedocs.io/ | jazzyjackson wrote: | I guess I'll need to blog about it because I can't find any | information online, but still nothing has beat Sony's | SonicStage, which is by most accounts very annoying proprietary | software for working with minidisc players that want ATRAC | encoding, but also included an excellent tag navigator that | worked like so: | | While playing any song in your library, you could display a | graph view which showed the song center screen, and radially | arranged spokes enumerating what the song was tagged with, | "rock", "instrumental", "upbeat" etc, and when you clicked that | tag, it would become center-screen and all the songs with that | tag would be radially arranged around that tag. So you could | navigate your library by kind of surfing the tag-graph and hit | play/add to playlist as you go. | | Last I checked there were .exe's compatible with Windows 10 | available so I'll have to download it again and try it out. | crooked-v wrote: | Run an old version of iTunes in a VM? | | The column browser is still one of the generally best media | browsing interfaces I've ever dealt with. | throwaway984393 wrote: | I think the author is just a little bit behind the times in terms | of organizing large corpuses of inter-related data. Tagging as a | general idea is too fast-and-loose without a system/structure to | organize the tags. Semantic Web stacks are one way to improve on | this using taxonomies and ontologies, query languages and data | specs; they almost mention it in the alternative systems | ("touples") but don't dig into just how hard it is to manage data | using unstructured references (or even structured ones!). | | Simple hierarchies are ..... simple. Tags are simple too, but | they quickly devolve into new complexities as people try to | figure out how to apply them, find them, organize them. | Hierarchies aren't typically as difficult to manage because it | boxes you into re-creating the same mental model for | organization, just with different classifiers for each level of | the hierarchy. They're less flexible, but they're easier to grok, | maintain, and use. | | My major concern is that there isn't really a need to "fix" | hierarchies, it's just a nagging problem that someone doesn't | want to deal with, so their solution is to make something more | complicated.... and more complicated might not make it better. It | should also be feasible to design applications to organize the | files without having to rewrite filesystems. | pimlottc wrote: | You seem to know a lot about this; what is considered the | "correct" way to organize data like this these days? | throwaway984393 wrote: | I couldn't claim to know what is "correct" and what isn't! | I've just worked on projects to organize large collections of | interrelated datasets (for example, to update correlations | between concepts, to make search engines more effective, to | identify related or dependent item relationships, etc) and | for our project we used a Semantic Web stack. Browsing GitHub | for "knowledge graph" or "knowledge management" seems to pop | up some cool looking projects, but I think everyone is still | trying to figure out what works for a particular use case | rather than generally. | | I hope a real data scientist can reply with whatever the | latest and greatest solutions are. Semantic Web tech for | knowledge graphs are continuing to evolve, but also kind of | old, and they're still mostly used for research projects. | Part of that is probably because the terminology is unusual, | and application leads you down a long rabbit hole of new and | confusing concepts. So that's why I'm thinking that just | sticking to a boring inefficient hierarchy might not be so | bad... | pininja wrote: | I like how this article is laid out as first defining the | existing systems used today since while it's all I know, I | haven't spent the time defining it. And then describes a number | of inspirational examples of how it could be different. | | Desktop file systems seem impossibly hard to change at this | point, but cloud storage and mobile file systems are still so new | and not amazing in my opinion - there's still hope for a better | experience. | ghoward wrote: | Good ideas, but there are a few things wrong with this. | | First, we forget that filesystems are not hierarchies, they are | graphs, whether DAG's or not. [1] | | Second, and this follows from the first, _both_ tags and | hierarchy are possible with filesystems as they currently are. | | Here's how you do it: | | 1. Organize your files in the hierarchy you want them in. | | 2. Create a directory in a well-known place called `tags/` or | whatever you want. | | 3. For every tag `<name>`, create a directory `tags/<name>/` | | 4. Hard-link all files you want to tag under each tag directory | that apply. | | 5. For extra credit, create a soft link pointing to the same | file, but with a well-known name. | | This allows you to use the standard filesystem tools to get all | files under a specific tag. For example, find | tags/<name> -type f | | (The find on my machine does not follow symbolic links and does | not print them if you use the above command.) | | If you want to find where the file is _actually_ under the | hierarchy, use find -L tags/ -xtype l | | Having both hard and soft links means that 1) you cannot lose the | actual file if it's moved in the hierarchy (the hard link will | always refer to it), and 2) you can either find the file in the | hierarchy from the tag or you know that the file has been moved | in the hierarchy. | | Of course, I'm no filesystem expert, so I probably got a few | things wrong. I welcome smarter people to tell me how I am wrong. | | [1]: | https://lobste.rs/s/ydno8w/tree_structure_file_systems#c_njg... | jeddy3 wrote: | > 2. Create a directory in a well-known place called `tags/` or | whatever you want. | | > 3. For every tag `<name>`, create a directory `tags/<name>/` | | > 4. Hard-link all files you want to tag under each tag | directory that apply. | | Does this only give you one level of tags? (i.e you can't | combine tags when exploring) | ghoward wrote: | It does, unfortunately. But you probably could implement | finding something with two tags with some command-line fu. | | My first crack at it is this: find -L | tags/tag1 tags/tag2 -xtype l | sort | uniq -d | | I don't know if that would work (on mobile; can't test), but | from reading the man pages, it seems like it would do the | trick. | hyperpallium2 wrote: | MS was going to use a relational filesystem at one time. | spicybright wrote: | I've been hearing this for years, but I've never had the mental | model for it. Maybe I just have to try it out. | | There's a very neat project (can't remember the name, and some | details below may be wrong,) that mounts a FUSE filesystem that | uses only tags. | | Paths have identical syntax to hierarchical ones, only each | segment of a path is a tag. | | so `/document/taxes/2020` would return a collection of files, but | you could also write `/taxes/2020/document` and it would mean the | same thing. | | The advantage is you can use standard unix tools like ls, mv, cp, | etc. as well as R/W using standard programs (like how you'd | export an image.) | | I'm sure there's edge cases to it, but I always found the idea | very neat. Anyone remember what I'm talking about? | daenz wrote: | I'm the author of Supertag, which is what you might be talking | about. https://github.com/amoffat/supertag | sdeer wrote: | You are probaly thinking of https://github.com/cfagiani/cotfs | which allows ln but not mv or cp. | spicybright wrote: | Ah, that's right. Thank you for the link! | webmobdev wrote: | > _I 've never had the mental model for it._ | | This is one way to think about the hybrid model of folders + | tags that is currently available: | | 1. Folders tell you _where_ the file is stored. | | 2. Tags tell you _what_ is the file. | | So basically use the _Tags_ to add more data (metadata) about | your files, so that if you forget where the file is, you can | still search for it by what is in it. This also slightly helps | in easing the burden of trying to figure out where to store a | file (e.g. _" Do I put a home video in my 'Videos' folder or my | 'Personal' folder?"_ - if you tag it properly, you can put it | in either, as you can use the tags to figure out where the | video is later). | | Examples: | | 1. In Documents folder - _" mom-2020.xls"_ (tags => docs, tax, | finance, mom, unfiled). | | 2. In Videos folder - _" newyear bash.mp4"_ (tags => video, | family, 2021, home). | oneplane wrote: | While I don't remember that specific project's name, I do | remember that a lot of that was possible because files in a | filesystem often aren't just some hierarchical folder | structure. There is often a tree-like structure to be able to | find a single file quickly amongst many files, but there was no | technical reason why it wouldn't also support off-tree indexing | based on tags. | | I imagine one of the problems you'd run in to is mass updates | since deleting a directory with many files in it below a single | tree node is very much limited to that part of the filesystem's | tree, but doing it with tags causes updates all over the place. | | It does remind me of this more metadata-like FS: | https://github.com/marook/tagfs | | As well as the thing Apple does in Finder where it has a tag | database and dynamic 'tag' directories where it shows you | everything that was tagged with that tag. Then the search | function would allow selecting files based on tags so you can | do many-tag searches and only find the files that match all of | them. I think that one is based on an in-filesystem metadata | stream. | remus wrote: | > I've been hearing this for years, but I've never had the | mental model for it. Maybe I just have to try it out. | | Gmail uses tags rather than folders, nice easy way to have a | play if you've already got a google account. | | Conceptually, I think of it as basically being able to have | lots of different hierarchical folder structures applied to a | set of objects at the same time. Or to flip it round, a normal | folder structure is like using tags but where you are limited | to one tag (the folder that holds the object) per object. | thunderbong wrote: | Not exactly. In Gmail, tags are hierarchical. | | So, if I have a tag called 'projects', and another tag called | 'newproject', the overall tag is 'projects-newproject'. | | If I try to see everything under 'projects', it'll only show | me the ones tagged 'projects' directly, not the ones under | 'newproject'. | joshuamorton wrote: | I think what you're seeing is the opposite: tags aren't | hierarchical. Tags can be organized hierarchically, for | organizational purposes, and you can use rules to ensure | that everything tagged "projects-FOO" is also tagged | "projects", but the tags as they apply to tagged objects | aren't hierarchical. | maxpro wrote: | This is only if you use it hierarchically. No one prevents | you from having a tag projects and another tag newproject. | Your emails just have both tags assigned and you are good | to go | DurhamPete wrote: | Fuse tagging: TMSU. At tmsu.org. | rkagerer wrote: | 10 years ago I was prototyping something very similar for | Windows. | | If I recall, CBFS and Dokan were the closest things you had to | FUSE on Windows back then. Alternatively considered emulating a | network drive. Like you pointed out, it had to be transparent | to your existing software's Open / Save dialogs (although there | was some effort to hook the standard ones to give users a place | to apply tags when saving). | | We've been stuck in the same old directory paradigm for a long | time. There are some use cases where the traditional | hierarchical approach is desirable (e.g. when you need to | "visit" a set of files exactly once, like to browse through a | folder to clean it up, enumerate for backup, calculate sizes, | etc). But it's a constraint when a file belongs in more than | one place. | tacticalmook wrote: | The downside of folders is trying to figure out where things | belong in the hierarchy, or trying to update that hierarchy to | a new standard. | | The downside of tagging is you still need to establish | conventions to ensure things can be found again, but | enforcement of your conventions is harder. | | Exploring, learning, and using an unfamiliar folder hierarchy | is easier than exploring, learning, and using an unfamiliar | tagging methodology. | | But manually searching for something in somebody else's tagged | data is easier than manually searching for something in | somebody else's folders. | johnchristopher wrote: | > The downside of folders is trying to figure out where | things belong in the hierarchy, or trying to update that | hierarchy to a new standard. | | That's why I stick to Documents/{folder1..folder[?]} and | folders don't have hierarchical sub folders, just contextual | folders. Eg: Documents/taxes 2021/{invoices, stuff}, | Documents/taxes 2021/, Documents/Cthulluh Roleplaying/{pdf | files of characters}, Documents/Covid vaccination | certificates, | | Yes, it's messy but I don't have the mental burden of a | holding a tree in my head or a tagging system. | Andrex wrote: | I try not to go more than three folders deep (starting from | ~). I don't mind a lot of files in a folder, search helps | me with that. | johnchristopher wrote: | My current setup is: Documents, owncloud, Downloads, Dev, | Media and tmp. No desktop. At work I have an additional | git folder. | Kalanos wrote: | Tags are the lazy man's schema | Razengan wrote: | One of the best uses of tags is to let a file effectively exist | in multiple "folders" | | For example I have folders of screenshots named after various | shows and games, and I use tags to further organize the images | based on their suitability for different "reactions" on online | forums :) | | So naturally I have hundreds of tags but macOS doesn't seem to | keep up with that many and after a certain point it feels like | Apple have forgotten about tags and improving their integration | into the system. | polote wrote: | (2017) | tiberiusteng wrote: | If using Windows, use Everything (https://www.voidtools.com/) and | when naming a new file treat the filename as a tag set. | | Then Everything could become your shell ... | dusted wrote: | Reminds me of an old project I did where I built a fuse system | for this, files still lived on a normal ext3 fs, but the overlay | presented files as tag paths, yo you could access, for instance, | a movie like /tagfs/movies/year/1999/matrix.avi or | /tagfs/movies/genre/scifi/matrix.avi | | There was a special path /tagfs/untagged/ which listed any files | that didn't have at least one tag. | dsr_ wrote: | The difference between tags and hierarchies is this: if you don't | know what you're looking for, hierarchy can guide you. | Rygian wrote: | That's just bad UX of the tag browsers. | arduinomancer wrote: | Couldn't you just look at a list of tags? | dsr_ wrote: | Sure. How many tags are in your system? | | The number of tags should be more or less similar to the | number of non-leaf nodes in a hierarchy, or else you aren't | capturing the same information. Any tag that applies to more | than half of the files is probably useless. On my blog, the | tags "blog" and "technology" are definitely useless. That's | fewer than 250 entries and already it has cruft. | | Were people consistent when they added tags? Does your system | suggest tags automatically? Is this actually a full-text | search minus stop words? Is there a librarian who cleans up | after you and merges tags that have the same meanings? Would | the full-text search be more useful than tags? | jasode wrote: | The previous 2018 discussion had comments from the author: | | https://news.ycombinator.com/item?id=16763235 | [deleted] | btrettel wrote: | From the article: | | > ### Hard and soft links as non-solutions | | > Soft links have a nearly opposite set of problems as hard links | - soft links can span different file systems, but they generally | don't track the target files getting moved or renamed (except | that Windows provides such a system service, but may not be | reliable); while hard links are all indistinguishable, some | application software behaves differently on soft links than on | real hard-linked files. In spite of these problems, hard and soft | links still require an exponential amount of effort to classify a | set of files in multiple ways, and require the user to manually | remember all the possible paths that a file can be reached from | (important when editing and removing files, not important when | browsing/retrieving files). They are non-scalable kludges | compared to true tagging. | | As a fan of hierarchies for file organization, and as someone | with quite literally thousands of soft links, when I read things | like this, I don't know if the person arguing against links has | tried this approach. It works totally fine for me. | | Yes, with soft links, moving the original file breaks the links. | I wrote a fairly simple bash script to automatically fix these | for my reference PDF files. It works because each PDF file I save | has a unique file name. So figuring out where the links need to | point is pretty simple. That makes me a "power user", I guess, | but the author is at least at the same level and I think could | figure it out. | | With respect to the "exponential amount of effort to classify a | set of files in multiple ways", I guess the author is referring | to navigating the hierarchy to link a file in multiple places? I | use tagging at my work, and I personally find scrolling through | my list of about 200 tags to be comparable in terms of time to | navigating through a hierarchy. The bottleneck is the human. | | Edit: Here's another article by someone who is not a fan of | links: https://karl-voit.at/2018/08/25/links/ | livrem wrote: | I use hard and soft links (mostly the former for files and the | latter for directories) and I think it works very well in | practice despite the theoretical problems brought up. | lmilcin wrote: | Or better idea. Use directed graphs. Directed graphs are strictly | a superset of both tree and tag (set) functionality. | | Tags (sets) can be thought as a special case of directed graph. | You make the tag a node in the graph and you make files also | nodes and have edge pointing in direction of the tag node. | | You can then do graph queries to find files tagged in a certain | way. | | But graphs offers so much more. | | Because they are superset of both trees and sets, you can use it | to represent both, _at the same time_. | | It is not very useful to have a lot of things tagged the same way | because you end up with just a long list of things. Whereas in a | graph you could say that you want to find objects from which you | can reach certain tag node and all these objects can still have | their own structure and even be part of multiple structures. | | For many years I had this idea to build my PIM where I could make | arbitrary nodes being anything that could let me connect anything | to anything. | | A node could be an email, a file, a link to external external | website, a task, a contact, a reminder, etc. | | And you could connect everything to anything and have, for | example a project that has important emails attached to it, the | email could have attached a reminder to respond and a file that | you want to include in response, and a note. | | You could browse this graph as a tree because locally it could be | interpreted as a tree, you open the tree one level by finding all | elements that pointing to the node. | MrLeap wrote: | How do you transform a digraph with cycles into a tree without? | Seems perilous. | [deleted] | lmilcin wrote: | How do you traverse a symlink that points to one of its | parent folders? How do you traverse a web of interlinked | sites? | | Somehow Linux, Linux software, web browsers etc. are already | able to deal with directed graphs. | MrLeap wrote: | You can handwave cycles created by symlinks away by | ignoring them. Full traversal is possible relying only on | the tree. | | If all you have is the digraph you have to rely on cycle | detection. Transforming it into a hierarchy of depth | requires snipping edges somewhere. | | I'm just gently disputing the claim that a digraph can be | mapped 1:1 to a hierarchical filesystem. I'm open to being | wrong. | tdrdt wrote: | The same is happening in webshops. Products can be assigned to | hierarchies (categories) but in shops like Amazon it is obvious | it is more like tagging. | indymike wrote: | In ecommerce, products are often presented to the shopper by | search facets, so kind of all of the above applies. | kzrdude wrote: | And amazon thanks to that is opaque, I never look around for | stuff, just search and find or not. They could have made | something better where you can discover other stuff. | s1k3s wrote: | Just cancelled my Google Drive subscriptions last week due to | this, and moved to SFTP on rented metal. If you can't 100% | predict what I'm looking for then don't even bother. Just give me | files and folders. | agumonkey wrote: | I had this idea long ago but at the same time I worry about | things too fluid. Both for performance and both for information | efficiency. Tree can be seen as a preemptive good enough tag | order. Some obvious dimensions like category, time will always be | of use. | austincheney wrote: | For tags I use meta data, which are included as mapped text at | the extreme end of many media formats. For example ID3 data on | MP3 files. | | I have found this incredibly helpful for MP3s because I have | thousands of them and there are many similar names. Windows | Explorer provides columns for this data in its detailed file | system view (not by default) which is incredibly helpful and | trivial to customize. | | For everything else folders are enough. I have hundreds of movies | on a hard disk and yet folders are enough. When I do need more | the data I want is generated by the file system: last modified, | file size, and so forth. | | What improves file usability the most for me is network access by | meta data. For example Windows Explorer and OSX Finder are nice | but I would rather have the exact same interface on the same | local machine for a bunch of remote machines regardless of their | file system or operating system. Then copy to a different machine | is just drag and drop from one window onto another in an | application that looks like some local OS, that windowing | interface needs to allow sorting and filtering and search by meta | data just like Windows Explorer. Having an application that does | this for me has been great. | xixixao wrote: | I love Bear with it's 1+ hiearchical tags per note. It's the | perfect combination of both worlds for me. Would encourage the | author to check it out (if it's any different). | megous wrote: | Usefullness of this is highly content specific. It maybe works | for mp3s or videos/photos you made yourself and contain some | metadata. I can't imagine tag based organization of all the | random 2 mil. files I have that don't fit into these neat | categories. | | I don't need access to most of these files unless I'm working on | something relevant to them. When I work on X, I go to directory X | and everything I need sits below X in some hierarchy. I'm never | interested in anything that's below X unless I'm working on X. I | don't want stuff from X to pollute some global namespace just | because some mp3 or PDFs are present under X. That's true for | hundreds of personal projects and tools I've made. | | Directory hierarchy is a pretty neat abstraction to me and all | the tools I use already support it well. | Rygian wrote: | Does it mean that you will store multiple copies of a file, if | it ends up being useful for multiple projects? | | About the random 2 mil. files you have lying around, the tags | you should set should describe not the files themselves, but | the reason why _you_ chose to keep those files. | fmajid wrote: | Librarians have been classifying the world's knowledge since | forever, and they developed faceted classification systems | (Ranganathan, 1933) to deal with these issues. | | https://www.researchgate.net/publication/321840994_Ranganath... | | A notable one that hierarchies embed the point of view of the | classifier, e.g. the Dewy Decimal's ridiculous classifications of | religions (codes 200-299), making minute distinctions like 285 | (Presbyterian, Reformed, Congregational), 286 (Baptist, Disciples | of Christ, Adventist), then putting all non-Christian religions | under a handful of afterthought headings 292-299: 294 for | Hinduism, Buddhism, Sikkhism and other religions of Indian | origin, 295 for Zoroastrianism and its descendants, 296 for | Judaism, 297 for Islam and Bahaism lumped together, and 299 for | New Age. | | Unfortunately, most of us do not have access to the services of a | librarian to develop a taxonomy that corresponds to our own point | of view then classify our files accordingly, which is why simple | hierarchical taxonomies have endured and faceted ones seldom | beyond specialized applications like Digital Asset Management. | maratc wrote: | Dewey Decimal System has only one (often-misunderstood) task: | organizing bookshelf space. It does not have a task of | classifying world information. | | The reason that fairly narrow "Presbyterian, Reformed, | Congregational" topic has one DDS code and an extremely wide | "Hinduism, Buddhism, Sikkhism and other religions of Indian | origin" topic has one DDS code is simple: an average library | has similar book-width for both topics. | karaterobot wrote: | > Dewey Decimal System has only one (often-misunderstood) | task: organizing bookshelf space. It does not have a task of | classifying world information. | | How is this distinct from organizing world information? If | the goal is to shelve books in a way that adds value to the | information-seeker, you have to arrange them according to | some definition of similarity. That decision about what makes | one thing similar to another encodes a view of the world. | TheCoelacanth wrote: | The physical shelves in a library only serve a single, | small geographic area, not the whole world. | tobr wrote: | An average _American_ library, you mean. | maratc wrote: | Of course, and while we are there, _an average American | Public library as of the end of the 19th century_. | | In the last century, DDS has seen its adaptations for | China, Japan and other countries which are based on the | same idea but are rather different. | ifethereal wrote: | One example of a data structure implementing faceted | classification would be the multitree [0]. Unfortunately | multitrees seem to receive far less support than 2 other data | structures it intermediates: trees and DAGs. | | k-d trees [1] are close but use cases seem to predominantly | target data with inherently ordinal (rather than nominal) | dimensions. | | Further abstraction could lead to the knowledge graph [2] or | graph databases. | | In all cases, the availability of "low-code" tools (in the | domain of single-user personal information management, at | least) seems sparse. I have been looking for some time, but the | search continues. | | [0]: https://en.wikipedia.org/wiki/Multitree | | [1]: https://en.wikipedia.org/wiki/K-d_tree | | [2]: https://github.com/JeffreyBenjaminBrown/hode | vinodkd wrote: | Some really interesting ideas in a well organized article. | Bookmarked! | nathanmcrae wrote: | I find that using Everything search (https://www.voidtools.com/) | makes me use the filesystem more like a tagging system. I still | name files and directories meaningfully, but I don't worry about | the hierarchy at all. Then when I want to get something, I just | search (parts of) the terms I want and see the matching paths | instantly. | | Using the filesystem hierarchically now feels painfully slow and | awkward. | _dain_ wrote: | I do this with fd and ripgrep, which also lets me do full-text | search through the files themselves. My filesystem has become | much flatter and coarser-grained; just a few very large | categories. | webwielder2 wrote: | Tags vs. Hierarchy are like the WFH vs. Office debate: something | people assume has a right answer but is actually personal | preference. | slaymaker1907 wrote: | I don't like the idea of file hashes for storing files. That | really only works well for a limited set of files completely | ignoring things like databases, note files, etc. If you want to | generate a name, UUIDs would make way more sense. Additionally, | even for files which never change, what happens when a bit gets | flipped inside of the file? That would presumably change the hash | without knowledge of the file system until you try and read it. | For drives intended for consumers, you really don't want to have | more data loss than is absolutely necessary in case of minor | corruption. | | On a different note, I use Tiddlywiki with tags a lot for my | personal notes and I think a hybrid approach with tags and a | hierarchy works best. Hierarchies are useful for generating | meaningful unique names where it is nice to have the notes be | meaningful. Another thing that would be useful with a hybrid | system is to have rules like | /Skyrim/Guilds/ThievesGuild/NPC/Mercer automatically assign tags | to Mercer by virtue of his placement like {"Skyrim", "NPC", | "ThievesGuild"}. You could do this automatically via the path or | by some sort of rules engine. | | Finally, I would really like it for a more complex information | storage system to fully support 1, 2, and 3-tuple metadata. A | 1-tuple is a tag, a 2-tuple is a key-value pair associated with a | file, and a 3-tuple would be a relation between files with an | optional key/value (some relations would be merely a tag while | others might want a key/value). 2-tuples are obviously useful | since they exist in limited form currently via file attributes, | though they unfortunately are not exposed very well to users even | though some file systems support arbitrary key/value pairs as | attributes. | | 3-tuples are a little more niche, but I think it would be useful | for keeping track of stuff like what file imports another. Humans | probably wouldn't generate the metadata in the previous example, | but it would help tools play nicely together since you could have | a small static analysis tool which simply updates this metadata | which could then be used by other tools or by users directly. One | of the greatest things about file systems is that they allow for | separate tools to interact with the same data in a structured | way. | rasengan0 wrote: | I ditched everything for a simple Zettelkasten in PlainVanillaVim | all on a SdCard | | EverythingIsAFile NoExtensions CamelCase | | hit gf on a word/file/tag and enjoy freedom | | :map gf :e <cfile><CR> | | https://vimhelp.org/editing.txt.html#gf | | sprinkle in RipGrep Fzf CtrlP as needed | klodolph wrote: | I like to organize files in a similar way but use spaces in | filenames. I find it a lot easier to read filenames that way, | and in the shell, <tab> will add all the necessary quotes / | backslashes. | megous wrote: | This is tempting, at least for note taking. :) | jcelerier wrote: | I remember trying to use tags to tag my media collection a decade | ago and miserably failing ___________________________________________________________________ (page generated 2021-11-07 23:00 UTC)