[HN Gopher] Johnny Decimal ___________________________________________________________________ Johnny Decimal Author : ralgozino Score : 330 points Date : 2023-06-13 11:03 UTC (11 hours ago) (HTM) web link (johnnydecimal.com) (TXT) w3m dump (johnnydecimal.com) | noirscape wrote: | Decimal organization is a good system... but it's explained here | in a completely obnoxious way that makes you want to hate it. | | Firstly, I strongly recommend just reading up on Dewey Decimal[0] | (which is what JD cribs almost everything conceptually from), | there's a decent explanation about it on Wikipedia. Should help | you "get" the categories you might want to make a bit more. | | Secondly, don't marry yourself to JDs limitations. The site likes | to evangelicize about some things that _really_ aren 't as | important as you might think. Feel free to ignore something if it | doesn't work for you - in particular the "no subfolders" rule | might just... not be worthwhile to follow. | | Personally I've always pretty much ignored this rule - if you | look at Dewey, the left hand of the number is meant to be a | classification for the broad _category_ while the number on the | right is meant for the broad _project_. In other words, applying | a decimal organization system to specific files? Yeah not what it | 's meant for, don't do that. | | Even in a library, where Dewey is used, an individual books Dewey | Classification isn't actually unique to that book. For example | all books on MySQL will have the same Dewey Class. | | Build it as a system that works for you, don't try to forcefully | refit your system to match the explanation of this website. Also, | _don 't_ use it for small projects. That'll just make it a bigger | mess than it's worth. Stick a small project in a bigger folder | system, it'll work way better that way. | | As for mental mapping - keep a readme file to just list the broad | categories in the top of the structure, it'll help a lot. The | site recommends spreadsheets but really, that's wayy overkill and | will just cause dumb overhead each time you have to add a file. | | [0]: https://en.wikipedia.org/wiki/Dewey_Decimal_Classification | bbor wrote: | I'm conflicted here, because you made some _great_ points that | I'm excited to think about /try, but you're so angry! I guess | I'm just joining an ongoing decimal-themed flame war from many | years ago, lol. For example, this: | | " Dewey Decimal[0] (which is what JD cribs almost everything | conceptually from)" | | seems a little uncharitable! It's pretty openly a | specialization/variation on DD, and I'd be surprised if many | people on here (or really in the culture at all) weren't | vaguely aware of DD from their school days. So "crib" seems a | little pejorative imo | | Re: substance, I'd be interested in a clarification if you find | the time: _why_ do codes for individual files bother you so? | | You need to differentiate them somehow, and the first pure-DD | solution I found doesn't apply at all: | | " we also add to the end the first three letters of the | author's last name (or, if no author is given, then the first | three letters of the title). In our example, the author is | James Brock, so BRO is added to the end of the Dewey call | number to get 595.789/BRO." - | https://www.oakland.edu/Assets/upload/docs/SEHS/ERL/Document... | | It just seems plainly helpful to have numbers before files, | especially for ones that you'll be returning to and/or | recreating for other projects a lot, e.g. documents within your | usual project management system. | noirscape wrote: | Eh, it's by and large annoyance with a lot of these "here's | how to get organized" guides. I have a bad tendency to kinda | accrue files and as a result need these types of | organizatorial systems to make sense of it all. | | The problem is that rather than being _descriptive_ (as in | "this works for me, see what works for you"), lots of these | organization guides are _prescriptive_ , which helps pretty | much only the person who wrote them to begin with. It gets | really grating after a while, especially if they offer things | like templates that are a pain to actually refit for personal | use. (Which to be fair, JD doesn't do, but the author very | clearly has that type of workflow in mind - older versions of | the JD website straight up recommended using airtable for | organizing stuff, template iirc included.) | | My annoyance with numbering individual files in JD in this | case is pretty much the result of "nobody else works in | _your_ Dewey decimal system ". Like, start working with any | kinda enterprise-y management tool and you'll _very quickly_ | learn that a lot of software is not written with JD in mind | because they assume control over an entire folder and | organize it in a way that makes sense to them. That is a | problem that often combines with when you start receiving | external files which are a folder of dependencies with one | file you can open in the aforementioned tool. Yes, you can | often spend time to edit the internals to "correct" that | document to the Dewey decimal system, but that creates extra | overhead and can also sometimes gravely annoy the other | person if the document has to be send back and forth a couple | times. | | In that case, it's just way more straightforward to assign a | unique ID to the parent folder instead of spending upwards of | 30 minutes fiddling with every incoming file. | | As for adding author last name - that's just for shelf | organization in libraries, libraries sort all books on | author/title alphabetical level. DDC just adds another | organizational layer on top of that for scientific books | (most fiction and (auto)biographies usually ends up organized | outside Dewey entirely for practical reasons). You can have | multiple 595.789/BRO in a single library (dictionaries for | example with multiple books will have the same DDC). | piokoch wrote: | If only life was that simple that it could be enclosed into | series of two digits categories. | | The problem with such strongly hierarchical system is that it | fails if there is some document, note, picture, etc. that would | be useful to keep in multiple locations. Obviously we can | introduce links between objects, but I believe tags are more | comfortable to use. | | Hierarchical system, folders are artifacts of the physical world | in which a single object, tool, pipe, screw, book cannot be in | two places at the same time. In the abstract world of computers a | note about new game could be in #games, #fun, #to-check, | #interesting-ideas, #great-graphics, etc. | Imnimo wrote: | But step 2 is to just "Make sure the buckets are unambiguously | different."! How hard could that be? \s | bachmeier wrote: | Your argument seems to come up a fair amount in these | discussions. In the end, you have to deal with storage of many | items, and you can either browse or search. The browse approach | requires you to know where you'll be browsing in the future. | The searching approach requires you to know what to search for. | No system is going to deliver all relevant documents, but you | can do a good enough job with a hierarchical system plus | search. | hotsauceror wrote: | I think this is exactly right, and it is a facet of the same | discoverability issues that crop up when people talk about | GUI vs CLI - one is more useful when you're discovering, and | one when you are searching. Tags are really set-based search | operations like a SQL query, but the 'primary key' is the | filename, and if you knew that you'd just search for it. | You're rarely going to have a tag or attribute that can | pinpoint a single document. | floren wrote: | I built a hierarchical note-keeping system for myself and have | been intending to add tags to it, but I've never gotten around | to it -- because the hierarchy is generally "good enough" after | I added two features: linking, and grep. | | Grep is self-explanatory. Linking works like hard links in | Unix, where the same note appears as a child of multiple | different parents (added a command to find "orphans" in case | you unlink it from everywhere). | | At this point I might not even bother adding tags. | jkubicek wrote: | > Linking works like hard links in Unix, where the same note | appears as a child of multiple different parents (added a | command to find "orphans" in case you unlink it from | everywhere). > At this point I might not even bother adding | tags. | | What you described with hard links is exactly how I use tags, | so that would satisfy my need for tags as an organizational | tool. | dclowd9901 wrote: | While I haven't gone so far as to attribute a numbering system | to my organization, I have done well at organizing things into | red-line distinctive categories. The idea is to create | categories that _cannot_ overlap. If there's any commonality | between them that's not useless, they need to be grouped at a | higher level. | | As an example, if you're organizing your toolbox, you don't | mark a drawer "hand tools" because it's a useless | categorization. You mark one "socket tools" which will include | everything from the sockets and wrenches themselves to adapters | that connect a socket to an impact wrench (but an impact wrench | does not go in there because it is not _exclusively_ a socket | tool). If it really does come down to something that may really | fit in two categories (hey, there's always exceptions), you put | your mindset in the place of yourself when you want to look it | up: what's the most common situation in which you'll be looking | this thing up? | edpichler wrote: | I solved this problem with hard links. I became fan of | Hierarchical systems, it just works. | MarceColl wrote: | I think that's the whole point of this system, when you have | infinite tags it's impossible to maintain a correct taxonomy, | you add #great-graphics to this game, but now you have to | backfill it to all other games, or in the future you may miss | them. | | They created this so the hierarchy is unambiguous (as much as | possible), you want a document, you are two steps away from it | in an easy to find way. | | tag systems have far too much maintenance and adding a new tag | is almost impossible to do exhaustively so you have a lot of | partial tags. | cjbprime wrote: | > you want a document, you are two steps away from it in an | easy to find way | | This isn't a response to the parent commenter's point, right? | They were describing how many projects have items where a | resource easily fits within the scope of N different | categories, at which point they become max N steps away from | it, not max 2 steps. | horsawlarway wrote: | Two thoughts: | | 1. This is much, much less likely with the enforced limits | on categorization in the post. | | 2. No - you are still 2 steps away. Make a choice about | where that item lives. If it's shared across many | categories, maybe you really need a distinct category like | "Ambiguous" or "Shared" | albedoa wrote: | > No - you are still 2 steps away. Make a choice about | where that item lives. | | You misunderstand. The max N steps are at the point of | recall, not categorization decision. | chaxor wrote: | This is a great point about tag maintenance *if you have to | make the tags yourself*. However, if you have a simple ML | system that you can run to categorize your files and pull out | good single word descriptors that have a large explained | variance over your files, you can run this and check the tags | that are constructed. | | I think there's a good way forward that uses typical | hierarchical Johnny.Decimal filesystems, with an overlay | filesystem with tags that can update the tags every so often | based on the content in the files. Obviously letting the user | have a hand in this via a TUI/gui would be helpful for | choosing tags for which they're comfortable. | | Unfortunately I haven't settled on a good filesystem with | tags (how to do this with ZFS?) or how to interact with it as | a network filesystem served to many different OS (cifs with | tags?). | MarceColl wrote: | It doesn't seem to me like a simple ML system, it needs to | be able to extract tags from all kinds of filetypes (video, | games, images, assets, text, ...), at a decent speed and | then it has to assign tags to what you would also assign, | because if it doesn't do that then it's even worse, because | you can never find anything as your mapping and the ML | mapping would not be the same. | esperent wrote: | > However, if you have a simple ML system | | Or the old-school method, a community of people with | tagging powers and a few moderators to do sanity checks. | majkinetor wrote: | #great-graphichs problem is not something category based | system will solve either, as you have the same problem. | Nothing will, to be honest, maybe AI eventually and even it | can't do it in all the things. | | > you want a document, you are two steps away from it in an | easy to find way | | This is not how people work in general. This kind of thing | might be OK for institution for taxonomy like collections. | jasode wrote: | _> Hierarchical system, folders are artifacts of the physical | world in which a single object, tool, pipe, screw, book cannot | be in two places at the same time._ | | Many think hierarchies come from limits in the physical world | but that's not what's happening. Yes, that's some of the cause | but does not explain all of it. | | The deeper rooted reason is that hierarchies are a convenience | to _aid the human mind_. Even without any limitations of | physical shelves, the brain likes to: | | - notice the relationships from the general-to-specific and | navigate them with spatial cues of dirs | parent-->child-->grandchild-->etc | | - group related items together -- using spatial cues of _moving | file icons_ into a file system folder | | The world the the blog essay is working in is the os _file | system_. The various files have to be put _somewhere_ on the | file system. Since putting hundreds /thousands of files into a | single flat folder is useless, one creates some child | subfolders to organize it it in some way. | | The tagging system assumes a different mechanism (e.g. a | separate "database" of tags which filesystems like Microsoft | NTFS and Linux ext4 do not have natively.) This happens above | the native filesystem. (Incidentally, by placing a file into a | subfolder, the name of that folder and the names of parent | folders above it act as an _" implied set of tags"_ for free.) | | That said, both hierarchical folders and tags solve different | needs. Also, hierarchies simulate/approximate "tags" by | "virtual folders" and 1-to-n softlinks. Likewise, tagging can | simulate "hierarchies" via compound-multi-word-tags. | II2II wrote: | The article points out that it is too easy to create duplicate | files. Part of that ties into what you're talking about. Part | of that deals with how people deal with files (e.g. few people | use versioning outside of software development). The article is | suggesting that a strong hierarchical system will help to avoid | that problem. | | Of course the other problem with tags is management. Placing | something into multiple relevant categories involves more | effort. Failing to place something into a relevant category | makes it harder to find since you are now dealing with either a | flat file namespace (worse yet, a disorganized one) or a flat | tag namespace. In theory, some of this can be handled by | letting someone else handle the tags (e.g. the creator, the | publisher, or the seller), but that has its own problems since | there is frequently a conflict of interest (e.g. irrelevant | tags are applied to increase the visibility of a product). | | At the end of the day, we have to accept there is no perfect | system of categorization. Some will prefer hierarchies. Some | will prefer tags. From the tone of the article, it is clear | that they prefer hierarchies. | jen729w wrote: | > At the end of the day, we have to accept there is no | perfect system of categorization. Some will prefer | hierarchies. Some will prefer tags. From the tone of the | article, it is clear that they prefer hierarchies. | | I'm the Johnny who wrote Johnny.Decimal and this is basically | it. | | The OP clearly isn't one of the people for whom finding JD is | a massive mental relief. I know those people exist: they | write and tell me. | | Others find the idea baffling. Stupid, even. That's fine. If | this helps you, enjoy it. If it doesn't, use something else. | dxs wrote: | Thank you. "If this helps you, enjoy it. If it doesn't, use | something else." is a sane, humble, and adult attitude. You | have my respect. | syntheweave wrote: | I spent some time studying the world of professional home | organization(as seen on Youtube) and the core concepts always | come down to these: | | * Allocate space up front in the form of containers | | * Position containers around workspaces | | * Use containers appropriate to the type of object and its | use(e.g. "rounds in rounds" - put round bottles on turntable | racks so you can spin to access) | | * Duplicate objects you need to use in multiple locations, e.g. | scissors for the kitchen and for the office | | * Label spaces where things belong | | And the key thing to it is that this isn't a hard rule like | always organizing hierarchically or always labelling. The | hierarchy helps compress space(that's why books and folders are | powerful) and the labels help define uses, but in many | instances, the level of organization you need is an open bin | with some dividers - the drawer organizer, cube storage, | cardboard box, book bin, cafe tray etc. | | Computer file systems are somewhat resistant to unlabelled | open-bin storage because that means you're allocating with less | precision, but I think everyone in practice knows that they | will shove things in "Documents" or "Downloads" and just | periodically purge it. | cnity wrote: | Godel's incompleteness theorems strike again. | horsawlarway wrote: | Personally - I've come to the absolute opposite opinion. To be | overly blunt: | | "Tags fucking suck." | | They are literally the worst possible way to store and organize | your information, and they are only useful when you just want a | random sampling of a category - not a specific document or | piece of information. Ex: Great for social media or looking at | old photos or just playing a song from a genre you like, bad | (fucking terrible) for organization and structure. | | --- | | Hierarchical structures have downsides, but the exact thing you | complain about (artifacts of the physical world) is exactly | their strength... You have a body that is adapted to the | physical world - routing and navigation through a series of | ordered steps is a _VERY_ well developed human skill. We are | primed to be able to remember things like: | | - Go left at the tree, | | - Straight until you hit road | | - Right at the road | | - continue until you hit a red house with a big garden | | - etc... | | That skill set maps directly into the hierarchical system of | folder: | | - Find the "documents" folder on the desktop | | - scroll down to "my super sweet project" | | - open that folder | | - Find the "icons" folder | | - open it and double click "exactly_the_thing_you_wanted.jpg" | | ------ | | You can absolutely still make horrible, unorganized messes - | but if done well (ex: this article is actually a fairly good | system) it's a much, much better system than tags. | douglee650 wrote: | Can't have both ... tags and hierarchal? | richardjam73 wrote: | Yes you just use a wiki with a traditional tree structure | and search. I use Obsidian which lets you do just that. | ape4 wrote: | I hope hierarchical aren't disallowed sometime in the future | - I could see it happening for phones. | PaulHoule wrote: | I've thought a bit about tags++, that is adding some logical | and not-so-logical features to them. | | For instance there are ideas from OWL where you could define | a category instead of other categories and their attributes, | for instance tag D could be the union of tag A and tag B and | the complement of tag C. | | _Implication_ is also useful both as a way to implement | subclassing but also containment relationships. For instance | on Danbooru a character that has several forms would have the | various forms of the character imply that character and the | character would imply the media property that the character | comes from. | | I am looking at what a tagging system looks like in the | transformer age and one key idea is a kind of three value | logic around tags which can be in a "positive", | "indeterminant" and "negative" state. If you are training a | machine learning system to auto tag you will need (1) a | number of examples where a tag does not apply (the tag not | being applied is not evidence that the tag doesn't apply, | poor coverage of negative examples is one reason why YouTube | recommendation is worse than TikTok) and (2) to deal with | cases where the ML model tags something incorrectly. If the | model tagging something puts it in an indeterminant polarity | and that result can later be switched to negative or positive | that is a great way to manage the situation. | esperent wrote: | > ideas from OWL | | What is OWL? Except for a good lesson in why not to use | common and hence impossible to search for words as names | for a project. | gglitch wrote: | https://en.wikipedia.org/wiki/Web_Ontology_Language | PaulHoule wrote: | They used to call the semantic web that OWL is a part of | "Web 3.0" which failed to make an impression or was | overwritten with the "Web3" moniker for NFT grifts by | exceptionally ignorant people. | | I learned OWL the hard way, I had been involved with the | semantic web for 10+ years on and off and didn't meet | anyone who knew how to do meaningful modeling with OWL | until last year, and that even includes famous academics | who"ve written books in it. | gglitch wrote: | OWL and RDF interest me immensely, intellectually. I've | never been positioned to use either one professionally, | but it looks fascinating. Is there a shorter path to | successful modeling than the hard way? Is there a good | source on this? | enord wrote: | RDF is not magic and OWL is... showing its age. | | If you are willing to eat the up-front cost of | coordinating global resource identification-- a daunting | task make no mistake, you get non-trivial dataset | integration almost for free. Imagine if concatenating two | ginormous JSON documents describing different aspects of | the same entity would amount to a useful merge into a | single combined JSON. If you Need this with a big N, RDF | has no alternative. | | The rise of SSDs has also more or less obviated the need | for clustered indexes as a practical performance | consideration. For the small price of trebling your | storage footprint, commodity RDF triplestores will index | _all_ your attributes/columns without a schema (usually | red/black or equiv). Will it scan an integer PK over 100b | records as fast as postgres? No. Is that use case in your | hot path? Also no (most likely). | | Edit: as for OWL, just take the plunge into rule based | inference directly. From forward chaining inference (if | you want performance and decidability guarantees) all the | way up to full blown prolog or | [miniKanRen](http://minikanren.org/) (if you want it in a | library in your runtime of choice) | munificent wrote: | Your example about navigating roads has nothing to do with | hierarchy. And, in fact, most road networks are _not_ | hierarchical and the interconnectedness is their strength: | | https://en.wikipedia.org/wiki/A_City_Is_Not_a_Tree | | Your brain doesn't organize information hierarchically. Let's | say I ask you: | | 1. Name a band that starts with "B". | | 2. Name a band from England. | | 3. Name a rock band. | | If your brain stored bands in a hierarchy, you'd only be able | to come up with "The Beatles" as an answer for _one_ of those | questions. You 'd have to figure out whether to categorize | the Beatles by name, location, or genre and it would be | absent from the other categories. | schoen wrote: | Or you'd have to do an inefficient search in order to find | something that matched, which would be slow, but not | impossible. | | Or you'd have to maintain several redundant hierarchies. | | (I agree with you that our subjective experience and speed | in thinking of things is evidence that we probably don't | mentally represent things this way.) | still_grokking wrote: | I strongly disagree. | | Everywhere where you have a lot of stuff to manage (photos, | music, videos, documents, links) hierarchies don't work and | only tags can tame all the chaos. | | The analogy to "path finding" doesn't hold, imho. That's | _not_ how our brains organize information! We organize | memories by association and not by some hierarchical | structures. | egypturnash wrote: | Tags are great _as an adjunct_ to a thoughtful folder | hierarchy, IMHO. | | Links are great as part of that too, they can provide | shortcuts. | | Real-world use: I am an artist, and I have found that the | best way to organize my work is with a series of yearly | directories. If I begin a large, multi-year project, it goes | in a directory within the year I start it; I'll make a link | to it that lives next to all the yearly directories. | | I also use OSX's tags a ton. Files get marked as 'in | progress', 'complete', 'paid for', 'commission', and | 'experiment' (and a few other things). When I want to decide | what to work on in any particular day it's super easy to open | up the saved search for "everything in progress" that I keep | on my desktop; this shows me everything in those yearly | directories that's marked as 'in progress', whether it's | personal work, client work, whether it's part of a large | multi-file project with its own folder hierarchy or just a | single file in the yearly directory. I also have a saved | search for 'commission'+'in progress' for those days when I | know I want to work on clearing the commission queue. And | whenever I spend some time just fooling around with different | effects to create interesting looks, I'll save my scribblings | with the 'experiment' tag; when I decide to use it later I | can easily tell Illustrator to open a file, and look through | the 'experiment' tag to find the file full of some crazy | procedural explorations, regardless of how long ago I did it. | This habit has saved me _hours_ of digging for that one file | where I did that cool trick once. | | Trying to organize all the files in my artwork directory with | _just_ tags would be a total fucking nightmare, the | subdirectory for a multi-year graphic novel has its own | folder hierarchy that 's several levels deep, and when I know | that what I want to work on today is "getting the prepress | files together for book 3 of the graphic novel" it's | definitely great to be able to just hit the top-level link to | the graphic novel directory, then go into "books", then "3", | and have its own little file hierarchy in there. | | Tags by themselves are not very good for serious | organization, but they can be very good for pulling things | _out_ of a hierarchical structure. They take work - I have to | remember to mark a new file as 'in progress' and possibly a | 'commission', though that's become routine, and changing | something from 'in progress' to 'complete' is a _pleasure_. | But it 's work well worth doing to create a nice little | network of shortcuts and secret passages through the terrain | of your thoughtfully-laid-out tree of folders. | setr wrote: | hierarchal tagging is the one true path | masukomi wrote: | there have been many, MANY historical attempts to organize | the worlds knowledge hierarchically. They have all failed to | achieve their goals spectacularly. | | some of the most common reasons | | - things exist in multiple categories that aren't in the same | branch of the tree | | - different state of mind during data retrieval means you | expect the same item to be in different categories. | | - different humans think the same thing belongs in different | hierarchical locations | | there's also been a LOT of scientific research around | informational organization. It all came to the same | conclusion. Hierarchies have interesting promises but fail | when it meets the practical reality of the human brain. | | in the end hierarchical organization of knowledge is a | terrible solution expect in VERY restricted cases. | fellowniusmonk wrote: | Do you have any suggestions of where to start reading on | this? A seminal paper or cluster of papers? I want to deep | dive on this not just to map out where it doesn't work but | also to get a map of the restrictive cases where it does | work. | | edit: never mind, I just put your quote into gpt-4 and it | passed me on to Eleanor Rosch, prototype theory and some | other interesting works. I feel like this is my own modern | lmgtfy moment. | tester457 wrote: | > You have a body that is adapted to the physical world - | routing and navigation through a series of ordered steps is a | VERY well developed human skill. | | I find that this skill is better utilized with a system that | has hyperlinks like Obsidian. | | Also purely hierarchical systems break down over time, they | can be supported with tags. https://karl- | voit.at/2022/01/29/How-to-Use-Tags/ | | > To my surprise, we tend to think in hierarchical categories | all the time. As I have written in my article on Logical | Disjunct Categories Don't Work, the real world does not fit | into disjunct categories. | | > Therefore, we should embrace multi-classification more | often. If you do want to learn more about the rationale, you | may as well read the first chapters of my PhD thesis or the | book "Everything is Miscellaneous" by David Weinberger, just | to give you two resources of many. | | > Long story short: tagging does take away the burden of | finding one single spot in a strict hierarchy of entities | which is actually a heavily intertwined network of concepts | we do find in the real world. It's far from being a neat | hierarchy. Everybody who tries to put "the world" into a | strict hierarchy will fail.To my surprise, we tend to think | in hierarchical categories all the time. As I have written in | my article on Logical Disjunct Categories Don't Work, the | real world does not fit into disjunct categories. | 0xr0kk3r wrote: | Tags are superior because tags can model hierarchies, but | hierarchies cannot model tags. There are far too many times | when a single document crosses multiople categories that are | served by tags. I used Outlook for 15+ years and thought tags | were a joke, then moved to GSuite for 13 years and learned to | use tags, now I"m back on outlook and I feel like I'm | suffocating without them. That's two decades of experience | with both systems. Not to make a fallacy / whizzing contest | out of this, but how long have you tried both systems? I'm | guessing not as long. | horsawlarway wrote: | > Tags are superior because tags _can_ model hierarchies | | Tags are inferior because tags must be coerced into | hierarchies. | | Tags are inferior because they do not properly link | hierarchies that they model without extensive software | support (which is present for file directories by design, | and absent for tags). I have yet to see a hierarchical | tagging scheme work well when you need to do something like | change a mid-level directory name (you end up having to re- | write many tags, often without good software support for | what you're trying to do) | | Tags themselves are _fine_. It 's a perfectly valid way to | label data. It is _not_ a good way to organize that data | for human recall and reference. | 0xr0kk3r wrote: | > It is not a good way to organize that data for human | recall and reference. | | Yet here I am: using them for recall and reference faster | than hierarchies (after 30+ years of using both). | account-5 wrote: | Pretty sure Categories is what you're talking about for | outlook. | phailhaus wrote: | Workflowy [1] solves this problem by supporting mirrored nodes | as well as tags. | | [1] https://workflowy.com/ | fallat wrote: | Ok, this made me look into why not the Dewey. | | It seems too hard to memorize the numbers for first time | placement. | | So let's make a program that asks us when moving it into our | collections? | | `dewey <file to organize>` | | Will then lead you down a tree of decisions. Insta-organized. | It's so good I just might try it. | | (The file will move to wherever your organized files are | specified in your .config/dewey.conf) | | On Windows this could be a right-click -> Dewey, where it then | pops up a small window to pick the categorization. | ttul wrote: | This seems pretty backwards in the age of AI, where semantic | search can ingest and numerically sort embeddings with | extraordinary finesse. | Exuma wrote: | I freaking love this sidebar design | liendolucas wrote: | Question aside. Can anyone recommend any opensource de- | duplication tool(s)? I've realized that I have the same data over | many drives but manually going through them even for a single | drive will take a ton of time. I'm wondering if there's something | smart enough where you input paths to be scanned and magically | outputs de-duplicated data to a single coherent place... | | Edit: Some corrections. I forgot to mention which OS: GNU/Linux | and/or BSDs. | JTyQZSnP3cQGa8B wrote: | I used DupeGuru (https://dupeguru.voltaicideas.net/) in the | past but I'm not sure it's the best solution for you. Try it, | it's open-source. | sea-gold wrote: | https://github.com/qarmin/czkawka | harshreality wrote: | what os? for use in a console, there's rdfind or fdupes | hummingly wrote: | I've never tried it myself but the README mentions several | other tools. | | https://github.com/dpc/rdedup/ | jbverschoor wrote: | rmlint. | | I've tried many, but rmlint is the most flexible and reliable. | Esp. the tagging works really well. | | https://github.com/sahib/rmlint | tjoff wrote: | My research into this many years ago turned out that jdupes was | the right / best solution I could find for my usecase. | | https://github.com/jbruchon/jdupes | | Though that works fine from a script perspective I'd like some | more interactive way of sorting directories etc. Identifying is | just the first step, jdupes helps with linking the files (both | soft and hard links comes with caveats though!) but that is | mostly to save space, not to help in reorganisation. | liendolucas wrote: | It seems to me that is not a trivial problem to solve: de- | duplication + reorganization. Maybe I'm incorrect. It also | seems the kind of problem where it could be super-easy to | screw it if you go with a custom made script plugging | different tools... | zie wrote: | When it gets bad enough you need this for an organization, you | hire a "librarian"[0], it's literally their job to classify and | keep track of information. They have a whole degree program | called Library and information science. | | Let the experts handle this stuff. How many times have you found | some super important production piece being handled in a disaster | of Excel and 400 different versions all named ridiculous things, | and nobody knows which is the right one to use? Why? Because they | didn't bring software development in soon enough. | | 0: Librarian is our commonly understood word for the broad | profession of information management, but the experts tend to | have many different job titles for their discipline, get a | subject matter expert(I'm not one) to help you track down the | right job title for your specific project. | bbor wrote: | I think most people here are applying this to their personal | notes/projects. | jen729w wrote: | Johnny here. I think more people should apply this at work. | | I work on large IT transformation projects and the | disorganisation and resulting waste of time and money is | borderline criminal. | MH15 wrote: | I've used this system for a few years in the past. It's | definitely handy for some people, but it didn't fit my use case. | I now store all important documents/photos/backups in the cloud | and consider the computer to be basically throwaway. | | One organizational system many programmers may appreciate is | keeping your git/GitHub repos in the same place, under | `.../g/<username>/<reponame>`. Huge fan of this method. | bbor wrote: | Hmm this seems unrelated to me - why not implement Johnny | decimal around or within git repos? And what about it would | change if used for cloud directories instead of local ones? | | Probably just missing something obvious! | MH15 wrote: | The only relation was that I personally used to use Johnny | decimal to group my different projects, but then moved to the | git repo namespace setup and no longer had a need for Johnny | decimal. They aren't really substitutes so I understand your | confusion! | scrapcode wrote: | > Nothing is more than two clicks away | | > An important restriction of the system is that you're not | allowed to create any folders inside a Johnny.Decimal folder. | | This being said immediately after a screenshot with three levels | of directories confuses me. One problem I immediately identified | with this system is that I would have to take extra steps to peek | into the applicable directory to see what the current index is... | | I'm always looking for a good organizational methodology. This | seems to be _per project_ , no? Any suggestions for a system for | overall data organization? | gglitch wrote: | I'm enjoying using it for overall family/personal data | organization. Took me a year to migrate in, and I allow myself | the privilege of reorganizing as needed; but I've found it | super simple and super stress-relieving. | | I do allow myself subsubdirs wherever it makes sense though. | E.g. right now I have a file browser open to "64.05 TV Shows" | (60 - 69 is "Media"; 64 is "Video"), and within 64.05 I have | one subdir per TV show. I don't feel obliged to give each show | a special number, and I also don't feel troubled by each show | being a sub(sub)dir. This system is searchable and browsable | within my tolerances. | makeworld wrote: | > This being said immediately after a screenshot with three | levels of directories confuses me. | | Me too, but reading more I understand this now. A | "Johnny.Decimal folder" is a folder that starts with a name | like 12.04, meaning it represents a unique item. It will | already be inside two other folders, the 12 folder and the | 10-19 folder. The point is that while 12.04 can be a folder if | the unique item is actually multiple files, you can't have more | folders inside 12.04, because that's considered too much | nesting. | | > This seems to be per project, no? Any suggestions for a | system for overall data organization? | | Multi-project organization is covered later on: | https://johnnydecimal.com/10-19-concepts/13-multiple-project... | GuB-42 wrote: | Oh, now I understand where the directory structure comes from at | work. | | I hate it. | | The problems I have with it (some of them implementation details | that can probably be fixed) | | - On smaller projects, you have a big directory tree of nothing, | with maybe a quarter of the directories being populated. This is | because it starts from a template. | | - You tend to get long directory paths, enough to get over | MAX_PATH in some instances, don't fit in a single line, etc... | | - Remembering arbitrary numbers is hard. Try using arbitrary | numbers in your code for your variables, I am sure it will be | appreciated... | | - And especially when there are several number based systems in | place. So you have the software version number, the ticket | number, the number system used by your customer, etc... Do you | really want another number system on top of that? | | - The article says there is no overlap. There is never "no | overlap" in the real life. For example, as a dev, I should have | nothing to do in the "sales" folder, except that the technical | specifications are here because they are part of the contract. It | really belongs in both "sales" and "dev". | | - I still use search as may primary tool. | | Note that someone mentioned the military. I have worked on | defense contracts, they are the worst. Acronyms and codes | everywhere, I guess they are too special to name things with | regular words. And I am talking about the unclassified stuff, it | is even worse when confidential information is involved: "The | name should follow the ZB4455 convention, ZB4455 is in document | L45.34c, can I have L45.34c? No it is classified, but actually, | it just means it should be lowercase and start with an | underscore." So I wouldn't take what the military does as a good | example. | quietbritishjim wrote: | I'm haven't used this system (thankfully), but my first thought | is complementary to the comment you made about small projects: | in big projects, probably it's more useful to break down by | component rather than type of document. E.g. the front test | spec is better off grouped with the front end architecture | diagram rather than the test spec for a bunch of other | individual components. | hotsauceror wrote: | An IT professional criticizing the military for using acronyms | and codes is a pretty bold stroke... | | Besides, if you named everything with regular words your poor | MAX_PATH would be working overtime. There's a time and a place | for abbreviations and codes, and if a multi-theater, | technically-advanced military force with global and | extraplanetary reach isn't one of those times and places, then | I don't know what would be. | | But I do agree with you about assigning an arbitrary number to | a project. 773.0034 is not that helpful a descriptor and I | wouldn't want to see a whole "Downloads" folder full of those. | But it does help you find things quickly. | bbor wrote: | > multi-theater, technically-advanced military force with | global and extraplanetary reach | | Neither here nor there but god I need a drink. And it's only | 8am. Reading that sentence reminds me of the nationalistic | radio broadcasts mentioned in A Canticle For Lebowitz before | everyone is annihilated in nuclear fire... | hotsauceror wrote: | I've been chipping away at moving to my own flavor of JD over the | last year. One of the first things I did was add one higher level | with broad categories, numbered as x00. Tis way things are | broadly organized, I still don't have to 'fundamentally' go more | than two folders deep or 'have more than 100 folders', but I can | use it for my entire work life despite having 100-ish actual | technical projects. | | Backporting old docs to this system is a real chore and honestly, | I haven't been very disciplined about that part, besides moving | old Project folders under the top-level Projects folder. But this | is always going to be an issue with any new filing system, and I | don't think there's a lot of value in doing it. Maybe would be an | interesting programmatic exercise. But I, hotsauceror at his | keyboard, am NOT going to go and retroactively assign a 753.0026 | etc identifier to every document lol... | | My rough, rough hierarchy is as follows: 100 - | Administrative - 110 Interview Notes - | 110.001-eng-john-smith.md - 120 Onboarding - 130 | Performance - 140 Training + Certification - 150 | Travel + Expense 200 - Analysis - 210 Code | Review - 220 Performance Tuning - 230 Technical | Specs 300 - Documentation - 310 HOWTOs and | Runbooks - 320 Technical Specifications - 330 | Environment - 340 Processes 400 - Meetings | (this is a catchall) - YYYY-MM-DD-annual-project-plan.md | - YYYY-MM-DD-budget.md - YYYY-MM-DD-new-policy-rollout.md | 500 - Operations - 510 Stack #1 - 510.001-turn- | it-off-and-back-on-again.md - 520 Stack #2 - | 520.001-reset-proxysql-after-network-partition.md - 530 | ... 600 - Troubleshooting (another outlier) - | yyyy-mm-dd-stack-2345-bad-plan - yyyy-mm-dd- | stack-1234-cpu-peg - yyyy-mm-dd-stack-3456-non-yielding- | scheduler 700 - Projects - 701 Project 01 | - 702 Project 02 - 703 Project 03... 800 - | Reports 900 - Training - 901 Brown Bags / | Lunch+Learn - 902 Terraform Certification - 903 | AWS Certification | | I have recently added a 000 - Logs folder for places like coding | journals, another trendy suggestion that pops up here on HN from | time to time that I may or may not stick with... | wingmanjd wrote: | We implemented this for our shared storage at $DAYJOB. We had a | long tail of decade old files on our shared drive, so we started | again with the Johnny Decimal system on a new one. It's helped | tremendously for us for finding stuff. | | I had previously implemented it on my personal Nextcloud | instance, but found it to be less impactful, as I already tended | to over-organize my digital files. | citizenkeen wrote: | Every time I think about implementing this I realize the | categories I have today and the categories I have five years from | now are unlikely to mesh well. | | At least based on my priorities from five years ago. | JKCalhoun wrote: | I've been thinking about that too. We either make broader | categories or allow ourselves to deprecate and refactor | category numbers in the future. | | To me though the overwhelming benefit of the process is the act | of bucketing. Another strategy then would be to bucket down to | 8 categories instead of 10 -- like line numbering in BASIC you | allow yourself a bit of space if needed in the future. | PurpleRamen wrote: | Gosh, that's a really awful explanation. Not sure I get this | correctly, but the gist is, you organize things by nesting | general Categories with specialized categories and put a number | on them. With the "lifehack" that the first digit is the general | category, and the second digit is the specialized category? And | then every folder under a specialized category gets another | number? And this is only meant per Project? Not globally? Meaning | every project can have slightly different categories & numbers? | Have I understood this correctly? | | How does this handle inter-project-files? What exactly is a | project even in this context? How does it handle things which can | be in multiple categories? This smells for like someone pressing | everything into a hard form to circumvent the flaws of their | tools, instead of getting better tooling. | adrianmonk wrote: | I don't plan to use this, but I think I get why it might work. | | (1) Although it's just a hierarchy/tree, which is nothing new, | its size and shape is (supposed to be) a sweet spot. There are | trade-offs with hierarchy sizes and shapes, so a sweet spot is a | plausible idea. | | (2) By limiting the size of the tree, you force people all across | the organization to share the same parts of it rather than giving | them private spaces they control exclusively. This means they are | forced to work together on how information is organized. This | could encourage there being one coherent idea of how information | is organized. Everyone will have to agree on how it's organized, | and everyone will be more familiar with how others' stuff is | organized. | | (3) The numbers are small enough that you can remember them and | talk about them. When you ask someone where something is, they | can give you the answer directly instead of promising to send you | a link. (It's like how you can read an IPv4 address off one | screen and go type it into a config file on another computer, | whereas unfortunately this is not easy with IPv6.) This increases | the odds of success in finding the info. | acyou wrote: | Couple caveats that I think should be included: | | Use this for your own files where no one else has to find | anything. | | Avoid reorganizing other people's files. | | If you do the organizing, it may make sense to you, but may not | for other people. | | Adding the decimals has the primary benefit of nothing being | recognizable from before, so that new brain maps can be made, not | horribly and painfully mangled, warped and twisted from the old | maps. | | If you have to navigate one of these systems and you didn't | create it, use search and hope files are named well, and hope the | creator didn't go overboard with making foldets. Otherwise, | welcome to a little hell of clicking into a million empty folders | and never being able to find anything. | | Has anyone mentioned Aristotle yet? His abstraction of | categorizability works, but is so obviously wrong once you have | to accomplish any practical task. | | For us, organic folder structure development for as long as | possible, or avoiding folders as much as possible is better. | Then, some intelligent and pragmatic decision making, and no hard | and fast rules. We are human friendly first, where file systems | are primarily intended for human navigation. | drcongo wrote: | Absolutely no. For all the reasons listed here: | https://heyluddite.com/post/4043411544/how-to-name-folders | Mizoguchi wrote: | There's a lot of nonsense in that article, talking about | evolutionary conditioning to alphabetical folder organization. | Hundreds of millions of humans can't organize their documents | alphabetically because they don't have an alphabet. They don't | seem to have a problem with that. | thechao wrote: | I thought most languages have a collating order -- I think | even a slightly generous reading is what they mean by | 'alphabetical'? Even Chinese (an important edge-case) has the | _traditional_ radical-and-stroke ordering mechanism. | f1shy wrote: | The algorithm at the end is what I go through every single time | I search for something. I think this article has a point! | nequo wrote: | The algorithm at the end: Start -> Open | Folder -> End | | In reality, natural language uses synonyms that often start | with different letters. So without numbers, I still need to | scan every directory one by one. | | With numbers, I assign categories according to the phase of | the process in which the item occurs. For example, | 1 plans |- A first draft |- B Lisa's notes | `- C design 2 analysis |- A exploratory | `- B design implementation 3 deliverables |- A | May 2023 report |- B June 2023 presentation | `- C August 2023 report | | I can limit my search to folders and items that are in the | low/medium/high range, according to what I am looking for. | But alphabetically sorted, this directory structure would | look much more ad hoc: analysis |- | design implementation `- exploratory | deliverables |- August 2023 report |- June | 2023 presentation `- May 2023 report plans | |- Lisa's notes |- design `- first draft | tomjakubowski wrote: | Collation order is not necessarily best for organizing. Many of | us think spatially. Having related things near each other can | be useful. Same with putting the most commonly used or most | important things near the top. | dSebastien wrote: | I wrote an article about this system a while ago: | https://www.dsebastien.net/2022-04-29-johnny-decimal/ | | I rely on it a lot for my personal data and projects. The | simplicity and constraints have a positive impact on the | usability of the organized information | macintux wrote: | The older I get the more I appreciate the intrinsic value of | constraints. | poutrathor wrote: | As said before about in the post "BIG DATA is just data", a lot | of information is worthless after 1 or 2 years and most after 5 | years. Long term value data seems to be stored in IT systems' DBs | rather successfully. | | And I have so far always find important emails (notably because | important topics are easily found emails chains and far more | often than not in the dedicated meeting report). | | Structuring data is cultural so you should rather learn to use | the system used by your organization. Only super small teams and | solo-founders need to think about how to store data. Most workers | should follow their community to let other people find the | information. | | Folders, drawer, cabinet have been around for 3 centuries at | least and imho, we are not gonna reinvent the wheel with this or | that way to structure information. | massysett wrote: | If your organization has a system, by all means use it. | | The whole point of Johnny.Decimal is that most organizations | have absolutely no system to organize information. It's tossed | into a huge pile. | | Even organizations that have systems concern themselves only | with organization-wide needs. Individuals still have needs that | the organization does not address. | James_K wrote: | It seems like nothing of use is gained by replacing folder names | with numbers that index those names aside from making the path | shorter. In a library this is useful because books have to be | stored physically in order, but a computer does not have these | restrictions. You could just as easily apply the same set of | rules without the numbers and see similar results, with the | advantage that the names of things reflect what they are. You | also wouldn't have to create silly rules like "1- is always | project management", because under the new system, "project | management" will always be project management. | | He does seem to address this at least somewhat[0], but the | justification is so flimsy it's hardly worth addressing. In | essence, he doesn't like alphabetical ordering because the index | can change when something new is added. He would prefer new | folders to be inserted at the end of the list. He is evidently | unaware that folders can be sorted by creation date. | | [0] | https://johnnydecimal.com/10-19-concepts/11-core/11.02-areas... | JKCalhoun wrote: | > It seems like nothing of use is gained by replacing folder | names with numbers | | It forces you to whittle your categories down to ten (and sub | categories). I would argue that in and of itself is a useful | constraint. | NeoTar wrote: | I have invented a superior system - Johnny Binary. | | It's basically the same as the described system except you | are forced to categorize your files even more severely since | every level of the hierarchy only allows two subcategories. | | It must therefore be superior, right? | bbor wrote: | Very funny but hopefully we can all see that "constraints | are good" does not imply "you should be as constrained as | possible" | hotsauceror wrote: | When I first started out I used Johnny Unary. I dumped all | my documents into a folder called - get this - "Documents". | It actually worked remarkably well for a number of years. | ellyagg wrote: | Yeah, there was a system/OS/UI concept I came across | years ago that I can't find anymore, but every document | on your system is in one time-ordered stack/stream and | then I guess you just have filters and such to manage | random access. | James_K wrote: | You don't need numbers to do that. | evandale wrote: | Using numbers makes it easier. | bgribble wrote: | I find that the numbers are really helpful when trying to find | related items. Things that are topically connected sort | together and can be filtered by common criteria. | | I use SimpleNote a lot for JD content and put the category in | the title of each note. I type a piece of the JD number in the | search box and it instantly filters down to relevant notes. | Sort by title sorts by topic. | laputan_machine wrote: | I agree, these mental maps you have to create is adding _extra | mental overhead_ , which apparently the author aims to | reduce... odd. | chaxor wrote: | The most useful part of the process is simply thinking about | how to organize your files. | | The 03.65 like naming can indeed be switched out for something | with words, but I believe the best of both worlds is to make | the words "unix-like", i.e. small, and explanatory. | | For instance *~* 10 main directories (code, doc, vid, etc) with | *~* 10 subdirectories (note, tv, movie, etc) is nice to try to | fit your data into, but if one of the subdirectories has only 8 | things, it's not the end of the world. This tends to work | extremely well for "longer term" storage (a drive mounted | beside your OS for data when 'finalized' or 'semi-finalized') | but the mess of OS and everyday files isn't as appropriate for | it. | juancn wrote: | I'm messy, I like being messy. | | I cannot follow any of those organizational, rigidly structured | methods. They make me anxious, I much rather live in my mess and | let it automatically prioritize stuff for me. | | Things I don't know where I left are likely unimportant, and no | energy should be wasted on them. | | I think I finally made peace with my mess. | deofoo wrote: | This can work extremely well for one or two people. It becomes a | problem when different people need to agree on what are the 10 | things, categorization and maintenance. | f1shy wrote: | And even when defined, at some point some document will be "in | the middle", one coworker will place it in 10, the other in 50. | Has happened to many much more times that I can remember | jen729w wrote: | Johnny here. | | I was mid-reply and I realised I was typing out my problem | statement, so I'll just paste it here. This is a work in | progress. | | --- | | # The problem | | When we kept everything on paper, organised people had these | things called filing cabinets. They stored all of their documents | in them in a structured way so that they could find them again. | | Now those same people store all of their files in arbitrarily | named folders on their company's shared drive and wonder why they | can't find anything. | | ## Information wasn't always free | | When we kept everything on paper, generating information came | with a cost. Paper cost money. Typing out a document took real | effort. Duplicating a document meant a trip to the photocopier. | | Every document produced was a tangible thing. It was there, on | your desk. You couldn't ignore it. | | Now anyone can duplicate anything, instantly, invisibly, for | free. We assume this is an improvement. | | Is it? | | ## You had to be organised | | When we kept everything on paper, you had to be organised. There | was no other option. | | If you weren't organised, the information was lost. Not lost as | in 'it'll take me a while to find it': lost as in 'gone forever'. | | Now you _can_ be disorganised, but at what cost? The cost is the | time it takes you to find a thing; it is the risk that the thing | that you find is a duplicate or an old version. It is the | constant frustration that comes from knowing that something | exists, but having no idea where it is. | | We all feel this every day and we have come to believe that it is | normal. | | It is not normal. | | ## Why aren't we given training? | | When we kept everything on paper, it was someone's job to | organise it. This was an occupation: you were trained. You became | an expert. | | Now we employ Gen Z's who didn't grow up with the concept of 'a | file' yet we expect them to navigate the byzantine hierarchy of | the company's SharePoint.[genz] | | [genz]: https://www.theverge.com/22684730/students-file-folder- | direc... | | You work at a keyboard all day, so we make you sit through a | module so you know to bend your knees when you lift a box. | | But when it comes to information management: you're on your own. | christiangenco wrote: | I stumbled on this system several years ago and found it useful | as inspiration for organizing my external storage. | | My top level categories are `inbox` (stuff that isn't sorted | yet), `Media` (stuff that other people made), and `Vault` (stuff | that I made). | | `Media` contains `Audiobooks`, `Books`, `Courses`, `Films`, `TV`, | `Music`, and `Broadway. | | `Vault` contains `Backups`, `Projects`, `Audio`, Video`, and | `Photos`. | | Anything one layer deeper is either a file of the type described | by the parent folder name or a folder containing related files | (ex: `Video/2023-06-12 makers.dev 119` is a folder containing the | raw recordings and processed end video and audio for my podcast). | | I've got about 10TB and tens of millions of files organized in | this system. It works better than anything else I've tried. | jehb wrote: | I do something similar. Where is always seems to fall apart is | with things I collaborate on with others. Sometimes, joint | projects get their own home (i.e., they become an organization, | or at least get their own public repository of some sort). So | in addition to "inbox" "media" and "private" (my version of | "vault") I've also got a "shared" category. | | It's still not perfect, because ultimately the subcategories of | "shared" need to actually be accessible, or mirrored, or it's | not actually true. And sometimes, a project goes into "shared" | aspirationally, even if I have no collaborators yet, as a | subtle reminder that I might share it someday, so I don't want | to put anything in that folder that I'm not comfortable being | public or semi-public. | ellyagg wrote: | I incorporated some of these ideas like 10-15 years ago. | | My top level relations: | | * Fun: Sex, drugs, rock & roll | | * Home: Rent, buy, interior, yard, cars, places | | * Meta: This system | | * Mind: Philsophy, language, math, art, music, science | | * Money: Accounts, investments, Bitcoin | | * People: Family, friends, everyone | | * Self: Fitness, health & illness, spirit, food, fashion | | * Tools: Computing, devices, productivity, maker, crafts | | * Work: Career, job | | Roget's original thesaurus, which divides every word into 6 (or | something) top-level relations was also an inspiration. | | These are my root items in Workflowy (with its infinitely nested | bullets). | | I star active projects so they show up in the sidebar. I shift- | drag (to mirror) items out of projects into the root (above the | relations) to serve as my daily todo list. All in all, simple, | efficient, and comprehensive. | evandale wrote: | I was trying to imagine what my 10 categories might look like | and it's very similar to this! I tried getting into using | Obsidian and used top level categories such as: Ideas, Lists, | Learning, My News, Reminders, Misc. | laputan_machine wrote: | Terrible advice. Abitrary rules (make 10 folders!) is just | utterly bonkers for everyone except a small subset of people who | could categorise their life in this way. | | It really grates on me when people offer solutions that work for | them, as if they will work for _everyone_. | | No. | wwn_se wrote: | If i did this with my e-mail i would have over 1000 in some | folders. | | "It's very unlikely you will end up with a hundred categories." | -the page | | Exactly this will result in about 20-30 folders for most, with | any real amount of documents some folders might hold 100-1000 | docs. | | The advise you should take from this is that forcing structure is | useful. Look att large code repos for example. | JKCalhoun wrote: | I'm going to allow that some things, like photos for example, | can live in their own folder apart from the Johnny Decimal data | hierarchy. | | (Also, it would force me to consider ... do I need 1000 files | here? I've certainly been known to join related documents into | a single PDF, Uber-document, if you will.) | AndrewKemendo wrote: | This is effectively how formal military instructions are | structured - and generally US code for that matter, with chapters | generally reserved for certain functions going down to the .01 | decimal specificity [1] | | Way back in 2010 or so I published a series of instructions for | the 36th Wing that followed this kind of naming/information | numbering convention which was frustrating to fit into, but | ultimately once you understand the framework it's faster to | write. | | That isn't to say it isn't confusing and complicated - which | happens to everything at scale - simply that this kind of | structure for documentation is pretty common and literally battle | tested. | | [1]https://www.esd.whs.mil/Portals/54/Documents/DD/iss_process/.. | . | deltarholamda wrote: | Commercial construction specifications are done in this way as | well. So all electrical specifications are in division 16000 | (or 26000 nowadays) and subdivided from there. | | This method of only being two levels deep is interesting. If it | works, that's great, but there's nothing to stop you from going | three if required, e.g. 10.20.30. But keeping everything | constrained has value in itself, if only in that it forces you | to think in larger discreet chunks. | jawns wrote: | Number-based organization systems (e.g. US code) work best when | there are frequent references to specific nodes in the | hierarchy (e.g. legal citations) and there is no guarantee that | they're being accessed digitally. | | But there is a good reason why I navigate to | news.ycombinator.com and not 209.216.230.240. | | For digital resources like URLs or file systems, using numbers | as prefixes or primary IDs only makes sense if their ordinal | values represent the most important and intuitive way to browse | through the hierarchy. | | But in most cases, the name rather than the number is the most | important thing, and it's very easy to sort or filter by name | -- whereas sorting or filtering by number is only useful if | there's an inherent ordering (e.g. date modified) to the | numbers. | seanosaur wrote: | > But in most cases, the name rather than the number is the | most important thing, and it's very easy to sort or filter by | name | | Names can also be difficult if not done correctly / | uniformly. For instance, "Category Name", "CategoryName", | "category_name", and "category-name" can all return | differently through search. | | I don't think the key is names vs. numbers vs. whatever else, | I think it's more important to pick a system that works for | the use case, then define / document / communicate it as wide | and loud as possible. | monooso wrote: | >The Chief, Directives Division (DD) assigns numbers to DoD | issuances based on the established subject groups and subgroups | provided in Tables 1 through 9 and the recommendations of OSD | and DoD Component heads with equity in a particular issuance. | | What an opening sentence. | egberts1 wrote: | Part of me originally thought that Johnny Decimal would be in | groups of 10. | | But first visit to their web site shows numberings exceeding 10. | | Ok. | | Still a novel idea worth pursuing. | copperx wrote: | This is one area where LLMs will help tremendously. I've always | hated the Save operation, because it forces you to think about a | name that describes what you're working on, even though the idea | isn't fully formed yet. | | I'm pretty sure Microsoft will integrate LLMs to automate file | naming, and I hope other systems follow suit. | | More interestingly, LLMs will easily organize data hierarchically | based on the contents. I hope this becomes a reality this or next | year. | | I hate manually organizing a filesystem. | hotsauceror wrote: | I agree with this. Having an OS option to scan and tag your | documents into some taxonomy that is a built-in in your file | browser would be quite attractive. I'm sure Microsoft and | Google are both working on it. | | e-Discovery applications like Relativity have been doing this | for years. You run a PCA against a bunch of OCR'd documents, | look for correlations between words or phrases within | documents, look for repetitions of those particular | correlations, call them 'issues' or 'motifs' and slap a label | on them. Attorneys used to use it to scan millions of documents | in a discovery set and auto-flag them for possible privilege | issues for further review, and even automatically mark them as | such. | Hbruz0 wrote: | What about project folders like git repos and all that ? How do | they fit into this system ? | aezart wrote: | I genuinely can't imagine this working at all for any sort of | software development project. | jen729w wrote: | Johnny here. Correct: don't use it to organise anything that | smells like a software project. | hosteur wrote: | I have been using JD for several years to organize both my | personal documents as well as my business' documents. I think the | system works really well. | | One thing I have learned to do which bends the rules a bit is to | use date stamped folders in the lowest level instead of XX.YY. | | Examples of places where I use this with success is for folders | containing: meeting minutes, travel documents, receipts, etc. | mberning wrote: | The fact that this indistinguishable from satire is an incredible | feat. Very well done. | ktpsns wrote: | This pops up on HN regularly. Was extensively discussed at | https://news.ycombinator.com/item?id=25398027 | | Interestingly, we have a similar "BASIC line numbering" system in | our company. Allows for easy traversing the directories if you | can remember the numbers (I cannot), such as | "05_Contracts/15_Employees/041_John_Doe/07_Testemonies". | jmiskovic wrote: | First time I heard of it. | | I like how simple the core concept is explained, but I feel it | would box me into categories when I like tags more | (categorizing items in multiple orthogonal domains). OTOH maybe | well thought-out categories would bring more structure than | tags. | | My current notes strategy is to prefix the date to markdown | filenames (for example '2023-05-31 canvas scan transform | matrix.md') and put them into single dir. These are active | journal-style notes that I'm free to update over next days | while they are still in focus. Every few weeks the list of | nodes gets busy and I 'archive' older notes into sub-dirs | (personal, hobby project, work project) and backup the whole | structure. The method requires minimal maintenance and the full | text search works well for my needs. | | Edit: I like how the author leverages the CLI auto-completion | and I try to do the same, but I think Johnny would work against | my brain. When naming the directory or a script, I put myself | in mind frame where I'd want to use it and I'm trying to recall | its name. So I give semantic names like 'build-android.sh'. If | it's a new thing I try to come up with a short catchy name for | it. Having to recall the `10-19` category each time I want to | access specific subscope seems like too much cognitive burden. | Just theorizing, haven't given it a shot so far. | f1shy wrote: | Where I work we do use that structure. I can never find | anything. For me remembering the numbers is as remembering IPs | instead of URLs. The problem is not the naming of the | directories, the problem is that the next idea after "johnny | decimal" was to make a standard structure. Because this | structure has to serve the full company, is HUGE! So | irrespective of project or area size, you have an structure of | 10 levels with 30 directories in each level. The names are very | generic, and sooner or later somebody has a different | interpretation of where document X should be placed... we have | lost days searching for lost documents... | kazinator wrote: | The system is spoiled by confusion between division into 10 and | division into 100. This creates extra levels so that the | implementation does not live up to its "two clicks away" promise. | | For instance, in the site's own structure, we have | 11-core/11.01-introduction | | But that would leave two digit categories at the top level. The | top level is organized by groups of ten and so we need | 10-19-concepts/11-core/11.01-introduction | | One question is what if 10/11 gets more than ten items, so there | is an 10/11/11, 10/11/12? | | Isn't there a division into ten needed there? | | If the bottom level never goes beyond 00-09, the zero is | redundant. It's actually a three level system with a branching | factor of 10, and you might as well just have | 10-concepts/11-core/1-introduction | | I would just have 10/11/1 | | and have symlinks concepts -> 10 | 10/core -> 11 10/11/introduction -> 1 | | Using the numbers as prefixes for the symbolic names means that | someone who remembers the symbolic name but not the number cannot | use tab completion nicely. They have to use tab completion to | scan the entire directory level, then type the number, then tab | complete again. | | Symlinks going from symbolic to numeric is probably the right | direction. The OS symlink resolution then teaches the users what | the categories are: $ realpath --relative-to=. | concepts/core/introduction 10/11/01 | | There could be accelerator symlinks at the top level: | 11.1 -> 10/11/1 | | Now you get the full benefit. If you remember that introduction | is 11.1, you actually have that as an instantly navigable | identifier in the system. | slaughtr wrote: | > One question is what if 10/11 gets more than ten items, so | there is an 10/11/11, 10/11/12? | | I'm not following this (and thus, I think, your entire point). | I think you might be slightly misunderstanding something, the | files inside a category(11-core in the example) would never | have a prefix other than the category - 10/11/11 is the only | option - 10/11/12 would be breaking the system. | | Once you're inside a category, there is no division into 10 | anymore. The 11 category would allow documents from 11.01 to | 11.99. And as I believe is mentioned in the spec, if you need | more than .99 you likely have too broad of a category or area. | | For what it's worth, I've used this system at work and in my | own notes for around 2 years and haven't run into this problem | (yet). | kazinator wrote: | OK. So if the 11 category can go to 99, why can't the top | level just go from 00 to 99 as well without being broken into | batches of 10 requiring another level. | Player6225 wrote: | I have been using JD for a while now, to the point that I built a | CLI for it (using Deno). | | But I just enjoy the speed of feeling like I can cd to any | directory at any time in like... 8 keypresses (`jd 20.21` is an | alias I use to cd). | | https://github.com/bpevs/johnny_decimal | | https://johnny.bpev.me/ | | Edit: I had a separate hierarchy I used on my work machine when I | was still working at a larger company, but this is the one from | my personal machine (with some redacted)... 10-19 | Notes 10 Quick [Daily-life kind of stuff] 10.01 | Daily Notes 10.02 Cooking 10.03 Listening | Notes ... 11 Research 11.00 Device | Setup 11.01 Project Name 1 11.02 Project Name | 2 12 Reference [Basically categorizing random notes] | 12.00 Unsorted 12.05 History and Current Events | ... 12.28 Spatial Audio 12.29 Music, | Cognition, and Computerized Sound 13 Travel | 13.01 Zhong Wen ... 13.10 Maps 18 | bpev.me 19 Documents [Various documents here] | 20-29 Projects [Active Projects] 20 Code 20.00 | gists 20.01 bpev.me [insert projects I am | committing to often] 21 Media 21.01 Music | [insert Music album work here] 30-39 Archives 30 | Code 30.03 favioli 30.04 johnny_decimal | ..... basically, maintanence-mode projects. | If I start committing on a more regular cadence, I move to `20 | Code` 31 Media I have a separate, date-based | hierarchy within these... 31.01 Music 31.02 | Photos 31.03 Videos 31.04 Memes | 31.05 Screenshots 39 Backups 39.01 Contacts | 39.03 bpev.me 39.04 Savefiles 39.05 | Applications | magicalhippo wrote: | My grandpa was very interested in libraries. He had drawers full | of index cards[1] for his personal library, organized using the | Dewey decimal system[2]. | | When he first got a computer, back in Windows 3.11 days, it only | seemed natural to use what he was familiar with. So he would | store documents and emails in directories based on the Dewey | decimal system. | | However a problem quickly arose. A document might pertain to | multiple topics. With index cards this was simple, you just noted | the book or document on each of the relevant index cards. | | With files however it was less clear. The only way he found was | to save the same file in multiple directories. With the obvious | nightmare of keeping it all in sync. | | It got somewhat better when I taught him how to make shortcuts to | the documents, but still... | | [1]: https://en.wikipedia.org/wiki/Index_card | | [2]: https://en.wikipedia.org/wiki/Dewey_Decimal_Classification | tokai wrote: | Universal Decimal Classification solved this issue by being | fully build to do faceted classification. It does take more | work to create classes though, and the class notation can get | very complex. | danman114 wrote: | I like PARA a lot, which has some great ideas: | https://fortelabs.com/blog/para/ | vlovich123 wrote: | > and the cues that Google uses to determine what's useful -- the | links that are the fabric of the internet -- just don't exist at | work. | | Says someone who's never worked at Google and used Moma. I still | don't understand why Google doesn't offer Moma as a on-prem thing | to replace JIRA's suite. Is the market too small? They used to | have an on-prem appliance way back when but surely a container | package is all you need these days? | [deleted] | gglitch wrote: | As this thread currently has a lot of critics, I just want to put | in a personal plug for JD. I've been using it for some time now | for family and personal data and it has been enormously helpful. | It's true that it is occasionally vexing to have to choose one | category for a given thing, but (a) it's usually not, (b) it's ok | to reorganize categories, (c) I have found that often if there's | something important that fits equally well in more than one | category, (c1) either I need to refactor my categories, or (c2) | it's probably going to be ok if I just pick one and allow myself | to recategorize later. This almost never happens anyway. | | And in the mean time, all my stuff is searchable, browsable, | findable, and tidy. | | I'm not saying it will work equally well in all environments or | for all purposes, but for mine, it solved many years' worth of | stress. | cjohnson318 wrote: | Agreed. I find this useful too. Especially for random stuff I | use once a year. | whelchel wrote: | I have a similar experience for personal life. I use this too - | it's imperfect, but it's a nice balance of complexity and | utility that doesn't get in the way once you set it up. | kashunstva wrote: | > it's ok to reorganize categories | | This is an important point. A person's interests and areas of | responsibility evolve over time; so refactoring is not only | permissible; it's probably also helpful to unload accumulated | organizational cruft that's no longer relevant. | | When it comes to indecision about where a file goes, I'll often | just place a .txt file in the "wrong" location pointing to the | correct spot. Or an alias. | jejones3141 wrote: | If you refactor, how does that affect email searches including | pre-refactor subject lines? | rrradical wrote: | I imagine you could just reply to the old thread with the new | category.id. Or only reply to yourself if it's only your | organization system. Email search should include the email | bodies. | | (I'm not a user of this; just guessing) | thenoblesunfish wrote: | This seems like one particular example of a good general set of | principles: organize things intentionally, put things in one | place, use hierarchies with a branching factor of about ten. The | specifics beyond that are probably not worth arguing about. | bbor wrote: | Beautiful sentiment, but sadly akin to muttering a poem to a | raging River - I can't think of anything more HN-y than arguing | passionately about directory organization systems! | myth2018 wrote: | Never used such system, but I'm inclined to believe in its | promises. In addition to what I've recently commented in another | HN post [1], this system also slightly resembles the | classification system used in accounting. At a first glance those | account numbers look cryptic and arbitrary, but soon enough you | realize how helpful they are on enabling accountants to | communicate and creating journal entries. | | [1] https://news.ycombinator.com/item?id=36301140 | dxs wrote: | More Karl Voit: (1) "Managing Digital Files (e.g., Photographs) | in Files and Folders" at https://karl-voit.at/managing-digital- | photographs/ | | (2) "TagTrees: Improving Personal Information Management Using | Associative Navigation" at https://karl- | voit.at/tagstore/en/papers.shtml | | (3) "TagTree: Storing and Re-finding Files Using Tags" at | https://karl-voit.at/tagstore/downloads/Voit2011.pdf | rsecora wrote: | Melvil Dewey[1] started this way, but then things got bigger, and | a cast of clerks were born to serve the system. | | [1] https://en.wikipedia.org/wiki/Melvil_Dewey | pantulis wrote: | I think JD main value resides in the restrictions it suggests. | They will work for some people, for others they will not, and | others like me will adopt JD in an informal way. For example my | most used folders, loosely corresponding to main areas of focus | have unique numeric prefixes, but inside them the folders do not | follow the numeric prefix approach. What I appreciate is having | the same numeric prefix in _all_ the applications I happen to | use, like GMail labels, task manager projects, Evernote | notebooks, and file systems. | zetalyrae wrote: | If anyone has implemented this successfully/satisfactorily please | post your folder hierarchy so everyone can compare notes and | improve their organization. | tmslnz wrote: | We use a loose version of it in my small company. | | It helps with two things: - 1. A little easier to be consistent | across projects so not to reinvent the wheel every time - 2. | The prefix increments as new folders are added during a | project, painting a convenient picture of "progress" as things | move along. | | We tend to have: 10 to 19 reserved for admin stuff, like Admin, | Incoming, Outgoing, Documentation, Meeting notes, etc. | | Then anything from 20 onwards is ad-hoc per project | | We also timestamp children of Incoming and Outgoing, with an | ISO prefix. This is very useful to keep track of what was | received and shared and when. | | Overall the goal is to have as little protocol as possible to | prevent total chaos. Anything more than that is usually too | much to ask or doesn't stick longer than a single project. | 10. Admin 11. Incoming 2023-10-12 sender, subject | 12. Outgoing 2023-09-01 Estimate 13. | Documentation 20. Design 30. Production 40. | Blah | majkinetor wrote: | I also use variant of incoming/outgoing, its very convenient. | JTyQZSnP3cQGa8B wrote: | I have only used this (alone) for a few weeks because it is the | first kind of organization that really resonated with me. I | understand it may not be for everyone, but when it comes to | organizing small to medium projects, it's really good IMHO. I | use the standard organization because I'm not creative. Every | project has his directory with a prefix (like "FMW01 xyz" for | "firmware, first project, named xyz"), and subdirectories named | "00-09 System," "10-19 Project management," and (my choice) | "20-29 Data" with "20 Inputs" and "21 Outputs." | | I have a template with empty folders and files (like Notes.md, | Todo.md, etc.), and I can copy-paste this template for each new | project. As long as I improve my template, every future project | will have the new structure. | | It's like the GTD system (which I also enjoy), but for | organizing your thoughts, notes, and files in different | projects. It's weird because I'm not fond of naming folders | with numbers but this time it seems to work. Every project has | the same structure and I'm not lost. I guess it's good for | people who needs a serious structures as it forces you to have | a good organization. | | Interestingly, I had a boss 10 years ago that was using an | equivalent method with a template and numbered directories. He | was successful at managing projects and I think I discovered | his secret. | | Last but not least, once a project is done, I can zip it and | reuse its number. | majkinetor wrote: | I am obssessing over this when maintaining my knowledge/artifact | base. Currently I am keeping it in git repository with few | categories and I use 3 mechanisms - tags on end of file names and | directories, iso8601 dates as prefix on some locations, and | nothing on thrid ones. | | So, | | 1. notes a-la gists use tags: 'notes/Rsync | notes #cli #foss #notes #x-platform.md' 'notes/Windows | initialization #windows #powershell.md' 'notes/Modafinil | notes #medical #nootropic.md' | | 2. event-like things use both dates and tags | 'work/meetings/2023-01-03 Project XYZ meeting #project-xyz.md' | | 3. stuff I just collect dont use anything or some of above | 'dms/wallpapers/w1.png; w2.png ...' | 'dms/shopping/2023-06-13 Dyson Absolute 15/README.md; | receipt.jpg' | | I keep basic folder hierarchy very limited for now. I use vscode | to commit any change on save and pull git on folder open, making | this behave like always in sync cloud a la Github Gists, | especially together with vscode sync that brings my plugins, | configs and shortcuts everhwhere. | | CTRL+P to quickly find stuff by name or tags, and vscodes very | fast ripgrep search to get files containing any content - so I | just need to remember any word or phrase to find it. If I can't | remember anything I browse over tags (having handy script to | display all of them) or dates (since I usually know a time | range). As another mechansism, I use double commander file | manager with its fuzzy file names search to get interactive lists | by typing tags or keywords while in particular folder. | | To encrypt some pages I use GPG with vscode extension. | | This serves me well, and I don't get lost, either when searching | for previous knowledge or when trying to find where the single | one is. | | I evaluated Johnny Decimal prior to this, and it didn't fit this | workflow - seems ad hoc enough so I can live without it and has | nothing tags or good search can't solve. Also, it feels not | flexible enough particularly as stuff can't have multiple | categories. Tags are much better mechanism for information | organization, you just need to keep them organized, keep their | number relatively low, and have mechanism for | delete/merge/move/rename which is simple enough here as it is all | on the file system and is a few shell commands away. | JKCalhoun wrote: | I love everything about this: the concept, even the name. I feel | Johnny Decimal just needs a graphic. From a few minutes of | Googling, I think something like this: https://clipart- | library.com/img1/1252227.gif | sproketboy wrote: | [dead] | kstrauser wrote: | When I first saw this, I thought it looked silly and too simple | to be useful. | | The other day I looked at my DEVONthink database I've populated | over the last 15 years or so, and what do ya know. It has a | couple dozen top-level folders, each with a handful of folders | inside, and that's about it. I didn't deliberately set out to do | this, but "Banking/{Bank1|Bank2|Bank3}", "Medical/{Me,wife,kid}", | "Taxes/{2020,2021,2022}", and so on evolved that way anyway. | | I love the _idea_ of tagging, but turns out nearly all the | information I care to store long-term can be filed more easily | than it can be tagged. It's rare that I want to have the same doc | in 2 places, mainly limited to when I'm collecting information to | send to someone else (e.g. filing taxes, applying for a business | loan). When that happens, I just - shocker! - make copies of | those docs in a new folder I've created to collect everything I | need. DEVONthink makes the copy a zero-sized reference to the | original doc and gives each copy a special icon so you know it's | a duplicate. | | So basically, Johnny Decimal couldn't possibly work for me, and | yet I ended up with a sad version of the exact same thing on my | own naturally. Well, huh. Maybe it's not so silly after all. | | (Also, regarding tagging: the idea of a database with a few tens | of thousands of files in the same namespace, searchable by | tagging, gives me hives. I know people do this all the time, and | it's a "me problem" that it bothers me, but oh, how it bothers | me.) | moritz wrote: | What I like about DEVONthink is actually not the "duplicant", | but the "replicant". | | My organization also evolved to a simple hierarchy over time, | but the fact that files can live in several directories at the | same time is very useful in some cases. When there is | ambivalence where a file should go, it can just go in two | places - but it's not a duplicate, so you don't run into | uncertainties which one is the latest version, etc. So it's _a | bit_ like tagging (which in DT you can additionally do), but | also not quite... | kstrauser wrote: | Yep. I didn't get into the specifics, but I use the replicant | feature all the time for the kinds of things I mentioned. | | Plus, it's so freaking good at finding stuff wherever you | might have happened to have squirreled it away. | rsecora wrote: | Paraphrasing Greenspun's tenth rule [1] | | Any sufficiently complicated library management system contains | an ad-hoc, informally-specified, bug-ridden, inconsistent | implementation of half of the Dewey System [2]. | | [1] https://en.wikipedia.org/wiki/Greenspun%27s_tenth_rule | | [2] https://en.wikipedia.org/wiki/Dewey_Decimal_Classification | [deleted] | [deleted] | esperent wrote: | I think you're being unfair. There's nothing bug ridden or | inconsistent here, just a simple categorization system that | looks like it would be pretty decent for small to medium sized | projects. | | It's also not informally specified. The shared link is | literally the specification document. It's written in a kinds | of informal style, sure, but that's a different kind of | informal - Greenspun's informal means "not written down at | all". | JadeNB wrote: | I suspect that rsecora was going for humorous parallelism | rather than meaning any dig at the linked project. | hammock wrote: | >I think you're being unfair. | | Uncharitable. The fact that this is called "Johnny Decimal" | is a nod to Dewey Decimal in the first place | maliker wrote: | It's always boggled my mind how disorganized most companies are | with written information. It's always a wiki here, 7 different | file shares over there, most of the latest data is on workers' | desktops named "mgmt report 04032023 latest jb edits 2.0.doc". | Constant stream of "can you send me the thingamajig file?" | | And yet, we've all been to a library. Information organized by | topic, then by author, and inside the books everything is further | organized into chapters, and then there's an index referencing | all of that (plus a card catalog/search system). | | I use something similar to the Johnny Decimal system described at | work, except the high level is by project not by topic. I find | chronological filing split into projects (i.e. chunks of | time/money/effort) matches my workday better. | SanderNL wrote: | Libraries are also easy. Books are done, one dimensional pieces | of linear writing. The text itself is the thing you care about. | | Companies run on mental models that are occasionally _partly_ | solidified (and ultimately ossified) in a textual format. | argiopetech wrote: | The thing with libraries is that they're full of librarians. | For some reason, it has fallen out of vogue for all but the | oldest/largest companies (and government agencies) to hire | librarians to work outside their libraries. | jen729w wrote: | https://johnnydecimal.com/10-19-concepts/11-core/11.08-the-l. | .. | | I couldn't agree more. :-) | | - johnny | mxuribe wrote: | And, librarians are professionals who are trained | specifically in the challenges around managing info, | etc...Like many other areas, many corporations don't value | long-term attention to things that will help them the | most...in the long term. Its just too much short-term | thinking...as well as, "oh hell, we don';t need to hire | librarians...that takes money away from | stockholders...Everyone in the company will just figure out | how to manage the data at some point in some fashion on their | own...etc." :-p ___________________________________________________________________ (page generated 2023-06-13 23:01 UTC)