[HN Gopher] I'm "still afraid to use spaces in file names" years... ___________________________________________________________________ I'm "still afraid to use spaces in file names" years old Author : dario_satu Score : 1004 points Date : 2021-11-11 09:40 UTC (13 hours ago) (HTM) web link (twitter.com) (TXT) w3m dump (twitter.com) | fortran77 wrote: | I think people who use a terminal interface, regardless of OS, | don't like spaces in file names. I avoid them. | mrb wrote: | Sort of related, but here's a joke: _Windows 95 does support long | filena~1_ | necovek wrote: | I don't use spaces because it's so much faster to type filenames | out (including with TAB-completion) in the terminal. | | I do, however, use Cyrillic (UTF-8) in filenames, and I regularly | try out if moving a file into ASCII-path will let some programs | open it (half the time it's that when I am having trouble). | spurgu wrote: | I'm not "afraid" of it, I just think it's unnecessarily | compicated to work with spaces in filenames on the command line. | gorgoiler wrote: | I like to store data on USB flash drives. After being left to | mature for a few years in a humidity and temperature environment, | you get some really interesting and _complex_ byte streams where | your original file names used to be. | | Often they are not even valid UTF8 which, when you uncork the | filesystem for the first time in a decade causes the most | delightful crashes. The more years the better the aroma. | boffinAudio wrote: | Every week, I encounter a user - just like I did in the 80's - | who cannot explain the _difference_ between a file and a folder. | | "What do I use a folder for?", they ask, in the same breath that | they request "some way to organize things logically". | | The no-filesystem movement has worked hard to eradicate this | scourge from user experiences, but I fear that this is the devils | work. Computer users _should know what a file is_ , and what its | for - and they should _know_ what a folder is for, and why they | would want to create one to put their files into it .. | | But yet: they don't. | | It hasn't improved since the 80's. Taking away the users | responsibility to understand these things, only makes computing | worse. The fact that "special chars in paths" breaks things, also | holds this factor into place, imho. | octorian wrote: | > The no-filesystem movement | | Is that the movement to store all your data as an amorphous | pile of crap, and then provide easy-to-use search tools to | actually find the content you're looking for? | | On one hand, I really like the search tools that come from | this. But I still like to actually organize my data, so I can | browse it if I want to. Also, these search tools seem to only | work well enough on macOS and fall flat on their face in | Windows. (and no idea where Linux falls on this) | vbg wrote: | Spaces in file names are a bad idea because spaces delimit the | name of separate distinct files, | | At least in my crazy old illogical head anyway. | vbg wrote: | File names should be long enough to clearly communicate | meaning/purpose/context, no more no less. | koziserek wrote: | .doc | koziserek wrote: | och my emojis didn't display, sorry | iknowstuff wrote: | Hahah how ironic. | hajile wrote: | I dislike constantly having to backslash escape files on the | command line, so I use dashes instead. | sixdimensional wrote: | This seems like a case for an axiom I hear infrequently, but I | think comes up a lot - things that seem like they should be | simple and easy, but are in fact difficult. | sieve wrote: | This is a UI/UX problem that I only face when dealing with shells | and shell scripts. Never had any issues when spawning processes | from within languages/runtimes that support sane argument arrays. | | _sh_ , _bash_ and _cmd.exe_ are shit. The shell needs serious | rethinking. | necovek wrote: | This is a difference between $@ and "$@" (note the quotes): | $ cat proba.sh #!/bin/sh echo "Using quotes:" | for i in "$@"; do echo "$i"; done echo "No quotes:" | for i in $@; do echo "$i"; done $ ./proba.sh "ho ho ho" | Using quotes: ho ho ho No quotes: ho ho | ho | tomcam wrote: | Damn I didn't know that. Thanks | Joker_vD wrote: | I see that there are lots of comments about problems of TAB- | completions with filenames with spaces in this comment section | and I am frankly puzzled: both Bash and cmd.exe actually TAB- | complete those perfectly fine, inserting quoting where it's | needed. | tremon wrote: | And where it isn't needed. If you have a path that contains a | variable _and_ a space, bash will happily escape the $, | making the path invalid. See the following: $ | cd $HOME $ mkdir my\ dir $ ls my[tab] $ cd | / $ ls $HOME/my[tab] ls: cannot access '$HOME/my | dir/': No such file or directory | | That error is because when you press [tab], bash changed the | path to \$HOME/my\ dir/ but that isn't obvious from the | output and I couldn't find a proper way to include the tab- | expanded result in the transcript. | | (edit: this is on GNU bash, version 4.3.48(1)-release but | I've seen this behaviour for years) | Joker_vD wrote: | Depends on the Bash version, I guess? Mine is 4.4.20(1) and | when I do "cd $HOME/my[TAB]", it replaces the input line | with "cd /home/joker/my\ dir/", and pressing [ENTER] | changes the directory to '/home/joker/my dir', as can be | seen from the prompt. | akovaski wrote: | The variable escaping behavior has existed for a while | https://stackoverflow.com/questions/32463052/bash- | tabbing-fo... https://askubuntu.com/questions/70750/how- | to-get-bash-to-sto... | https://askubuntu.com/questions/41891/bash-auto-complete- | for... | | And I experience the problematic behavior on my Ubuntu | VM. However, I can get the above describe expansion | behavior if I run: shopt -s direxpand | necovek wrote: | I seem to remember bash losing preferred escaping when TAB- | completing, but can't reproduce it now with 5.0.17. | | Eg. you'd type `ls -l "Spaced [TAB]` and it would turn it | into `ls -l Spaced\ Name`. I remember similar annoyances with | other special shell characters (eg. single quotes, dollars, | slashes), but that all seems to behave sane now. | xyzzy_plugh wrote: | I didn't even know this was a thing, but can't say I've | ever preferred an escape style. I actually use backslashes | a fair bit, usually just with spaces. I tend to reserve | double quotes for variable or shell expansion, explicitly. | necovek wrote: | It's not so much about a preference, but your cursor | would jump about and you'd need to be on the lookout if | you wanted to edit the completion (eg. to change the | extension). | sieve wrote: | > inserting quoting where it's needed | | You have to remind yourself to do this manually in scripts if | you don't want to see lines full of "No such file or | directory." | | One of the reasons the shell is broken is because the | character they use as an argument array member separator is | something that regular people use to distinguish between two | words, such as in a file name. | Joker_vD wrote: | Well, writing scripts would be much less painful if | $VARNAME did not explode into pieces by default. Alas, this | ship has sailed long ago. | goohle wrote: | IMHO, it's possible to add a flag to bash, which will | turn on this behavior, so problem can be fixed, but it | will diverge bash from POSIX sh a lot. | Pxtl wrote: | Yes, but working with filenames with spaces in them is a huge | PITA in command-line tools, because you have to quote everything. | The ergonomics is just really annoying. | | Personally I wish console shells had chosen another delimiter | than space, but here we are. | shadowgovt wrote: | And honestly, it's a good fear to have; there are contexts where | it still just doesn't work. | | Last I checked, the standard answer for GNU make is "Spaces are | expected to break the tool, that's working as intended, it will | never be fixed." And because we build our towering edifices of | software on the pillars of the past, I can't guarantee to you | that a project of arbitrary complexity _won 't_ try to cram a | list of filenames through a make script. | timakro wrote: | Maybe if we'd do it more software would actually learn to deal | with it. | TacticalCoder wrote: | Define " _space_ ". Is the Hangul filler we talked about | yesterday a spacing character? Is the zero-width non-breaking | space a spacing character? What about the typographic spacing | characters? | | You should better be _very_ afraid of using spaces in filenames. | | You should do everything you can to support them but you have to | know you'll invariably encounter countless cases where you'll | have this or that tool that won't work properly with them. | | I still live in a world where I cannot name a song from the | french group _L 'imperatrice_ with an eacute in the filename or | my car's media system will display garbage (it's running QNX and | I don't know which filesystem). | | FWIW, and it should be food for thought, every single Git | repository in the world contains a pre-commit hook sample | (disabled by default but it's there) that enforces that every | committed file in the repo is named using a subset of ASCII | characters. | | Every Git repository in the world has that example: let that sink | in. | selfhoster11 wrote: | > FWIW, and it should be food for thought, every single Git | repository in the world contains a pre-commit hook sample | (disabled by default but it's there) that enforces that every | committed file in the repo is named using a subset of ASCII | characters. | | I use Git for documents too, not only code. Why shouldn't I use | my native language? | cerved wrote: | non-ascii characters cause annoying hard to fix problems. If | you're willing to deal with that - kudos. Personally I don't | find it worthwhile | numpad0 wrote: | Tab completion don't work well for languages that require | IME. That is one reason why _I_ don't. | dredmorbius wrote: | IME == Input method editor? | | https://en.wikipedia.org/wiki/Input_method | selfhoster11 wrote: | That's actually a good point. On the other hand, not all | languages use IMEs. Mine just uses the AltGr modifier key, | but is otherwise just a standard QWERTY layout without any | features. | glandium wrote: | Tab completion works just fine for me with a Japanese IME. | chungy wrote: | > I still live in a world where I cannot name a song from the | french group L'imperatrice with an eacute in the filename or my | car's media system will display garbage (it's running QNX and I | don't know which filesystem). | | I have an Android phone and I tell MusicBrainz Picard to save | all files with ASCII-only names and Windows-compatible names | for the ones that get sent over to the phone. Basically for | this reason. Sometimes it's players on Android itself, but even | more frequently, whatever bluetooth radio I'm connected to | freaking out with non-ASCII characters. | torstenvl wrote: | What do you mean, display garbage? | | L'imp?ratrice? L'impratrice? L'impA(c)ratrice? L'imp,ratrice? | L'impUratrice? | kingcharles wrote: | You get all those space characters working and then some jerk | comes along and uploads a file like this: r[?][?][?][?][?][?][? | ][?][?][?]e[?][?][?][?][?][?]g[?][?][?][?][?]e[?][?][?][?][?][? | ][?][?][?][?][?]x[?][?][?][?][?][?][?][?][?][?]-[?][?][?][?][?] | [?][?]t[?][?][?][?][?][?]h[?][?][?][?][?][?][?][?][?][?][?][?][ | ?][?][?][?]i[?][?][?][?][?][?][?]s[?][?][?][?].[?][?][?][?][?][ | ?][?][?][?][?]e[?][?][?][?][?][?][?][?][?][?][?][?][?]x[?][?][? | ][?][?][?][?][?][?][?][?][?][?][?][?][?][?][?][?]e[?][?][?][?][ | ?][?][?][?][?][?][?] | dang wrote: | Please don't Zalgo on HN. It's enough to speak its name. | allemagne wrote: | It would be one thing if it was making other comments | difficult to read or causing browser issues, but I | appreciated the demonstration that both would presumably be | possible on certain browsers | quantified wrote: | Glad you didn't choose a sequence that crashes my browser. | meepmorp wrote: | regex this, bravo | jagged-chisel wrote: | 768 characters is too long for macOS it seems. (References | online say HFS+ has a limit of 255 UTF-16 characters. Didn't | find anything for APFS immediately... edit: same for APFS) | hnuser847 wrote: | Honest question - what the heck are those characters? | sdenton4 wrote: | Zalgo text: https://zalgo.org/ | | It was a great joke for a couple weeks two internets ago. | Sohcahtoa82 wrote: | > two internets ago | | It's been like three internets since I heard someone | using "internet" as a measurement of time. | | It's actually interesting to think about "generations" of | internet, just like generations of people, and how the | culture shifted between them. | | There was a time in the early '00s when broadband was | catching on, yet YouTube didn't exist. A time when | Ebaumsworld and Newgrounds ruled the internet. When | Homestar Runner was pop internet culture. Weebls Stuff. | The frog blender. | grishka wrote: | Combining diacritic marks. | Valgrim wrote: | It corrupted text or "Zalgo" text, it relies on diacritics. | | See this answer on stackoverflow: | | https://stackoverflow.com/questions/1732348/regex-match- | open... | lmkg wrote: | I disagree with calling it "corrupted." We're not | tricking the browser into trying to render garbage bytes | that are actually the middle of a jpeg or something. It's | actually valid Unicode. It's an edge-case which is not | seen in regular usage, but it's technically following all | of the rules. | orangepurple wrote: | In digital typography, combining characters are characters | that are intended to modify other characters. The most | common combining characters in the Latin script are the | combining diacritical marks (including combining accents). | | https://en.wikipedia.org/wiki/Combining_character | db48x wrote: | Specifically _Vietnamese_ combining characters. The | Vietnamese writing system uses multiple combining | characters at a time, and stacks them vertically. Throw | in a few that wrap around the character like | t*1.000.000*his, some alternat lttr frms, disturbing | imagery, and perhaps a few other tricks, and you have | zalgo. See also | https://stackoverflow.com/a/1732454/823846 | Loughla wrote: | This legitimately made me laugh out loud in my office. | | The characters reach up off the screen as I reply to this. | They overlay the comment above you. Amazing. How? | pxndxx wrote: | It's usually called Zalgo text, and it's what you get when | you start stacking all kinds of Unicode diacritics on poor | unsuspecting characters. | | https://en.wikipedia.org/wiki/Zalgo_text | Macha wrote: | Interestingly I get different behaviour per browser/OS. | Firefox/Linux clips it to the bounding box of the parent | element, Firefox/Mac and Safari/Mac clip it to the line | height, and only Chrome/Mac lets it extended further. | terr-dav wrote: | Firefox and Safari on iOS 15 both render all the glyphs | attached to the base character. Vivaldi, Chrome and | Firefox on Win10 all render them stacked and overlapping | the parent and child comments. | detritus wrote: | Huh, I tried it in Chrome to see how it reacted here and | it maintained about the same position as it did in my | usual browser, Firefox. | kingcharles wrote: | This is the best generator I found: | https://lingojam.com/GlitchTextGenerator | nyanpasu64 wrote: | I find | http://animalswithinanimals.com/generator/generator.html | much more controllable. | Liquix wrote: | For anyone who is curious (and acolytes of Zalgo): "In | Unicode, character rendering does not use a simple character | cell model where each glyph fits into a box with given | height. Combining marks may be rendered above, below, or | inside a base character. So you can easily construct a | character sequence, consisting of a base character and | "combining above" marks, of any length, to reach any desired | visual height, assuming that the rendering software conforms | to the Unicode rendering model." | | [https://stackoverflow.com/questions/6579844/how-does- | zalgo-t...] | shadowgovt wrote: | Hah, lucky for me Chrome on Ubuntu didn't implement the | spec correctly. ;) | prepend wrote: | Let me tell you how much of a pain in the ass that my employer | forces spaces in the corporate OneDrive directory. | | PS-Microsoft is horrible about stupidly named folders being | created and dumped in there. | mtift wrote: | I have an overly-aggressive function in my .bashrc to rename all | files in the current directory: # Rename all | files in a directory rn() { rename "s/ /-/g" * | rename "s/_/-/g" * rename "s/-/-/g" * rename | "s/://g" * rename "s/\(//g" * rename "s/\)//g" * | rename "s/\[//g" * rename "s/\]//g" * rename | 's/"//g' * rename "s/'//g" * rename "s/,//g" * | rename "y/A-Z/a-z/" * rename "s/---/--/g" * | rename "s/---/--/g" * } | | I use this all the time, especially when I download files. | BiteCode_dev wrote: | Thanks to all the comments in this threads, I now have "sudo | apt install rename detox" in my install script, and: | normalize_names() { rename "s/-/_/g" * | detox -s lower * } | | in my .bashrc. | | I've thrown some edge cases at it, and it handles it super | well. It deals with consecutive "_", remove leading garbage, | normalize unicode, and even prevents naming conflicts by opting | out early. | | Thanks you. | cerved wrote: | I wonder if rename has an -e flag like sed. It might be worth | baking this into one monolithic regex if you call this often | mrzool wrote: | You might be interested in detox: | | https://github.com/dharple/detox | OskarS wrote: | Overly aggresive is right! I don't know if this is genius or | deranged! I'm leaning towards genius and stealing the idea. | | By the way: what's your beef with en dashes? I mean, if it was | "everything should be 'HYPHEN-MINUS' (U+002D)", then fine, but | why specifically en dashes and not em dashes? | michaelt wrote: | _> By the way: what 's your beef with en dashes?_ | | Of all the changes in that list, removing _the character that | doesn 't appear on a standard keyboard_ seems like the least | controversial... | mywittyname wrote: | To add, it's a character that gets magically inserted for | no reason in various situations. | | It's up there with those damn angled quotes. | jedimastert wrote: | A better question might be "how did it get there in the | first place?" | dredmorbius wrote: | Presume all inputs are hostile. | | Whether people or processes, something is likely to | introduce the character at some point. | ggm wrote: | Sw which converts -- and __ on the fly. Same sw converts | quote pairs "for your convenience" | tgbugs wrote: | Word of warning from hard experience: rn is a really dangerous | thing to name a function because it is one char away from rm. | post-it wrote: | Looks like it's typically run without any arguments, so it's | probably fine. | lioeters wrote: | A typo can go the other way, like "rn somefile" where it | was meant to remove a file but instead it renames all | files. | spurgu wrote: | One char away also physically on the keyboard (maybe that's | what you meant?). | tgbugs wrote: | Yeah, the physical layout is the primary concern. I should | have noted that since there is ambiguity because n and m | also happen to be next to each other in the alphabet. | Extigy wrote: | I once ran "crontab -r" instead of "crontab -e" and also | thought that was terrible design for the same reason. | TheSkyHasEyes wrote: | ren would be better than rn. :) | theshowmustgo wrote: | Nice but how do you prevent overwrites? What about | directories/folders and the files in that directory/folder? | | I have: Movie Bla (2020) Movie Bla | (2020).mp4 | | But also: Movie_Bla_(2020) | Movie_Bla_(2020).mp4 Movie_Bla_(2020).srt | | Would not like to lose files like the the srt. | BiteCode_dev wrote: | rename will stop and output and error. | Tempest1981 wrote: | Surely you must run into conflicts now and then? | nybble41 wrote: | That's the most beautiful part! After running this script | there are no more conflicts, because it just silently | overwrites all but one version of the "cleaned" filename. | | (Also--that entire function is super inefficient and could be | replaced with a single invocation of "rename".) | donio wrote: | https://github.com/dharple/detox is a nice tool for this. Sane | defaults but configurable. | | In addition to CLI I use it from emacs dired-mode too: | (defun my-dired-detox () (interactive) | (dired-do-shell-command "detox" nil (dired-get-marked-files)) | (revert-buffer)) | | I bind it to "_" in dired-mode. | niccl wrote: | I use this snippet, to change spaces to underscore for | directories and files in the current directory and below. | Haven't made it a function yet, but should. I got it from stack | overflow or somewhere, but no attribution. Thanks to whoever | did it first: find . -depth -name '* *' | | while IFS= read -r f ; do mv -i "$f" "$(dirname | "$f")/$(basename "$f"|tr ' ' _)" ; done | cmg wrote: | I nearly gave up on learning newer front-end JavaScript stuff | like React & webpack and so on a few years ago because of spaces | in paths. | | node-gyp doesn't like it when there's a space anywhere in your | working path. Stuff I was messing around with was all in ~/Code | Projects at the time, and using npm install on some things just | broke. Looking back, I definitely could have done a better job | parsing the error messages but still... | | There's an issue but it was closed in 2018 as "The workaround is | to use a path without blanks" https://github.com/nodejs/node- | gyp/issues/439 | xdennis wrote: | Looks like I'm in the minority. I always use spaces and non-ASCII | characters in filenames. | | In many languages it's a requirement. For example, in Romanian, | there are 8 words that collide with ,,fata" if you remove the | diacritics (fata, fata, fata, fata, fata, fata, fata, fata). | | Given that we have to use diacritics, spaces don't seem like a | big deal. | vadfa wrote: | >In many languages it's a requirement. For example, in | Romanian, there are 8 words that collide with ,,fata" if you | remove the diacritics | | That is what context is for. | selfhoster11 wrote: | So do I. I have a language, and I'm not afraid to use it. My | computer should speak it just as well as I do. | rob74 wrote: | Hmmm, I thought I was fluent in Romanian (born there and lived | there for 26 years), but I only know 5 of those 8 words... | xdennis wrote: | That doesn't seem unusual. Only the first 5 are very common. | theshrike79 wrote: | According to Google Translate the first two are "girl" and | the rest are "face". =) | xdennis wrote: | * fata - the girl | | * fata - girl | | * fata - the face | | * fata - face | | * fata - was giving birth | | * fata - a small fish, or a child who won't sit still | | * fata - was fussing | | * fata - variant of fata | | As you might infer from the first 4, Romanian uses postfix | "the" and for singular feminine words you can't tell the | difference if you use only ASCII. | qayxc wrote: | Google Translate is a horrible tool for "translating" | single words or lists of unrelated words. | | Use a proper dictionary for that. The very nature of | statistical models makes proper translation without context | impossible for these systems, especially when uncommon | words and diacritics are involved. | hdjjhhvvhga wrote: | So how did you deal with it in the 80s/90s? | PeterisP wrote: | Not sure about Romanian, but for many other languages people | essentially came up with transliteration schemes (multiple, | incompatible, ambiguous) to squeeze your language into ascii. | | The resulting text was understandable by the "computer | people" but not the general population who did not use the | networks back then, perhaps somewhat comparable to when some | time ago USA parents encountered the "SMS slang" used by | their teenagers. | octorian wrote: | Back in the day there were dozens of character sets that were | alternatives to US-ASCII. Having once worked on an Email | client, I needed to bake in a bunch of translation tables to | convert stuff sent that way into UTF-8. | xdennis wrote: | As you would assume: use ASCII and deduce from context. Many | people still do that. | | That has lead to phantom diacritics: reading letters in | unfamiliar words/names based on what you assume they are. For | example some pronounce Chirica as Chirica because they assume | someone forgot to type the breve in a. | apricot wrote: | I call it the habanero trap. There is no n in "habanero", | yet a lot of people say "habanyero", probably by analogy | with "jalapeno". | masklinn wrote: | > Given that we have to use diacritics, spaces don't seem like | a big deal. | | There is one big difference: CLI utilities don't usually care | about diacritics (though encoding issues can throw a wrench in | that), but they care a lot about spaces. So putting spaces in | filenames requires properly quoting or escaping parameters, | whereas diacritics does not. That makes one-off shell snippets | and scripts a lot more annoying (though TBH I tend to shy away | from those anyway, these days). | yread wrote: | We have a few words that depend on diacritics to be unique in | Czech as well - though not as bad as this example - but people | just manage without. Hell, I don't even bother installing the | Czech keyboard, if I REALLY need it (like in names), I just | google for words that have the character and copy it | enriquto wrote: | Why stop here? Why not put spaces in your variable names also? | Allowing spaces only in file names and not in variable names is | short-sighted when not inconsistent. | ricardobayes wrote: | You can now? | morpheuskafka wrote: | I'm 19 now and learned this advice from my dad growing up. Still | run into situations in my IT work and programming stuff where it | makes a difference. | pulse7 wrote: | I'm still afraid to use national specific characters in file | names... | Joker_vD wrote: | One of the main reasons why Windows used "Program Files" and | "Documents and Settings" was to _force_ the programs (and | programmers) to deal with paths with spaces. And you know, for | the most part it kinda, more or less worked out although of | course even today you will find programs that ask you to install | them in a folder without spaces in the path. | Rerarom wrote: | VFAT and stuff like that actually provided alternate names like | PROGRA~1 | beardyw wrote: | Yes, I was doing code to quickly read FAT folders (on a micro | controller) and got to the bit about filenames more than 8.3. | I decided my life was too short (and processing time) to go | and sort out what the "real" file name is. Enforced 8.3 as a | requirement! | toyg wrote: | The main culprit for space issues is stuff relying on BAT or | CMD files, where escaping variables seems to be a black art. | | Sadly such set includes loads of Java programs. If only SUN had | shipped a standard way to generate isolated exe files in | 1998... but they worked under the presumption that you'd have a | JVM already there, because distributing that monster was | difficult in dialup times, so you could just hand people a jar; | and the enterprise market did not care, since they had webapp | servers. Sadly it's an "optimization" that became obsolete very | quickly but wasn't rectified until it was too late (java 9+). | ReleaseCandidat wrote: | > The main culprit for space issues is stuff relying on BAT | or CMD files, where escaping variables seems to be a black | art. | | Actually it isn't, just use double quotes and add a '~'. It's | just about the only thing batch files handle better than | shell scripts. set "VARIABLE=%~PATH" | makecheck wrote: | They may have thought that would happen but I saw just as much | stuff end up in C:\Windows or \Users or (always my favorite) | those "Documents" that are really just "whatever random crap | every app wants to put there". | dale_glass wrote: | And that was a good idea, if only Microsoft also fixed the | CreateProcess function, Windows would be somewhat sane in this | regard. But somehow nobody seemed to think of it. Seriously, | look at it: | | https://docs.microsoft.com/en-us/windows/win32/api/processth... | | The arguments are a single string. So you want to pass | parameters with spaces in them? You've got to add quotes and | stuff all of that into a single string. Instead of doing it in | a more sane manner, like oh, the arguments to main(). | IiydAbITMvJkqKf wrote: | The root cause is that argv isn't a first-class citizen like | on linux, but an abstraction. The kernel only cares about a | single string argument. If you use main instead of WinMain, | the CRT will transform the single string into an argv for | you. | | Oh and cmd.exe uses a different escaping scheme than the CRT. | dale_glass wrote: | Microsoft is in full control of the Windows kernel, so they | can make it care about whatever they want to, and one would | think better argument passing would be a nice quality of | life improvement. Less nonsense for developers to deal | with, and less weird bugs on the platform. | exciteabletom wrote: | Sure, but MS values backwards compatibility a lot. | | They aren't going to break existing API or bloat the | kernel with a bunch of functions that do the same thing. | Joker_vD wrote: | They can either add a new API which almost nobody would | use -- because everyone already learned to use the | existing one and either reused or reimplemented the | MSVCRT's logic so that most of the software parse the | command lines the same way; or they can literally break | every single program in existence by breaking the | interface of CreateProcess -- which is just as likely as | Linux breaking the interface of execve(2). | | Giving CreateProcess a new flag so it would to correctly | accept "path\\\to\\\my\\\program.exe\0arg_1\0second | argument\0argument with literal \" symbol" (with an | implicit \0 terminating it) as lpszCmdLine is an easy | part; the hard part would be forcing everyone to switch | to using it. | | Also, I'm pretty certain this processing happens in the | user space, and Win32 API is already bloated beyond any | belief. | Avalaxy wrote: | Yet in Microsofts own cmd tool I need to put quotes around my | path if I want to refer to any files/folders below those | folders. | uwagar wrote: | or Capital letters | mindslight wrote: | I\ am\ not\ afraid,\ I\ just\ do\ not\ see\ how\ it\ benefits\ | my\ quality\ of\ life. | student2k wrote: | I recently find out a windows folder can't end by a space.. But | python for example you can create this folder 'example ' every | file you create in this folder will be inaccessible, and | impossible to delete. | goto11 wrote: | I still use the "web safe palette" when choosing color codes for | CSS | GuB-42 wrote: | Spaces in file names break half of the shell scripts I have | encountered. | | And it is one of the biggest reason I hate Unix shells as | programming languages, it is a minefield. In fact I think that | after a dozen lines, Perl is a better option. It has most of what | shells are good at (i.e. running commands), but saner and more | powerful. | ndesaulniers wrote: | my god, I was simply trying to loop over every file in a dir | and zip it in a bash one liner. Of course, some of the inputs | had spaces in the file names. What an exercise in | frustration!!! | chrisBob wrote: | I know I _can_ put spaces in file names, but \ is one of the | characters I still can 't touch type, so I still hate dealing | with them in the terminal. | Aulig wrote: | I had to move my development folders because you can't develop | Android apps if your project path contains a space. Not sure | where the issue is, if it's gradle or something else. | | Edit: thinking about it again, it might not have even been the | space but the exclamation mark in my path. Or both. | matchagaucho wrote: | Keep%20the%20names%40and%20links%20readable%20or%20submit%20to%20 | encoding | davidjgraph wrote: | You think space are bad (and yes I'm old enough that I don't use | them)... We work with a company that has forward slashes "/" in | their trading name and insist on shared cloud directories | involving them to be prefixed with that trading name. | | As you as you do anything programmatic in/out of these drives it | all hits the fan. So I'd add to the original statement - "Avoid | 'technical' companies with special characters in their name", | it's just not right... | Decabytes wrote: | It's just such a pain in the butt to work with files with spaces. | In a script it's fine b/c I just surround it in double quotes, | but on the command line I hate having to escape the spaces. | | This might already exist, but I wonder about a terminal that was | really just a multi-line repl to a language. It would be | preloaded with libraries that replicated all the features of the | gnu core utils, but instead of calling grep like normal, you | called a function like grep("args"). The advantage would be that | you had access to a full blown programming language at all times. | So when you needed to do something more complicated you would | still have access to all the standard language features. And when | you didn't need that, your canned core utils like functions would | work | glitcher wrote: | Wait, what's a file? :P | mherdeg wrote: | There was some prior discussion about a generational shift here | at https://news.ycombinator.com/item?id=28615884 -- there's an | idea that people no longer need to know what files or folders are | in order to get things done day-to-day with software ( | https://www.theverge.com/22684730/students-file-folder-direc... | ). | | I'm wondering when the first generation of college students will | start who have never used a physical keyboard to input text. | ctur wrote: | i feel so seen | mgdv wrote: | Years of Java has me seeing the world in camel case | maydup-nem wrote: | Not afraid, but typing a dash in the terminal is easier and | shorter than typing a reverse slash and a space. Spaces are kind | of a pain in the ass in the terminal, tbh. | ezfe wrote: | Quotes around the path is easier and avoids any issues - but | tab completion and drag and drop files into terminal handles | most cases for me. | fragmede wrote: | And I'm older than Google. If you want some hilarity, newlines | are allowed in filenames as well (\n, \r, \r\n). Try getting bash | to handle that! (It's possible, though annoying. try redirecting | to `while read line` in addition to xargs -print0 hackery) | mikewarot wrote: | File names shouldn't have anything except a-z,0-9,_ and perhaps a | -. No unicode, no spaces, no nulls. | | It's not fear that keeps me from using spaces in file names, it's | habit. | | If we're going to play this dangerous game, from now on I'll | figure out how to use nulls (\0) in my file names, and make all | the C/C++ programmers cry. | codetrotter wrote: | I do it the other way around. I used to be afraid of spaces. But | I have come to realize that it is better to learn sooner than | later which pieces of software is in such a bad state that they | aren't handling spaces correctly. | | That being said, even after all these years I sometimes need to | try a few times in order to get the quoting and the escapes right | when communicating names of files with spaces through multiple | layers of software. | duxup wrote: | Some react scripts freaked out on me recently because my login | (and thus user folder) in windows contained a space. | MisterTea wrote: | 2021-11-11_I_have_absolutely_no_idea_what_you_are_talking_about.t | xt | DeathArrow wrote: | If I'm going to use the file in the command line, I won't use | spaces, since I don't know what sick bug I might encounter. | [deleted] | alpaca128 wrote: | I avoid spaces because they make tab completion more cumbersome | in bash. | eloisius wrote: | Same. For documents and stuff that I use in normiespace I give | them friendly names with capitalization and spaces and such, | but for anything I'm going to be working on via CLI I try to | use filenames that will be easily chunked as "words" when doing | things like double clicking it in terminal to select, ^w to | erase it, tab completion etc. | floatingatoll wrote: | Coming from web-heavy _and_ perl5 backgrounds, it 's insane to me | that people don't treat filenames and arguments and environment | variables as tainted user input, and just blindly trust | properties about them like "does not contain whitespace or | control characters". | jmull wrote: | This is a general issue to this day. So that isn't very old. | apricot wrote: | That's funny because the first operating system I used (Apple DOS | 3.3) was very liberal about file names. There was a 30-character | limit which was a lot, and it didn't mind spaces in file names. | Even control characters were fair game, which made things fun | when you accidentally inserted a ^A in a SAVE command. | yboris wrote: | I've been stuck for years with a bug in my commercial Electron | application where images do not get displayed if the folder path | has spaces in it :'( | | https://github.com/whyboris/Video-Hub-App/issues/667 | | Any help would be really appreciated! | authed wrote: | I try to avoid spaces and special characters because issues still | happen to this day (just yesterday, I had an issue with a file | with an accent in it). | shockeychap wrote: | Maybe it's just me, but it always seemed like prohibiting spaces | and other special characters was a reasonable way to avoid | unnecessary complexity (and the bugs that accompany it) when | parsing and navigating directory trees and files. | | I'm old enough to remember working with 8.3 filenames in DOS, and | while the length limitation was maddening, the space part never | was. Then Windows 95 came out and all restrictions were thrown | out. | | Why couldn't we just have a file system that robustly supports | long filenames, including variable length extensions, while | prohibiting certain special characters - namely spaces, slashes | or any directory denoting characters in files, and characters | that have special meaning in regex context? (brackets, asterisk, | etc.) | kasabali wrote: | Related: David Wheeler's Fixing Unix/Linux/POSIX Filenames | | https://dwheeler.com/essays/fixing-unix-linux-filenames.html | tgv wrote: | By coincidence, I found another reason just two days ago. A web | app lists uploaded files' names, and (in a rarely used context) | lets the user search for them. One user has copied a file name | from the web page, and pasted it into the search box, but got | no results. Turned out that the file name contained two | consecutive spaces, which the browser turns into a single | space, hence no match. Every layer between the user and file | system can do something unexpected. | IiydAbITMvJkqKf wrote: | Posix makefiles don't support spaces in dependency names. Not | sure about gmake. | | Cmake doesn't support semicolons, because everything in cmake is | a string, and ; is the list item separator. | | PATH is separated by colons, so you can't add directories | containing : to it. | reaperducer wrote: | Spaces are still not "permitted" in URLs. | | Browsers will take http://example.com/some name.pdf and | automagically turn it into http://example.com/some%20name.pdf, | and deliver the goods without a problem. But having that space in | the URL is still out of spec, and will cause your web page to | fail validation, even though it works fine. | crescentfresh wrote: | Our local development environment has evolved to a complex enough | sequence of steps to set up and troubleshoot that I spent 2 weeks | creating tooling that you can simply point at source checkout | locations and the tool will take care to setup that repo. | | It broke on the first try on a jr hire's machine, the source | checkout location was `C:\source code`. | Vrondi wrote: | If you're in tech long enough, you can be traumatized by | anything. Like the time a vendor-supplied system decided after an | update that nothing could have a hyphen in the title, and a lot | of existing content just... broke at once. Fun times. | mhd wrote: | I'm >>still tempted to write umlauts like 'Mot"orhead' old.<< | | But also a "use a font that has a proper capital ss" hipster. | jl6 wrote: | I had half a feeling that the warning against using spaces in | names pre-dates computing, but after a little research into | library call numbers and archive accession numbers, which turn | out to have both historically included spaces, I have found no | evidence to support this feeling. | mindvirus wrote: | Heck, I'm still afraid to use caps! | msoucy wrote: | My coworkers still don't quote strings in their bash scripts, | even when they're paths... and yet they wonder why everything | falls apart. | branko_d wrote: | I have an uneasy feeling whenever I see a path parameter declared | as string. Path is not a string - it's a sequence of path | components and should be treated as such by our APIs. A path | should be parsed once - on user input - and then used in its | "sequence form" throughout the software stack. | | And "path component" is not an arbitrary string either - e.g. | appending a path component to the path should first require | converting/parsing the string into the path component, and only | if that's successful appending it to the path. | dahfizz wrote: | > I have an uneasy feeling whenever I see a path parameter | declared as string. Path is not a string | | I guess that depends on what you mean by "string". `open` and | `fopen` need a char* path to open a file. Whatever fancy Path | abstraction you use eventually becomes a char* string, because | that's what the kernel needs. | VWWHFSfQ wrote: | yeah. it's a string. | dwheeler wrote: | On POSIX systems file names are not strings, they are | sequences of bytes. They might not be UTF-8 or have any | meaning. Python3 had to hack around this, they thought they | could force everything to Unicode and discovered that | doesn't work. | guntars wrote: | Which makes for fun issues like there's no standard way | to display a filename in Unix. A system that's, you know, | all about files. | warkdarrior wrote: | Unix: everything is a file, including file names! | sipos wrote: | At least for most Linux systems (not sure about other | *nix, but I expect the same?), there is a system default | encoding, defined by the locale, and I think decoding the | filename in that encoding and displaying the resulting | string, is probably the correct way to display a | filename? That seems as good as you are likely to get on | any system really. | | I think for any POSIX system, either there is locale | support defining the encoding, or it uses the POSIX | locale, which defines the encoding (ASCII). | | Of course you need to handle cases where filenames cannot | be decoded in the system encoding (probably by replacing | characters that cannot be decoded), because a filename in | a different encoding, or even with no valid encoding, has | been used on disk. While systems can say that file names | containing bytes that are not valid characters in the | system's encoding are not valid file names, that doesn't | stop people mounting disks with them, so the problem | never goes away if you support opening media from other | systems. | | What I am saying is that this is no more a Unix problem | than it is a problem on any system that supports | removable media. | duped wrote: | That's probably because paths aren't properties of the | file itself, they're helpers to reference the file. | SAI_Peregrinus wrote: | POSIX "Fully portable filenames" allow all characters except | 0x2F (/) and 0x00 (NULL). That means file names can include | line feeds, backspaces, EOF, etc. | | "This is `a | | perfectly vali'd.\010! file name\377, despite the weirdness" | jerf wrote: | "Path is not a string - it's a sequence of path components and | should be treated as such by our APIs." | | For maximum correctness, you want to turn it into a file handle | as soon as possible, and do all operations through the | variations of the file functions that end in "at", like: | https://linux.die.net/man/2/openat | | The downside of this approach is that you still technically | have to carry the path around with you if you ever want to | present it back to the user, because once you have a directory | handle, you can get back to the root directory easily enough by | following parent links and seeing what directories you end up | in, but that may not be what the user "thinks" the path is, and | they want to see their path, not a canonicalized one. And | they're mostly right. And it's not easy to correctly track | changes to their intended path from this basis either. | | Basically, I don't know of a really solid, 100% correct way to | handle this with any reasonable degree of effort. | Pxtl wrote: | "you want to turn it into a file handle as soon as possible" | | But no sooner. | | For example, I've run into problems where I'm configuring | program A server to talk to file location B... but _I_ don 't | have access to file location B. But the client-side library | for talking to the server tries to convert location B into a | file handle and then freaks out because I can't access it. | When I don't want to access it. I want that program to serve | it. | | If it was using simple "path" objects that _didn 't_ confirm | that I have access to the path, everything would be hunky | dory. But because it tried to convert it into a file handle | unnecessarily, I get blocked. | jmull wrote: | > For maximum correctness, you want to turn it into a file | handle as soon as possible | | That's not right. You want to resolve a file/folder path to a | file/folder at the exact point it makes sense. | | It's a problem if you're using a path when you wanted the | file. The file can be switched/modified out from underneath | you. | | It's also a problem if you've got the file when you only | wanted a reference. Now you can't simply switch/modify the | file independent of the reference. E.g., maybe you want | config file changes to take effect immediately and | transparently. | | You can also have the hybrid case, e.g., where you want the | folder directly, but have a relative path to a file that is | resolved late. | | If you're unsure, I'd err on the side of late resolution. | BoorishBears wrote: | > For maximum correctness, you want to turn it into a file | handle as soon as possible | | This is why I get stressed out when I see paths turned into | special objects encoding separators and such. | | It tells me the path is living for way too long compared to | the file handle. | | I only want to see path-specific objects if we're modifying | the path, and even then I want that to happen as late as | possible. | cerved wrote: | doesn't this lock the file? | aspaceman wrote: | Why not just hold onto both? The users representation and the | file handle. Only ever "display" the representation, while | you do all operations on the handle. (Not trying to be | sarcastic, just curious). | globular-toast wrote: | This goes for most instances of user input. Timestamps is the | other common one people get wrong. I've even seen programs | that pass around timestamps as strings in multiple formats | and as integers (Unix time). | aqfamnzc wrote: | As a programming noob, I'm wondering what would be the | better way to pass or return a unix time value as opposed | to an integer? | globular-toast wrote: | Depends on the language but most high-level languages | have a timestamp or datetime abstraction which you should | be using. | joe_guy wrote: | If it's being serialized, consider fully qualified | iso8601. | tmerr wrote: | Another inconvenience with this approach is that you can keep | thousands of paths in memory no problem. But thousands of FDs | may cause you to exceed per-process limits. | anyfoo wrote: | Strings following certain rules are entirely valid | _representations_ of paths, just like sequences of path | components in the chosen language /framework are. Similarly, | the sequences of bits that make up the sequences of your | language/framework in memory are an entirely valid | representation of said sequences of components. | | Yes, paths have structure, but saying "a path is not a string" | is equivalent of saying "C source code is not a string". Both | are strings, and both are something else, represented by | strings according to rules. Different internal representations | have different advantages and disadvantages. I fully agree that | for things such as "adding components" an internal | sequence/list representation is better, but strings can pass | arbitrary IPC or even ABI boundaries much easier for example. | (And you wouldn't bat an eye for example when you see FQDNs | like "www.google.com" passed as a string instead of as | ["www","google","com"] because the string representation works | pretty well.) | fouric wrote: | C source code and paths are both representable by strings, | true, but the fact that they're not actually strings is still | important, because most people don't know that, and in the | case of paths that leads to a lot of edge cases (in the case | of source code it leads to a bunch of inefficient and weak | tooling, which isn't quite as bad). | | Because neither are strings, their _native representation_ | shouldn 't be such - it should be something structured, and | only when necessary (IPC, FFI, serdes) be serialized into a | string representation. This would save people a lot of time | and effort. | gadders wrote: | Where I used to work they had a risk system that created | directories on the window server that matched the book name. They | had a trader that named one of his books "COM1"... | rob74 wrote: | Well, you should still be afraid! Be very afraid! Seriously: only | a few months ago I was confronted with a video encoding tool that | didn't work properly when the file names contained spaces - so | yes, even in 2021 it's still safer not to use spaces in file | names... | nojs wrote: | Not to mention most naively written bash scripts! | tiagod wrote: | Honestly, this still causes a lot of problems with some Software. | I've had friends asking for help with obscure errors that were | ultimately caused by the files they were using being on a path | that contains a space or special character. | shoto_io wrote: | On a similar note: "it makes sense to add a date to a file name" | years old. | cbushko wrote: | Base64 is your best friend! | shmerl wrote: | Never use spaces in file names. It shouldn't depend on age, it's | common sense. | imchillyb wrote: | This is why \Program Files, and \Program Files(x86) exist as they | do. With spaces, and strange characters, in the name. | shaoner wrote: | Any shell script that uses files should use double quotes for at | least the variables: `mv $1 $2` is not safe, should be `mv "$1" | "$2"` | neogodless wrote: | I work in Azure Data Factory, and there are places where a space | in a name will cause you difficult to troubleshoot errors. But I | can never remember where. It's not universal. So I just avoid | them entirely. | sva_ wrote: | What about long filenames and paths? | rndgermandude wrote: | I still feel slight unease sometimes when using more characters | than 8.3 | | Damn, I feel old now :P | ourmandave wrote: | A lot of my stuff is cross platform so making filenames portable | means avoiding spaces. | | Ironically, even NASA doesn't like space. | | https://www.nas.nasa.gov/hecc/support/kb/portable-file-names... | zibzab wrote: | Touche my friend, had a good laugh | hardwaresofton wrote: | I am also that age, and kebab-case is the best case for | filenames. | | 2021-01-01-some-important-document.pdf gives me the warm fuzzies. | On the off chance that some more differentiation is needed, throw | in an underscore and a whole new world opens up | ModernMech wrote: | Kebab case is the often overlooked benefit of prefix notation | and semantic white space in programming languages. Honestly the | best case of all cases imo. | kibwen wrote: | One glorious day we'll accept programming languages that | require spaces around infix arithmetic operators so that we | can make kebab case a reality! | JasonFruit wrote: | Lisps, especially Scheme with its `x->something-else` | convention, have ruined naming in other languages for me. | MaxBarraclough wrote: | Forth does something like this, by virtue of its reverse | Polish notation. | | In Forth, 'words' (which are roughly analogous to functions | and operators) must always be separated by whitespace, as | Forth doesn't parse out operators the way most languages | do. In exchange, you get the ability to use symbols in | identifiers, as Forth has no reason to single out symbols | like _+_ as being syntactically special. You can even use a | number for the first character. (For that matter, Forth | will even let you override the usual interpretation of a | numerical literal, but that 's always struck me as going a | bit far.) | | It gives you a _+_ word, analogous to the _+_ operator of | most languages [0]. It also gives you a _1+_ word, as an | (admittedly slight) abbreviation of the sequence _1 +_. [1] | If you wanted a _2+_ word, you could easily define it | yourself. | | (This property of Forth evidently wasn't enough to get it | to take over the world, but it's still neat.) | | [0] https://www.complang.tuwien.ac.at/forth/ansforth- | cvs/documen... | | [1] https://www.complang.tuwien.ac.at/forth/ansforth- | cvs/documen... | eCa wrote: | Maybe Raku[1] is for you! | | [1] https://raku.guide/#_syntax_overview (see section | 1.7.1) | apricot wrote: | I'm of the opinion that kebab-case is the best case for all | identifiers, because it's easy to read and to type. As always, | Lispers were right all along. | jerry1979 wrote: | I found that some_document_2021-01-01_v03.pdf works best | because it keeps the same document next to its other versions | alphabetically, keeps them in date order, and keeps them in a | sub-day version order. | jaclaz wrote: | As a side note, in the good ol' times of ISO9660 level 1-4 and | the various mkisofs parameters, an underscore _ which is a | CAPITAL -, may have given issues, only for the record/as a | curiosity: | | https://web.archive.org/web/20151007005513/http://www.911cd.... | | P.S. should anyone want to see/run the actual batch, a copy has | been uploaded here: | | http://reboot.pro/index.php?showtopic=18962&page=29#entry204... | Raineer wrote: | In my work, today's date would be 21K11, to save space over the | longer date. | blackboxlogic wrote: | How do you distinguish 21K111 and 21K111? | inanutshellus wrote: | Are you trying to catch GP on differentiating hours, were | it to be appended to his time format (1st @ 11 vs 11th @ | 1am)? | | Notably he didn't promise any, but presumably one'd need a | separator... Maybe, per his "K" usage of the month, one'd | use the alphabet again. 11am would be "K" again... or | lowercase just for giggles? | | I don't think it reads very well, but I also think one'd | get used to it pretty quickly. | blackboxlogic wrote: | I was thinking January 11th vs November 1st. Maybe their | "date" doesn't need/support day-of-month? Or they typod | and I should just focus on my work. | apricot wrote: | I imagine January is A and November is K, so 21A11 vs. | 21K1 (or maybe 21K01). | blackboxlogic wrote: | Ah yes, I missed that K was a month. | onychomys wrote: | Are you working in some embedded system with tiny memory | space or something? What's the use of saving one character? | Just make it YYMMDD! | jjoonathan wrote: | > kebab-case | | I hadn't heard that before and I love it. | FpUser wrote: | Same. I had tears in my eyes from laughing. For some | inexplicable reason it seems incredibly funny. | Asraelite wrote: | Google considers it too violent apparently. In one of their | recent changes to their style guide, they started | recommending "dash-case" instead. | | https://developers.google.com/style/word-list#letter-k | [deleted] | prepend wrote: | This guide is stupid. They recommend not using "janky." | sodapopcan wrote: | If you hadn't heard kebob-case called that before there's a | chance you haven't heard SCREAMING_SNAKE_CASE called that | before, and I couldn't live myself if I didn't let you know. | inanutshellus wrote: | that's hilarious, thanks for sharing that. | | Perennially relevant xkcd: https://xkcd.com/1053/ | sodapopcan wrote: | Awe, in turn I have never seen that particular xkcd--it's | great! I learned to call it "feigning surprise" and I | always try and be conscious of it (though I still catch | myself doing it from time-to-time). | ur-whale wrote: | > 2021-01-01 | | Yes on the date format. | | Saves you so much time. | hnburnsy wrote: | I don't bother with the century or the dashes, saves time... | | 211111_foobar_v1.txt | | I am old enough that I still save before printing. I think it | was Lotus 123 that engrained it for me. | zz865 wrote: | Agreed on dates ordering problem but 20210101 is so much | easier to type. | testplzignore wrote: | Years that end in a 1 are awful when doing this, especially | in October and November. We've had 20211001, 20211010, | 20211101, 20211110, and now today 20211111. | nicoburns wrote: | But much less easy to read! | zokier wrote: | I just tend to use $(date -Is) so I don't need to think | what date it happens to be today. I guess -Id would work if | you don't want the time part. | tambourine_man wrote: | I go one step further: 2021-11-11_client_project-name.ext | | 2021-11-11_client_projectName.ext is also OK. But underscore | separates fields, hyphens for space replacement. | hardwaresofton wrote: | I see and applaud your use of the underscore there, but I | must reject the premise! | | work/client/project/2021-11-11-file.ext is more or less how I | lay stuff out. I'd say client/project is a folder level | distinction (arguably dates too). | | [EDIT] Realistically most of the stuff under <project> is git | repos and I usually make a "home" repo where I keep org files | for tracking hours, notes, and resources related to the | engagement. | Zababa wrote: | I'll be the opposite voice: the file system isn't for | precise organisation, it's just for storing. For | organisation, the ideal thing to use is tags. Since most | file systems don't have tags and using software for that | would be a pain, the best way to do this is to list the | tags in the file name. | cmg wrote: | work/client/project/2021-11-11-file.ext is great until | you've got a '2021-11-11-project-status.txt' in a few | directories and you need to find one quickly! I do a | combination: clients/client/project/2021-11-11-client- | project-update.txt | renewiltord wrote: | I just store it as a content hash and then when I want to | find the file, I just have to recreate its content and I | can then just get the hash. | ModernMech wrote: | It sounds like what everyone in this thread needs is a | database file system. This was always my favorite | proposed feature of Windows Longhorn that never made the | cut. Almost 2 decades later and Microsoft's latest OS | still doesn't have this feature. | nayuki wrote: | I wrote about what I perceived as deficiencies of | hierarchical file systems, and proposed an alternative | organization based on tags and hashes. It was discussed | on Hacker News last week and many years ago. | | https://www.nayuki.io/page/designing-better-file- | organizatio... ; | https://news.ycombinator.com/item?id=29141800 | zajio1am wrote: | > But underscore separates fields, hyphens for space | replacement | | But why not the other way, hyphen-minus for separating fields | and underscore for space replacement? That seems to me more | consistent with how underscores and dashes are used. | zepearl wrote: | I fully agree, that's how I do it :) | | my_project-some_activity-this_document-20210923-v02.txt | ridaj wrote: | Maybe you mean `2021-11-11_client_project-name_v2_final.ext` | whatusername wrote: | 2021-11-11_client_project-name_v2_final_ridaj(1).ext | eurasiantiger wrote: | Copy (2) of 2021-11-11_client_project- | name_v2_final_ridaj(1)__FINAL-v2.ext | pluc wrote: | this is the way | tomcam wrote: | but the extra Shifts, no thank you | pluc wrote: | you gotta involve your pinky or it'll atrophy | reaperducer wrote: | Cut most mine off in an unsupervised Halloween pumpkin | carving accident when I was a kid. I think the lack of | length actually allows me to type faster. | FpUser wrote: | I use this style: | | 2021-01-01_what-happened_who-did-it_possible-reason | jonnycomputer wrote: | I've recently shifted sharply toward the dash from the | underscore. I find it more readable, and it doesn't require the | shift key. However, I do find it useful to use underscores to | create groups, e.g. test-001_2021-10-11.log. Including hours, | minutes, seconds is still awkward. | FpUser wrote: | Brother in arms. I just posted similar thing below. | kingcharles wrote: | Burn the witch! | discreteevent wrote: | There's a customer for everything. I've just never liked the | aesthetics of the underscore. Also if your underscored thing | gets put in some document and then underlined the underscores | can become invisible. | jonnycomputer wrote: | A lot of this is personal aesthetics, for sure. Personally, | I am not a big fan of camel casing. In code, I only use it | for class names, generally. I don't find it particularly | readable, and for filenames, not all filesystems are case | sensitive, so best not to rely on case to differentiate | files. Camel case does have the nice property of being more | compact, as no character is required. That's its main | benefit. | | R traditionally uses the . as a legal character in | identifiers. Once you get it used to not being syntactic, I | found I actually prefer them to underscores. | ur-whale wrote: | If any of you reading this have to deal with very large scale | data pipelines for data science / ML type processing, and if | "don't use spaces and weird chars in file names" hasn't become | second nature by now, let me just say: you are very, very brave. | intrasight wrote: | My first job as a SW Eng was in 1989 in the nuclear industry. Our | folders and files were limited to 8 letters. So names were | effectively acronyms. It was actually pretty awesome. Clean and | concise. Years later, I still remembered the whole folder | structure. | roody15 wrote: | me too... still use underscore all the time. | hirako2000 wrote: | I never put spaces, and won't go over 32 characters, preferably | less than than 16. even when sending a file to my grand mom. | that's how deep rooted the trauma is. and yes, it remains an | issue with some parsers and what not. | johnchristopher wrote: | I still find files on the internet that my browser can't | download because too many characters :(. | | Edit: can't save, downloading works. | qwertox wrote: | I have experienced a person using a space _in a password_ for | Windows login. | | I still don't know how to process this emotionally. Either it is | somehow naively really genius, or stupid. | | In any case, it scares me, mostly because it is a non-IT person. | Waterluvian wrote: | Even if libraries all handled it, I'd still personally avoid | spaces because spaces get semantically used to separate tokens | and I see file names as tokens. | ajsnigrutin wrote: | ascii, no spaces for me | | i still get issues with old one-off scripts, that still work, and | I forgot to properly quote stuff... plus the urls are pain in the | ass with the %20;s. | vbezhenar wrote: | [0-9A-Za-z_-]+ for me. | lkuty wrote: | Same here and most of the time it's even just [0-9a-z_]+ It's | simple and there are no suprises around the corner | anovikov wrote: | But it still breaks in so many situations and becomes a pain in | the ass in so many other ones! I HATE people who use spaces in | file names. For me it is a sign of a "deeply nontechnical | person". | joshlemer wrote: | I don't know that this is really hacker news material guys... | amelius wrote: | You should be still afraid. Many commands such as Unix "xargs" | don't work properly with spaces if the right flag is omitted. | AdamN wrote: | The meta point here is that spaces are the type of thing that | work fine ... until they don't. This class of bug is best avoided | entirely, especially if there is an easy workaround (not using | spaces). | doodpants wrote: | I'm not young, but I've been using Macintosh computers regularly | since 1990, and even back then file names could be up to 31 | characters long, and could include any character except colon.1 | So I'm pretty comfortable using spaces, and sometimes even non- | ASCII characters, in file names. | | Also back then Mac file names typically did not include an | extension, because the file's type was stored as part of the | metadata in its resource fork. I remember one time a friend of | mine was visiting and was playing around with a paint program on | my Mac. Being used to DOS, when she went to save her file, she | typed a very short name, and then asked me what the proper file | extension should be. I smirked and said, "That's not how you name | files on a Mac. THIS is how you name files on a Mac." And then I | named her file "Ailsa's Cool Picture". Her mind was blown. :-) | | 1This is because the colon was the path separator. But since the | classic Mac OS had no command line interface, the typical user | would never type or even see a file path written out. | forgotmypw17 wrote: | All of that was very cool and impressive and extremely user- | friendly. | | However, I found the lack of a command-line to be restricting. | harshadwaj wrote: | I have been following the guidelines from this presentation for | all my filenames, everywhere and it has been working well so far | - https://speakerdeck.com/jennybc/how-to-name-files | kreeben wrote: | Slightly off topic but I find myself stuck at being "please for | the love of god don't use spaces in git branch names" old. Anno | dazumal this might not even have been an issue and I'm just cargo | culting. | jrimbault wrote: | And on that topic, git branches are case sensitive but windows | filesystem API isn't. Git branches are materialized on the | filesystem as files and directories. | qayxc wrote: | The Windows filesystem API supports CS file- and directory | names just fine. | | It can be enabled on a per-directory basis like so: | | > fsutil.exe file setCaseSensitiveInfo C:\folder enable | | NTFS had support for this for decades now - it was designed | that way to be POSIX-compliant. | | It's shoddy software that lacks support for it, not the OS or | the file system. | jhallenworld wrote: | Yep, I recently got bit by this, someone checked in a branch | named something like "x<-->y", Windows was unhappy. I think | this is a git bug: git should escape these names for the | native platform. | | https://stackoverflow.com/questions/1976007/what- | characters-... | masklinn wrote: | If people actually abuse git branches being CS, odds are good | they're also abusing CS in the repository content. | | The linux kernel is one of the offenders, if you check it out | on Windows or macOS (which supports CS but remains CI by | default) you'll immediately get garbage in netfilter, because | it's an habitual user of having different files with names | identical but for the casing e.g. xt_TCPMSS.h and | xt_tcpmss.h. | chrismorgan wrote: | I enjoy choosing fun branch names from time to time. A few of | them: Russian when a user reported a typo in a Russian | translation; emoji (mostly _added_ emoji rather than _pure_ | emoji); and my personal favourite, a ~250 character diatribe | about a single-character bug I was fixing (~250 after I | discovered that Git's error messages when you cause it to try | to use file names too long for the file system are fairly | mediocre). | swayvil wrote: | Me too. Afraid of dashes too as they might be interpreted as | minus. I use a lot of underscores __ _____ _ _ _ | | Weirdly, my friend hates underscores. But he's a baseball fan | nvilcins wrote: | Tangentially, I frequently add dates to filenames to keep things | organized. And _always_ in the `YYYYMMDD` format for clarity and | technical reasons; `DDMMYYYY` (or God forbid the Americans' | `MMDDYYYY`) never made much sense to me. | wglb wrote: | I do this so often that I have an emacs macro or two that helps | me out: (defun mdy () (interactive) | (insert (format-time-string "%04Y-%02m-%02d"))) | | That inserts the "proper" date format (e.g., 2021-11-11) at the | current point. | | Then to create a date-stamped file name: (defun | file-mdy (file-name) (interactive "sbasename: ") | (find-file (format "%s-%s.org" (format-time-string | "%04Y-%02m-%02d") file-name)) (save-buffer)) | | And a few others. | | Nobody seems to misunderstand this date format. US folks might | find it annoying, but understand what it means. | sclangdon wrote: | If you're developing on Windows, I find a good way of dealing | with this to convert paths to short format before using them | (E.G. GetShortPathName in kernel32.dll). | andreareina wrote: | Spaces breaking tab completion is still an issue, so, yeah. | | ETA: not broken in a technical sense, but having to escape them | isn't the best experience. So it's just easier for me to avoid | spaces. | JadeNB wrote: | Where? It works fine in bash and I think most shells .... | andreareina wrote: | That was a bit of hyperbole on my end, my bad. But you do | have to escape the space, which I'm counting as a minor | break. | thriftwy wrote: | I have had a huge music library on my RAID, and naturally it had | a lot of spaces, and non-ASCII, in the file names. | | It's cumbersome-ish, but can be made to work. | | Then there's shell injection via files containing a newline | character in their name... | slmjkdbtl wrote: | Can someone convince me to not use spaces in music, film, and | book files where they have a "standard title"? | wglb wrote: | I still find them annoying, doing lots of work on the command | line. I use this hack: #!/usr/local/bin/sbcl | --script (load "~/.sbclrc2") (require 'replace-all) | (in-package :replace-all) (format t "file is ~s" | (second sb-ext:*posix-argv*) (probe-file (second sb-ext:*posix- | argv*))) (let* ((args sb-ext:*posix-argv*) (orig | (second args) ) (newfn (if orig (replace-all | orig "(" "-") orig)) (newfn1 (replace-all | newfn ")" "_")) (newfn2 (replace-all newfn1 " " "-")) | (newfn3 (replace-all newfn2 "&" "-")) (newfn4 (replace- | all newfn3 ":" "-"))) (when orig (format t "renaming | \"~a\" to \"~a\"~%" orig newfn4) (multiple-value-bind (new- | name old-truename true-newname) (rename-file orig newfn4) | (format nil "new-name ~a old-true ~a new true ~a" new-name old- | truename true-newname)))) | forgotmypw17 wrote: | I'm "whitespace as syntax is stupid" years old | 3guk wrote: | Somehow the OneDrive clients still refuse to allow leading or | trailing spaces in the filenames, along with a few other | characters that are not allowed - seems to cause quite a bit of | user friction at least with the non-tech guys that I work with | who are confused about why OneDrive is one of the few file | syncing clients that has these requirements.... | icefo wrote: | Gdrive the same "issue". I think it's on purpose to avoid files | that seems to have exactly the same name. | | This can cause user confusion | luckman212 wrote: | I have had to deal with that nightmare multiple times this | year! It was a real head scratcher at first. | bryanrasmussen wrote: | I'm "still afraid to use spaces in file names" wise, dammit! | nocman wrote: | I would say I'm "wise enough to not use spaces in filenames". | | It's not about fear, it's about making good decisions, and | avoiding unnecessary complication. | cabaalis wrote: | I'm hoping to one day be "Windows adds user root folder to the | quick links in explorer by default" years old. | toyg wrote: | Shells are indeed the main culprits for the continued fear of | spaces, but not the only ones. A lot of programs that deal with | "metadata" which will then generate database tables and stuff | like that, still struggle when working with any sort of special | character. And the same for anything that, behind the scenes, | just feeds text into regexes. | frzj wrote: | Just this weekend I learned that the Espressif Framework | doesn't like it aswell. | jimnotgym wrote: | I won't use a space if I think I may need to address that file | from the command line... | foxrider wrote: | I must be nightmare customer, because I've always been exploiting | my ability to use filenames in full UTF-8. I'm that guy that | sends .pdf to your website. | notacoward wrote: | If putting spaces in file names makes you queasy, try punctuation | - especially punctuation like semicolon or ampersand or single | quote that's meaningful to shells and such. <shudder> | | Also, emoji. | hutzlibu wrote: | Or for more fun, use language specific characters, like | aouss... | | And even more fun is, when it mostly works, but then it doesn't | and you notice too late. | sokoloff wrote: | You don't name your files with extensions && rm -rf? | amitaibu wrote: | I can relate! :) | stavros wrote: | I saw this and felt old, but then the comments in here made me | realize that the fear\ is%20real. | Pensacola wrote: | I'm newly afraid to use emojis in domain names: | https://tinyprojects.dev/projects/mailoji | dukoid wrote: | I'm "still afraid to use more than 8.3 characters in file names" | years old! | distant_hat wrote: | I had a guy in my team use forward slashes in filenames. Terrible | idea, caused all sorts of weird issues. | zokier wrote: | Did you mean backslashes? I don't know if any filesystem/OS | supports forward slashes in filenames | kps wrote: | OS X does in the GUI; they're isomorphic to ':' at the UNIX | level. (The Mac used ':' as the directory separator.) | rootbear wrote: | And a : in a file name at the GUI level gets turned into a | dash! I just tried to name a text file "Foo/Bar 10:01.rtf" | and it changed it to "Foo/Bar 10-01.rtf"! | kps wrote: | In that case the GUI is merely changing the file name you | type; in a shell you'll see it as "Foo:Bar 10-01.rtf". | danachow wrote: | How was this possible? None of the mainstream operating systems | allow this. | distant_hat wrote: | via GUI in OS X. | danachow wrote: | Ah so that's not really putting a slash in the name on disk | - finder is just displaying the colon that way - it | substitutes with a colon for historical reasons that have | to do with pre OSX MacOS (but you can see if you create a | file from a program or the command line with a colon in it, | it will display as a slash in finder). It shouldn't cause | any problems on its own on the system - but the colon is | troublesome if you have to interact with DOS/Windows | lineage machines. | mrweasel wrote: | But nice for testing. I spend a few month on Windows while | doing a Django project and found a number of bugs no one else | discovered because they used Mac or Linux. | 1970-01-01 wrote: | I'm still afraid of any non-8.3 filename. | | https://en.wikipedia.org/wiki/8.3_filename | zwieback wrote: | Anything more than 8.3 is for sissies. | rsync wrote: | acme.sh - a shell script that I use to create "Let's Encrypt" SSL | certificates - creates and maintains directories with asterisks | in them: | | https://github.com/acmesh-official/acme.sh/issues/1408 | | This is the sysadmin equivalent of piercing your nose just to | make your parents mad. | lostgame wrote: | I name almost everything with underlines still. I think it's a | programming habit. | | Although lately I have started saving my Logic Pro files with | spaces, simply because I prefer it to be the name of the song as- | is. | ReleaseCandidat wrote: | Still way too many libraries and programs can't handle spaces in | filenames. | | And shells and other programs still have problems with perfectly | legal characters in filenames too, like '!' or ':'. | HenryKissinger wrote: | > Still way too many libraries and programs can't handle spaces | in filenames. | | "It's nothing." | | "What do you mean?" | | "It's nothing... It's empty space. I never taught the computer | how to read empty space!" | | "I never taught Virgil how to fly." | Pxtl wrote: | Colons are a problem on Windows, so it's reasonable to | discourage creating files with colons in the name. | danielvaughn wrote: | yep, I still don't use spaces. I also don't use uppercase | characters. Just underscores or hyphens. | boringg wrote: | Sometimes I break the rule and use uppercase but never | spaces. | ptha wrote: | I've had issues when moving between Window/*nix file | systems, where Windows file names are case insensitive and | *nix systems are case sensitive. | | Build script works fine locally on Windows, but then chokes | in *nix test server, as it's effectively a different path. | danielvaughn wrote: | I've had issues with git when changing a filename, if the | only change is the casing. | jerf wrote: | Was recently encoding my Stargate: SG-1 DVDs to move them to | plex. I was encoding it on a system other than what was serving | it, so I had to copy it. It's surprisingly difficult to "scp" a | file with a colon in it directly. | | I also love when you're using bash and you have a file with ! | in the name, and you accidentally fail to correctly backslash | it, you not only get "bash: !rest_of_filename: event not | found", but it _also_ fails to add that command line to the | history, so you can 't just hit up and fix it. You have to | actually go to the mouse and copy and paste. | philote wrote: | Can't you usually just put quotes around the filename and/or | path to prevent all those issues? | | Edit: nope, just tried it and scp still sees the quoted | filename as a host + path | warkdarrior wrote: | That is just lazy programming. If the input "foo:bar" is | ambiguous, the program should try both interpretations | (HOST:FILE and FILE) and then present the user with a | prompt that provides sufficient information. | | "Does foo:bar refer to the local file `foo:bar' (size: | 102kB, date: 2021-11-11) or to the file `bar' on host `foo' | (FQDN: foo.example.com, IP address: 1.2.3.4)? | | 1: local file `foo:bar' | | 2: file `bar' on remote host `foo' | | Your selection: " | AnIdiotOnTheNet wrote: | It's almost like in-band signaling isn't a good idea or | something. | kerblang wrote: | That sounds like... Puzzle time! I had to cheat, sort of, by | looking at the man page: | | > Local file names can be made explicit using absolute or | relative pathnames to avoid scp treating file names | containing ':' as host specifiers. | | So `scp foo:bar user@host:~` fails because it tries to find | the host foo. But `scp ./foo:bar user@host:~` works just | fine. I feel kind of stupid for not guessing as much. | mywittyname wrote: | Is "!" legal in Windows? I'm pretty sure it is not, but I'm not | on a Windows machine to test. | remram wrote: | If you suspect that the file might be handed to a bash script | at any point, being afraid of spaces is very healthy for sure. | chrisseaton wrote: | > And shells and other programs still have problems with | perfectly legal characters in filenames too, like '!' or ':'. | | Without asking you to always quote and escape every file name - | what alternative is there? If they tried this you'd probably | find you didn't like it. | zeroimpl wrote: | Not exactly - the problem is mostly when doing variable | expansion. The fact that bash treats "$x" and $x as different | is a bit of a design flaw. Of course there's still an issue | with evaluating dynamically generated code, but that problem | is partly solved by working with arrays. | chrisseaton wrote: | I mean how do you want shells to deal with file names with | spaces in? Do you think we should have to quote and escape | all file names all the time? If not then how do you think | it should work? | rcxdude wrote: | Shells should treat data as data, and not have the | default behaviour be treating it as code (i.e. you should | need to do 'eval $x' or some equivilant if you acutally | want the string to be treated as a shell command). This | would also mean having a real list type, instead of | depending on arbitrary seperators in strings. This is | exactly how other languages treat it, and it is not a | significant challenge for interactive use (in fact, it | would substantially reduce the opportunity for suprises | when running commands interactively as well). | billpg wrote: | "You need to add --print0 to your find call and -0 to your | xargs." | jrootabega wrote: | I tend to follow a Postel-like system when it comes to this. When | I write a script I'll usually get paranoid and make at least | token efforts to handle spaces. Which I will then never, ever | use. | adulion wrote: | I don't even use spaces in csv column names | uncomputation wrote: | I don't think this is so much an age thing as a programmer thing. | Old people will still name files all sorts of things, and a lot | of young programmers today avoid spaces. | NoblePublius wrote: | I love it when characters like | break OneDrive | alephan wrote: | I've never created a filesystem entry name with a space. Mainly | because fear and when fear is not proven, "\" looks so ugly. But | I think I'm even worse, I dislike capital letters too. | JadeNB wrote: | So, born today, eh?--says the guy who still regularly runs into | build scripts that cheerily command that they be run from | directories without spaces, since that's easier than proper | quoting in the script. | HNo wrote: | Anyone else totally fine with spaces in filenames? I use to rip | _a lot_ of CDs back in the day, and never had an issue with the | spaces in the file names. | | 01 - Metallica - Metallica - For Whom the Bell Tolls.mp3 | | Names like that were common, and had many spaces. | snvzz wrote: | Spaces in filenames were a mistake to begin with. | | Spaces are used to separate parameters in the command line. | There's also no real need for filenames to support spaces. | jfb wrote: | The filename belongs to the user. Therefore, it is incumbent on | the computer to adapt, not the other way around. | nomel wrote: | Or, one could claim that the poor parsing of a text interface | shouldn't dictate the for-human names of files, especially when | an exceedingly small percentage of users deal with that text | interface. | | But, of course, if you mix the abstractions of metadata | (filename) with location, things won't be trivial. | kazinator wrote: | The nice thing about spaces is there are so many to choose from, | thanks to Unicode. | makapuf wrote: | Well, I'm using makefiles old | hknapp wrote: | Literally just fixed a bug in our software because of an issue | with spaces. | rvense wrote: | There was a Discussion yesterday at work about allowing quotation | marks and semicolons in some user-set titles. We use Mongo. But I | empathize. | bborud wrote: | Not obeying the "Robustness Principle" in software is just poor | engineering. | | https://en.wikipedia.org/wiki/Robustness_principle | armandososa wrote: | I'm "8 characters max plus a 3 character extension in your file | names" old. | Havoc wrote: | I_promise_I'm_not. | comeonseriously wrote: | Without exception, I never ever ever use spaces in filenames. | Ever. | trudler wrote: | tbh using spaces in file names is still stupid. | dncornholio wrote: | Remember when we put + instead of %20? Spaces in URL's are still | a nightmare IMO. I still get strange access log entries where | some encoding went lose, especially in heavy Javascript | enviroments. | | Same goes for capitalisation. All filenames should be lowercase. | | Maybe it's not strictly necessary, it can avoid headaches. | necovek wrote: | Plus sign actually came from | https://en.wikipedia.org/wiki/Query_string#Indexed_search | jasode wrote: | Yes, spaces in filenames introduce edge cases and bugs that | people are not always aware of. | | E.g. Here's a random StackOverflow q&a about a Git pre-commit | hook where the _top-voted answer does not properly handle | filenames with spaces_ : | https://stackoverflow.com/questions/2412450/git-pre-commit-h... | | However, the 2nd and 3rd most upvoted answers do mention "-z" | option to handle spaces.: | https://stackoverflow.com/questions/2412450/git-pre-commit-h... | jonathanoliver wrote: | I always format my filesystems (macOS) as case sensitive and I'm | surprised by the software that has a hard time with that. | | On Unix/Linux we've grown up with case sensitive by default but | everywhere else it still seems to be a problem now and again. | | I should qualify this...I'm en-US so I have no idea what the | experience is like for anyone else. | phreack wrote: | My username has been my name which has an accented character and | has broken countless Windows apps every year since forever, so I | just keep a C:/Programs folder where I run stuff. You should | never not fear filenames. | ASalazarMX wrote: | I am overly aggressive with spaces and special characters in | filenames: I use them everywhere and report a bug when they | cause errors, because they shouldn't in this UTF-8 age. | | I still don't use the special character of my name in my | username because that has caused me many hard to fix troubles. | Think "cannot recover user password because this user doesn't | exist". | antiquark wrote: | You mean, I'm linux years old? | meepmorp wrote: | This is much older than linux or gnu. | darepublic wrote: | If you're working on cli this is reasonable | glandium wrote: | Spaces in file names are a nightmare in Makefiles. | necovek wrote: | Not if you are careful (a bit like "$@" vs $@ in shell | scripts). | | Edit: replace $@ with quoted version which actually changes the | behavior (I was wrong that the difference is between $* and | $@). | chrismorgan wrote: | I don't think it's fair to claim that any Make implementation | supports spaces: there are too many fundamental bugs and | breakages, so that lots of rather important Make | functionality is off-limits if any of your file names will | have spaces. | | https://www.cmcrossroads.com/article/gnu-make-meets-file- | nam... explains the situation in GNU Make in 2007 (and I | don't think it's changed since then, though jgrahamc | especially could correct me). Not being able to use such | features as $^ and $(patsubst) is _severely_ debilitating for | all but the simplest of makefiles. | necovek wrote: | That's a fair point, thanks! | 123pie123 wrote: | I still use the Netbios limitations (15 Characters) when naming | servers | kabdib wrote: | My proposal for a shell on the Mac, in the late 80s, was: | | - Spaces in filenames get transformed to non-breaking spaces by | the filesystem; | | - The filesystem treats nbsp as equal to space (just as case- | folding treats A=a, B=b, etc.) | | Now, argument parsing, mouse double-clicks, etc. all respect | filenames as "words", and the output from things like 'ls' just | work. | | (Yes, I'm well aware that there are case-sensitive filesystems | out there. I'd forgotten that iOS was one of those). | throwawayffffas wrote: | If a filename doesn't match \w+\\.\w+ I hate it | rapind wrote: | I wonder why "space" wasn't always simply treated as another | character. To save a couple bytes back in the 50s (when it | mattered) I assume? | deepsun wrote: | All because we use programmatically interfaces that were intended | for humans to write: command line, sql, html, email headers. | qayxc wrote: | It's worse than that. Whitespace is a hellish invention in the | world of computers: there are multiple characters that may or | may not render as whitespace with no way to distinguish them by | just looking at the output. | | Yet to the machine (script, shell, program, ...) it matters a | lot, since u0020[?]u0009[?]u00A0[?]u2000[?]u2001, etc. whereas | the aforementioned codepoints render like this: " " (and yes, | that's indeed the five codepoint in that order - at least I | typed them that way). | | (Ab)Using whitespace like that can lead to all sorts of funny | business, not just when dealing with shell scripts and variable | expansion. | bravetraveler wrote: | Admittedly trite/unhelpful comment: avoid xargs | stochastic_tn wrote: | I see that this guy must be in his early twenties as well. | fallingfrog wrote: | No way I would put anything but a-z, 0-9, and underscore in any | file name. Too many stupid ways it can go wrong. I guess I have | very little trust in my fellow programmers! | pixelbeat__ wrote: | POSIX portable file names were defined not to have spaces, and | just contain '[[:alnum:]_./]'. | | The findnl script as part of fslint identifies problematic | patterns, and has 4 levels of stringency, with "POSIX" being the | most stringent. | https://github.com/pixelb/fslint/blob/master/fslint/findnl | zibzab wrote: | Why stop at spaces? | | An old prof of mine used to send emails where the subject line | was always a valid identifier in C. | | Hello_dear_students_where_are_your_reports_ | wruza wrote: | That identifier is clearly too long. | | MISRA C:2004, 5.1 - Identifiers (internal and external) shall | not rely on the significance of more than 31 character. | meshaneian wrote: | As a software engineer, I require testing of paths and files in | spaces, and forbid the use of spaces for any system generated | file possible to make cli easier. | ineedasername wrote: | Instead of spaces I just use U+2215 | kazinator wrote: | Spaces in file names are a poor idea. File names are identifiers, | not titles. | | Let's test something: http://example.com/my silly webpage.html. | | Hey look, HackerNews just broke a URL with spaces in it. And it's | written in a Lisp dialect and all; it's not some Unix job cobbed | together with shell, sed and awk. The language has a string data | type, and strings are passed to functions without word-breaking | interpolations taking place. | | You know what else breaks on spaces? Basic everyday gui text | manipulation. | | Suppose that in a block of text we have the sentence: | | > Please look for the Holiday Schedule 2021 file. | | If you double click on any part of the name like Schedule, pretty | much every text widget on the planet will just select only that | word, and not the entire filename. | | However, if you have: | | > Please look for the holiday-schedule-2021 file. | | There is at least a ghost of a chance that a semi-intelligent GUI | can pick that out as a word. | | There exist good reasons to keep identifiers as clump beyond just | command line shells. | | It's why we need encoding like %20 in URLs that never pass | through a shell script. | NelsonMinar wrote: | Nothing old about that; lots of stuff is still broken. What are | the odds Homebrew works if installed to a directory with a space | in the name? Maybe the core brew manager itself, but all the | packages? | totetsu wrote: | It messes with tab completion in bash is why I avoid spaces | foxfluff wrote: | I'm hardly afraid but I just think it's poor ergonomics. Same as | the move from xset m 0 0 | | to xinput --set-prop 'pointer:Logitech USB | Receiver' 'libinput Accel Profile Enabled' 0, 1 | | Everything seems to be going this way in Linux land. Longer | names, harder to type names, camelcase names, spaces... I'm | looking forward to an OS that treats command line ergonomics as a | first class feature and where camelcase & spaces are verboten. | martin-t wrote: | I find this attitude misguided. More descriptive names are more | ergonomic for things you only use rarely but they need to be | combined with much better autocompletion than most shells | provide by default. | foxfluff wrote: | You state that as if that were objective.. but that's not my | subjective experience at all. Somehow I have a hard time | remembering these long names, (is it --conf or --config or | --config-file or --config-path? -c would've done it for me. | --set or --set-prop or --set-property or --prop or | --property?), and I need to look them up in a man page | anyway, and I make more typos typing them, and shell | completion rarely works well if at all. I also find it harder | to read and edit long lines that wrap. | | Somehow these short letters stick much better for me, and the | effort for finding them in the manual is the same, although | in case of extra complexity as with xinput, it's even worse | with the long names. I don't use either command often, but | it's hard to forget xset m. The only thing I remember about | xinput is that it's a horribly long lithany of things which I | need to look up every time, and the syntax still feels weird. | me_me_me wrote: | the most used options for properly written tools have both | short single char option like -c and long-form version | --config if you need verbose self-describing option. | | If you are using cli tools of github written by a random | person, then no wonder you will see non-standard approaches | to UX. | sidpatil wrote: | PowerShell takes an interesting approach in that it | accepts any truncated variant of a long-form flag as a | short form, provided it isn't ambiguous (i.e. if the | interpreter can't decide which long-form flag to expand a | short-form flag to.) | | For example, if a command features a "-ConfigFile" flag, | valid short-form variants include "-C", "-Co", "-Con", | "-Conf", and so on. But if the command featured an | additional flag "-ConfigURL" for example, the | aforementioned short-form flags would be ambiguous. | Mindless2112 wrote: | getopt_long (and thus most GNU programs) work this way. I | think it's probably a misfeature though since it means | that adding a new option can introduce ambiguity. Having | both short (ex. -x) and long (ex. --exclude) options is a | less problematic solution. | ufo wrote: | The shell ought to be able to help with that. There's no | need to remember if it's --conf or --config if you can | press --conf<tab>. | | One of the things I like about Fish is that by default it | can tab-complete program options and also shows a one-line | description of what each of them does. (It grabs that info | from the man page). | ori_b wrote: | So much of computing is dedicated to solving problems | that could be omitted. | salawat wrote: | Seriously. Just get up from the computer and go do | something else. /s | | We computer people are truly an odd bunch. | Joker_vD wrote: | I mean, that's precisely my thoughts on copyright and | licensing in general but what can you realistically do? | forgotmypw17 wrote: | Realistically, on an individual scale, you can pretend it | doesn't exist and go on with living your life? | fouc wrote: | > and shell completion rarely works well if at all | foxfluff wrote: | I just tried fish. xinput --set-[TAB] and nothing. | Apparently it doesn't understand the standard long-option | format that is supported by xinput and documented in the | man page. You have to know to omit the dashes and then | it'll complete. And it's downhill from there. | | Yeah I used to have all kinds of simple as well as | supposedly sophisticated completion setups with zsh years | ago but I've given up on it since then. It's always half- | assed and half the time causes more problems than it | solves. Same with bash. There are some places where I | must resist the urge to try complete a filename because | the shell starts trying to figure out which target it can | complete from a Makefile in a large build system and just | freezes. The only practical way out is to interrupt and | type the command again or wait a stupidly long time. | There are other issues like completion trying to be smart | and filtering out things it thinks you don't want to | complete. Nothing is more frustrating than a shell | refusing to complete a filename that you know is there. | throw10920 wrote: | I run fish. I was able to get long-option completion for | gcc, polybar, firefox, man, emacs, xrandr, and fish | itself. The only command I was _not_ able to get long- | option completion for was xinput. You just picked a bad | program to try. | tambourine_man wrote: | I'm with you. Terseness is paramount. | | I could never overcome my repulsion for Java and ObjC | because of that. On the other hand, I fell at home with | crazy RegEx that look like line noise to most people. | yepguy wrote: | I think shells could use something like a built-in | eldoc[1], in addition to tab completion. It would make | terse command line interfaces much more usable if you | could see what the positional arguments were for. | | [1]: https://docs.cider.mx/cider/config/eldoc.html | omnicognate wrote: | Spaces don't make anything more descriptive, they just cause | completely unnecessary quoting and escaping hassle. | | The amount of time that has been wasted by Windows using | "C:\Program Files" instead of "C:\Program_Files" far | outweighs any highly questionable aesthetic benefit IMO. | skohan wrote: | What's wrong with camelCase? It's easier to type than snake | thrwyoilarticle wrote: | There's a tendency away from snake_case and towards kebab- | case in things you interact with via CLI. Even moreso towards | nocase. | | Programs like Powershell eschew ease of use in CLI for | readability in scripts. | pvaldes wrote: | Snake_case is problematic for including filenames in TeX | also. This is a big no for me, even if I find it more | readable than the other. | JadeNB wrote: | > Even moreso towards nocase. | | Nocase (did I break a rule by writing it that way?) seems | great when you're enmeshed in the domain and you can see | the implicit separators, but then someone looks at your | naming from the outside and you're guaranteed to have an | 'expertsexchange' in there somewhere. | thrwyoilarticle wrote: | oh, fsck | rk06 wrote: | Powershell is case-insensitive, so camelCase is only a | writing preference | thrwyoilarticle wrote: | It's still verbose in places | chrismorgan wrote: | camelCase is objectively harder to read than snake_case or | kebab-case, though familiarity can mitigate that. | skohan wrote: | I'd argue it's at most a tiny bit harder to read, and a | _lot_ easier to type. On balance I 'd rather avoid making a | pinky key one of the keys I have to use the most. | frenchyatwork wrote: | Having used a lot of all the formats, it's argue it's a | lot easier to read an a tiny bit harder to type. For | typing it's basically just an extra `-` because unless | your alternative is nocase. | | For reading, CamelCase has 2 significant ambiguity | issues: similarity between I and l, and what do you do | with acronyms. Acronyms wouldn't actually be a problem if | everybody just wrote them would in snake_case (i.e. only | capitalize the first letter), but they don't and so it's | anyone's guess whether you're going to get "Id" or "ID". | | There's also a minor issue where if you're on a case- | insensitive file system it can be a little difficult to | change casing, but adding/removing underscores is easy. | [deleted] | daneel_w wrote: | _" On balance I'd rather avoid making a pinky key one of | the keys I have to use the most."_ | | And you use something else than your pinky finger for the | shift key specifically when typing capitalized letters | for camelCase? | skohan wrote: | At least it's where they sit naturally on the keyboard. | And the shift key is wider specifically so you don't have | to be accurate with your pinky when you're pressing it. | The underscore is one of the least ergonomic keys there | is. And you need _both_ pinkies to do it | daneel_w wrote: | I might be misunderstanding. On all layouts I'm familiar | with the underscore key is directly next to one of the | shift keys, or left of backspace. Neither layout requires | the Vulcan death grip. Shift should always be under your | pinky fingers to avoid contortions. | skohan wrote: | On the US layout it is next to the zero key on the top | row. | Pxtl wrote: | imho, the fundamental problem is using space as a delimiter. | Also, case-sensitivity is a disaster for ergonomics. | | If you had comma-delimiting like in an algol-derived language, | you wouldn't need to quote things with spaces. | | edit: also, code is read more times than it is written, so | optimizing for readability over brevity is generally a good | move. | Dudeman112 wrote: | I could infer a lot about the second and what those params mean | and what they do. | | The first one is some magical incantation. | zsmi wrote: | Another interpretation is: | | On the first, you think you know what it does, but you're not | sure. So maybe it gets looked up. | | On the second, you know you don't know what it does. You so | know to look it up. | | Personally, I'll take the second. Assumptions during | debugging are dangerous things. | foxfluff wrote: | Sure. One could also make "move-down-one-line" be the | incantation to move the cursor down a line in vi, but I | prefer j. | | Ergonomics isn't all about making everything self-descriptive | for someone seeing the thing for the first time. It's about | making things comfortable to actually use. If it's so long | and complicated that you can't even remember how to do it, | it's not very comfortable to use. Even if I could remember, | xset m 0 0 is still far more comfortable. | | And fwiw you still don't know what 0, 1 in accel profile do; | you need to look that up or take a wild guess, and if you | want to use that command, you'll also have to know how to | look up the device because chances are yours is not the same | as mine. So it's not any less magical in the end, just more | verbose. | | The "cool" thing about the xinput command is that you don't | even find accel profile in the man page. You gotta look | elsewhere if you want to understand what it is and what it | does and what the parameters are. | | xset m? Yes, that is documented in the man page. | Gigachad wrote: | It should be based on frequency of usage. I can tell you | that moving down a line in vim is a little more common than | toggling the mouse acceleration. | | I would never even type such a command. I would just copy | paste it once. | foxfluff wrote: | Yeah well, given that mouse acceleration tends to be on | by default, I need to turn it off every time I'm on a | fresh install or computer I haven't used before. The last | time I needed that was yesterday. | | I don't want to waste time searching for a command to | copy-paste when it could just be made short, simple, | memorable and ergonomic. I could type xset m 0 0 faster | than I could open a browser and ask google how to disable | acceleration with libinput. And again: you can't just | copy-paste the xinput command unless you're lucky enough | that it matches your device. On my new computer, the | device has a different name than on my old laptop even | though it's the same damn mouse. | TheOtherHobbes wrote: | It should be, but how would you keep track of usage | frequency? | | At least it would push all the "This switch was added by | someone playing with UNIX at a university in 1986 and | hasn't been used since" options to the end of the list. | ReleaseCandidat wrote: | > Ergonomics isn't all about making everything self- | descriptive for someone seeing the thing for the first | time. | | We're talking about `xset`. It doesn't make sense to | optimize that for usage of more than once a year. | foxfluff wrote: | The less frequently I need something, the more | frustrating it is if it's not short and memorable (or | easy to look up in the synopsis or built-in help). | Forgetting and googling a needlessly complicated command | over and over again every year isn't fun. | | xset achieves that perfectly. If I somehow _didn 't_ | remember how to set mouse acceleration with it, a quick | glance at the synopsis immediately tells me. Or I can | just run the command and it'll tell me: | To set mouse acceleration and threshold: m | [acc_mult[/acc_div] [thr]] m default | | Zero frustration, and the command is so short and simple | that I end up remembering it without trying. | | This is something I've observed more than once: I easily | memorize useful sets of one-letter flags even if I can't | remember or know what they all stand for. This just | doesn't happen nearly as much with long options. Commands | like ls -ctrl or ss -nap quickly become part of my | repertoire even if I don't use them very often, but I | really couldn't remember ss --numeric --all --processes | (if I had written that from memory, it could've ended up | as --num --all --pid or --numeric --any --process), and I | don't even know what the corresponding long options for | ls are. In the rare case when I have to deal with an | option that has no short equivalent, I feel like I have | to look it up every time if it's been longer than a few | weeks. | | You talk of optimization but I think this is just a very | basic (and reasonably successful) attempt at sane design. | It's not like someone had to go far out of their way to | make this in a manner that isn't batshit insane. | eloisius wrote: | But which case should software interfaces optimize for? | Ergonomics of someone who uses a tool frequently, or | interpretability for casual by-standers of some out-of- | context shell command? | formerly_proven wrote: | Cue nmcli (CLI for Gnome's NetworkManager) which uses UUIDs for | everything and (at least a while ago) did not accept partial- | but-unique UUIDs. Basically goes "nmcli connection up | 5095665a-d82c-4ae6-8964-283623387941". | gertlex wrote: | Weird, I haven't had to do this. Most(/all?) connections have | nice names you can see with `nmcli c`... and so I can do | `nmcli c up id DroidNet` and that's pretty dang nice. Pretty | sure this worked with Ubuntu 14.04 (though, nmcli has gotten | much more featureful since then) | | (The ability to shorthand connection->c and similar is great, | too; obviously not unique to nmcli) | apricot wrote: | By this point, I'm pretty sure there are people at gnome who | compete to see who will make the stupidest suggestion that | gets put in production. | MonkeyClub wrote: | It's a Gnomespiracy to determine whether worse is actually | better. | prionassembly wrote: | apt-get install nmtui # it's better | apricot wrote: | The problem is we're optimizing for "easy to learn" rather than | "easy to use". | jjoonathan wrote: | In a world of broken promises and tool churn, minimizing | tooling investment isn't laziness, it's a defense mechanism. | | This is a lesson I had to learn the hard way, multiple times. | forgotmypw17 wrote: | I've learned this lesson too, and I now avoid using any | tools that have broken backwards compatibility in the past | 20 years. | foxfluff wrote: | That may be a part of the problem but honestly I don't feel | like all these new crazy interfaces are easy to learn either. | I mean how do you come up with the lithany xinput calls for? | You need to understand the syntax for specifying a device. | You need to know that you're to set a libinput property, and | you need to know the name of that property, and it's not | documented in xinput man page, and of course you need to know | the values to pass which again are not documented in xinput | man page. You can play with --list-props and then take your | search elsewhere because it is completely opaque and doesn't | explain what the properties actually do. | | I suspect the number of people who figured all that out | without having to find it by googling / arch wiki / whatever | is very very low. | | Now I'm not gonna say xset is the easiest interface to figure | out, but the syntax for setting mouse acceleration is right | there in the synopsis, and if you search down the man page, | you'll learn a little more (and also if you just run xset | without arguments, it'll tell you how to set mouse | acceleration). It might not be the best designed tool but | it's something I learned back in the day as a teenager just | by looking at the man page. | | I think the real issue is that people nowadays are designing | these interfaces to be consumed by interactive configuration | tools, GUI apps, and desktop environments; they're more | dynamic, more complex, more flexible, but not easier to | figure out, not for you on the command line. The command line | is just a last resort. Second class citizen if you will. | forgotmypw17 wrote: | alias mouseoff='xinput set-prop 11 "Device Enabled" 0' | alias mouseon='xinput set-prop 11 "Device Enabled" 1' | | Kind of ridiculous if you ask me. | deckard1 wrote: | On some level it makes sense. The problem with the command | line is familiarity. | | How often do you reach for iptables? If you're like myself, | and most home/desktop users, then probably once in a blue | moon to set it up and then you leave it alone. But a system | admin? Maybe they touch it a few times a week or month. Every | time I use iptables I have to relearn how Linux networking | works. | | Similarly, the xset/xinput thing. When I need those tools I | just create a script or throw it in .bashrc. I adjust the | settings once and will not touch them again for a couple | years. It makes sense to have long parameters that are | _readable_. I can look at my .bashrc and see exactly what | device is getting adjusted. | zibzab wrote: | I've a feeling you will hate powershell | akersten wrote: | Needlessly long parameter/command names and the bizarre | insistence on capital letters are the #1 and #2 reasons I | detest PowerShell. Like GP, I resent that Linux tools are | moving in that direction. | ansible wrote: | Well, if you think that's bad, behold the recent trend in | network interface names on Linux. | | We started out with 'eth0', 'eth1', etc. Which adapter was | which could change when adding and removing a network card. | That was bad, so that prompted the evolution. | | Now we have 'enp1s0', 'enp0s31f6', 'enp13s0' and many similar | variations. These are supposedly more stable across device | changes. As it turns out, it wasn't. | | But wait, there is more! Now we have the "predictable names" | scheme that produces interface names that are even longer, and | not even slightly easier to remember. | | Read about the whole sorry saga here: | | https://wiki.debian.org/NetworkInterfaceName | | I do get that it is not an easy problem to solve, especially in | the face of removable network interfaces (like USB Ethernet / | WLAN). But surely this is not the best we can do. | foxfluff wrote: | I was actually ranting about this on IRC last night (yeah now | my laptop has two enp* interfaces and enx[MAC]).. | | One thing I like about OpenBSD is that buses are scanned and | drivers probe in order and there's no race between drivers | coming up. Unless your hardware is physically tampered with | or broken, all interfaces come up with the same name across | reboots. Linux isn't like that (even if you don't touch your | hardware, interfaces could swap across reboots), so you need | to do something about it. | | As is typical on Linux, the default is unergonomic and if you | want something nice, you're on your own to make it so. | | If you already have userspace daemons responsible for device | insertion and naming, it really wouldn't have been so hard | for it to e.g. automatically add a config file / database | entry for each interface the first time is seen. So the | devices that came up as eth0 and eth1 are still eth0 and eth1 | on the next boot; if I unplug eth0 and add a new card, the | new one would be eth2 because eth0 is still reserved for the | first card I had. | ReleaseCandidat wrote: | > add a config file / database entry for each interface the | first time is seen. | | Ubuntu did that with their persistent-net.rules udev rule. | That was a part of the PITA of the old naming. | nocman wrote: | Missed the 's', it's: | | https://wiki.debian.org/NetworkInterfaceNames | ReleaseCandidat wrote: | > These are supposedly more stable across device changes. | | No. These are stable across reboots. The old eth? weren't. | And yes, that had been a PITA. | nomorecommas wrote: | Long option names are more descriptive, more easily | distinguished, and easier to remember. Your shell should be | intelligent enough to provide tab completion for option names, | assuming it is configured to. | forgotmypw17 wrote: | >Your shell should be intelligent enough to provide tab | completion for option names, assuming it is configured to. | | Wait, are you saying that I need to change my shell or config | to make up for another tool's poor design? | | No, thanks. | Jiro wrote: | Long option names are more difficult to remember because a | long option name can be spelled multiple ways and it is | difficult to remember which spelling is correct. | Angostura wrote: | > Long option names are ... easier to remember ... Your shell | should be intelligent enough to provide tab completion | | They are so easy to remember that you need to configure your | shell to remember them for you? | [deleted] | throw10920 wrote: | These changes are meant to make it easier to _read and | understand_ command-line incantations (and to make them more | explicit, which is always good), because the command-line | paradigm, being text-based, imposes an unavoidable trade-off | between ergonomics and understandability /ease-of-use. It | sounds like you prefer ergonomics - although I wouldn't be | surprised if most users would prefer ease-of-use. | | Of course, if one doesn't write a CLI to begin with, this | trade-off doesn't exist - you can have your cake and eat it | too. | hackbinary wrote: | It seems to me that many of the problems associated with spaces | in filenames are due the OS assuming that a space signals the end | of a command or filename. | | Maybe we ought have to a different character signify the end of a | name? Or signfiy a option section, or the next option section of | a command? | bcrl wrote: | The Amiga supported spaces in filenames in 1985... =-) | pimterry wrote: | I work on a complex desktop application, and it's been astounding | the number of bugs that have appeared over the years triggered by | spaces and other unusual characters in file names. If you do | anything with subprocesses or path processing, it's absurdly easy | to hit in a thousand different ways, over and over again. | | Pro tip: rename your development directory (or even better: the | workspace path in CI) to put a space and/or special characters in | it. | | Forces you to deal with this properly, and immediately ensures | that every automated test checks this case without you having to | remember every time. Hasn't been particularly inconvenient, since | I'm autocompleting it 99% of the time anyway, and I haven't | shipped a single path parsing bug since. | josteink wrote: | > it's been astounding the number of bugs that have appeared | over the years triggered by spaces and other unusual characters | in file names | | If you consider spaces "unusual" I would say you haven't | encountered a single average user in your lifetime. Spaces in | file-names is the single most common thing people have, outside | programming environments. | | As a x-plat developer, the only platform where I (still) | regularly encounter these kind of bugs are platforms where | solving problems through scripting is common, like Linux, where | the primary means of operation is through stringly-typed | statements getting parsed and processed in a untyped-fashion. | It's not very reliable. | | On Windows people more often use "real APIs" (because scripting | doesn't really work as well), but then these problems just goes | away. | | Pros and cons, I guess. | SAI_Peregrinus wrote: | It's especially funny that it affects Linux so much. Most | file systems allow everything except `/` and NULL in file | names. Early AT&T UNIX even allowed NULLs! POSIX shells use | the IFS variable to perform field splitting, and it defaults | to <space>, <tab>, and <newline>. The choice to perform field | splitting by default (particularly with spaces in the default | IFS set) has caused no end of headaches for developers and | users. | InfiniteRand wrote: | It's easy to tell users to make a folder with no spaces if | you're setting up a global path, however if you have an | application that runs in user directories things can become | painful fast. Changing your user name is a pain and can leave | things inconsistent, but having to handle all the variations in | people's names with spaces, punctuation, international | characters, can just be mind boggling. | ralphc wrote: | Late '90s I worked on Java software that got installed on | several Unix platforms, including Linux for IBM mainframes. | When you deal with the default en/de-coding of Unicode to | EBCDIC you never have trouble with Java byte encodings ever | again. | dheera wrote: | Or not, which when bugs crop up will teach the businessy types | to stop putting spaces in their filenames. | macintux wrote: | The beatings will continue until morale improves? | | Spaces are very useful for readability. | cerved wrote: | depends entirely what you're using to browse files | lifthrasiir wrote: | While I agree that we should do this in the ideal world, doing | so will inevitably break other necessary tools so it is | unworkable for me :( | Spooky23 wrote: | Someone should provide the OneDrive/SharePoint people some of | this religion. | | Mysterious character requirements that do not conform with | Microsoft's OS limits, limits on tbe fully qualified pathname | length, etc. | alpaca128 wrote: | Seems like MS had the same idea according to an answer in the | link: | | _> Microsoft intentionally made programs install to C:\Program | Files on Windows 95+ to force programmers to deal with spaces | in filenames._ | vesinisa wrote: | Except for programs that were too old / obscure to fix I | guess. I think at least the Symbian Development Kit was such | that builds would fail with strange errors unless you | installed it in any other path than the default immediate | subdirectory of C:\, let alone under "Program Files". | henrikschroder wrote: | C:\PROGRA~1 | | Easy fix! | billti wrote: | And then to really mess you up and ensure you handle parens | properly, threw "(x86)" into the mix. (A real pain on some | REPLs as well as dealing with environment variables). | lifthrasiir wrote: | And yet they introduced C:\ProgramData in later versions. | kitkat_new wrote: | why "yet"? | | one occurrence is enough to make devs care about it | jjoonathan wrote: | Imagine if they made programmers put 64 bit DLLs in a | "System32" directory and 32 bit DLLs in a "SysWoW64" | directory. That would really keep 'em on their toes! | eyegor wrote: | You should look into the behavior of the | /windows/sysnative link. It appears and disappears | depending on whether your process is running as 32 bit or | 64 bit. | Karuma wrote: | Programmers should never put DLLs in those folders... Or | even ever touch them. | mastax wrote: | Except for \Windows\System32\drivers\etc\hosts, of | course. | jaywalk wrote: | I occasionally try to search for the reasoning behind the | location of the hosts file in Windows, and I always come | up blank. | jve wrote: | https://superuser.com/questions/355297/why-does-windows- | have... | HideousKojima wrote: | They originally copied BSD's network stack, IIRC | blincoln wrote: | Maybe it's from back before Windows had a built-in TCP/IP | stack? If it were a third-party/optional driver, having | files related to it in a path under system32\drivers | would make sense. | mjevans wrote: | Back around Win 95 when they added networking it was | based off of (IIRC) BSD's TCP stack and related tools. | They were an optional 'third party' driver of sorts, but | shipped by the first party. I'm not positive about WinNT | or Win3.11 (for workgroups?) | mixmastamyk wrote: | I remember adding "trumpet winsock" to Win 3.x back in | the day. Says '94 for that, and summer of '93 for NT 3.1 | debut: | | https://en.wikipedia.org/wiki/Trumpet_Winsock | | https://en.wikipedia.org/wiki/Windows_NT_3.1 | [deleted] | cerved wrote: | Sure. Microsoft only ever ships features | hetspookjee wrote: | I wonder how much global work could have been saved if | Microsoft also provided a covered interface for all paths in | the system. Not sure if there is any, but one good | implementation might save thousands of poor implementations | required to handle it. | moontear wrote: | You mean like the Environment.SpecialFolders enum? | | https://docs.microsoft.com/en- | us/dotnet/api/system.environme... | | There are several other classes that take care of getting | folders, least of which checking system variables. | lamontcg wrote: | Then they made poor APIs so that you have to do this to get | it correct: | | https://docs.microsoft.com/en- | gb/archive/blogs/twistylittlep... | | In _nix at least you can call execve or other APIs that take | a char_ argv[] and the whole problem is largely solved and | you don't need to quote things. | ealexhudson wrote: | I wish they did "User Files" instead of "Users" too, because | so much software breaks on the home area having a space in | it. | | Not least, it makes writing scripts for various shells and | getting the quoting rules right an absolute pain as well... | the_mitsuhiko wrote: | They used to. The folder was called `Documents and | Settings` until Win7. | 323 wrote: | "Documents and Settings" still exists on Windows 10, as a | soft link to "Users". | sixothree wrote: | I know this is completely tangential. But you can Win-R | and just type Documents and it will load your documents | folder. Same for downloads, pictures, temp (windows | temp), and I'm sure many others. | | Works from File-Open dialogs and address bars and even in | the command prompt you can even do "explorer documents". | thedday wrote: | Yeah, it's a junction point, but it's also useless. Open | a command box and CD to it; now what? A file explorer and | set it as the directory, again, now what? | 0des wrote: | You know, this makes me wonder.. tangentially speaking- I | wonder how hard it would be to rearrange the folder | structure in linux so that I have something like this: | | /Users/{root, user0, user1, ... }... | | /System/{Logs, Apps/{opt, container, ...}, Temp, Conf | ...}... | | /Devices/{Mount, sda, sdb, null ...}... | | /Boot/... | Bad_CRC wrote: | macos does something like that. | matheusmoreira wrote: | > I wonder how hard it would be to rearrange the folder | structure in linux | | Restructuring the directories is the easy part. You just | delete the old tree and make a new one. You can also | mount procfs and sysfs wherever you want. | | The hard part is modifying existing software to work with | the new tree. So many programs assume you have a | "standard" file system tree. So many programs assume | procfs is mounted at /proc. So many programs have | hardcoded paths. Shared library locationd can become part | of the binaries when they're compiled. It's insane and | you'd essentially be creating a new Linux distribution. | anyfoo wrote: | Is it coincidence that you almost exactly replicated what | macOS has? Except that /Devices is /Volumes, .../Apps is | .../Applications. and /Boot is handled differently. | | Of course, that's not perfect either, because a) decades | of changes vs. compatibility have made it less clean in | certain places, and b) pretty much all the POSIX paths | still exist for unix-y compatibility, but overall it's | like that. | caymanjim wrote: | You monster! | 0des wrote: | Don't even get me started on /usr/local/bin.. | ThaJay wrote: | You mean "Start Menu"? | riccardomc wrote: | Why not just symlink them? You can have best of both | worlds with relatively little effort. | | Make the overlay of your dreams! | Spivak wrote: | I mean we're heading there with /usr being your /System. | Redhat/Pottering are doing heroic work in this space. | /Users -> /home /System -> /usr /Data -> | /var /Config -> /etc /Boot -> /boot | /Ephemeral Temp -> /run /Persistent Temp -> /tmp | | The only real holdouts are proc/sys/dev which are the | kernel and mnt/media/opt/srv which are really for the | user/sysadmin and aren't really used by the OS anymore. | woodruffw wrote: | Genuine question: on what systems is `/tmp` persistent? | Both macOS and Ubuntu 20.04 clear `/tmp` on every reboot | for me, and I haven't changed the defaults at all. | earthboundkid wrote: | All storage is temporary. You just gotta wait long | enough. | novok wrote: | People don't reboot often. Persistent tmp basically means | it will be cleared in an infrequent manner, so the | likelihood of it going away 1s after you release your | file handle is low. | mike_hock wrote: | "Persistent Temp" should be /var/tmp. "Persistent Temp" | is also an oxymoron. | nybble41 wrote: | > "Persistent Temp" is also an oxymoron. | | It's not an oxymoron to have files which are temporary | but not limited in scope to a single power cycle. For | example, you could have a long-running process which you | want to be able to resume if it's interrupted; /var/tmp | would be an appropriate place for the state. The data is | temporary because it will be deleted once the process is | finished, but you wouldn't want it wiped out by a system | reset. Generally /tmp is cleared at every reset, and is | often a tmpfs mount, while files in /var/tmp are | automatically cleaned up only when they reach a certain | age. | tremon wrote: | Except that the FHS says that "data stored in /var/tmp is | typically deleted in a site-specific manner", and as an | application vendor you have no control over that site- | specific clean frequency. On all my systems, /var/tmp is | a symlink to /tmp and that has never caused any issue. | nybble41 wrote: | The FHS is not wrong; cleaning policies are indeed site- | specific and files placed in any temp directory can in | principle disappear at any time. (Though, in theory, it's | not supposed to happen while the files are still "in use" | by running programs.) Still, historically you could count | on files in /var/tmp lasting longer than files in /tmp, | including across reboots. | | Nothing will immediately break because you linked | /var/tmp to /tmp. Whether it causes issues depends on the | programs that you (or your users) run and how they make | use of /var/tmp. However, if someone did have to restart | a long-running process from the beginning because recent | state information in /var/tmp was not preserved across a | reset, I would say that is a problem with the | administration of the system and not the program that | stored its state there. | Spivak wrote: | Basically no one uses /var/tmp for anything (and nobody | should either). World writable directories are a mistake | and only continue to exist because apps assume they are | available. | | /tmp and friends are poorly named. They really should be | /shared or /dmz or /freeforall or something. | | * If you need service-specific tmp space use | RuntimeDirectory or PrivateTmp if your app is hardcoded | to /tmp. | | * If you need service-specific persistent data that goes | in /var/lib/your-app. | | * If you need temp space for your user it's at | /var/run/user/your-uid. | | * If you need more than one user/service to share files | _but not everyone_ then god have mercy on your soul | because all options are bad. There sure are a lot of them | but none of them are at all satisfying. | nybble41 wrote: | Right, /var/tmp is the "Persistent Temp" directory, and | /tmp is "Ephemeral Temp". The /run directory is for | _runtime data_ such as PID files, Unix sockets, named | FIFOs, and generated systemd units--it has a specific | internal structure and shouldn 't be used as a direct | alternative to the relatively unstructured /tmp | directory. While both are generally ephemeral tmpfs | mounts, only /tmp is writable to all users. | carlhjerpe wrote: | I'm not sure I'm a fan of the capitalization and spaces, | other than that I'm all for more self-explanatory names. | abdusco wrote: | This is what I want from Linux. Sensible & guessable | names for newcomers to figure out where to put files and | programs. | | It's frustrating having to spend time to decide whether I | should install a program in /var or /opt or /usr. What do | they even mean! | | So, I disagree with this convention altogether and use | /apps or ~/apps now. | tenebrisalietum wrote: | The directories that house your executables are read only | to users other than root, to prevent attacks and | overwriting them by non-root users. | | /var stands for variable data--like log files, cache | directories, spool directories, etc. You shouldn't put | executables there. Ideally you should be able to set the | noexec flag on it. | | `/usr` actually exists because the original UNIX | developers ran out of disk space and had to attach | another disk. The difference between /bin and /usr/bin is | not worth it and even Debian symlinks /usr/bin to bin. | | But your _distribution 's package manager_ should be | putting stuff in /bin or /usr/bin, not you. Anything that | follows the regex "{asterisk}/local{asterisk}" is | something the system owner can do whatever with. So you | should be using /usr/local/bin or $HOME/local/bin. I | don't know why there's no /local off of the root. (One | thing I do on my own systems is make and use an | /etc/local although I think you're supposed to use | something like /usr/local/etc). | | /opt is for third party programs that aren't installed | via your distro's package manager. | | If you do this, any customizations you make to a system | can be easily backed up by copying all dirs with local in | the name. | | There's multiple decades of tradition behind these names, | but they do date back to the age where actual teletypes | were used. | chasil wrote: | Oh, my young friend, you have no idea what POSIX has done | to you. | | "While no one sane would put newlines in directory names, | such corruption of the results could lead to exploitable | vulnerabilities in scripts." | | http://www.etalabs.net/sh_tricks.html | oblio wrote: | He he. | | Want to see true craziness? POSIX file names are just a | bag of bytes. They don't even have to be text, they can | be anything (almost), there's no standard text encoding: | | https://lwn.net/Articles/325304/ | | And in typical Open Source fashion, someone actually | claims it's a feature: https://lwn.net/Articles/325398/ | because hey, you 99.999% percenters can suffer so that I, | 0.001% percenter can implement my wacky system. | | https://xkcd.com/1172/ | chasil wrote: | This appears to demonstrate the full range of abuse. | $ mkdir hold $ cd hold $ cat | ../wildname.c #include <stdio.h> int | main(int argc, char **argv) { char n[256]; int | i,j=0; FILE *fp; for(i=1; i<256; i++) | if(i!=47) n[j++] = i; n[j] = 0; if(fp = | fopen(n, "w")) { fprintf(fp, "hello world!"); fclose(fp); | } } $ cc ../wildname.c $ | ./a.out $ ls -l total 16 | -rw-r--r--. 1 luser lgroup 12 Nov 11 16:32 | ??????????????????????????????? | !"#$%&'()*+,-.0123456789:;<=>? @ABCDEFG | HIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~?? | ????????????????????????????????????????????????????????? | ????????????????????????????????????????????????????????? | ????????????? -rwxr-xr-x. 1 luser lgroup | 8464 Nov 11 16:32 a.out | | Just because you can do something does not mean that you | should. | 0des wrote: | Behold! https://en.m.wikipedia.org/wiki/Filesystem_Hierar | chy_Standar... | LordDragonfang wrote: | I feel like it just highlights the problem of how | antiquated and confusing linux terminology that so many | of those reference "single-user mode", used to refer to | booting into root, when the vast majority of computing | devices a given user will interact with only have a | single _actual_ user, making this a confusing and almost | meaningless distinction to someone not already intimate | familiar with linux. | emteycz wrote: | Yeah, except that tells me nothing useful... The question | is exactly the same: So where do I install this random | binary I downloaded from the internet or compiled myself? | Is it /opt, /usr/bin, /usr/local/bin, or /bin? Where do I | put the dependencies I compiled for this software - | /usr/lib, /usr/local/lib, /lib, /opt/lib, /opt/<app | name>/lib, or what? | woodruffw wrote: | Is your account the only account that's expected to run | the binary? If so, then `$HOME/bin` is a perfectly | acceptable (albeit not standard) place to put it. | | If you expect other users to be able to execute the | program, then you should put it in either `/usr/bin` or | `/usr/local/bin`, depending on whether the former is | already being used by a package manager. `/opt` is | _generally_ for self-contained software that doesn 't | play nicely with the rest of the system, but _might_ | still be installable through the default package manager. | megous wrote: | $HOME/.local is the equivalent if /usr/local for per-user | stuff. | mananaysiempre wrote: | I don't think there's any "official" word on that (the | XDG spec that defines ~/.local/share doesn't mention | ~/.local/{bin,lib} IIRC, and the traditional per-user | entry in PATH seems to be ~/bin), but a fair number of | people use it this way, yes, including me. | tom_ wrote: | I started out using $HOME/bin, but a fair amount of stuff | assumes a /usr- or /usr/local-style folder structure when | doing make install, so I've settled on using | $HOME/usr/bin instead, so that programs can create | $HOME/usr/include and $HOME/usr/share and whatever, | without trampling on stuff in my home folder. | | Can't remember the last time I had a problem arranging | this. If using autotools, which covers 95+% of stuff, | it's usually a question of something like "./configure | --prefix=$HOME/usr". | | (If I want to share stuff between users, /usr/local/ is | of course a better place. macOS is a bit more | restrictive, so I have a separate user for this, whose | /usr folder is readable by everybody.) | woodruffw wrote: | Yeah, it definitely gets hairier when using anything | that's more than just a drop-in binary. | matheusmoreira wrote: | > $HOME/bin | | On freedesktop systems there's the ~/.local directory | which is supposed to be a mirror of the file system | hierarchy. Seems like a good place for bin, lib, include | directories. | mananaysiempre wrote: | The standard is, indeed, excessively vague because it was | written to let many existing implementations be | conformant as is, though I'd say it's still more helpful | than many other standards with that deficiency. There's a | method to it, however: | | - Things installed in /, if it's different from / _usr_ , | are generally not to be touched; | | - Things installed in / _usr_ are under the distro's | purview or otherwise under a package manager, any | modifications are on pain of confusing it; | | - Things installed in / _usr_ / _local_ are under the | admin's purview and unmanaged one-offs, there are always | some but overuse will lead to anarchy; | | - Things installed in / _opt_ are for whatever is so | foreign and hopeless in not conforming to the usual | factoring that you just give up and put it in its own | little padded cell (hello, Mathematica); | | - Everything is generally configured using files in / | _etc_ , possibly with the exception of some of the | special snowflakes in / _opt_ ; the package manager will | put config files meant to be edited there and expect the | admin to merge any changes in manually, and sometimes put | default settings meant to be overridden by them in / | _usr_ / _share_ (see below)--both approaches can be | problematic, but the difficulty is with migrating | configuration in general, not the FHS as such. | | There used to be additional hierarchies like / _usr_ / | _X11R6_ , and even a / _usr_ / _etc_ on some (non-Linux?) | systems, but AFAIU everyone agrees their existence makes | no sense (anymore?), so much that even FHS doesn't lower | itself to permitting them. | | The distinction between / and / _usr_ might appear to be | pointless as well, and nowadays it might be (some distros | symlink them together), but previously (especially before | initial ramdisks were widespread) stuff in / was | whatever was needed to bring up the system enough that it | could netmount a shared / _usr_. | | Inside each of /, / _usr_ and / _usr_ / _local_ there is | _bin_ for things that are supposed to be directly | executable, whether binary or a script and all in a | single place; _share_ and _lib_ for other portable and | non-portable (usually but not necessarily text and | binary) shared files, respectively, segregated by | application or purpose; finally, due to the dominance of | C ABIs and APIs on Unices, the top level of _lib_ also | hosts C and C++ library files and there's an additional | directory called _include_ for the headers required to | use them. Some people also felt that putting auxiliary | executables (things like _cc1_ , the first pass of the C | compiler) inside _lib_ was awkward so they created | _libexec_ for that purpose, but I don't think the | distinction turned out to be particularly useful so not | all distros maintain it. | | That's it, basically. There are subtler but logical | points (files _vs_ subdiretories in / _etc_ ) and things | people haven't found an obviously superior solution for | (multilib and cross environments), and I made no attempt | to be historically accurate (the original separation of / | and / _usr_ happened for intensely silly reasons), but | those are the fundamental principles of the system, and I | feel it does make sense as a coherent implementation of a | particular design. Other designs are possible (separation | by application or package not purpose, Plan 9-ish | overlays, NixOS's isolated environments), but that's a | discussion on a different level; the point is that _this_ | one is at the very least internally consistent. | | Re the unfriendly names ... I honestly don't know. | Newbie-friendliness matters, but it's not the only thing | that does; particularly in a system intended for | interactive text-mode use, concise names have a quality | of their own. There's a reason I'm more willing to reach | for curl and jq rather than for httpx and lxml, for | regular expressions rather than for Parsec, and even for | cmd.exe, as miserable as it is, rather than for | PowerShell. | | I feel weird that no HCI people seem to have seriously | considered the tension between interactive and | programmatic environments and what the text-mode user's | experience in Unix says about it, but even Tcl, which is | in many ways a Bourne shell done right, loses something | in casual REPL use when it eliminates (as far as | idiomatic libraries are concerned) short switches. Coming | up with things like _rsync -avz_ or _objdump -Ctsr_ is | not very pleasant initially, but I certainly wouldn't | want to type out the longhand form that would be the only | possible one in most programming languages (even if I | find their syntax beautiful, _e.g._ Smalltalk /Self). | aranchelk wrote: | I was taught /usr/local/bin | | /opt is for standalone packages, so if it's a single | file, no. | | /bin is only for stuff needed on single user mode, so | probably not (unless that's what the binary is for. | | /usr/bin is going to typically contain files installed by | your package manager and should probably be left | unaltered by human hands. | | The deps I would assume /usr/local/lib but it hasn't ever | come up for me. | nsv wrote: | To add: when you install software yourself you choose | this, when your install software from e.g. a distribution | package it is chosen by the package maintainers, and to a | larger extent the maintainers of the distribution. | | This is one of the big advantages of using a pre-made | advantages of using a ready-made Linux distribution: | beyond the convenience of having an installer or easy to | install packages, you get some assurance that the system | as a whole has been thoughtfully put together. | | Arch Linux for example symlinks /bin and /sbin to | /usr/bin and /lib to /usr/lib among other things. | matheusmoreira wrote: | > So where do I install this random binary I downloaded | from the internet or compiled myself? | | In your home directory. | db48x wrote: | Wherever you want. All of the above, or none. It really | is up to you. | emteycz wrote: | That's exactly the problem. This leads to mess. The | Windows model of C:\Program Files\<app name> is much | better. | db48x wrote: | No, it frees you to pick whatever unmessy solution you | want. | | You can do `configure --prefix=/Program\ Files/<app>` if | you want. | emteycz wrote: | If I am not writing all of my installation scripts by | hand, because that would be really intense, then _every | folder_ gets filled with random bits of software. | | Offering too many similar choices leads to mess. There's | nothing fundamentally different between using one or more | of these options and using the only option, except that | in the second case there isn't any opportunity to make | mess. | | > You can do `configure --prefix=/Program\ Files/<app>` | if you want. | | Thanks for the tip! Can't do that with distro repo | software though :-/ | db48x wrote: | > then every folder gets filled with random bits of | software. | | What does that even mean? When you install something, you | put it where you want it. | | If you don't like where your distribution puts files, | choose a different one. Not all of them use the same | convention. | emteycz wrote: | All (except aforementioned GoboLinux) use FHS. | kevin_thibedeau wrote: | Use Gnu Stow to keep the random bits contained in their | own app directory that is symlinked into the /usr/local | tree. Then you can manage everything without leaving | orphan files behind. | yjftsjthsd-h wrote: | When you download a portable app (just a bare .exe), do | you make a folder for it and drop it in program files? | (quite possible, you'd just be unusual) If not, why does | Windows get a free pass? | drewzero1 wrote: | Okay, but what about ProgramData? I have enough programs | that put their junk in there instead of Program Files, | and others that make their own directories on the root of | the drive (driver installers are really bad about this). | | I think the best model I've seen for consistent binary | locations is the 'Applications' folder in Mac OS X, but | it fails as well by retaining the /usr/bin elsewhere. | tremon wrote: | But why are many Windows programs under | C:\Windows\System32 then, if Windows has only a single | model? Why aren't all Steam-provided (for example) games | in a single location? Or, if they are, does Windows | really have a single model? | | Yes, the Linux/POSIX model is confusing, but the split is | to segregate administrative domains: | | - / and /usr are the domain of the distribution. As a | user, you should never install there. The administrative | group is root. | | - /usr/local is the domain of the machine admin. If the | machine is yours to manage, you can install software | there. The administrative group is staff. | | - /opt/$vendor is the domain of third-party vendors. Each | vendor (like Steam, Eclipse, Arduino Studio) can get its | own subdirectory and its own administrative user group. | | How would you achieve the same on Windows? How do you | make sure the Adobe updater can only install new versions | of CS, but not surreptitiously install a new (free!) | spyware package under C:\Windows? How would you allow | certain power users to share one Google Chrome | installation, allow each of them to update it, but not | let them install additional software system-wide? | somehnguy wrote: | I've read that a handful of times (whenever trying to | figure out where to put some new random thing), and still | have never come to a clear conclusion. Even better, | because there are so many similar places, you might | choose completely different ones depending on the day of | the week and your current mood. | | Too much choice for things like this is harmful IMO. Deep | down I truly couldn't care less where the files end up, | as long as that place is the 'right' place. There are too | many 'right' places which makes it hard to find random | things at a later date or when on a box you're not super | familiar with. It's also a complete waste of time to | think about it at all. | NavinF wrote: | It's not just you: Every distro is its own special | snowflake and patches the programs they distribute to | store files in a different place. | | The "standard" doesn't tell you what directory structure | to use inside /etc to group related config files. The | "standard" doesn't tell you where an HTTP server should | serve its files. Everyone just does their own thing which | makes upstream docs incorrect and useless for newcomers. | stryan wrote: | > The "standard" doesn't tell you what directory | structure to use inside /etc to group related config | files. The "standard" doesn't tell you where an HTTP | server should serve its files. Everyone just does their | own thing which makes upstream docs incorrect and useless | for newcomers. | | The FHS, does actually answer both of of those questions. | Files inside /etc/ should be grouped in subdirectories[0] | andd the HTTP server should serve user-specified website | files from /srv[1] and normal distro-provided files (such | as the apache test page) from /var[2]. | | [0]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0 | 3s07.htm... | | [1]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0 | 3s17.htm... | | [2]: https://refspecs.linuxfoundation.org/FHS_3.0/fhs/ch0 | 5.html#p... | selfhoster11 wrote: | GoboLinux does exactly that: | https://en.m.wikipedia.org/wiki/GoboLinux | DarkWiiPlayer wrote: | Dammit, I wanted to be the one to mention gobo linux [HN | deleted my laughing emoji ffs] | andai wrote: | Beat the system ?**? | 0des wrote: | Wow, thanks for the reply, nice find! I did some poking | around on my Linux system and even re-arranging the home | folder was a task of its own because the system kept | trying to replace folders in their original places. I | will do some digging in to Gobo and see how they're | handling this. Thanks again for pointing this out. | dotancohen wrote: | > the system kept trying to replace folders in their | original places. | | This is the file that you want: $ cat | ~/.config/user-dirs.dirs XDG_DESKTOP_DIR="$HOME/" | XDG_DOWNLOAD_DIR="$HOME/Downloads" | XDG_TEMPLATES_DIR="$HOME/" | XDG_PUBLICSHARE_DIR="$HOME/" | XDG_DOCUMENTS_DIR="$HOME/" XDG_MUSIC_DIR="$HOME/" | XDG_PICTURES_DIR="$HOME/" XDG_VIDEOS_DIR="$HOME/" | yjftsjthsd-h wrote: | That helps, but be warned that there are still programs | running around that just hardcode their paths | kaba0 wrote: | Cries in nixpkgs | | (Anyone who tried to package a program that hardcodes the | "usual" binary paths know the pain) | lostlogin wrote: | You're clearly a more capable user than me, but even so, | take care. The time I accidentally moved /etc has scarred | me for life. | tech2 wrote: | Early on in my Linux-using-life I made the mistake of | deleting /etc. That was a learning experience like no | other :) | mixmastamyk wrote: | Since Live CDs/Flash drives were invented, I wouldn't | worry about this stuff any longer. Certainly have your | personal files in a centralized location and backed up | first. | | Probably the easiest way to experiment these days is to | create a VM and make snapshot, then start knocking down | walls, just to see when and where the house collapses. | Then revert and try something new. | genewitch wrote: | There's a computer game that deletes random files when | you make a mistake or lose. | | There could be a competition! | gavinray wrote: | How do you deal with lack of being able to just point to | "/usr/lib/include" or other things when saying "here's my | directory of shared libs"? | | This is definitely interesting though, and an improvement | I would say | JonathonW wrote: | GoboLinux symlinks everything into an FHS-ish structure | under /System/Index/ so you still have a single place | where binaries/libraries/includes/etc. live. (There are | also symlinks from /usr/lib, /usr/bin, and others into | /System/Index/ for compatibility with programs where | those might be hardcoded.) | short12 wrote: | That actually seems like some low hanging fruit to go on | a commit spree correcting code that hard codes paths | oblio wrote: | GoboLinux is old enough to vote in most countries. | | So either those low hanging fruits are higher than they | seem, or we're all just a bunch of dwarves. | | My bet is on the second option. | matheusmoreira wrote: | It _is_ low hanging fruit as far as the software is | concerned. Simply parameterize all paths. | | Will upstream accept such patches though? Sounds unlikely | to me. | biryani_chicken wrote: | You don't even need to rearrange the folders themselves, | just show them like that in the file explorer. Same way | the windows explorer does. | 0des wrote: | Do you have any docs on how to do that? Thanks for the | reply, I look forward to trying that. | post-it wrote: | MacOS too. /usr/ and /dev/ and whatnot exist, they're | just flagged as invisible in Finder. There's a command to | globally unhide them for those who want to see them. | e0a74c wrote: | Couldn't you do it with plain old symlinks? | grishka wrote: | Huh, spaces. There's way too much software, especially on | Windows, that breaks when there are Cyrillic characters in | a path. I'll let you guess how I found out. | DarkWiiPlayer wrote: | A friend had the username "Ruben" and jfc it broke | everything other than windows itself xD | dhosek wrote: | The problem isn't the Cyrillic or the e but the fact that | Windows lets you put those characters in file names in | non-Unicode encodings which will create sequences of | bytes which are invalid UTF-8. It's 2021, FFS, stop using | legacy encodings. | grishka wrote: | All win32 functions that accept or return strings come in | two varieties, with A and W suffixes, | MessageBoxA/MessageBoxW. The A works with the system | default 8-bit encoding (cp1251 in case of Cyrillic), the | W works with unicode in wide chars. There shouldn't be | much of a problem with string handling if you stick | exclusively with W functions. | ziml77 wrote: | Using the W functions has been the advice from | Microsoft's documentation for ages. But people still use | the A functions because they're easier, especially when | writing cross-platform software since Windows is the only | major OS that made the unfortunate choice of having the | base character type 16 bits wide. | | Fortunately the future of the Windows API does look | better since Microsoft has now added proper UTF-8 support | since Win 10 1904. All you have to do is request it in | the application manifest and the A functions will accept | and return UTF-8. | grishka wrote: | > since Windows is the only major OS that made the | unfortunate choice of having the base character type 16 | bits wide | | Apple OSes use something they call "unichar" inside | NSStrings. I'm not 100% sure what it is, but it feels | like it's the same 16-bit wide character. | ziml77 wrote: | It's possible! It seemed like a sensible choice back in | the early 90s when the answer to making a system for | global use was UCS-2. I know Java was another one that | went with that decision. | mjevans wrote: | I would rather they added a U suffixed version and better | still backported that all the way to Win 7. Now in 3-7 | years people can write programs that use the A functions, | but have to check the version of Windows and refuse to | run if it isn't new enough. | DnDGrognard wrote: | I had a really odd one last year where a Grave I ( well | known brand name) got converted by office/excell into a | Double Grave I. | | The double grave I is used by some obscure orthodox | religionious texts | kaba0 wrote: | If you have a username with your full name (plus point if | you have special characters in your name), you will get the | whole deal with shitty programs. I'm not sure if it's me, | but there were cases I simply could not use a program | installed in such a location, to the point where at my | previous (admittedly shitty) workplace, we often installed | software in a root location... | 323 wrote: | Laughs in C:\PROGRA~1\ (try it, still works in Windows 10) | selfhoster11 wrote: | Truly lifesaving for when she'll quoting gets in the way. | kijin wrote: | You've got a stray single quote in your shell. :) | selfhoster11 wrote: | That was a typo, but it seemed like a perfect | illustration of my point, so I left it in. | Someone wrote: | Typo? I would guess it's autocomplete at work. iOS does | that all the time for me. | the_mitsuhiko wrote: | There is no guarantee that the short name has that. In fact | on a lot of German Windows installations it was PROGRA~2. | 323 wrote: | Well, on my disk PROGRA~1 is "Program Files" and PROGRA~2 | is "Program Files (x86)", so still works :) | floatingatoll wrote: | That order is not guaranteed consistent across | installations, however. | marginalia_nu wrote: | I wonder if code to this effect has ever been written | before for (int i = 1; i < INT_MAX; i++) | { if (dirExists("C:\\PROGRA~%d\\ProgramName", i)) | { | gmfawcett wrote: | And that, children, is when marginalia_nu unlocked the | seventh circle of the inferno. Tomorrow we'll read the | story of how our new demon overlords forced us all back | to Windows 3.1. | jagged-chisel wrote: | Win 3.1? on DOS 6.22? Actually, this sounds like heaven. | Just don't put it on the public 'tubes. | floatingatoll wrote: | Or do. Can't hack a Mac Classic web server! | marginalia_nu wrote: | Got to tweak HIMEM.SYS before the slumbering one can be | awakened. | floatingatoll wrote: | For whatever it's worth, this is a terrible idea, for so | many different reasons: | | https://web.archive.org/web/20100107184218/http://blogs.m | sdn... | | And so, yes, I'm certain someone must have done it, | because it's clearly bad idea jeans and so Murphy's Law | says it must exist. | Someone wrote: | Apart from what others mentioned, that can only work if the | file system automatically creates 8.3 names. NTFS does not | necessarily do that (https://docs.microsoft.com/en- | us/windows-server/administrati...) | antihero wrote: | Shame it wasn't | | > C:\Pr[?]og[?]ram Fil[?]es[?]\ | gattilorenz wrote: | Funny, in the Italian Win9x it is C:\Programmi, which I | always thought was more convenient because of the lack of | spaces :) | dan-robertson wrote: | On the other hand their case sensitivity behaviour means that | "cross-platform" Java applications can break if they are run | on a non-windows platform where opening files is case | sensitive (unlike on windows) | sysadm1n wrote: | > other unusual characters in file names | | Saw a few hacks where malware authors used the RTL feature | (which is baked into Windows) to obfuscate file extensions. It | looked like .exe.innocuous-document.docx, but was actually | .docx.innocuous-document.exe | redwall_hp wrote: | I don't know if it's still a problem, but it used to break | Python virtualenv badly. If your working directory had a space | anywhere in the path, it would throw a huge fit and not work. | Which is problematic when the expected name for a Mac's boot | drive is "Macintosh HD" (if you ever had a reason to run a | virtualenv outside of your home directory). | mwcampbell wrote: | My favorite filename special character bug was when I | implemented CD ripping in 2005, and one of our beta testers | ripped a CD with a song called "Have You Ever?". My code wasn't | prepared to filter out the question mark on Windows. | mixmastamyk wrote: | I just hit the one where an album folder ends in a period. | Rsync copies every time because the period is dropped by the | filesystem silently. :-/ | Foobar8568 wrote: | Let's not forget return carriages in filenames within apps... | shane_b wrote: | My Mac is formatted case sensitive when the default is case | insensitive. This will also catch a ton of import related bugs. | | League of legends doesn't run until I sed files for instance. | deckard1 wrote: | I have coworkers on Mac that write node/JS code. Every once | in awhile I'd pull down the latest code and it wouldn't run. | I'm on Linux. | | Sure enough, they had SomeFile and were importing Somefile | and it works fine on Mac but not on Linux (which, of course, | is what our production servers use). It amazes me that "works | fine on my machine" is still a thing when I definitely worked | at companies that solved this back in the 2000s. It was | solved. It was done. Then devs became enamored with running | everything locally. Even dozens of microservices or | databases. Even though JS is fairly isolated, you still have | NPM packages that need built against the local OS and C/C++ | library and compilers, etc. Which also has caused issues in | the past. | speedgoose wrote: | Good news, we have solutions. You could use continuous | integration and software containers like Docker. | fouric wrote: | Does Docker abstract filesystem behaviors like this? I | always thought that it stopped at the libc level - that | is, libc is included in the container, but it calls the | host kernel's system calls, and so inherits the host | kernel's behavior (including things like underlying | filesystem case sensitivity). | handrous wrote: | Docker relies on LXC, so it's Linux-only. On other | platforms it runs in a Linux VM. The host for Docker, | then, is Linux no matter where you are. | [deleted] | dunham wrote: | Circa Y2k, I learned that the OSX Palm Pilot software didn't | work with case sensitive. I've since given up and stuck with | the default. (I'm anti-case folding in general, because of | the ambiguity.) | mdaniel wrote: | I also enjoyed doing that, but had to make a DMG just for | Steam because it straight-up refuses to run on a case | sensitive FS (that's true on Windows, also, which I suspect | is how we all got here). I think the most recent Steam | versions either caught wind of my trickery or -- more likely | -- run something from $HOME/Library/SomethingOrOther and thus | the work-around it no longer works | | When I got a new Mac, I just gave up and acquiesced to the | case-retentive world :-( | memsom wrote: | I once returned a printer because the Mac driver and support | software expected and enforced case insensitive access and | basically couldn't install properly on my case-sensitive HFS+ | volume. It half installed and blatantly just didn't work in | any way when installed. | NegativeLatency wrote: | Adobe software used to refuse to install on case sensitive | file systems back in the not too distant past. | agumonkey wrote: | See the recent article about unicode invisible glyphs in | JavaScript or bash. | | Naming freedom needs a stdlib module | kitkat_new wrote: | Pro tip2: Use std lib path processing utilities | idatum wrote: | Somewhat related to injecting unusual characters, in my | experience in localization efforts: | | Inject a Turkish 'I'. I don't know how to type or paste it | here, but picture an English lower case 'i' that is upper case. | It is a splendid way among many to shake out some loc bugs. | gus_massa wrote: | I | | From https://en.wikipedia.org/wiki/%C4%B0 | jeffwask wrote: | It doesn't even have to be complex, often basic automation | tasks fail with spaces and special characters. Honestly, | treating a file system like a natural language processor is a | bad idea. Besides at this point with how digital we have all | become who can't understand... | | thisismyconfig.txt vs this is my config.txt or | this_is_my_config.txt | | ...i've forced myself to stop using spaces, character, and even | cap. They are all constructs that provide minimal value for the | extra complexity. | rch wrote: | I'm similar, but I would like to support labels intended for | humans, along with various translations, as metadata on top | of e.g. filesystem path components. | fouric wrote: | You nailed it - getting rid of spaces and dashes and | underscores is extremely human-hostile. People added spaces | to the English language for a reason, and that's because | they make it way easier to read. | | Your system is only intended for other programs to interact | with? Go nuts, make hex UUIDs. Actual people are supposed | to use it? You need separator characters. | | I also don't see how those characters add "extra | complexity" unless you're doing dumb things like text | processing on paths and filenames (as opposed to using | OS/library functions that handle paths correctly) - in | which case, there's your problem. | long_time_gone wrote: | > thisismyconfig.txt vs this is my config.txt or | this_is_my_config.txt | | Just wondering, what is the readability of this for people | who are dyslexic? | JCharante wrote: | I'm not sure, but my gut instinct is that it wouldn't help. | Dyslexia rates are much lower in China, so if I suppose we | could start naming files with Chinese characters (on | systems that support Unicode). It would take a bit to get | used to, but eventually we'd develop a pidgin language for | when we talk about software, much like how if you overhear | Chinese or Vietnamese developers they will mix in English | words like "linked list" into their sentences, because | there's not a more natural sounding alternative. | | Switching to Chinese would also help eliminate the spaces | issue. | reaperducer wrote: | Or in my case, people for whom English is a second | language, or have low education levels. | | Saying, "who can't understand..." is arrogant, selfish, and | an example of why normal people hate people in the SV echo | chamber. | jeffwask wrote: | cestmaconfig.txt vs cest ma config.txt vs | cest_ma_config.txt | | It's the same in any language. | | Hugs who hurt you. | | I'm also pretty sure most of us in any language use | Slack, SMS or other forms of communication where text | isn't necessarily presented in a grammatical correct | manner and we all figure out what the person is saying. | throwaway2077 wrote: | SV echo chamber is on your side here - it is very in | vogue to denounce anglocentrism. they were defending | hieroglyphs and emoji in variable names in that thread | about invisible javascript backdoor a day or two ago if | you'd like a recent example | dang wrote: | Could you please stop posting ideological battle comments | to HN? We ban accounts that do that, regardless of their | ideology, because it's (a) not what this site is for, and | (b) destroys what it is for. | | If you wouldn't mind reviewing | https://news.ycombinator.com/newsguidelines.html and | taking the intended spirit of the site more to heart, | we'd be grateful. | danlugo92 wrote: | Agreed. | | But Hacker News should do something about all of the | anti-bitcoin and anti-anti-nuclear ideologies running | around in here. | | I don't really mind it that much but it'd be nice, it's | really the only 2 extremisms I've experienced here, all | other subjects are discussed in a fair manner. | beambot wrote: | I appreciate informed discussion about bitcoin & nuclear, | as both topics are highly relevant to the technical, | business, and hacker roots of HN. They seem distinctly | different from, say, "anglocentrism" @dang was calling | out. | danlugo92 wrote: | > discussion | | There's no such thing as fair discussion about those | topics here. | long_time_gone wrote: | > Saying, "who can't understand..." is arrogant, selfish, | and an example of why normal people hate people in the SV | echo chamber | | Exactly how I feel every time Economics is brought up on | HN. | teorema wrote: | tbh I'm not dyslexic and realized the spaces make it really | difficult to know what the filename actually is. If you | just take the second example, how would you know if the | file was "this is my config.txt" versus "config.txt"? | | Aside from parsing errors it just seems to lend itself to | ambiguity. | vertere wrote: | This. People are saying spaces improve ergonomics. Unless | everyone always quotes their paths in documentation, | emails, etc -- which they won't -- I say it actually | reduces readability. | | Also programs automatically that turn paths into links | don't work with spaces. | 400thecat wrote: | > treating a file system like a natural language processor is | a bad idea | | could you please explain what you mean by that? | KronisLV wrote: | > Pro tip: rename your development directory (or even better: | the workspace path in CI) to put a space and/or special | characters in it. | | This will also break any code in external tools that are called | during the builds of your application and do not handle spaces | correctly for whatever reason, thus making it so that you won't | be able to successfully finish the build. | | Then again, you probably shouldn't be relying on technologies | like that, but when you're struggling to keep an old enterprise | system alive, causing yourself more problems is not necessarily | what you should do. | | Still a good idea in most cases, though. | wldcordeiro wrote: | Even capitalization is a pain in the ass thanks to how OSes | treat file names. I pretty much stick with either `file- | name.ext` or `file_name.ext` exclusively now. | BiteCode_dev wrote: | > Pro tip: rename your development directory (or even better: | the workspace path in CI) to put a space and/or special | characters in it. | | The problem with that is that YOUR code may handle it, but your | tooling may not. If my code formatter break on spaces, I'm not | going to change the formatter. | ChrisSD wrote: | You could submit a PR to their repo. | echelon wrote: | Better solution: only allow ASCII, maybe dashes, and up to | twelve characters. Problem solved. | | Enforce this in LDAP. | | Strict convention is better than flexibility and predicting | obscure edge cases that can fail. | pimterry wrote: | In my case, and for many people writing desktop software, and | for absolutely everybody writing open-source tools or | libraries, unfortunately you can't control the environment. | | Non-ASCII paths are extremely common (e.g. the user's home | directory on Windows, for the large majority of users outside | the English-speaking world) and spaces, punctuation and | weirder characters will definitely happen when you least | expect it. | | Yes if you can avoid it then absolutely that's great, but I | don't think most people can. | | It's also not usually very difficult to deal with, as long as | you actually spot the issue in the first place. | MayeulC wrote: | Ah, that's the he enterprise edition. | | But then your program will crash hard and unexpectedly when a | user decides to save under "~/house plans" or | ~/Telechargements. | | I think it's better to exercise this in CI, that's what CI is | for. | mikepurvis wrote: | Ugh, we have the 15 character Active Directory limit now with | hostnames, and a previous IT administration has imposed a | convention that every name had to follow | [prod|dev]-[ph|vm]-[service]-[nn]. So basically every | production service is prod-vm-owtf-01-- you get exactly four | characters to actually describe what the machine does. Works | great when the service is "jira" or "wiki", but there are a | lot that are pretty mystical-sounding, like jkns, jwrk, cntr, | hrbr, etc, where you kind of just have to know. | icedchai wrote: | Do they at least allow you to set up CNAMEs? | mikepurvis wrote: | Yes, and for many of the web-serving machines, that's | what happens, they're jenkins.example.com or | containers.example.com or similar. But often a singular | service is backed by hidden worker nodes, databases, | whatever else, and it seems silly to give those machines | that level of indirection vs just using the hostname as | their sole identifier. | HNo wrote: | I kind of like that honestly. No doubt you need some | documentation so everyone knows what the service | abbreviations are, but after you've been working there for | a month you get it. Makes everything clean, consistent, and | informational. You can quickly ascertain what a specific | host is doing just from the name. | mikepurvis wrote: | Oh absolutely it makes sense to have a standard, and | being able to tell at a glance if something is a VM or | physical machine is of value also. But dedicating 2/3s of | the character budget to such a scheme is madness. If the | prod-vm- prefix simply become pv-, then you'd at least be | able to do pv-jenkins-01 again. | | Anyway, all this was fine when we were on LDAP rather | than Active Directory. So basically it's all Windows' | fault. | reaperducer wrote: | _only allow ASCII, maybe dashes, and up to twelve characters. | Problem solved_ | | ...and only hire people from the exact same background as | you, who will never have unusual characters or accents in | their name. And also make sure not to have any users who aren | 't exactly like you, and conform to this very narrow | requirement. Surely, excluding 90% of the world won't hurt | revenue in any way. | stopagephobia wrote: | This is not excluding? I just use an ascii canonicalized | version of my name and works fine. | echelon wrote: | Snarky, but I'll take it. | | Use strict schema for the hardware interface, networking, | physical stuff the user never sees. Microservice names | don't need to be non-Latin. Database replicas, | infrastructures, etc. And you're not going to piss off | employees by giving them ASCII ldap/email addresses. | | Use utf8mb4 or similar for storing names. Don't state | "first" or "last". I've been through this rodeo too many | times. You're not surprising anyone. | numpad0 wrote: | UTF-8 strings aren't reproducible anyways. User ID should | be strictly for identification, be alphanumeric random | string if necessary. | chris_wot wrote: | And yet OneDrive WP t allow fir spaces before or after a file | name. | wongarsu wrote: | > Pro tip: rename your development directory | | I changed my username to not contain a space because it was too | annoying to deal with all the random dev tools breaking. The | worst offender was probably npx on Windows [1] (resolved after | four years by deprecating npx), but it was far from the only | one (though the JS ecosystem was somehow the worst in this | regard of all languages I worked with). | | 1: https://github.com/zkat/npx/issues/100 | qwertox wrote: | In that case, be thorough and insert a Chinese and an Arabic | character to enforce a Unicode check. | cduzz wrote: | And add a emoji, a character in a right to left language ( ) | and perhaps Tai . Maybe italicize one of those too... | achn wrote: | I maintain a similar system, where a variety of companies | submit files that get processed through multiple services - it | is astounding how ridiculous people's naming of files can be; | spaces are the least concerning! | 5faulker wrote: | For those purposes I've found hyphen to be a nice substitute. | Izkata wrote: | > Pro tip: rename your development directory (or even better: | the workspace path in CI) to put a space and/or special | characters in it. | | A former co-worker changed his name in our auth system to | include an apostrophe, so that whenever we handled names wrong | he'd find it. | geoduck14 wrote: | Oh, I like this! | curuinor wrote: | the proper name of the glorious sultan of slack, j. r. "bob" | dobbs, has the quotation marks and therefore is a great | subject for this | floatingatoll wrote: | I set my nickname to U+FFFD at one point in one work system, | resulting in a variety of bug reports and concerned emails. I | think I dropped it since it was generating false reports from | people who didn't check what character the page contained | before reporting it. | reaperducer wrote: | One of the systems I built is being used by a group of | younger people. I included an emoji in the superuser account | name, just to make sure it would work. And to remind me to | think more broadly about user input. | ajmurmann wrote: | A related too for CI: change the system time to be a time | zone that is during your work hours in a different day | already than UTC. Really helped getting failures earlier than | 4pm PST. | brundolf wrote: | At my last job we had a wild time-zone bug that only | happened with your system location set to Mumbai. I left | mine set to that for the rest of my time there. | cpeterso wrote: | Related: here's a recent Firefox bug about a test that | failed during the daylight saving time change: | | https://bugzilla.mozilla.org/show_bug.cgi?id=1739847 | scubbo wrote: | Could you consider rephrasing this? It sounds like an | interesting observation that I'd love to understand, but | I'm genuinely not able to parse it. | | My best guess is "change the system time to be a timezone | for which, during your work hours, the other-timezone is in | a different day than UTC is" - but I'm still not sure what | effect that would have on CI failures. | ridaj wrote: | Accents help too | [deleted] | qwertox wrote: | I add a Japanese character into any .py, .js and .html file | to ensure that Unicode is working properly through the entire | chain. Mostly in form of a variable which gets passed along, | even in URL parameters. | fernandotakai wrote: | my test accounts always have emojis + accents + other weird | characters. | | it keeps everybody on their toes lol. | enragedcacti wrote: | To have such thoughtful coworkers. On an old team I had two | coworkers named Chris and once in a blue moon when they | reviewed each other code master would start crashing because | one of them accidentally left in an absolute path starting | with "/home/chris/". | cerved wrote: | Spaces are a pain in the ass when you're using CLI so I'd | rather enforce a no space policy | reayn wrote: | Most shells will behave just fine if you put a quote (single | or double) before anything that has a space. | | A small extra step but something you get used to if you spend | a lot of time in the cli. ___________________________________________________________________ (page generated 2021-11-11 23:00 UTC)