[HN Gopher] Write Better Error Messages ___________________________________________________________________ Write Better Error Messages Author : noch Score : 386 points Date : 2022-10-19 12:27 UTC (10 hours ago) (HTM) web link (wix-ux.com) (TXT) w3m dump (wix-ux.com) | egberts1 wrote: | 0x0000001E, KMODE_EXCEPTION_NOT_HANDLED | | That is all. | _nalply wrote: | It is my opinion that software problems tend be analyzed | corresponding to these four axes: | | - Can an end-user solve the problem themselves? If so, tell them | how, if not, display a generic error message telling them to ask | for support (with an error identifier they can tell the support) | | - Developers and end-users need different information: developers | need as much information as possible, like file names, contents | of important variables and especially where the error happened in | the source code with a backtrace, sometimes even two backtraces: | the backtrace for the cause of the error, too; and end-users only | need to be told what they can do, but this needs to be worded | clearly and carefully. This means that error messages need to be | written twice. | | - Is the problem serious? If so, report, crash and restart, if | not, just report and abort the affected operation when | neccessary. | | - The problem should be logged. Sometimes it can be sent to | developers automatically. | usrme wrote: | Here's another link for how to write useful error messages: | https://www.bbc.co.uk/gel/features/how-to-write-useful-error.... | allisonbrow wrote: | They recommend, avoid technical jargon so change it to: | | 'due to a technical issue on our end' | | but isn't that also generic and obvious which they were trying to | avoid too. | Minor49er wrote: | At a previous job, writing unambiguous error messages was | discouraged. Everything just had to be "Oops! Something went | wrong" | | The reasoning was that "users can't do anything with information | we tell them anyways", despite the overwhelming number of help | desk tickets we'd get from "Oops!" appearing in a million | different scenarios with no clear way for us to tell what error | actually caused the message to appear. | | Users naturally report the messages that they see because they're | helping us to see the problem. I didn't get why that was such a | hard concept to understand | MattGaiser wrote: | I have only worked at one place that wanted informative error | messages. | | All the others wanted to hide the reason because "if we know | the reason and tell the user, we seem incompetent" or "then | hackers will know which API call isn't working right" | (apparently the network console in Chrome is beyond hackers) to | wanting customers to be dependent as they paid for support. | rockemsockem wrote: | People who don't know anything about computer security use it | as a bludgeon to not do the thing that they didn't want to do | anyway. | PetahNZ wrote: | As long as you are logging the error with the context somewhere | that's fine. You could always include a timestamp or request ID | with the user message to not give away information, but be able | to easily search your logs for the occurrence. | m-p-3 wrote: | The epitome of uselessness: making an error message so "user- | friendly" that it doesn't help anyone. | | At least a "Details" button to unmask the technical details | would be useful in some way, while hiding the "ugliness" to the | end-user. | wongarsu wrote: | That seems like peak uselessness. Even "Error code 0x00ad4829" | is a more useful message, because even if it's useless to the | user it is useful to _somebody_. | cogman10 wrote: | There is some logic, the "you don't want to expose your | internals". Really useful messages might contain a lot of | details about the tech stack you use (giving a nice hint into | which CVEs to try). | | That said, this is an easily solved problem. The best | solution is to aggressively log errors AND prioritize having | dev teams push that error count to 0. If an error happens, | it's a bug. | | The next way to solve it is simply a report button. Let the | users click a "I'm mad at you for not working" button and | embed something like a session ID that allows internal | queries into what went wrong. | | Error codes are a terrible solution, but perhaps an OK option | if this is not hosted software. That said, a more user | friendly approach would be a QR code with all the relevant | details embedded. | marcosdumay wrote: | > Really useful messages might contain a lot of details | about the tech stack you use (giving a nice hint into which | CVEs to try). | | Nope. Useful messages contain details about what your | software does. Anything about your tech stack is redundant | and can be removed. | | > The best solution is to aggressively log errors AND | prioritize having dev teams push that error count to 0. | | Many errors can only be replicated talking to users. And on | the cases your dev team is not all capable enough to remove | all errors, you will still want to provide customer support | and work-arounds. | | > The next way to solve it is simply a report button. | | A report button is good. But neither session ID nor any | data that you can reasonably add to your logs will be | enough to let dev know what went wrong. Besides, your | report button will have errors too. | | And anyway, anything that you said applies exclusively to | people that create web applications. Many other types of | application exist, and everybody writing them are better | off not following any of your recommendations. | berkes wrote: | Why are error codes a terrible solution? I rather have an | error "bad request f12793b2" than a "bad request". | Obviously I prefer a "bad request, 'expiresAt cannot be | after 2022-12-19'. Code f12793b2". | | Having a unique ID to be able to search in documentation or | even source code is -IMO- preferable. It's still rather | technical and helps only those who can search such docs, | but at least it gives something unique to google/search | for." | slavik81 wrote: | This seems to be the approach that Android takes. If you try to | connect to a WiFi network and it fails, it just gives up. It | won't tell you why it failed. This makes it very frustrating to | figure out what's wrong. Maybe I wouldn't understand the error | message, but at least it would provide a starting place for me | to look up more information or ask for help from someone | knowledgable. | jrochkind1 wrote: | > The reasoning was that "users can't do anything with | information we tell them anyways", | | I mean, I feel like the focus of the OP was on giving them | something they _could_ do something with. Like the information | that their information was not lost; and the recommendation to | change X or try again in Y way; and the fallthrough to contact | customer support with a quick link. | | The OP was definitely not recommending giving more specific | technical info without thinking about what the user could do | with it, but instead specifically thinking about what hte user | could do or would want to know (about their data/account, not | about your under the hood services), and giving info to that | end. | grandinj wrote: | Probably just me, but I am less concerned with how good my error | messages are, and more concerned with trying very very hard to | make the errors happen closer to the cause of the problem, rather | than further away. | | "Fail early, fail hard" | | i.e. if I can make the error message happen near the beginning of | a process, I can get away with making it a hard error. | | Hard errors in the middle of a multi-hour operation tend to annoy | people. | vbezhenar wrote: | Exactly. Software must crash as soon as possible and include | some context information which is necessary to further debug | the problem. | Merad wrote: | This is an attitude I really try to build up in junior devs. | Soooo many people seem to default to writing code like, "if | input is null return null" (when input should never be null) or | "if valueThatShodBePositive < 0 silently skip the code that was | going to use the value". If the app detects that something is | in an invalid state _I want it to break_. The worst problems to | debug are the ones where you have to work backwards through | miles of strange behavior and corrupted data to find the root | cause, because the program tried valiantly to soldier on long | after it had been shot through heart with bad data. | | I guess this is because no one really teaches error handling. I | assume a lot of students end up with a mindset of just make the | errors go away instead of, deal with the errors effectively. | S201 wrote: | Agreed; I've often wondered if this is a result of early CS | classes usually expecting students to handle weird/bad | inputs. It's only natural for a programmer to want to write a | program that gracefully handles all reasonably bad inputs, | like nulls. So we're taught early on to write defensive code | that handles those. And that's fine when you're writing | short, academic programs. But when the complexity goes up by | a few orders of magnitude trying to gracefully handle that | null value 10 levels deep in some parsing logic maybe isn't | the best thing to do. Old habits die hard, however. | nightpool wrote: | Yeah, this is a great point. Both overly defensive | programming and (my personal least favorite) overly- | commented code are instilled in students at a very early | point in their careers by irresponsible teachers trying to | find something to grade students on (Didn't handle negative | values? 5 points off! Didn't leave a comment on every line? | 1 point off per line!) | elboru wrote: | When I was a jr dev, getting exceptions was a synonymous of | "me messing something up". Null exceptions were specially | annoying, so the naive approach is to check for nulls and | avoid the code that will cause the exception. And it "works"! | You don't get exceptions and your code keeps running. It's | just when you need to fix difficult bugs while you go through | logs when you understand the value of having the right | exception with the right message. And you learn to love them | and start caring about them. | tiborsaas wrote: | That's not really a respectful practice. Error messages should | be clear and actionable. | | Users don't care if you consider an error soft or hard. | madeofpalk wrote: | I think the point is that the higher up you fail, the harder | it is to identify why you errored in order to give the user | clear and actionable feedback. | tiborsaas wrote: | That's possibly indicating a bad UI / information | architecture if you are unable to tell that. | llanowarelves wrote: | When you have nested exceptions being caught by other | exceptions, how do you determine what level is correct to | show the user? Especially when it's a service class or | something that is used by a lot of calling code. | | It's implied that it would be the upper top-most | exception handlers in that code path but those are gonna | be more generic in their messages, and anything more | detailed has to be manually wrapped to add useful | description (that's not some internal developer | exception). | | Error codes may be the least bad solution, to fallback | on. | lupire wrote: | After it fails fast (thank you!), we also want to fix fast. So | we need info. | ChrisMarshallNY wrote: | The general approach that I take, is that an error message is one | of the most stressful occurrences that a user encounters, so it's | incumbent upon me to make it as pain-free as possible. | | First of all, unless I'm writing an engineering tool, my users | aren't geeks, and don't especially care _why_ the error is | happening (geeks always need to know _why_ ). They just need to | know that what was expected, did not happen. If there is a | remedy, and it can be simply stated, then I can add that, but _it | needs to be short and simple_. Longer stuff needs to go into some | kind of secondary screen (which probably won 't be read). | | Also, I take the "shopkeeper" approach. The customer is always | right, and it's never the customer's fault. I avoid any hints of | blaming the user (even if it is their fault), and try to be | polite and helpful[0]. | | Of course, the best way to deal with errors, is to avoid them. I | try to design good affordances. | | The rules are different for SDKs, though. In that case, I tend to | send a great deal of information back. I take advantage of | Swift's enums, and the ability to associate data. It can allow me | to nest error reports. | | [0] https://littlegreenviper.com/miscellany/the-road-most- | travel... | Stratoscope wrote: | 20 years ago I was working on Acrobat at Adobe. I was mostly the | "Windows guy" but also worked and tested on the Mac. | | When I tried to install Acrobat on my Mac, I got this message: | | "Your hard disk is too small" | | My _what_ is too small?! | | Later, on Windows I got this unexpected popup: | | "You are not here" | | WTF? | | I searched the code for that string and found it in a function | named "CantHappen()". This function was called in numerous places | where the programmer thought there was no possible way for the | code to get to that place. But of course CantHappen() _did_ | happen. | | As I looked through the code I found many other messages that | were bizarre and incomprehensible and sometimes downright | offensive. | | So I started a project to go through all our messages and make | them more clear and informative - and even better, when possible | to not have the message at all but just take care of the | situation. | | The underlying cause of these bad messages was twofold: | | 1. Programmers never got raises for writing great error messages | or finding ways to avoid them in the first place. We were just | rated on how much work we got done. | | 2. We did have a product designer who was supposed to specify all | user-facing messages. But the designer mainly considered the | "happy path" and didn't think about edge cases. It was left to | developers working under time pressure to handle those. | TacticalCoder wrote: | > Later, on Windows I got this unexpected popup: > > "You are | not here" | | The absolute best I had in a Microsoft product was this | (paraphrasing): _" An error happened because your computer may | be turned off"_. I still have a screenshot of that somewhere. | What it meant was that an hypothetical computer I may be trying | to connect to (which I wasn't, it was all local) was off, but | that wasn't the case. This was seriously WTF. | | The second most beautiful one from another Microsoft product | was whatever software generating a password and asking me, in a | pop-up window, to write it down. The problem was the password | was something like: 9mZOvy9E(4)?6b(w(<$KcTU%> | 9T6cz0Z4YxgQ-<tw035X6S.dLE0[2n0"42`/S=S1{q5{)61s190':&6UHT.4hZX | jO6b%l#X7v]~4tIT2Y0._ebFH,>2:G>%*P]7n4" | | I probably also still have a screenshot of that somewhere. | | Haven't used Microsoft stuff in two decades so it was a long | time ago. But it's still seriously WTF. | im3w1l wrote: | The paradox of CantHappen is that if the programmer truly | thought it can't happen then there would be no need for it in | the first place. The only reason to include it is because of a | fear that it may in fact happen. | | Rust funny enough has unreachable()! for that case, but it also | has unreachable_unchecked() for actually unreachable code. The | latter has undefined behavior and exists to help the optimizer. | giancarlostoro wrote: | What does unreachable()! do actually? I had no idea that was | a thing. | maleldil wrote: | It terminates the program with panic! | | https://doc.rust-lang.org/std/macro.unreachable.html | remram wrote: | Rust has a few of those, they all panic but with different | default messages: panic!(), todo!(), unimplemented!(), and | unreachable!() | FartyMcFarter wrote: | I've been guilty of this in the past - I remember writing an | error message that looked like "if you used X setting, do this, | otherwise that". The code should have instead checked what | settings the user enabled and given a clearer error for the | situation at hand. | marginalia_nu wrote: | That sort of code is a bit tricky though. | | Since the fault code paths (hopefully) are very rarely | executed, the error messages are easy to overlook, and tend to | rapidly become stale. This is to an extent always a problem | with error messages, but it's an ever bigger problem when you | have half a dozen error messages depending on various | parameters, since they create more and even more rare code | paths for staleness to hide in. | Too wrote: | Internet connectivity is an obvious candidate for this. | | Could not connect to server? Check if WiFi is on. Check if Dns | is working. Check if ping to router is working. Check if ping | to google is working. Link to wifi settings. | | Whatever you do. Just don't do this the reverse way, like my | smart ass Samsung tv does! It determines if internet is working | by pinging a Samsung server, _before_ it even allows _other_ | apps to use their internet. You can probably figure out what | will happen when Samsung servers are down. | e40 wrote: | Tried to download my data from takeout.google.com and got this | error: | | "500. It's an error." | | Thanks, google. I tried to start a chat (I'm a Workspace | customer) and could not continue because all the language choices | were disabled (even English). | hulitu wrote: | You shall be happy that you got an error message from Google | because their default is not to give any. | robmoore121 wrote: | I really like this. There are clear shibboleths which identify | the author as a person who deeply respects and cares for the | readers of error messages, and their experiences. It makes me | hopeful for the future of software when I see that there are | others. Thanks for sharing. | upofadown wrote: | There are fundamentally two classes of error message: | | 1. Information that can help a technically engaged person debug a | problem. | | 2. Information that can help a user of the system understand what | they have to do the overcome the problem. | | Since most error messages are created by people responsible for | debugging the system they tend to be of the 1st class. There has | to be a way to provide different information based on who is | getting the error. | tremon wrote: | There's a fatal flaw in assuming that there's no overlap | between groups 1 and 2. | koblas wrote: | The error message that is presented to the user should always | be clear and helpful. When an error is presented to the user, | you should have matching logging (e.g. sentry) that provides | technical reporting on what happened. By having both solutions | in place you have error handling that is complete and services | both communities. | FridayoLeary wrote: | There's also a third class which is "Oops! Something went | wrong..." which basically means "i don't know. Try and reload | the page." Why this is better then a simple "error" is beyond | me, but its mildly fustrating. | mttjj wrote: | > There has to be a way to provide different information based | on who is getting the error. | | Yes, this concept exists. The error message that is shown to | the user (number 2) is what's discussed in the article. The | error message that an engineer or someone else debugging the | system should get (number 1) is the full stack trace and data | dump that should be sent to the application log at the same | time that the user is shown the error dialog. | | Users can fix the problem by following the instructions in the | error dialog and engineers or technical people can come back | later and look at the more detailed stack trace to determine | the best course of action. | lupire wrote: | It's easy. Just provide both, with mark-up to label them. | MetaWhirledPeas wrote: | > There has to be a way to provide different information based | on who is getting the error. | | This is already solved. Provide one error to the user and | another to your logging system. In the user error provide a | mechanism to point you to the logged error (even a simple | timestamp helps). | munk-a wrote: | I don't disagree with any points but they missed a big one. If at | all possible, include some application (or attempt at a globally) | unique error code on each of your errors - i.e. YCOM-HN-9021. | When you provide a clearly googleable string you can help your | users independently resolve the issue and you can also set up | google alerts on the string - if you roll out a new feature that | took 3 months to develop and a week later google tells you that | YCOM-HN-9021 is up 9000% you probably broke something. If at all | possible make yourself open to client communication but most | users won't reach out about an error - users have very low trust | in customer care in the modern world (and it is, honestly, often | more trouble than it's worth) and are more likely to turn to | reddit/technical forums for a solution. It is extremely | advantageous to try and track these users. | CodeWriter23 wrote: | Even their example of terrible, "Whoops something went wrong" is | miles ahead of Chrome's "Oh snap!" | est wrote: | Redesign err msg or UX you want, I hope there is always a "more" | button to show exactly what went wrong. I hate eventvwr.msc or | less -nir wall of log texts. | 734129837261 wrote: | I completely agree with this article, but it never bothers me in | particular. But I'm a developer, so I'm an outlier. That said, I | do wish that the error message I see every day would be simpler. | <looks at TypeScript> | hdesh wrote: | Nicely written piece with clear examples. It would be great to | know the impact of this work. Perhaps one metric to look at would | be the number of tickets submitted to customer care? | Terr_ wrote: | Over time I've come to believe in the "grepability" of error | messages, and the code-lines that construct them. | | Sometimes the data (and error-messages) are flowing up and down | through many different modules and APIs and job-queues and | whatnot, that when an error pops up it saves a lot of developer- | time when you can just text-search on the code repo(s) and see | exactly the line that generated it in the first place. | klik99 wrote: | "Passing the Blame" in particular is a personal pet peeve. I hate | when apps phrase errors like I did something wrong by clicking | the totally normal link. Closely related is the general trend of | "lol wut" tone in error messages, which really grates when you're | frustrated and doing something that might be very important. | "Whoops! We made an Oopsies! Sorry :(" | manv1 wrote: | Really, error handling has been my big beef with CS education for | like 40 years. There is none. | | Error handling has been left to engineers, and when left to they | own devices engineers will almost always make the wrong choice | from a user point of view. | | Engineering need to think of error messages this way: the error | message is there to help people (which might be fellow engineers, | support, and/or and your consultants) identify the error quickly | so that they can manage the user's expectations, fix the error, | and/or both. | | Unfortunately, many engineering paradigms make this an impossible | task. | | Layering and encapsulation means that you have little idea what's | happening downstream or how the downstream stuff actually works, | but the lower-level you are the less likely the error will mean | anything to the end-user. | | Then, it's a question of who's responsible for handling the | error? If you're on the backend, where does it go? Does the user | care that the backend microservice can't connect to the database? | Heck, the UI probably has no idea what's happening back there. | | However, for accurate troubleshooting detail is needed. | | For many orgs, leaving transaction IDs in your log files is the | primary way that you figure out errors, especially in big | distributed systems. That doesn't really help end-users, and | requires developer discipline, something many engineering teams | find challenging. | | Ideally error objects would aggregate error codes up the stack, | so that if an error occurs you can at least present technical | people with the errors that were thrown..and they can search | through the source code trying to find that unique error code. | But designing that is difficult; conceptually you don't want a | list of 500 error codes being thrown upwards, one from each | function in the call chain. But sometimes you do. | | Anyway, error handling design really should be part of the | initial architecture, but it usually isn't because architecture | guys don't really understand support. | residualmind wrote: | Watched the new Quantum Leap yesterday (it's not great) and there | was this really cringeworthy moment when something goes wrong | with their awesome supercomputer and the screen flashes a giant | "INTERNAL SYNTAX ERROR". Apparently, somebody didn't run their | linter before sending people through time. Too bad. | londons_explore wrote: | How about just engineering stuff to not have errors in the first | place. | | My toaster is a complex bit of engineering - it has thousands of | parts which all work together to take power from the wall to make | toast. | | Yet it has no errors. It just does the job I ask it to do. | | A computer on the other hand seems to have a lot of ways to fail, | and does so nearly every day. I suspect everyone reading this | comment has seen at least one error _today_. Can 't we engineers | make the software better so that these errors can't/don't happen? | frontiersummit wrote: | A toaster is probably a bad example, given the common error | states (burnt toast, stuck toast) which are no doubt amplified | by design flaws in some units. I've never seen a toaster with | 2000+ components, so maybe such a machine is different. A | toaster is also historically famous for a dangerous error | state: if the plug is inserted the wrong way round, the coils | will be switched on neutral. A toaster which is "off" is thus | liable to shock an unwitting person using a fork to resolve the | stuck-toast error state. | jdiez17 wrote: | I don't know what kind of toaster you have but mine doesn't | have thousands of parts. Maybe 20 or so. | dietsche wrote: | This quote form a textbook in my graduate studies helped me a | lot: "Error messages should be how to fix it messages." | TruthWillHurt wrote: | So essentially go back to dev style error messages? | | A UX person telling us not to do what the previous UX person | thought was cute. | | Thank you sooo much! Ask PM for a pat on the back. | sposeray wrote: | magicalhippo wrote: | If you're raising an exception deep in some internal code, | provide as much detail as possible. | | If the error bubbles up to the user, then either the information | is over their head, in which case there's no difference to a non- | detailed error message, or the user/support person can actually | act on it. | | The most infuriating error I see is "file not found"... WHICH | FILE?! | | Of course if the error is found in the higher level due to some | consistency check in the business logic, then yeah try to guide | the user. But for internal stuff, try to help the person who | needs to fix it or find a workaround. It might be you. | riskable wrote: | > If you're raising an exception deep in some internal code, | provide as much detail as possible. | | > If the error bubbles up to the user, | | ...then you have an information disclosure vulnerability! | There's a _really good reason_ why we don 't bubble up deep | exceptions to end users: Attackers can use that info to gain | information about your back end that they can use to find worse | vulnerabilities. | | Put all the detail you want in your logs. Keep the end users | out of it. They shouldn't be able to tell what line broke | things. | magicalhippo wrote: | Yeah things are a bit different with web apps. There users | usually can't do anything with the info even if they had | details, so internal logs is clearly the place. But my point | still stands: you want detailed info in those logs, not just | a lone "file not found" without anything else. | lupire wrote: | > The most infuriating error I see is "file not found"... WHICH | FILE?! | | Filenames might contain user data, which must not be logged | outside of a database with proper access control, schema | annotations, and acccess auditing. | | We can only display an opaque object key, so authorized devs | can look up the filename using secure tools. | magicalhippo wrote: | Fair enough. I work mostly with good old desktop applications | though, so if there's user data, it's almost always the users | data. | | For the majority of errors in most applications one can | provide some helpful information. But yeah, one need to be a | bit careful if one has PII in the mix. | thenerdhead wrote: | As with everything, context matters. It's a great run-down of how | to empower an error message. Many products can add so much value | and saved support resources by doing so. | | There's one thing I wasn't sure about in this article though. Did | they talk to actual users regarding these empowered error | messages or even asked them what they want to see out of common | error messages they run into? It seems rather difficult to | empower error messages without first understanding the scenarios | that got them into the error state to begin with. Next would be | understanding if these error messages are helpful to the users | and asking them how they go about resolving these types of | issues. All of that is hinted at in the "what makes a good error | message". | duxup wrote: | For me error messages come in two forms. | | 1. For the user. | | You can't do that (maybe explain why). Don't do that. | | 2. Error that's actually there for the support or engineering | team for a customer to convey to support, probably with a handy | copy to clipboard link (that the user has at best a 50/50 chance | of using no matter how much prodding). | | That's it. | | Humans generally lock up hard when they see an error in my | experience. No amount of information or hand holding will help | most of them figure it out. It's better to try to solve it in | software. | | If the software can't fix the issue internally then they get an | error message and 2 things happen: | | 1. The user is going to try something else and solve it themself | (awesome) regardless of the error because they're smart and | capable people and could probably solve it no matter what you | told them. | | 2. Their brain locks up, they do the same thing 20 times and get | the same result and complain to support with some form of | "doesn't work". Doesn't matter what error you give them, they | won't even try to tell you what the error was / doesn't register | in their brain unless it had a cute cat on it or something (that | actually works... so forget this "tone" stuff). | | I like the article, but I am skeptical about a UX team who | doesn't answer support tickets ... just magically knows what the | user is thinking / will work. I get lots of advice on error | messages, I change them when they ask, but when it's from folks | inside the company who know the product it often isn't helpful. | | Heck even users give bad advice about errors. I've had them tell | me "Well it should have said X" where X is exactly word for word | what it said (they forgot...). | | Granted I still try to help the user along, but I'm skeptical | that software with any large user base can have "good" error | messages. | jeremy_wiebe wrote: | I'm not sure we'll ever eclipse the awesomeness of the VB6 error: | "Method ~ of object ~ failed". | | On a more serious note, error messages is something I always try | to keep in mind on in code reviews. Most error messages the code | I review deals with are only ever seen in production logs, so I | try to think what I'd do with that message (and accompanying | details) if I saw it in production. | dale_glass wrote: | I'll add a few for developer-oriented messages. | | * Say what the program was trying to do. | | * Make the message unique and searchable. | | * Make it detailed. | | * FFS, include the filename or whatever else the program is | having trouble with. | | * If possible, include the source code location. | | * If possible, include useful contextual information. | | * Quote strings. Once in a while, some unexpected whitespace | sneaks in somewhere and this can be hard to figure out. | | Eg, don't just abort with "Open failed: NOT_FOUND". Abort with | "job.c:2105 Failed to open job description file | '/var/spool/jobs/125.json' when processing job #5 for user | 'alice': NOT_FOUND". | | This way I don't have to strace the damn thing to try and figure | out what's it looking for, and know which user it was for, so I | don't have to dig around and try and figure out which entry in | the database might contain the wrong information. | | Also, context-free, generic error messages are awful. A large | enough codebase may be impossible to search for some very common | keywords. | | If possible, googleable error codes are great to have, but they | shouldn't replace the error message. It's ideal if you can search | the source code and instantly find where the error message | originates. | chiefalchemist wrote: | A couple+ years ago my then employer required I take (what | amounted to) Security Training 101 for Software Developers. I | believe one of the client orgs expected everyone to go through | the program. | | That said, Ppetty much everything you're suggesting was | considered a bad idea (for security). Mainly because the more | details you give away, the more a hacker can understand about | the underlying system. The more they probe and possibly break | things, the more you're showing your cards. | | It was then the bland cryptic error msg made perfect sense to | me. | m-p-3 wrote: | I'll also add to make them easy to copy to clipboard in the | case of a GUI-based program. | | It's easier to search and store in an incident management | system. | golergka wrote: | Also, make sure that sensitive information like user's | passwords, emails, credit card numbers etc, is filtered out of | the logs and not sent to your servers. | at_a_remove wrote: | Yup, all of these. Sometimes I look "around" the problem, like, | "I found _THIS_ directory but the file 'z.txt' was not in it!" | or "Not only could I not find 'z.txt' I could not find _THIS_ | directory it was supposed to be in. " Check to see that it is | really a file, not a directory. "I found 'z.txt' in _THIS_ | directory but it was zero bytes in length! " | | In terms of "fail early," my larger programs have a section | called Pre-Flight Checklist, which looks for files (and that | they _are_ files), databases, that the databases have the | expected tables and the correct columns, and so on. Are the | files sufficiently recent? More or less the expected length? | Because this is ETL stuff, it 's usually okay to push this | stuff up as early as I can. | redact207 wrote: | For Saas products, this plus use structured logging so you | don't have to grep-parse log messages when searching your log | collectors. | | Ie all the meta/log context in a hashmap alongside the error | message. | [deleted] | jrochkind1 wrote: | Very well-written article with good examples and advice. | cosmotic wrote: | All the 'do this' versions suffer from the same problems as the | 'don't do this' versions. Aside from fixing the tone, they are | still generic, still inactionable, and still verbose. | donatj wrote: | > Even in today's world of user-centered design, technical jargon | still sneaks its way into error messages. You couldn't fetch my | data? My credentials were denied? What? The technical stuff is | not important to the user | | This is the opposite of what I want. Stop condescending and just | tell me what actually went wrong. | lucumo wrote: | I have this issue with Google Family Link, where I want to add | my child's voice to a Nest Audio. The app straight up tells me | that I'm not connected to the wifi, which is clearly not true. | Furthermore, the app knows I'm connected because in the logging | you can see it finding the Nest Audio. | | It's impossible to figure out what goes wrong. Plenty of people | have the same problem, but Google only has this forum where | superusers assume everybody else is either lying or an idiot. | Meanwhile, they take such error messages at face value, despite | many people saying they have wifi. | | All that to say that I'd rather have an overly technical error | that actually tells me what's wrong, instead of a friendly | error message that's straight up wrong. | deathanatos wrote: | This; particularly because more and more, "support" seemingly | has no means to access logs, no ability to do the debugging, | and no way to escalate obvious bugs in the application to the | developers. | | I need the technical jargon to do support's -- and the company | whose product I'm using's -- job for them. | | Is it not helpful to laypeople? Perhaps not, but it is what the | technical friend they're going to drag into the problem needs. | gpderetta wrote: | How many times I had to strace an application because the | fucking error message didn't give enough information!! | jaywalk wrote: | It all depends on the context. If it's a web application that | can't connect to some backend service, for example, what | exactly are you going to do with that information? | aeonik wrote: | Depends on why it didn't connect, right? | | Was it a timeout? Maybe an HTTP 401 Was it a DNS failure Was | there a TCP reset immediately? | | Each one has a miriad of troubleshooting steps associated | with it. Some could be local to the host, some could be | network/firewall some could be from the remote host or behind | that. | lupire wrote: | I'm going to web search it and find advice from other users | or devs. Maybe I need to use my email address instead of | username, or delete my cookies, or something. | | If it's proprietary locked down user-hostile junk, then yeah, | all I want in the error message is a statement of a refund on | my payment, and a link to a competitor website. | [deleted] | josefresco wrote: | > Stop condescending | | Wix is mostly a platform for non-techie DIY website builders. I | can't imagine they'd know what to do with a highly technical | error. | taink wrote: | I think the point they are making here is that clearly stating | what went wrong doesn't necessitate using "technical jargon". | | Now, "your credentials have been denied" seems pretty clear and | does not use jargon in my opinion, but telling the user "the | ajax request failed, returning a 403 http error code" seems | unhelpful and doesn't tell them what happened. | InCityDreams wrote: | ...then they clearly didn't make their point at all. Big | error (in communication) on their part. Your single (2nd) | sentence communicates everything required. | bornfreddy wrote: | Even in your example there's a world of difference. "Your | credentials have been denied" implies a problem with | credentials, while 403 clearly states that the credentials | are valid, they are just denied access to this resource. | | I know it is a made up example, but it does show the problem | with "dumbing down" the error messages. Details matter. | pwinnski wrote: | Error messages should definitely be written with a target | audience in mind. For Wix, a blogging platform, the target | audience is usually decidedly non-technical. For many of the | tools I use, more technical detail would be welcome. Then | again, my parents are unlikely to use the same tools, while | they might use Wix. | artogahr wrote: | I don't understand why they wouldn't have a dropdown below the | error that would reveal the technical jargon. | Too wrote: | They do. Press F12 ;) | ARandomerDude wrote: | > If the issue keeps happening, contact Customer Care. | | This actually means "if you like wasting your time and want to | speak to incompetent fools who will pass you to an endless stream | of their 'colleagues' then dial this number." | simion314 wrote: | My recent experience with docker, I am a total newb so I was | running a tutorial step by step, then I get some error about apt | certificates/keys/repo stuff. After lot of googling the issue was | there was not enough disk space but the fucking error was | pointing in a different direction. Also this is a good example | why Stack Overflow is usefull for the dudes that hate on it and | RTFM everyone else. | | This is why I love exceptions, I had an issue with a C# game, but | with a stack trace I could figure out myself that the issue is | happening when the app initialize and fails to open a file. | | I think twe should always give the users a detailed log and stack | traces, also docker should fucking have some way to catch the | issue when there is not enough space and report the error | properly. | progx wrote: | @Microsoft read this article! ;) | hprotagonist wrote: | I would, if i had any evidence at all that they would be read and | acted on. I'm convinced even seemingly competent people are just | rendered contextually blind by the appearance of any error at | all. | | In the past month, i've had about a dozen interactions like this: | developer: your service crashed, here's a screenshot of the last | 5 lines of the crash me: do you see where the final | text you just pasted is "RuntimeError: Did not find ENVVAR, | ensure this is set to the proper value (see <internal wiki link>) | and then restart this service" developer: yeah? | me: well, did you do that thing? developer: what | thing? me: <headdesk> | | and this at work, where the developer in question is intimately | acquainted with the context and purpose of the project. | grandinj wrote: | Some developers are just lazy, and will likely need some kind | of negative feedback to force them to confront their own | laziness. | | Which can be tricky, because the degree of negative feedback | that is appropriate to the person in question can range from | | "Polite one-on-one suggestion that you read the error message | more than once before calling me" | | to | | "Full on yelling at the person in the middle of an open-plan | office". | | Thankfully, type II is rare, but they do occur. | lupire wrote: | Send a link to wiki. Last line of page is "if you have | questions, reach out and include the keyword $THIS_PAGE_KEY | in your message." | bartread wrote: | > Some developers are just lazy | | I'm _really_ lazy: if I were on the receiving end of emails | with error messages that included instructions about how to | fix said error I 'd automate Freshdesk (or whatever ticketing | system I was using) to respond with instructions specific to | that error message, in the first instance, along with a note | to get in touch again if that didn't solve the problem. I'd | also set the ticket to autoresolve after a set period of | time. | Taylor_OD wrote: | It's a little annoying but to be fair because most error | messaging is garbage, its easy to start to ignore them. How | often is the error message shown, and the little fix given, | actually going to solve the problem in modern web development? | 10% of the time? 25% of the time? I'd be shocked if its that | high. | bonoboTP wrote: | It's error message blindness, similar to ad blindness. Even if | you make a great banner ad with some very useful information, | or the perfect and affordable product for my life I won't see | it because I mentally filter out ads because they are junk most | of the time. | | Some people develop the same with relation to error messages | because most of them are not actionable, other than "stuff | broke somehow, [gibberish] blabla". Even if your error message | is impeccable, it's in the class of things that are noise. | | If you come up to me at some busy tourist location, where I'm | used to lots of scammers, I won't listen to you even if you are | actually a nice person and just want to have a nice chat and we | would be compatible friends. | | Often it _is_ a good strategy to just ask people. Documentation | and comments get out of date very fast. If you are the kind of | person who reads everything meticulously and googles around, | reads manuals etc. you may be wasting a lot of time. Of course | there is a right balance to find. Some people err too much on | the side of not thinking themselves and immediately asking for | handholding, but overall it 's often the right thing to do. | | In many cases I found that trying to reason out what was going | on was hopeless, because when I eventually gave up and asked | someone, it turned out that the solution was unguessable, | something like "ah of course, that things is out of date, do | this magic incantation, then this and that, yeah we should | update the docs sometime!". | | A lot of knowledge is locked up inside people's brains and just | spreads around as "rumors" on the grapevine. Is that state of | affairs ideal? No. But it's realistic and people are going to | adapt by asking first, thinking second. | Jiro wrote: | There's also the situation where the program creator likes | changing functionality on a whim, and every time you google | up your problem, you find a solution for a version of the | software that doesn't have the particular menu or whatever | that you had the problem with. | | (This is a big problem if you've ever had a problem with | Android.) | BlargMcLarg wrote: | Asking people is mostly bad habits from a culture too | ingrained into the whole 'ask first' thing, and often times | it is the people _trying to help_ that are to blame. | | I had this recently. Many individuals like to play hero and | make sure I don't get stuck because their business is an | undocumented mess. Before I even read the thing and tried, | they are already trying to give me the answer. When I ask 'is | this documented and if so, how would it be discovered easily' | their first reaction is 'no' followed by a lengthy | explanation which _should_ be in the wiki and easy for | newcomers to find. | | And it shows when I forget a few days later because my brain | never put in the effort to get to the answer and my memory is | that of a fruit fly's. | jimmytidey wrote: | This is a context where people are used to seeing errors that | they don't know what to do with. | | If a web app pops a well written error it is much more likely | to be acted on than an unmotivated dev seeing a some (probably | badly formatted) text. | | Every time I see an error in terminal with a link to | documentation I'm delighted. And surprised. | Kalium wrote: | Once upon a time, I worked at a financial startup (the | company is irrelevant). I created a little harness around a | static analysis tool. It would fail builds when a library had | an outstanding vulnerability scored as HIGH or SEVERE with a | patch available. The harness put a friendly error message | around it. It ran roughly as follows: | | > Hi! If you're reading this message, it's likely because | this tool failed your build. To understand why and fix it, | please click this link <link_to_internal_doc>. Below is a | table that lists the packages you need to update and the | version you need to update them to. | | The doc had at the very top in big flashing red text with | siren anigifs a link to the portion that explained that they | needed to update their libraries with _very_ clear copy- | paste-into-Dockerfile actionable guidance. The page also | explained the broader context, such as the point of the tool | and why we were doing this despite having a firewall and so | on. | | This is where you might be delighted and surprised. | | What was perhaps less delightful and surprising were the | consequences for me. About 4-6 times a week, I would then | have a Slack conversation akin to this: | Dev: Why did you break my build!?! Me: Can I see | the error message? Dev: <pastes message above> | Me: Thanks! Looking at the message, is there something | unclear about the documentation? Does it not work? | <ten minutes pass> Dev: Nope! Docs are great! | | At this point the conversation would end. | nerdponx wrote: | So? That's no excuse for a _developer_ to disregard the | content of an error message in their own application. | lijogdfljk wrote: | It kinda is. Kinda like when documentation is so repeatedly | outdated and incorrect, that when you need new information | you just skip documentation entirely. | | Are you wrong for skipping documentation? Yea, maybe. Is it | entirely expected? Yea. | | Based on the parent comment, at least. | monknomo wrote: | And yet developers do disregard the content of error | messages. Try to figure out why they disregard it. I doubt | the answer is "because they're stupid". The answer probably | also isn't "because they just aren't trying". | | What could it be? Why do people read things and react in | similar ways, even if they have different jobs? If only | there was some field of study that could answer these | mysteries. | outworlder wrote: | I have managed to get a lot of notoriety in my company by just: | | 1. Paying attention to error messages | | 2. Reading documentation | | 3. Looking up stuff I don't fully understand(including googling | error messages) | | That's it. | | Some people don't even read error messages at all. I understand | non technical people doing that, but I've seen far too many | engineers doing it. If anything doesn't go exactly as expected, | they freeze. I have no idea how a person gets so far in their | careers without reading error messages. Actually, I do, those | people ask others to figure out stuff for them. That's way | prevalent in enterprise settings. Sure, collaboration is good, | but I've seen a lot of instances where there's a massive | imbalance - you'll have 10 people pinging a single person to | 'unblock' them. They could have spent a couple of minutes | trying to figure out yourself. | | I'll move mountains to help someone that comes to me after | having done some basic homework to try to fix (or at least | triage) an issue. It very rare though. | | It's also amazing how many people will just go ahead without | having read a single line of documentation of the thing they | are working on. I've even had a developer dive in a Golang | codebase without having _ever_ worked on the language. That | would have been fine - that's how I learn new languages, just | get accustomed, before doing some more formal training and | exercises - except that he continued to not read the language | documentation before asking a bunch of questions. Needless to | say, the questions weren't good. | | And number 3... just rubber ducky everything. If you can't | explain it, you don't get it. Go read up on the topic. | Sometimes I'll find out that I don't fully understand something | as I'm writing an email to others. | vladvasiliu wrote: | > I'll move mountains to help someone that comes to me after | having done some basic homework to try to fix (or at least | triage) an issue. It very rare though. | | This. I actually am OK with people not figuring out even | basic stuff. But please, at least try to give the impression | that you've put some effort in, instead of just trying to | have me do your homework while you browse facebook or | whatever. | dan_mctree wrote: | > except that he continued to not read the language | documentation before asking a bunch of questions | | Can't really blame people for that too much, most language | documentation is utterly unreadable unless you already know | exactly what you're doing. And even if you do get it, it's in | one eye and out the other. Most people just don't learn very | well from reading technical information you don't need to use | right away. You might be a happy exception and got to build | up your notoriety that way | tetha wrote: | > I'll move mountains to help someone that comes to me after | having done some basic homework to try to fix (or at least | triage) an issue. It very rare though. | | These are rare, but they also tend to be the really effective | ones. We have a couple of teams who understand the stack, | read documentation and read error messages. We generally | don't hear of them for months and months, because they are | too busy being productive. | | But when we hear of them, it's usually time to push | boundaries of the infrastructure and the processes. They | tried everything and nothing worked and now it's time to make | it work. | bob1029 wrote: | This is a lesson I learned while being system owner of the | primary user interface that runs on a semiconductor factory | floor. No amount of confirmation/warning dialogs will actually | stop someone from doing a wrong thing. Doesn't matter how scary | the language is. Here's an approximate sample of one: | "DANGER! Confirming this action may result in 8 figures worth | of scrap!!!" | | Even if you are super careful and make sure your error messages | are terse in all cases, you will still succumb to things like | muscle memory among your users. I've caught _myself_ mindlessly | dismissing these while testing. How can I expect my users to be | better than the person who developed the UI? That is | unreasonable. | | It got to a point where we started _removing_ these alerts | /confirmations because it was training people to do the wrong | thing in a few places. If you have part of a UI where all | actions are immediate and final, the game theory changes. The | moment a user enters into one of these spaces, they are much | more cautious. | | If the user thinks the UI will save them, they may eventually | tire of these protections and forget why they are there in the | first place. I feel like this is very similar to the problem of | driver assistance and partial self-driving capabilities today. | nkrisc wrote: | The goal of writing better error messages isn't to help the | people who never read error messages, it's to help the people | who do and who you never have to hear from. | marklubi wrote: | The trick that I've found is that each error message needs to | be unique... not just the stack trace, but the actual wording | of the message leading up to that. | | Get a screenshot or the exact verbatim of it, and you can | identify exactly where in the code it originated. | | User reports are unreliable, but when I can pinpoint where | the message originated from, it massively cuts down on the | troubleshooting time. | legulere wrote: | In RFC 7807 all errors get an unique URI. Message texts | might change or be translated into a language you don't | understand. | shadowgovt wrote: | It turns out translating error messages is controversial. | | Users, upon hitting an error, often go check Stack | Overflow. If you localize your error messages, you | Balkanize the collective wisdom on how to address the | error (which will always be larger than your team's | ability to troubleshoot errors and offer correctives in | your documentation and FAQs). | BerislavLopac wrote: | To be precise, each error _type_ gets a unique URI. | | A good way to take advantage of that is to have a central | database of all error types, but not many companies | bother to do that. | mi_lk wrote: | > have a central database of all error types | | do you have any example? | zem wrote: | here's ours for pytype (a python type checker): | https://google.github.io/pytype/errors.html | alisonatwork wrote: | A useful thing here is not just to include a unique error | code for the type of error (usually numeric), but also to | generate some kind of short Base32 or similar hash and | print that right next to the error message while logging it | to your normal back end. Then whether people send you a | screen shot, copy/paste, whatever, you can easily search | the logs to find the exact event that occurred. | mceachen wrote: | Better still: add a unique prefix to the error code, so | it's googlable. | | The Typescript team does this with compilation errors, | like `TS12345: frobulating types cannot be transmuted`. | [deleted] | rmetzler wrote: | Yes, that type of thing is pretty useful for linters. | These error codes act as identifiers if you need to | google them and whenever you need to configure the linter | the way you like it or for one-off exceptions. | lucb1e wrote: | > each error message needs to be unique | | Include random numbers. "Error 7743929" is super easy to | track down (grep -r 7743929 takes 2 seconds to type), you | don't need a NATO alphabet to understand what they're | saying on the phone in order to be able to search it | correctly, its general purpose is understood | internationally, and it won't change between versions (like | when you'd encode a file name and line number, for | example). When I first figured this out at, idk, 17 years | old and mentioned the idea in a game making forum, people | called me crazy, but I still use it and don't know of any | better system. | | Of course, this is _alongside_ an actual error message to | help the user help themselves. This is just to trace the | line where it originated, which already helps a lot for | small software projects like I make. | Too wrote: | About that, the number of _developers_ that can't read, or | even understand the value of, a stack trace is also | astonishing. | | If only I had a penny every time someone sent me a "log of | the error", that only contains the final line with the | unhelpful message saying nothing but KeyError. | vladvasiliu wrote: | Forget stack traces. | | I've met multiple "web developers" (actually working on | the backend or "full-stack", building API servers and | whatnot) who came complaining about this or that server | being "unreachable" and could I check it's up / whether | the firewall allows them through. Only to find they were | getting HTTP 404 errors or the like. Which were explicit | in the errors they'd show me. | lamontcg wrote: | At prior work we removed stack traces from the default | error output because it was thought to "scare" too many | users. | | Then for years almost without fail when an error was | pasted into a GH issue it would include the big "If | submitting a bug report, please include the full stack | trace at /var/log/stacktrace.out" message--without the | stacktrace. I added some whitespace around it and all | caps to it and still nobody read it. | [deleted] | dylan604 wrote: | I used to lean on line numbers, but those quickly fall out | of sync with deployed code and what's currently checked out | and available for immediate debugging. I've also switched | to using unique text you mention as it will always find the | place in the code regardless if it has been moved. | | I wish I had learned that earlier than I had. | EvanAnderson wrote: | I am reminded of the classic non-intuitive survivorship bias | example from WWII re: armoring bombers: https://en.wikipedia. | org/wiki/Survivorship_bias#In_the_milit... | vkou wrote: | Or, in the anecdote above, to help yourself, when you are | inevitably contacted by the person who never reads error | messages. | rjmill wrote: | > evidence at all that they would be read | | I just had an idea: Put tracking info in the error URL. If your | company has an internal URL shortener, that could do the trick. | | More practically, I feel like it helps to put an empty line | before the call to action. For many people, a traceback is just | noise. The empty line helps split the useful info out from the | traceback. | | Or if it's a script/CLI (and you know the error reason) don't | even show a traceback. Just print the error message to stderr, | exit non-zero, and be done with it. | residualmind wrote: | Actually reading (and understanding, acting upon) error | messages seems to be part of the learning process of every | developer. And while more senior devs usually do read error | messages, even they sometimes, rather than reading it will jump | to behavior like "trying again a different way", before looking | closely what went wrong. | hinkley wrote: | Developers often seemed shocked that people can't find the | important error in a wall of text. A particular peeve is when | the same error is reported three ways and the real error is | sandwiched between others or scrolled off the screen due to | spammy behavior. | [deleted] | ajnin wrote: | How many interactions didn't you have, because the developer | read the error message, read the Wiki, and ultimately solved | the issue themselves ? | [deleted] | zagrebian wrote: | This just means that the error message needs to be more clear. | For example, after the error itself, it could give direct | advice: "PERFORM THESE STEPS: You must define ENVVAR. Go to | <wiki link>. Set ENVVAR to a proper value and restart the | service." | | Notice the direct language. It reads like an order. The less | direct the message, the higher the chances that the user will | not act upon it. | MiddleMan5 wrote: | I can't tell if this is sarcasm or not, this is obviously | highlighting a deeper issue in developer culture. | | The example given _was clear_ compared to 90% of other error | messages, and saying that it needs to be "more clear" is | almost dismissive | Aperocky wrote: | Don't blame developer culture, if _that_ error cannot be | acted on, attribute to incompetence and not culture. | ckozlowski wrote: | I think you're correct. To add to this (and I think it's the | point that the article was trying to make), errors written in | fragmented language or "developer speak" I feel are likely to | get glossed over. The "Write it like you're talking to a | friend." advice the article gives I think is spot on. Making | the message more conversational is to invite better | understanding and comprehension. | | I feel there's a trend when it comes to disseminating | messaging like this that we adopt an attitude of our audience | "is smart, and should figure the rest out". They may be. But | they already have lots to do any plenty to figure out. Any | opportunity we, the requestor, can lighten their mental load, | is going to increase the odds that they'll be inclined to | take action right away. | duxup wrote: | The problem is people are not rational... and we try to solve | that with software. | | Many people just lock up when software doesn't do what they | expect. | hinkley wrote: | Lots of people find ways to irrationalize being rational. | vbezhenar wrote: | Not rational people must be fired from IT. | duxup wrote: | Generally a pipe dream in my experience. | bee_rider wrote: | There's a type of error for which the user can be given | detailed step-by-step instructions (permission issues, etc). | But to some extent, errors should handle situations the | programmer didn't expect. If it is possible to provide | detailed step-by-step fixes, then the program should do those | steps itself. | | Adding a URL might not be a great plan, never know how long | an old copy of a program will stick around, might not control | that website forever. | dvtrn wrote: | I'm not seeing how what the message already is any less | direct or clear than what you're saying it should be? It | straight up tells you it can't find the var and what to do | about it. | | Can you help me understand what isn't clear about the message | as is, or maybe point out the ambiguity to someone who just | isn't seeing it? I want to write better error messages but I | share the frustration of the above poster. The message tells | you specifically what to do, but you're coming back saying | it's not clear. | lupire wrote: | Some people don't read anything that isn't an all-caps | command. They have learned helplessness from seeing too | much useless error text in the past. | j-bos wrote: | I think the original error is quite clear, under normal | circumstances. | | Not OP but I've noticed that people often get brain fog | when something goes wrong and are often need BIG, SHORT, | WORDS to shake out of it. Or really anything that can shake | them out of the 'idunno' state of mind. | | But maybe if something like that became standard ut would | no longet be a context switcher.. | ckozlowski wrote: | I think you're spot on, and I made a similar comment | above. | | It's easy to say "they can figure it out". Sure, in a | restful state. But the people we're asking to take action | already have a lot on their plate. Using plain, | conversational language whenever possible with | exceedingly clear steps means less mental exertion on the | receiver. And since we need their help, anything we can | do to make it easier on their end helps us. | Too wrote: | Conversational errors can also be fatiguing. Often what | you want is something short and dry that can be pattern | matched. Compilers are pretty good at this because all | their errors start the same way. Error | in file foo/bar.c, line 32, missing semicolon. | | No conversation needed. These can then be complemented | with more conversational language on the next line to | explain why semicolon is needed. Rust is quite good at | this. | dvtrn wrote: | These are fascinating responses to me, as with the | example given my mind first went to someone for whom | English is a second language. that group having trouble | with this message I would understand, or at least have an | easier time understanding having trouble, if even a very | little amount. | | For someone who was born speaking English and spoke it | their entire lives, the example provided couldn't | possibly be more to the point in my opinion. | | Though I agree overall with the general idea and that yes | there are some pretty baffling and downright awfully | written error messages and log entries that take a minute | to grok (I just don't think the example replied to is one | of them). | bombcar wrote: | Some of the errors that Gentoo portage can encounter do | exactly this - and they do it with beautiful terminal colors | that make it easy to figure out what you need to run, or | where to go to figure out which of the three options you | need. | | The problem can come when there's a wall of "useless" | logging/error messages, and the last one or near the last one | is the actual important one to look at. You have to | explicitly call it out on a clear screen and make it obvious | - and even then, people won't always read it. | mariusmg wrote: | >it could give direct advice: "PERFORM THESE STEPS: You must | define ENVVAR. Go to <wiki link>. Set ENVVAR to a proper | value and restart the service." | | Really, should logs also be documentation now ? Just | mindlessly logging the same "advice" over and over again each | time the error happen ? | dementiapatent wrote: | It will be so much fun when the implementation is | refactored and half of these comments are forgotten about | and no longer meaningful. | prerok wrote: | Exactly. At one of my previous workplaces there was a | cumulative effect of misattributed error messages so the | actions to perform were often of no help. | | Not even to mention the fact that new or changed error | messages caused a landslide in costs in translations to | various languages. I guess this product has no | localization? At that time, when I was working at such a | product that had it, we had to go through a deliberate | process to describe why we want to change it, what the | impact is, etc. Tell me you want 100 new messages and you | will be stuck in meetings for the next month. | | In their case, though, it seems they at least have the | support in management for it. I hope it turns out better | for them than it did for me. | SpicyLemonZest wrote: | I had an error message a few months ago that instructed | me to reinstall the AWS CLI, I filed a ticket when that | didn't work, and the team was annoyed with me because | _obviously_ the real problem was a Python configuration | warning with no suggested action 10 lines up. | kortex wrote: | It depends who, what, and when the error is about. Failures | are generally a bathtub curve. You have a high rate at | start (usually configuration issues), some fairly fixed | rate during operation, and then more at end of lifecycle | (exhaustion, service hiccups on scale-in). | | If it's in the early lifecycle, absolutely, because it's | most actionable. X is set wrong, Y can't be reached, etc, | guide whoever is operating the system how to fix it. | | If it's mid cycle, it's often post-hoc, but context is | worth its weight in gold. Less about telling the operator | how to fix and more about why it broke, to avoid in the | future. | | End of cycle, whatever. | 0xbadcafebee wrote: | Logs actually are a form of documentation. Documentation | can provide instructions on how to diagnose and fix | problems, and that's what logs do: tell a human being what | a problem is and how to fix it. | | Remember that often the person reading the logs is not the | person who wrote the software. Maybe it's an Ops person at | 2AM trying to fix a broken deploy. Maybe it's a developer | who joined the company 3 years after the software was | written. Maybe the log is passing through an error message | from 3 layers deep in the stack. The more literate your | logs are, the better. | ddulaney wrote: | Logs can definitely be a form of documentation. | | I write software that is generally run low in the stack, | quietly doing some mundane tasks that are business-critical | but rarely thought about. If one of our clients has to mess | with our software beyond the occasional update, that was a | failing. Not all software is like this, but lots of it is | -- its value is that no human needs to be involved. | | I need to write log messages with the expectation of an | audience who doesn't know much about the software -- it's | been running uninterrupted for months or years and suddenly | something has gone wrong. If the log line doesn't tell the | user how to solve their problem, I will end up getting a | call. | throw827474737 wrote: | If it is that simple, the why doesn't the code fix it | itself? But no, usually there is 1/2/3 likely things, but | it also could be anything else.. and that kind if | unexpected errors even often have no default-fix. | | No, the most best thing is to point to the documentation | which has that, and not printig out manpages of docs in | error messages now. | | > I write software that is generally run low in the stack | | What stack, how low? Me too.. that low that I usually | cannot return or even log a " see error code doc at | http.." string for various reasons (bandwidth, mem, | performance) but only have error codes ;) | pwinnski wrote: | In the case at hand, where an environment variable isn't | set, how exactly should the code fix itself? Human | interaction is necessary, which is the reason the log | message should spell out what the human needs to do. | | If I'm starting a service and see a pointer in the logs | to documentation, that seems like an incredibly broken | approach to me. Why would I look at missing or out-of- | date documentation that may or may not be at hand when | the code that knows the problem is _right there_ and can | just tell me? A log message like you 're describing might | as well say, "Something went wrong, but I don't want to | tell you what. Instead check page 43 of the document in | the third file cabinet from the left in that room over | there on your right. No, your other right." | an_ko wrote: | I don't want to have to hunt for documentation if it | breaks. It may have been 30 years and everything but the | binary has been lost, and the vendor is out of business. | If in that situation all I get is an error code and a | link to documentation that doesn't exist, I'd have to | start reverse-engineering. And while doing so I'd | definitely be cursing the coder who decided that saving a | couple hundred bytes of space in a log file in the event | of an "abort the program"-severity event was worth | dumping this in my lap. | Spivak wrote: | Errors on initialization, fatal errors, and non-recurrent | errors that require human/support intervention should be | documentation. | hinkley wrote: | If the error results in the program shutting down, it's | once per fatal interaction. | | In other words, yes. | chillfox wrote: | Yes! We have tools to filter what gets saved and | compression that handles repeated text very well. | | So why not provide docs on how to solve the error along | with the error. | eyelidlessness wrote: | This is fairly common in good error logs. | pwinnski wrote: | Yes! | | There are people who don't read formal documentation but do | read logs, after all. | | If the advice is the same over and over again, then yes, | give the advice over and over again. I wouldn't want to | assume that someone has read every line of the logs, or has | started to read top-to-bottom, so the advice should always | be among the most recent lines in the log, and the only way | to ensure that is to give the advice again each time the | error happens. | pydry wrote: | It more likely means that the developer views the service as | OP's responsibility. They'll view an order as something OP | needs to do. | | The clarity of the error message doesnt really matter if the | recipient believes it is intended for somebody else. | quintussss wrote: | Isn't this just survivor bias though? You only hear from those | that fail to read and act on the error message. | Joker_vD wrote: | Well, imagine the error was simply "RuntimeError: Environment | variable not set" instead, then how much of your time would | have been wasted by those dozen interactions? | chillfox wrote: | Don't send people somewhere else to learn how to fix the error. | The more steps and indirection you add the fewer people will | bother doing it themselves, especially if they can bump it to | the developer. Make it easy for people to fix their own | problems by being explicit, direct and complete. List all the | steps and use formatting to make it visually easier to consume. | | So your error message while a far cry from the worst I have | seen is also pretty far from the good ones I have seen. | starkd wrote: | I think his point was the developer tends not to even | investigate the ENVVAR at issue or visit the link. If the | developer does investigate the link and still has an issue, | than you have a point. | chillfox wrote: | Pretty sure his problem was he got contacted about an issue | he considers uninteresting, and his preferred solution is | the user stops behaving like a human. | | Reaching for the easiest way to solve a problem first is a | very human thing to do, and in this case he was easier to | contact than opening up a browser and reading an article | that presumably is written in the same kind of language as | the error message. | starkd wrote: | I admit to doing this. Even many of the useful error | messages that clearly indicate the fix are drowned out | out by the mass of output. I've made this mistake before, | and I'll probably do so again. | chillfox wrote: | I feel like this is a problem of overly chatty | application logs + lack of formatting for errors. | | If the volume of drivel was lowered and errors were | formatted with spacing and color to stand out, then they | would be easier to focus on. | | So log errors to stderr, send it to a separate log file, | and format it well (use multiple lines). | marcosdumay wrote: | > So log errors to stderr, send it to a separate log | file, and format it well (use multiple lines). | | Oh, for sure. Do never: | | - send errors to the same log you send normal activity. | | - default into logging things that aren't errors on the | error log (make this possible to override if you want, | but never the default). | | - log the errors there, but the necessary context on | stdout so it appears correct on a terminal. (E.g. build | tools that print entering into target in stdout; error in | stderr; leaving target in stdout) | | - try to recover just to show a different error later. | dvtrn wrote: | I'm left wonder at what point does the "give a man a | fish/teach a man how to fish" method of pedagogy apply in | terms of 'acting like a human' in this context? | | Asking as someone who otherwise generally agrees that | there are some truly poorly written errors and exceptions | out there, but has also been on the admittedly | frustrating end of the constant requests for help | deciphering error messages that were very plainly stating | what the problem is for someone who didn't even try | looking for the fishing rod. | chillfox wrote: | Sure, clearly there are people who will never try, or | learn, but in general as an industry I feel like the wast | majority of errors are very very far from good. | | Few error messages are written well, has good formatting | and are self contained (can be used to fix the issue | without having to seek further information elsewhere). | Sometimes you see errors that contain one of those | elements, but rarely all of them. | | There has been an effort the last few years improving | compiler errors for some languages, but those same | improvements have not reached applications. | coldacid wrote: | The help desk guys are on the other side of a cubicle wall from | my workstation, and almost every call I overhear about someone | getting errors just convinces me further and further that | people don't only not pay attention to the error message, they | don't pay attention to the people they're calling to help them | get through the situation either. | lupire wrote: | Use the error messages you wrote! Send them the link they sent | you, and move on. | pizza wrote: | I mean it kinda makes sense. When you're coding, you're | constructing something. When you're debugging, you're | deconstructing something. I feel like it's natural for people | to take a sec to codeswitch, bc they were likely in a state of | flow w/ considerable momentum up until they saw the error | llbeansandrice wrote: | I feel like I can't get folks to open the log file and cmd-F | "ERROR" half the time. | TillE wrote: | I've seen this _constantly_ over the years, people who | absolutely refuse to read the simplest instructions, but | instead require step-by-step hand-holding from you personally. | | I have no idea how these people get through life at all. | 0x457 wrote: | Hey, let's jump on a quick call, so we can go through this | together and maybe update docs if they're out of date? | dagw wrote: | I suspect that, at least subconsciously, they're to some | extent doing that to punish you for writing 'bad' software | that they have to struggle with. If they're going to suffer, | you're going to suffer right along side them. | 0xbadcafebee wrote: | The problem is here: _" RuntimeError:"_. Once they saw that, | they stopped reading. _" Did not find ENVVAR"_ [..] _" ensure | this is set to the proper value"_ [..] _" and then restart the | service"_ are also obscure and will stop them from reading. | | Why is the user like this? Error message PTSD. Years of staring | at obscure errors full of technical jargon that are not helpful | to the user, has left them scared to even _look_ at the content | of the error message. They have tried to Google these things | before and failed, and now they just avoid it entirely and run | for help. | | I'm sure there's enough detail in the link you provided to help | the user. But if that's the case, it will be better for the | error message to simply say: A problem | occurred, but don't worry! You can fix it yourself in 5 | minutes! For instructions, visit https://internal-wiki- | link/spaces/BLAH/AppUserRuntimeError#A013579 | | Even if you expect the user to be "smart enough" to fix their | own problem, they are more likely to try it themselves if you | make it seem easier. | Kalium wrote: | I tried exactly this approach! What I got was a bunch of | developers copy-pasting the error message with helpful URL at | me and demanding to know what they should do. The number who | followed the link and fixed the problem themselves was | shockingly small. | | Going out on a limb, I think we're all going astray by trying | to parse the error messages our fellow developers are | reacting to. A great many seem to handle any unfamiliar or | unexpected error message by giving up, no matter how friendly | or informative or helpful it may be. | bonoboTP wrote: | They don't parse the error message as a natural language | sentence talking to them. They take it as an opaque string, | like a big error code. It literally passes through them | without getting interpreted. | | They learned that the affordances of these error messages | are copy pasting into some place: a google search box, or a | chat box asking for help. But it has no affordance of | "interpret as an English sentence" for them. | 0xbadcafebee wrote: | If that's the case, then these people may just need | training. It's likely that nobody has ever sat them down | and explained that they have a responsibility to | investigate their own issue. Often people feel they have to | rush to get something done, and that they _can 't_ take | time to troubleshoot. But if their bosses explained that, | actually, it's fine if your work is a little late due to | troubleshooting, they might do it themselves more often. | You also may need to provide back-pressure by interacting | via email/ticket. | Kalium wrote: | That's a kind, caring, compassionate, empathetic approach | founded on assuming good faith. | | Unfortunately, it is perhaps not an ideal fit. I was | mostly not dealing with the most junior and new of | developers here. I was often dealing with senior | developers who fully understood that they were | responsible for investigating their own issues in a | context where it was understood that troubleshooting | takes time. | | I often wound up regurgitating the error message back to | them, asking them to point to the problems in the | documentation getting in the way of them solving their | own problems. This generally resulted in a conspicuous | silence and the issues shortly thereafter being resolved. | | The lesson I drew from this was not that the developers | in question needed training. What I learned was that they | needed to be convinced to treat these errors as natural- | language strings they could interpret themselves. | tlogan wrote: | This is 100% correct. | | In theory, all errors should: explain the input, explain the | problem and explain how to solve the problem (actions). And | that should help and reduce number of support calls. However, | error messages and actions how to solve the error are read by | maybe 1% of users. | | The only way to improve your UI is to prevent errors and use | standards / familiar design. | JTbane wrote: | >>>"RuntimeError: Did not find ENVVAR, ensure this is set to | the proper value (see <internal wiki link>) and then restart | this service" | | I'm laughing as you could not make it clearer if you tried. | PEBKAC | [deleted] | onion2k wrote: | Shouldn't the app gracefully exit with a clear message, and not | bail out in a way that looks like a crash? I'd guess that the | person who wrote it hooked into the error handler because that | was the easy thing to do rather than bother to write a nice way | to exit properly. | | The fact that you've had this _a dozen times_ points to a | problem with the app more than the people using it to me. | CityCobra wrote: | Still, if you write proper error messages then at least _you_ | can figure out what the issue was without SSHing into the | person's computer and checking their logs. | [deleted] | xiphias2 wrote: | ,,Try again'' button is the worst way to solve the problem of | having no connection. GMail does it right by trying again | automatically periodically while having an error bar on the top | of the screen, at the same time not stopping the user from using | the application. | | If Wix can save the data locally, why not just copy the GMail | error interface and let the user decide when to connect to | internet? | he0001 wrote: | I believe that any language that treats errors and error | management as an afterthought are bad. Also any programmer that | treats errors as an afterthought or simply ignore them is going | to write bad code/programs. Errors are hard and need language | first level support. People talks about "higher order functions" | but never how to deal with errors (mainly because it's boring and | complicated). Also errors are tightly coupled with intentions, as | if you fail to do something, well that's an error. But that also | means that it's tightly coupled with what the program is trying | to achieve. So anywhere an error happens should be close to what | it tries to do. Also it solves what an error is all about, which | makes it easy to describe what it should be. Yes there are errors | that may not fall into this category as they are much less | related to what you are trying to do functionally. Any program | which ignores how errors work and flow, in my experience, has | always been bad in general, as the structure of it is also bad as | there's no organization. | londons_explore wrote: | A big part of this is to direct more of your development time | into errors that happen more frequently. | | Most systems I was involved in designing have some kind of error | tracking system, so we can know exactly how often each error | occurs. | | An error that never happened needs (usually) no attention. | | An error that 28% of installations have seen needs _a lot_ of | attention. The error text should be translated into local | languages, wiki pages should be written about how to resolve it, | efforts should be made to auto-resolve the error. The error | message should include helpful info, etc. | | Eg. "SSH server can't start. Config file unreadable". | | Could be split into: | | SSH server can't start. Config file error on line 7. | 'AllowPasswordLoogin' is an invalid setting. Did you mean | 'AllowPasswordLogin'? If you want to make this change, 'sudo nano | /etc/sshserver.conf' will let you change this config. | [deleted] | imwillofficial wrote: | I saw an error message the other day: | | "Deployment failed because: deployment succeeded" | cpeterso wrote: | If you have tech support or knowledge base articles for your | product, you can include unique error codes in your error | messages so that Googling the error code will find the | appropriate support article. Microsoft is pretty good about this | with their KB article numbers and their compiler error messages | like C4000: https://learn.microsoft.com/en-us/cpp/error- | messages/compile... | andrewguenther wrote: | Bonus points if your link to customer care auto-populates the | fields necessary to get the ticket where it needs to go and can | attach relevant diagnostic information to the resulting ticket. | swyx wrote: | write errors that don't make me think: https://dev.to/swyx/write- | errors-that-don-t-make-me-think-24... | larsonnn wrote: | Just tell me you can't connect with a big red Error message. I | don't give a damn about polite error messages. | kgeist wrote: | What the article is missing is how they learned the new error | messages are now more helpful to the end user. Some kind of | metrics: maybe, the number of support tickets/angry reviews | decreased? Otherwise without clear criteria for success I'm not | sure if it was worth it and wasn't just changing the error | messages for the sake of changing. Sure what they talk about | makes sense but "it makes sense" is not a business metric. | fleddr wrote: | This is great, I would add one critical ingredient: provide | actual customer care. | | Meaning, the "way out" is to point users to customer care, but | this still does not help if customer care is shit. And we know it | often is. | | Customer care should be an email address (and/or phone number) in | the footer. Not a contact form. Self-help/FAQ is fine, but no | replacement for direct contact. Nor is a shitty AI bot. | | And when contacting support directly, answers should not be | scripted non-sense completely ignoring the actual issue at hand. | | I don't care if it doesn't scale. Make it scale. Your problem. | p5a0u9l wrote: | Was hoping to get insight on better logging for engineering | users, not UX design. ___________________________________________________________________ (page generated 2022-10-19 23:00 UTC)