[HN Gopher] A clickjacking vulnerability in WhatsApp that enable...
       ___________________________________________________________________
        
       A clickjacking vulnerability in WhatsApp that enables phishing
       attacks
        
       Author : enimodas
       Score  : 297 points
       Date   : 2023-12-22 10:12 UTC (12 hours ago)
        
 (HTM) web link (00xbyte.github.io)
 (TXT) w3m dump (00xbyte.github.io)
        
       | neverrroot wrote:
       | Using a unicode character to reverse order of characters and
       | create links that have "trusted" value like:
       | ln.instagram.com//:sptth. Neat and indeed something that could be
       | well exploited.
        
       | j4yav wrote:
       | Really interesting approach to use right to left override in that
       | way, that's very clever.
        
       | AlexSW wrote:
       | Interesting ideas and vulnerability! With a nice and concise
       | summary. Thanks for sharing
        
       | iLoveOncall wrote:
       | I remember this already existed on Windows Explorer 2 decades
       | ago, it's funny to see it "rediscovered".
        
         | rany_ wrote:
         | The attack still works and it is less obvious than you might
         | expect. For context, an SCR file is a regular executable,
         | treated the same as a .EXE or .COM.
         | 
         | From https://attack.mitre.org/techniques/T1036/002/:
         | 
         | > RTLO is a non-printing Unicode character that causes the text
         | that follows it to be displayed in reverse. For example, a
         | Windows screensaver executable named `March 25 \u202Excod.scr`
         | will display as `March 25 rcs.docx`. A JavaScript file named
         | `photo_high_re\u202Egnp.js` will be displayed as
         | `photo_high_resj.png`
         | 
         | I think the examples are pretty scary if you ask me, but most
         | anti-virus software do warn you when they come across those
         | types of files.
        
           | moritzwarhier wrote:
           | Unicode is hell really.
        
             | jeroenhd wrote:
             | The RTL override is necessary for embedding right-to-left
             | content inside left-to-right text. If you ever want to
             | combine Arabic and English in one sentence, you'll probably
             | want an override in there.
             | 
             | You could use HTML and other formatting tricks to do the
             | same, but this control character is a very valid and useful
             | part of Unicode.
        
               | moritzwarhier wrote:
               | Yes, hell is other people (pun not meant in a culturally
               | divisive way).
               | 
               | Unicode is extremely useful and a great engineering
               | success. My comment was a bit tongue-in-cheek, sorry.
        
               | jeroenhd wrote:
               | I guess I misunderstood you, apologies!
        
               | moritzwarhier wrote:
               | No need to apologize, my comment really wasn't very
               | clear. Have a nice weekend!
        
       | Flimm wrote:
       | It's disappointing that Meta chose not to fix this and chose not
       | to reward this researcher with a bug bounty.
        
         | IshKebab wrote:
         | I expect they probably didn't make clear exactly what they
         | wanted fixed (blacklisting the RTL character) and Meta thought
         | they wanted all misleading URLs fixed which is not really
         | possible.
        
           | acdha wrote:
           | That's still a Meta problem. Simply confirming the PoC should
           | have made it clear that they need to fix something.
        
             | AlecSchueler wrote:
             | POC?
        
               | Zu_ wrote:
               | Proof of concept.
        
               | cyann wrote:
               | Proof of Concept
        
           | henearkr wrote:
           | How can they blacklist this character while still supporting
           | URLs in right-to-left languages?
        
             | amenhotep wrote:
             | The client should probably be aware of whether the user
             | might be expecting RTL text and maybe display a warning if
             | not? Arabic users receiving a URL containing the character
             | shouldn't raise any eyebrows, but if a random Anglo user
             | clicks on one it might be worth displaying a warning that
             | it's backwards text? At least the first time.
        
               | WilTimSon wrote:
               | It wouldn't really solve the problem, sadly. The
               | percentage of people who'd bother to read that warning is
               | likely quite low.
        
               | berdario wrote:
               | I don't think that's a good reason for not including such
               | a warning.
               | 
               | "quite low" for a service with billions of users, can
               | still allow for million of users who would benefit from
               | seeing the warning.
        
               | thiagoharry wrote:
               | This does not solve the issue for arabic users. Sounds
               | not good for me declaring the problem solved just because
               | it was solved for people speaking certain languages. Or
               | attacking the problem excluding certain languages.
        
               | berdario wrote:
               | That's a good point, but the algorithm for
               | detection/flagging doesn't have to be what the
               | grandparent post proposed.
               | 
               | Maybe something like: strip all tags (leaving only the
               | unstyled text) and check:
               | 
               | - there shouldn't be any RLM within the URL
               | 
               | - RLM marks are accepted before/after the URL only if the
               | URL uses only characters for a language that is RTL and
               | the surrounding text uses characters for a language that
               | is LTR (or viceversa, LTR and surrounding text is for RTL
               | text)... Otherwise the text is flagged
               | 
               | - flag URLs that contain both characters for RTL and LTR
               | languages (with possible exceptions for ccTLD/TLDs? )
               | 
               | Of course, this leaves some open problems (how big should
               | the sample of the "surrounding text"?)
               | 
               | And also, Meta could roll out this logic/algorithm in
               | public Facebook/Instagram posts, where it has more
               | control of it... Rolling it out in WhatsApp first could
               | be more problematic, since due to e2e, Meta wouldn't be
               | able to easily spot false positives (messages with URLs
               | that are flagged as potentially malicious, but which are
               | actually fine)
        
             | avel wrote:
             | You can only apply the fix to the URL field in the payload
             | described, not to normal message text.
        
             | yorwba wrote:
             | You don't need RIGHT-TO-LEFT OVERRIDE to support URLs in
             | right-to-left languages. It's an extremely rare codepoint
             | that's used to force left-to-right characters to be
             | displayed as if they were right-to-left characters. The
             | only use case I can think of off the top of my head is some
             | kind of interlinear phonetic transcription where you want
             | Latin characters to flow the same direction as the
             | corresponding Arabic for ease of cross-referencing.
             | 
             | For ordinary bidirectional texts, RIGHT-TO-LEFT ISOLATE,
             | its sibling LEFT-TO-RIGHT ISOLATE and POP DIRECTIONAL
             | FORMATTING are plenty:
             | 
             | [?]`nwn URL lhdh lt`lyq hw:
             | https://news.ycombinator.com/item?id=38734329
             | 
             | Where I used RIGHT-TO-LEFT ISOLATE in the beginning to make
             | sure the Arabic text in front of the colon is to the right
             | of it, and then POP DIRECTIONAL FORMATTING in the end to
             | restore the original directionality. (Amusingly, HN's URL
             | parser treats the POP DIRECTIONAL FORMATTING as part of the
             | URL, which breaks the link.)
             | 
             | Otherwise it would show up like below, where "in front of
             | the colon" means "left of it" (as is customary in English
             | text):
             | 
             | `nwn URL lhdh lt`lyq hw:
             | https://news.ycombinator.com/item?id=38734329
        
           | Jerrrry wrote:
           | There's nothing to fix, this is intended, just often "re-
           | discovered" behavior.
        
             | nequo wrote:
             | TFA mentions that Twitter, TikTok, and Pinterest all
             | sanitize U+202E. Sanitizing it in WhatsApp would be a nice
             | fix.
        
         | smashah wrote:
         | They're to busy sending legal threats to OSS projects
        
         | pera wrote:
         | I reported a similar issue to Google early this year and they
         | declined the submission because it "can only result from social
         | engineering" and "we think that addressing it would not make
         | our users significantly less vulnerable".
         | 
         | I won't mention the details here but Google Search sometimes
         | rewrite URLs in such way that an attacker can spoof the actual
         | URL.
         | 
         | My advice is to never trust URLs displayed by websites and
         | apps.
        
           | LelouBil wrote:
           | > I won't mention the details here but Google Search
           | sometimes rewrite URLs in such way that an attacker can spoof
           | the actual URL.
           | 
           | I think I saw something like this a while ago, with some fake
           | KeePass website maybe.
        
             | chatmasta wrote:
             | This is an actual feature for AdWords (which show up in
             | search results). But at least there's some moderation of
             | the rendered domain in that case.
        
               | pera wrote:
               | I know you are referring to those fake KeePass malware
               | ads, but just to clarify: the issue I reported was not
               | related to AdWords - it was for normal search results
        
         | dclowd9901 wrote:
         | They'll fix it. They just won't reward the bounty hunter.
        
           | rollcat wrote:
           | And therefore, future bounty hunters will make sure to offer
           | their next 0day to the highest bidder instead.
        
       | tambourine_man wrote:
       | Nice hack. The real problem is not WhatsApp or the Unicode
       | reverse character, though, it's that URLs are hard.
       | 
       | Just this simple visa.securesite.com fools a lot of people. And I
       | don't see a good solution in the near future.
        
         | acdha wrote:
         | This specific example is poor sanitization because it actively
         | misleads the users who try to understand what they're clicking
         | on.
         | 
         | Your example of the generic confusion around host names and
         | domains is a harder problem but people have tried to mitigate
         | it somewhat by doing things like highlighting the domain name
         | portion. Like most phishing techniques, passkeys will end it
         | eventually.
        
           | paulryanrogers wrote:
           | > Like most phishing techniques, passkeys will end it
           | eventually.
           | 
           | This assumes passkeys will be widely adopted. And that users
           | will know to stop wherever the passkey doesn't work. I have
           | doubts about both.
        
             | jpc0 wrote:
             | So Google and Amazon have support, and it seems depening on
             | which AB group you are in Apple does too?
             | 
             | I think it is a significant benefit and likely to be
             | implemented specially concidering client support is already
             | there and there are good libraries available to do it.
        
               | paulryanrogers wrote:
               | Large providers have supported other standards and not
               | seen uptake. I'll believe it if/when it happens.
               | 
               | Lack of understandably is the primary downside of
               | passkeys, and I doubt it will be overcome in this decade.
               | Authentication is like investing, one must understand the
               | options for it to be effective.
        
             | jeroenhd wrote:
             | Passkeys work, password managers with autofill should also
             | work. You can override password managers, but "why can't I
             | find my credentials" makes you look at the URL again at
             | least once.
             | 
             | For Bitwarden, this will be the hostname, and as such, will
             | tell you that you don't have any passwords for
             | moc.margatsni.nl
             | 
             | There are design issues at play here, but mitigations for
             | most types of phishing are already available. Websites need
             | to implement Passkey support, but any username+password
             | website should work perfectly fine with password managers.
        
             | acdha wrote:
             | Your first assumption is dubious: Apple, Microsoft, and
             | Google all have well-integrated support and usage is
             | increasing on mainstream sites. It seems unlikely that
             | there will be strong popular backlash against something
             | which is easier to use in addition to being safer.
             | 
             | The second is flat out wrong. Passkeys and U3/F/FIDO2 do
             | not depend on the user at all. Even if I completely fool
             | you, the credential you get for example.com cannot be used
             | on example.org because the protocol incorporates the host
             | name. That's why the security community is pushing them
             | since phishing is so common and this shuts that down
             | entirely. The attacks now tend to involve getting people to
             | downgrade to password + SMS/TOTP so the more those fade
             | from common usage the better everyone will be.
        
       | aeonik wrote:
       | Very cool attack, and easy to read write up.
       | 
       | I have one basic question: It was mentioned that attacking the
       | encryption was skipped in favor of using a debugger.
       | 
       | Was this debugger applied to the WhatsApp Web app? Or was the
       | debugger deployed on the phone? Was it an emulator?
       | 
       | For some reason I didn't think WhatsApp had a web app (I don't
       | use it).
        
         | ajb wrote:
         | The article says "I decided to intercept a message via WA web".
        
           | isaacfrond wrote:
           | That was the initial idea, but it failed because Whatsapp
           | traffic is end to end encrytped. The second idea, which
           | actually worked, was to put a breakpoint in Whatsapp while
           | running in an emulator.
        
             | LelouBil wrote:
             | No, not an emulator. Just using your browser's JavaScript
             | debugger.
             | 
             | You can do that on any website.
        
               | flutas wrote:
               | Yeah, the article also says "WA web's javascript was
               | uglified and minified, however after a while of searching
               | I found the right place."
        
         | blincoln wrote:
         | The article doesn't make it 100% unambiguous, IMO, but the
         | debugger screenshot looks like a desktop browser's debugger.
         | You could also potentially do the same thing in the mobile app
         | using Frida.
        
       | Dah00n wrote:
       | What's up with the font?
       | 
       | >web' s
       | 
       | All 's have a space after them.
        
         | qwertox wrote:
         | Not on my machine, not on Firefox or Chrome.
        
         | jeroenhd wrote:
         | I see it too, because I block web fonts. Enabling web fonts
         | resolves the issue.
         | 
         | For me, this is some kind of Linux + Firefox + certain fonts
         | issue with the ' character (right single quotation mark, not
         | '). We're not the first to run into it:
         | https://bugzilla.mozilla.org/show_bug.cgi?id=48152 but
         | reproducing seems quite hard.
         | 
         | According to https://www.reddit.com/r/firefox/comments/is9twh/w
         | hy_would_a..., this happens when you have Chinese (I'm guessing
         | Asian) fonts installed.
         | 
         | The reason, as far as I can tell:
         | 
         | - font-family is "Source Sans Pro","Microsoft Yahei",sans-serif
         | 
         | - Source Sans Pro has no fallback font
         | 
         | - the fallback font for Microsoft Yahei is Noto Sans CJK SC
         | (result of ~ $ fc-match 'Microsoft Yahei') because YaHei is a
         | CJK optimised font. This is configured in
         | /etc/font/conf.d/30-cjk-aliases.conf
         | 
         | - Noto Sans CJK SC is a wide font (common for CJK fonts)
         | 
         | I think the solution to this problem is altering config files
         | in /etc/fonts/conf.d somehow but I haven't figure out what I
         | need to change exactly. Commenting out lines 466-473 (the alias
         | containg <family>Microsoft YaHei</family>) to kill the
         | association works, but I'm pretty sure that breaks any attempt
         | to render MS YaHei.
        
       | coderag wrote:
       | Interesting attack and a nice write up. I see Google services are
       | also mentioned. Are they taking any action on this unlike Meta?
        
       | fruit2020 wrote:
       | What's happening with the Whatsapp osx app. It's so bad to use
       | nowadays, slow, buggy.
        
         | Jleagle wrote:
         | It's an Electron app, they also have a native one in beta you
         | can download
        
       | Erratic6576 wrote:
       | Preview and message are sent separately. My intuition tells me
       | the preview is used to track user activity. I wish I could
       | contact the author to know more about how WhatsApp tracks
       | activity
        
         | kioleanu wrote:
         | They cache the preview for subsequent messages containing the
         | same link. For example, if a link is making the rounds and gets
         | sent 200k times in an hour, they don't call the URL 200k times
         | to build the preview, as that's a huge waste of resources on
         | both sides, with a huge chance that the servers containing the
         | link gets DDOSed
        
       | darkwater wrote:
       | Everyone fixing on the UTF RTL character but Meta should have at
       | least acknowledged the issue with the preview URL that can be
       | different from the message URL. I understand that this is
       | probably to unfurl shortened URLs, but there has to be some
       | clever workaround that Meta & Whatsapp can implement
        
         | amadeuspagel wrote:
         | No. End-to-end encryption means that the preview has to be
         | generated either by the sender or the receiver. Having the
         | receiver generate the preview would leak his IP. They have to
         | remove the preview feature.
        
           | josu wrote:
           | > They have to remove the preview feature.
           | 
           | They can just disable it for contacts that you don't have on
           | your contact list.
        
           | darkwater wrote:
           | Yes, preview is generated by the sender to avoid receiver's
           | address leak to a sender-controlled host, but what I'm saying
           | is that WA should enforce on the receiver side that both
           | point to the same URL. As said initially, they are most
           | certainly doing it this way to unfurl URL shorteners, which
           | would other be the easiest way to phish people. At the same
           | time it's also noteworthy that the preview can fail to be
           | generated on the sender side and the message will be send out
           | anyway, so yeah, I agree with you that they could just remove
           | the preview feature. Probably in their opinion the trade-offs
           | are worth, I guess.
        
             | chimeracoder wrote:
             | > Yes, preview is generated by the sender to avoid
             | receiver's address leak to a sender-controlled host, but
             | what I'm saying is that WA should enforce on the receiver
             | side that both point to the same URL.
             | 
             | How do you do that without having the receiver make an HTTP
             | request to that address, in order to follow all redirects?
        
               | o11c wrote:
               | The receiver can do the verification while clicking
               | (which would make the request anyway).
        
               | darkwater wrote:
               | Exactly, that's why I say that they chose the trade-off
               | of easy-to-send shortener over more complicated/manually
               | crafted attacks like the one in the article.
        
       | eviks wrote:
       | > Exactly as I suspected, the link and the preview were sent
       | separately!
       | 
       | This is an even bigger issue with the UI design, why should poor
       | users compare links and previews to be safe?
        
         | kevincox wrote:
         | It's a security tradeoff. Given that you want to provide a link
         | preview (which is a nice feature) you have a few options:
         | 
         | 1. Generate on the sender side. Downside: Can be spoofed.
         | 
         | 2. Generate on the receiver side: Downside: Leaks receiver IP.
         | 
         | 3. Generate via third party: Downside: Leaks information to the
         | third party.
         | 
         | Overall I think that 1 is the best option. The sender can
         | "spoof" all of their messages anyways, including the preview as
         | part of the message is really no different. The problem here is
         | that it isn't obvious that this content comes from the sender,
         | it is displayed as a separate bubble and I would bet that 99%
         | of users don't realize that the content is from the sender.
         | 
         | Plus the URL is all that really matters anyways. If you are
         | clicking on an attacker-controlled URL they can make the
         | preview display anything they want. So you gain very little by
         | forcing the preview to be "authentic".
         | 
         | Option 3 can be good as well. Especially if implemented with
         | something like double-blinding. So you connect to one party
         | which forwards you to a second party. This way the first sees
         | your IP and the second sees the destination IP but neither sees
         | both (unless they collude). However that is a lot of
         | infrastructure to set up and maintain for relatively little
         | benefits.
        
           | robertlagrant wrote:
           | Another comment picked what I think is the best option: the
           | sender generates it, and receiver verifies it, but only on
           | click. That way the receiver's already going to leak their
           | IP, so WhatsApp can verify before opening up the web page.
        
             | kevincox wrote:
             | Verifies what? That the preview matches? What if it changed
             | between the send and the click legitimately? Also what is
             | the threat model here? If the sender controls the URL they
             | can generate any preview that they want.
        
       | bugsliker wrote:
       | I love that this is categorized as "reverse engineering" at the
       | bottom of the post.
        
       | rashidujang wrote:
       | Amazing article! In case the author sees this, it'd be great if
       | the author can deep dive into how he "found the right place" in
       | finding the correct breakpoint to produce the decrypted message.
       | It seems to me that if you're able to do this there's a lot of
       | interesting things one could do.
        
         | j0hnyl wrote:
         | Probably just painfully stepping through the debugger.
        
       | charcircuit wrote:
       | This isn't clickjacking. Clickjacking is when an attacker hijacks
       | a click go actually click on something else that the user was not
       | intending to or was aware of clicking. The existing of the RTL
       | codepoint to make text go from right to left is an i18n feature
       | and using it confuse people is not a novel vulnerability.
        
       | blincoln wrote:
       | This is a pretty clever combination of feature misuse, although I
       | think I'd rate the overall security impact fairly low, because
       | the best-case scenario is that you cause the recipient to open a
       | link in their browser. That can be useful in some cases, but
       | unless the attacker is a police force, intelligence agency, or
       | similar, there would usually need to be some kind of follow-up
       | attack, e.g. exploiting unpatched software on the device.
       | 
       | In the interest of technical accuracy, I don't think I'd label
       | this one "clickjacking" specifically. "Clickjacking" usually
       | refers to a very specific technique that involves invisible HTML
       | frames overlaid on top of other content.[1][2]
       | 
       | [1] https://owasp.org/www-community/attacks/Clickjacking [2]
       | https://portswigger.net/web-security/clickjacking
        
         | Thorrez wrote:
         | Yeah, I wouldn't call this clickjacking, because real
         | clickjacking is a technique the makes a victim perform some
         | account action without knowing it. Simply opening an unintended
         | link isn't as bad as that.
        
       | josephcsible wrote:
       | RTL has been a huge source of security vulnerabilities for its
       | entire existence. Why don't operating systems have a setting to
       | disable all RTL, so that people who don't know any such languages
       | aren't unnecessarily exposed to the dangers with zero benefit?
        
         | altfredd wrote:
         | Operating systems notwithstanding, there should definitely be
         | such option for every OS widget, that displays text (including
         | Android TextView). And it should default to disabling all BiDi
         | backdoors unless developer explicitly vetted specific text span
         | to enable them.
         | 
         | Making entire text rendering stack vulnerable by default under
         | pretext of catering to less than 1% of world population is
         | ridiculous.
        
           | dixie_land wrote:
           | Exactly, somethings should've just been left ASCII. The push
           | to use Unicode in many components for "inclusiveness" are
           | simply political.
        
             | ptx wrote:
             | Or perhaps KOI-7 N1, another 7-bit text encoding [0]. The
             | Cyrillic alphabet should be good enough for anyone. As you
             | say, no need to let Americans use their native alphabet
             | just to feel included.
             | 
             | [1] https://en.wikipedia.org/wiki/KOI-7
        
       ___________________________________________________________________
       (page generated 2023-12-22 23:00 UTC)