[HN Gopher] Reverse engineering my router's firmware with binwalk ___________________________________________________________________ Reverse engineering my router's firmware with binwalk Author : sprado Score : 480 points Date : 2020-02-06 13:52 UTC (9 hours ago) (HTM) web link (embeddedbits.org) (TXT) w3m dump (embeddedbits.org) | bonyt wrote: | This is indeed a cool tool! I've used it before when forensically | analyzing a cell phone, and found interesting things. For | example, I found that a web browser had cached the unencrypted | bytes from an HTTP message. Binwalk identified the gzip header's | magic number (1f 8b), and after decompression there were | interesting results. | | Another cool tool I learned about recently is signsrch. It's more | for reverse engineering binaries of software that implements | encryption of some type. It'll find signatures in the binaries of | these encryption methods, giving you a place to look when, for | example, reverse engineering a file format that you suspect is | encrypted in some way. | | https://www.oreilly.com/library/view/learning-malware-analys... | xenocratus wrote: | I first found out about binwalk from this YT video on Firmware | Reverse Engineering: https://www.youtube.com/watch?v=GIU4yJn2-2A | | Quite a good, short intro into the subject as well! | ggcdn wrote: | A slightly related question for HNers: Is there any easy tool for | a non-cs guy to reverse engineer a binary file containing numbers | and text in some specific format? | | I have to work with some old structural analysis software. The | material and element definitions come in an obscure file format | ".PF3CMP". I know it contains text like the material names, and | numbers/letters for the material properties. | | Ultimately its my goal to be able to write these files from | matlab or python, instead of using the horribly clunky user | interface. But first I need to know the structure of the file, | and I'm not even sure how to begin figuring that out. | | [0] is what it looks like when opened in a hex editor | | [0] https://imgur.com/a/jvqV3k8 | mml wrote: | related possibly? what domain is this file from? | | https://techdocs.broadcom.com/content/broadcom/techdocs/us/e... | ggcdn wrote: | thanks but sadly not, its from a structural analysis program | called PERFORM-3D. | | I've contacted the developer but they will not release the | format of the files to me. | thebruce87m wrote: | The Linux tool "od" might help you here. The -c flag will print | ASCII characters. | | You can get it with WSL on Windows, or even just install git | and you'll get git-bash for another easy option. | PeterisP wrote: | Depending on how weird the format is, it might be more | efficient to reverse-engineer the file-reading routines of that | program which can work with these files. | Youden wrote: | I don't know of any straightforward tools, most people I've | seen reverse engineer a format do it with a hex editor and | writing custom scripts. It's not directly relevant but the best | I've seen is this presentation about reverse engineering the | protocol used to communicate within a car: | https://www.youtube.com/watch?v=KkgxFplsTnM | | It uses some techniques that might be relevant, like monitoring | different parts of a file as you make different changes (like | accelerating or decelerating). In your case it might be | possible to compare between different material definitions for | example. | [deleted] | ggcdn wrote: | Ok thanks, I'll take a look. It's possible for me to generate | these files for each of the various material settings so I | can manually 'diff' them, simillar to what you're describing | josteink wrote: | Did I read the blog wrong, or was the stock firmware too based on | a OpenWRT kernel? | | That would be pretty hilarious if it was true. | fencepost wrote: | I'm pretty sure a lot of stock firmware is based on OpenWRT or | used to be, though I'm pretty sure most of them lag well behind | the current version. I haven't paid much attention for a while, | but I think a lot were based on Kamikaze which is more than 10 | years old now. | | For the vendors with access to closed-source drivers and | chipset info they can likely support devices not supported on | the open source packages. | | Edit: Per Wikipedia, "Qualcomm's QCA Software Development Kit | (QSDK) which is being used as a development basis by many OEMs | is an OpenWrt derivative" | | It also notes Ubiquiti's wireless router firmware as being | derived from OpenWRT, but I thought I remembered discussion of | Ubiquiti being derived from a different open source | distribution - unless perhaps the routers and wireless devices | don't share a code base. | josteink wrote: | That's pretty cool. I didn't know that. | | Looking into the equivalent firmware[1] for my Archer C7 v2, | I didn't find any OpenWRT bits though. I was honestly a | little bit disappointed. | | I guess the difference between hardware revisions might be | more fundamental than I assumed. DECIMAL | HEXADECIMAL DESCRIPTION ------------------------- | ------------------------------------------------------------- | ------------------ 0 0x0 TP- | Link firmware header, firmware version: 1.-15188.3, image | version: "", product | ID: 0x0, product version: -956301310, kernel load address: | 0x0, kernel entry | point: 0x80002000, kernel offset: 16384512, kernel length: | 512, rootfs offset: 855873, rootfs length: 1048576, | bootloader offset: | 15204352, bootloader length: 0 71520 0x11760 | Certificate in DER format (x509 v3), header length: 4, | sequence length: 64 98560 0x18100 | U-Boot version string, "U-Boot 1.1.4 (Mar 5 2018 - | 13:57:29)" 98736 0x181B0 CRC32 | polynomial table, big endian 131584 0x20200 | TP-Link firmware header, firmware version: 0.0.3, image | version: "", product | ID: 0x0, product version: -956301310, kernel load address: | 0x0, kernel entry | point: 0x80002000, kernel offset: 16252928, kernel length: | 512, rootfs offset: 855873, rootfs length: 1048576, | bootloader offset: | 15204352, bootloader length: 0 132096 0x20400 | LZMA compressed data, properties: 0x5D, dictionary size: | 33554432 bytes, | uncompressed size: 2451644 bytes 1180160 | 0x120200 Squashfs filesystem, little endian, version | 4.0, compression:lzma, size: | 9878520 bytes, 789 inodes, blocksize: 131072 bytes, created: | 2018-03-05 06:16:10 | | [1] https://static.tp- | link.com/2018/201806/20180611/Archer%20C7(... | mjevans wrote: | The BOM can vary quite a lot between 'revisions', using | your product as an example... | | https://openwrt.org/toh/tp-link/archer-c7-1750 (Scroll down | to the Info Links table and the Wikidevi Info column) | | v1 to v2 upgrades the Flash (8MB to 16MB) and uses a | slightly different AN+AC wifi chip. v2 and v3 seem pretty | similar at a glance. v4 is rated at 12v 2a rather than | 2.5a; using a completely different BGN(2.6ghz) chip and | also different ethernet chip/switch. v5 is lower power | still at 1.5a, but it's less obvious where that change | happened due to lack of pictures. A guess based on the | simpler antenna list is that it uses less antenna. | bradknowles wrote: | Ubiquiti is based on Vyatta. | crankylinuxuser wrote: | Given this line... | | image name: "MIPS OpenWrt Linux-3.3.8" | | I would say you are true. | hyper_reality wrote: | It's a good article but there are much easier ways to use binwalk | than presented here. | | In the first example he uses the "--signature" and "--term" | flags, these are unnecessary. Running binwalk with no flags will | produce the same output. | | To extract part of the file, he also uses dd with the "skip" and | "count" options painfully calculated. You can just use: | | binwalk --dd='.*' img.bin | | and it will extract everything that matches the pattern - the | pattern above will extract all found files. | leeoniya wrote: | glad i flashed latest dd-wrt beta on my archer-c7 v5 :D. though | my wan-facing device runs OPNSense. | | i actually prefer to run Tomato, but archer c7 is not broadcom :( | | can anyone offer advice about dd-wrt vs openwrt (considering | trying openwrt). | josteink wrote: | Latest version of OpenWRT (19) runs noticeably better on this | device, with better HW offloading support and based on a nearly | mainline, modern Linux kernel and a brand new device-tree for | the Atheros SoC. | | What reasons do you have to stay on dd-wrt? | leeoniya wrote: | > What reasons do you have to stay on dd-wrt? | | mostly that i've used it before. can i gui-flash to openwrt | from dd-wrt? i've done tftp flashes before but they're pretty | fiddly with getting the stupid 30-30-30 or whatever timing | right. also i think these routers try to "pull" from a tftp | server rather than having you push to one that they bootstrap | - i've never been able to get the "pull" variant to work. | | would be hell of a lot easier if the router could be booted | into something like android's (arm's?) fastboot or flashmode | mode so i can just push an image. | SpikedCola wrote: | Going from dd-wrt to openwrt should be as simple as a | firmware flash from the web gui, and an nvram reset. Worst | case, you can flash a "revert to stock" image from ddwrt to | go back to factory, then flash openwrt as if the device was | factory. | | Openwrt also has a handy failsafe built into a lot of | models. It boots a stripped down http server where you can | upload recovery firmware. | | Used to swear by dd-wrt, now I prefer openwrt. | josteink wrote: | Flashing the OpenWRT "factory" (as opposed to sysupgrade) | image in the web UI should probably work fine, but don't | quote me on it. | | That's how I flashed from stock to OpenWRT on 3+ Archer | units anyway. Make sure not to keep settings. | bxparks wrote: | I use Gargoyle on my Archer C7 v2. This thread | (https://www.gargoyle-router.com/phpbb/viewtopic.php?t=11896) | says that C7 v5 is supported. | magduf wrote: | >i actually prefer to run Tomato, but archer c7 is not broadcom | :( | | Not being Broadcom is a very good thing. | 12bits wrote: | Did you notice your wireless signal strength considerably lower | when going to dd-wrt? | | I put openwrt on my c7 V5 and could barely get any bars. | | Flashed back to the stock and was back in business. | | Another thing I've read is the third party firmwares don't get | hardware access to NAT resulting in speed hits. | | Cheers | leeoniya wrote: | yes, and i had throughput issues when running in full-width | G/N mixed mode compared to my previous Tomato/Asus RT-N16 | setup. my phone would also drop out and reconnect | intermittently with the c7. but in dedicated AC it seems to | be doing well thus far. i cannot say for sure whether this | was due to DD-WRT or not as i did not do a thorough | comparison to stock. | | > Another thing I've read is the third party firmwares don't | get hardware access to NAT | | i read that too :( | dpcx wrote: | Where can one find the dd-wrt you used for your c7? I have the | same device and have been unable to get it to flash anything | other than official firmware. | RussianCow wrote: | These are the instructions I successfully followed on my C7 | V2: https://wiki.dd- | wrt.com/wiki/index.php/TP_Link_Archer_C7#Ins... | | Here is the exact `factory-to-ddwrt` image I used (this will | depend on which version you have): ftp://ftp.dd- | wrt.com/betas/2019/10-15-2019-r41328/tplink_archer-c7-v2/ | TimSchumann wrote: | I'm running OpenWRT and the Archer c7 is on the list of | supported devices. I'd say give it a try. | ChuckNorris89 wrote: | _> Although the firmware was released last year (August 2019) as | I write this article, it uses an old Linux kernel version (3.3.8) | released in 2012 compiled with a very old GCC version (4.6) also | from 2012!_ | | This is what happens whey you pay peanuts for embedded devs and | outsource development to the cheapest sweatshop you can find so | your products can meet a competitive price point. | | Sadly this will not change until there's regulation in place to | hold manufacturers accountable for their massively obvious | vulnerabilities since nobody cares that they're flooding the | market with potential botnet hosts when they're overworked, paid | miserably and have a manager constantly breathing down their | neck. | bluesign wrote: | It is mostly related to drivers to soc, not about paying devs | scoutt wrote: | Exactly. What I see is that the SoC provider just _freezes_ | everything at a given version and supports just that. For | example I am currently building Android 9 on a QCOM SoC with | a 4.9 Kernel. I don 't think it will receive any future | update... | ChuckNorris89 wrote: | So how did OpenWRT manage to build firmware with up to date | components for it? The Qualcomm chips inside of it seem | fairly modern for such an old kernel. | prashnts wrote: | Note that openwrt has a big community of contributors and | not all devices/features are supported. In contrast the | manufacturer firmware is at least feature complete and easy | for regular users to set up. | rahuldottech wrote: | OpenWrt is also free. Both as free software, and free of | cost. When you're paying a manufacturer for a product, | surely it's not too much to expect them to ship with | functional software that also happens to be up-to-date | and secure? | jschwartzi wrote: | You can get that, but not at consumer-grade router | prices. I have a separate router that I put behind my | stand-alone cable modem. I paid for that separate router | about $200.00. And another $100 for the modem. A wifi | access point cost me another $100. | | So it's about $400.00 for a router that has updated | firmware(pfSense). Or you can cheap out and spend only | $100.00. This is what you get by doing that. | bluesign wrote: | Cause openwrt doesnt care if some feature doesnt work but | oem should support all features | mjevans wrote: | Support varies, you should purchase devices that include | hardware which is supported by the Open Source drivers | (even if you have to compromise and it still uses some | small blobs that are free to distribute). | | You should also purchase a device that includes enough | storage space and RAM to support more than the bare | minimum; that will help keep things future proof. | tenebrisalietum wrote: | OpenWRT doesn't guarantee support of all hardware. I have a | router flashed to a certain version with a newer kernel, | and the Wifi doesn't work because of no driver available. | Thaxll wrote: | Most routers run 2.6.32 kernel. | non-entity wrote: | Is there any particular reason for this? Like some feature | that was removed in later versions? | gvb wrote: | The primary reason is likely because the hardware (SoC | peripherals) drivers were written for 2.6.x and not forward | ported to newer versions of the linux kernel. A lot of | hardware drivers were (are) written by the hardware (chip) | manufacturers and then abandoned. | ChuckNorris89 wrote: | I have the feeling most home routers are designed by the same | OEM shop in Shenzhen. | GEBBL wrote: | This is amazing! I've used binwalk extract for 'capture the flag' | challenges but I never really thought about the practical | applications of it. Wow! Thank you | LeonM wrote: | Funny, I always assumed that there would be no application for | binwalk other than for extracting binary firmware images of | embedded devices. | | Using binwalk for CTF challenges is actually a new insight for | me :) | beefhash wrote: | Conversely, it's a convenient tool for obfuscation. You can | trigger plausible false positives all over, while also making | sure that there's nothing of immediate use with binwalk left. | commandlinefan wrote: | From the output I see: 23296 0x5B00 | LZMA compressed data, properties: 0x5D, dictionary size: | 8388608 bytes, uncompressed size: 97476 bytes 64968 | 0xFDC8 XML document, version: "1.0" | | So it looks like the size of the bootloader should be 64968 - | 23296 = 41672. But he extracts 41162: $ dd | if=archer-c7.bin of=u-boot.bin.lzma bs=1 skip=23296 count=41162 | | Curious if anybody knows why 41162; is this a block-size | alignment requirement? | mrspeaker wrote: | I'm wondering how these values are determined too. I'm | "following along at home" without any idea what I'm doing | (though all the files, bytes, and offsets are matching with the | tutorial... Also, if the original author finds this thread: | amazing write-up - got me really interested in the topic!). | | At the step where they remove the header with | dd if=uImage of=Image.lzma bs=1 skip=72 | | It results in a file that if I try and un compress it with | `unlzma Image.lzma` it complains with "Compressed data is | corrupt" | | I don't know where the magic number "72" comes from. Is it | likely that could be different on my machine (a mac)? | | [edit: I think there's something else wrong - if I use | `mkImage` to examine the uImage file I only get: | mkimage -l uImage GP Header: Size 27051956 LoadAddr | 78a267ff | | Instead of image information] | zerocrates wrote: | The 41162 bytes comes from the preceding uImage header, you'll | see it listed in that big description. I'm not sure what the | 510 bytes of padding are, though. Just padding? A checksum? | jschwartzi wrote: | Maybe bootloader code? | JoeAltmaier wrote: | Cool tool! I wrote something for reverse-engineering code, as a | consultant years ago. They had a radio module but the | manufacturer had lost the source code. | | So the tool was called Golem. It had tables for defining opcode | to assembler pattern matching, that could be written for any | machine (instead of just the one I was cracking). | | It worked iteratively. You ran it over the binary once, it | produced arbitrary labels from jump-points. You could annotate | that output by changing the labels to something human-readable | (e.g. Loop-back, Main, TimerISR etc) and add comments. | | The next iteration would read that back in to build a symbol | table, rescan the binary and re-output. But this time it would | understand that the symbols were always on opcode boundaries, | distinguish data table from code entry points (because you marked | them) etc. So it would do a better job of staying in sync with | the code. | | Once I was done with that project (and had re-compilable source | for the radio module) I put it away and never thought of it | again. | souprock wrote: | You were on your way to cloning IDA Pro, Ghidra, Binary Ninja, | or Hopper Disassembler. To varying degrees, sometimes as a pay- | extra option, those tools can produce source code. | JoeAltmaier wrote: | Um. I think they post-dated me! But I didn't go anywhere with | it. | souprock wrote: | IDA Pro started as a 16-bit MS-DOS program. It's real old. | I'm pretty sure I was using it back in 1992, when it was | already a well-developed program. | | Ghidra is old too, although only recently public. It | couldn't be older than Java, which is from 1996. | JoeAltmaier wrote: | Cool. I did mine in 2006. Hey, those have mostly Intel | disassemblers. Mine did any machine code you cared to | write a dissector for. | | Are they iterative? Can you add human clues/cues so they | do a better job the next time? | souprock wrote: | They are not at all mostly Intel disassemblers, though | some of them have freeware versions (to suppress | competition) or time-limited demo versions that are | purposely limited. They are very much designed around | humans adding clues: you can declare function parameters, | struct types, enumerations, and the meaning of various | offsets in code. They are interactive GUI tools, | continuously updating automated analysis as the user | assists by providing clues to the analysis engine. Ghidra | and Binary Ninja can be simultaneously multi-user, | storing the database on a server for collaboration. | | IDA Pro supports dozens of processor architectures. I | count about 70, not including model variations and not | including community support. https://www.hex- | rays.com/products/ida/processors/ | | Ghidra supports "X86 16/32/64, ARM/AARCH64, PowerPC | 32/64/VLE, MIPS 16/32/64/micro, 68xxx, Java / DEX | bytecode, PA-RISC, PIC 12/16/17/18/24, Sparc 32/64, | CR16C, Z80, 6502, 8051, MSP430, AVR8, AVR32, and variants | of these processors." | | Binary Ninja officially supports x86, x64, ARMv7, Thumb2, | ARMv8, PowerPC, MIPS, 6502. Community support adds AVR, | MSP430, and VMNDH-2k12. | | Hopper Disassembler supports "x86{16,32,64}, Dalvik, avr, | ARM, java, PowerPC, Sparc, MIPS" | dmitrygr wrote: | ida handles any arch | | it is interactive (so by definition iterative) | tasubotadas wrote: | I am really surprised that firmware images are not just .tar.gz | files renamed to .bin :/. That's how I would have implemented a | distribution of new firmware. | josteink wrote: | And how do you partition boot-loaders, kernels, and rootfs and | such in that tar.gz? | | Embedded device will be hard coded to look at a fixed point and | start booting from there, there's no UEFI. How will you ensure | boot-loaders get unpacked precisely where they need to be? | | And that doesn't even touch the idea of having a _router_ | understand a file system before any firmware code is loaded. | | Routers really are quite different from PCs. | tenebrisalietum wrote: | I think firmware images are typically not the fixed ROM code | the CPU first encounters upon startup, even if they contain | U-Boot. Especially if stored in NAND flash they probably | aren't. | | AR7 platform, for example, the MIPS core runs a small ROM | that initializes RAM, then reads some blocks from flash. Not | sure how much code you'd need to unpack a tar.gz but | completely possible. | bshipp wrote: | True enough, but I think they used to be even more unique and | over time they've become more like PCs. | | One of these days I'm going to log in to the admin interface | and find candy crush installed. | vlovich123 wrote: | They're "like PCs" in the sense that the instruction set | has of the CPUs has caught up and in theory you can attach | more complicated peripherals. However, unless your embedded | product has MMC flash attached (for many applications it | doesn't due to cost + physical size) you're SOL for the | following reasons: | | 1. For M4s your storage is typically some kind of SPI flash | which doesn't act like the traditional desktop flash you're | dealing with. You have to manually specify the address | you're reading/writing & you have to do it on block | boundaries (multiple KB). You're generally looking at | 8-64MB. 2. For M0 your storage is typically flash built-in | with potentially even more restrictions. 3. These devices | have _very_ little RAM. Decompression means you have to | have a way of enforcing constraints on the amount of space | you 'll need. Aside from the space needed regularly for | decompression you may need to buffer the decompressed | content in-memory to align with block boundaries. All of | this means development time, increased costs & risk for | something you may not be able to pull of. | | If your vendor actually internally compresses their image | then great but generally they don't for all the same | reasons (+ sometimes this is touching ROM code in the | chip). | monocasa wrote: | > And how do you partition boot-loaders, kernels, and rootfs | and such in that tar.gz? | | In the past, each of those would be a separate MTD partition | with a seperate device file. You just dd them over those | files. | andrewshadura wrote: | Another similar tool to look at is Hachoir. ___________________________________________________________________ (page generated 2020-02-06 23:00 UTC)