[HN Gopher] Tell HN: People forget that you can stick any data a...
       ___________________________________________________________________
        
       Tell HN: People forget that you can stick any data at the end of a
       bash script
        
       This is a neat trick I've used to write self-extracting software or
       scripts that extract files from archives by just using
       tail -c <number of bytes for the binary> $0       All you have to
       do is make sure you append an explicit 'exit' to the end of your
       program before your new 'data section', so that bash won't parse
       any of the 'data section'.  One thing to bear in mind is that if
       you append binary data, it will be corrupted if you save it in most
       text editors so when I want to make changes I just delete all the
       binary and reappend it.
        
       Author : BasedAnon
       Score  : 106 points
       Date   : 2023-07-05 19:32 UTC (3 hours ago)
        
       | jjgreen wrote:
       | Ruby (and earlier, Perl) formalised this with the __END__
       | section: https://www.honeybadger.io/blog/data-and-end-in-ruby/
        
         | francislavoie wrote:
         | PHP also has __halt_compiler()
         | https://www.php.net/manual/en/function.halt-compiler.php
        
         | brasic wrote:
         | Are you sure that Perl took it from ruby and not the other way
         | around?
        
         | x86x87 wrote:
         | yup. after that you can use the global var DATA to access the
         | data injected after the __END__
        
       | ShowalkKama wrote:
       | portswigger does that for the burpsuite installers.
       | 
       | https://portswigger-cdn.net/burp/releases/download?product=c...
        
         | vram22 wrote:
         | >portswigger does that for the burpsuite installers.
         | 
         | Wow, that triggered my wordplay radar, which I'm working on as
         | a fun side line these days, thanks :)
         | 
         | port, suite (sweet)
         | 
         | swig, burp
         | 
         | Heh.
        
       | INTPenis wrote:
       | That's how I made a bash backdoor once. It was just a script
       | somewhere on the FS, until it unpacked itself and executed the
       | rest of the rootkit.
       | 
       | Long story but trust me that I had good intentions.
        
       | sumosudo wrote:
       | I use a fun little hack, a la awk:
       | 
       | ``` #!/usr/local/bin/bash
       | 
       | echo "HELLO"
       | 
       | TAIL_REMOTE_MARKER=`awk '/^__THE_REMOTE_PART__/{flag=1;next}/^__E
       | ND_THE_REMOTE_PART__/{flag=0;exit}flag' ${0}`
       | 
       | eval "$TAIL_REMOTE_MARKER"
       | 
       | exit 0
       | 
       | __THE_REMOTE_PART__
       | 
       | echo "WORLD"
       | 
       | __END_THE_REMOTE_PART__ ```
        
       | kkfx wrote:
       | Makeself archives are a classic self-extracting tarball who do
       | exactly that...
        
       | xg15 wrote:
       | If you care less about space efficiency and more about
       | maintainability of the script, you can also encode the binary as
       | base64 and put an                 echo '...base64 data...' |
       | base64 -d > somefile
       | 
       | in your script.
       | 
       | Or add compression to reclaim at least some of the wasted space:
       | echo '...base64 gzipped data...' | base64 -d | gunzip > somefile
       | 
       | Also note that bash accepts line breaks in quoted strings and the
       | base64 utility has an "ignore garbage" option that lets it skip
       | over e.g. whitespace in its input. You can use those to break up
       | the base64 over multiple lines:                 echo '
       | ...base64 gzipped data...         ...more data...         ...even
       | more data...       ' | base64 -di | gunzip > somefile
        
         | AlDante2 wrote:
         | Just to be sure I'm following you correctly, what is the
         | advantage of zipping the base64 data vs having the original
         | binary, zipped if you like?
        
         | mike_hock wrote:
         | If you care about maintainability, you keep the binary data out
         | of the source file and have a build process.
        
         | saltcured wrote:
         | You can also use here-documents to avoid hitting any argv
         | length limits:                   { base64 -d | gunzip > output;
         | } <<EOF12345         ...data...         EOF12345
        
         | dheera wrote:
         | Is there an encoding that is less wasteful that base64 but not
         | vulnerable to text editor corruption issues? I think avoiding
         | 0x0 to 0x20 should be enough to not get corrupted by text
         | editors, though base64 avoids a lot more than that.
        
           | delusional wrote:
           | At that point you're basically doing yEnc.
        
           | doublerabbit wrote:
           | base16
        
           | ElectricalUnion wrote:
           | If you can count on every printable ascii character being
           | not-mangled, you can use ascii85/base85/Z85 (5 "ascii
           | characters" to 4 bytes) instead of base64.
        
             | raverbashing wrote:
             | There's probably a base(bigger number) with Unicode chars
             | today
        
               | bashinator wrote:
               | base65536, and look who the author is :-D
               | 
               | https://github.com/qntm/base65536
        
               | cassianoleal wrote:
               | Who is the author?
        
               | yccs27 wrote:
               | https://www.qntm.org
               | https://news.ycombinator.com/from?site=qntm.org
        
               | CaptainFever wrote:
               | Base65536? https://github.com/qntm/base65536
        
       | heresie-dabord wrote:
       | In Perl, __DATA__ indicates the beginning of the data section of
       | the file. A portable way to provide test data or sample data.
       | 
       | https://perldoc.perl.org/functions/__DATA__
        
       | [deleted]
        
       | davidw wrote:
       | I seem to recall that you can do the opposite as well: stash some
       | extra data at the end of a binary file. The 'tclkit' system used
       | this to package up an executable with the scripts you wanted to
       | ship.
        
       | cocodill wrote:
       | I can vaguely remember that many programs used to install
       | themselves this way under Linux.
        
         | a2tech wrote:
         | Lots of commercial Linux software use this still for installing
         | their stuff. It's a neat trick
        
         | nerdponx wrote:
         | I've seen it recently with the Conda and Mamba package
         | managers.
        
         | [deleted]
        
         | teddyh wrote:
         | It was used on Unix systems even before that.
        
           | dekhn wrote:
           | definitely used something similar on VAX/VMS called VMS_SHARE
           | (https://www.glaver.org/ftp/multinet-contributed-
           | software/vms...) circa '90-91
           | 
           | in fact I found an old archive of mine floating around on
           | usenet and wrote a python script to unpack it. Looking at the
           | original, it was using a scripting. language bootstrap to
           | make a COM script unpack embedded the original code.
        
       | dietrichepp wrote:
       | This trick is used in the demoscene. Instead of using -c, I use
       | -n,                 tail -n +2 $0
       | 
       | The -n +2 option means "starting at line 2", which is what you
       | want if you cram your script into one line. You can make an
       | executable packed with lzma this way,
       | a=`mktemp`;tail -n+2 $0|unxz>$a;chmod +x $a;$a;rm $a;exit
       | 
       | This is the polite way to do it, using mktemp. You can save some
       | bytes if you don't care about that stuff.
        
       | 2OEH8eoCRo0 wrote:
       | I think this is how GOG ships the Linux version of Battletech.
        
       | hey00 wrote:
       | momomom
        
       | JohnFen wrote:
       | This is my default approach to writing installers for the Unices.
       | The program is compressed and added to the end of the script, and
       | the script does the unpacking and any needed setup/configuration
       | for the specific platform it's getting installed on.
       | 
       | I don't append it in binary form, though. I uuencode it. That
       | way, there is no danger in using text editors.
        
         | themerone wrote:
         | Why uuencode? Base64 is the defacto standard these days.
        
           | vram22 wrote:
           | I've used both, but only briefly. I think I used uuencode
           | when using uucp. And Base64 in one of my Python programs.
           | 
           | What are their pros and cons, in your opinion?
        
             | themerone wrote:
             | Base 64 is slightly more space efficient. Other than that
             | it's just more popular and better supported.
        
           | JohnFen wrote:
           | Sorry, I did mean base64. I have a bad habit of calling all
           | "binary as text" encodings "uuencode". I usually catch myself
           | before I put it in writing, though.
        
       | hey00 wrote:
       | I dont understand this website it is too hard and i dont
       | understand anything. Anyone help me with this?
        
       | onion2k wrote:
       | This reminds me of ZX Spectrum Basic where all the graphics,
       | sound, and level layouts were defined using DATA lines at the end
       | of the program.
        
         | antod wrote:
         | Or any machine code routines you wanted to POKE into memory.
         | 
         | A suppressed obscure part of my lizard brain secretly wishes I
         | could just code for 8bit computers from the 80s, just with all
         | the modern niceties like text editors, assemblers and emulators
         | etc.
        
         | allarm wrote:
         | You could also put the binary data in the first line of the
         | Basic program after the 'rem' command, change the line number
         | to 0 using the poke command, so that it's not possible to edit
         | this line. The second line would run the code using 'randomize
         | usr'. There were also fun tricks with control sequences, that
         | would hide the 'rem' command and the line number, and put
         | something like "Cracked by Bill Gilbert (c) 1982" instead.
         | Gosh, why I still remember all this nonsense after all these
         | years...
        
       | eadler wrote:
       | See
       | https://man.freebsd.org/cgi/man.cgi?query=shar&sektion=1&for...
       | for a tool to generate these types of archives.
        
         | teddyh wrote:
         | Also on GNU/Linux systems:
         | <https://manpages.debian.org/stable/sharutils/shar.1.en.html>
        
       | vram22 wrote:
       | BASIC and Perl had or have something like that too.
       | 
       | IIRC, Perl copied it from BASIC, because BASIC came much before
       | Perl.
       | 
       | And, again, IIRC, I've read about the shar (shell archive) method
       | that someone else commented about in this thread (and which even
       | has a Wikipedia entry), in either the classic Kernighan and Pike
       | book, The Unix Programming Environment (which I've recommended
       | here multiple times before), or in some Unix man pages, long ago.
       | 
       | So it's quite an old method.
        
       | norir wrote:
       | This is a great trick, but no one should ever run someone else's
       | script that does this unless they have verified the script line
       | by line beforehand.
        
       | nottorp wrote:
       | Shell archive it was called? There used to be a lot of installers
       | like that.
        
         | NovemberWhiskey wrote:
         | Yup; "shar" https://en.wikipedia.org/wiki/Shar
        
       | twic wrote:
       | Since zip files use a directory at the end, you can make a kind
       | of mullet file - script at the front, archive at the back. I
       | generated single-file runnable Java binaries like that at once
       | point.
        
       ___________________________________________________________________
       (page generated 2023-07-05 23:00 UTC)