|
|
If you care less about space efficiency and more about maintainability of the script, you can also encode the binary as base64 and put an echo '...base64 data...' | base64 -d > somefile
in your script. Or add compression to reclaim at least some of the wasted space: echo '...base64 gzipped data...' | base64 -d | gunzip > somefile
Also note that bash accepts line breaks in quoted strings and the base64 utility has an "ignore garbage" option that lets it skip over e.g. whitespace in its input. You can use those to break up the base64 over multiple lines: echo '
...base64 gzipped data...
...more data...
...even more data...
' | base64 -di | gunzip > somefile
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
If you care about maintainability, you keep the binary data out of the source file and have a build process.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
For something small. I would take a data.bin file rather than a build process.
But yes.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
You can also use here-documents to avoid hitting any argv length limits: { base64 -d | gunzip > output; } <<EOF12345
...data...
EOF12345
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
An even simpler way would be to include a marker to denote the end of the shell script, and the start of the data. For example, if you put this in extract.sh #!/bin/sh
sed -E '1,/^START-OF-TAR-DATA$/d' "$0" | tar xvzf -
exit
START-OF-TAR-DATA
and then run: cat extract.sh ../foobar.tar.gz > foobar.tar.gz.sh
You can then run foobar.tar.gz.sh to self-extract. And you still get the benefit of being able to modify the shell script without needing to count lines or characters without sacrificing any compression.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Just to be sure I’m following you correctly, what is the advantage of zipping the base64 data vs having the original binary, zipped if you like?
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
As I understood, you base64 the zipped data on input and the other way around on output. The reasoning being that the base64'd binary data is safe from being corrupted when the file is edited in text editors, as a response to the warning stated on the last paragraph of the original post.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Is there an encoding that is less wasteful that base64 but not vulnerable to text editor corruption issues? I think avoiding 0x0 to 0x20 should be enough to not get corrupted by text editors, though base64 avoids a lot more than that.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
If you can count on every printable ascii character being not-mangled, you can use ascii85/base85/Z85 (5 "ascii characters" to 4 bytes) instead of base64.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
There's probably a base(bigger number) with Unicode chars today
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
While a couple of people suggested Base65536, that encoding isn't particularly compact, and it can't be as elegant as 65536 would suggest because it has to dodge special cases in unicode. It's almost always the case that either Base32768 is denser, or encodings with 2^17 or 2^20 characters are denser.
|
|
|
|
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
But you need to make sure to use utf-16 or utf-32 instead of utf-8, or you may be worse off.
|
|
|
|
|
|
This trick is used in the demoscene. Instead of using -c, I use -n, tail -n +2 $0
The -n +2 option means “starting at line 2”, which is what you want if you cram your script into one line. You can make an executable packed with lzma this way, a=`mktemp`;tail -n+2 $0|unxz>$a;chmod +x $a;$a;rm $a;exit
This is the polite way to do it, using mktemp. You can save some bytes if you don’t care about that stuff.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
There must be a way to run something without needing a temp file...
|
|
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
yup. after that you can use the global var DATA to access the data injected after the __END__
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Are you sure that Perl took it from ruby and not the other way around? (edit: a subsequent correction has obsoleted this comment)
|
|
|
|
|
A very large Electronic Medical Records company shipped an extremely large shell script to us for an install. Upon examination it contained binary data and a command to extract it to a file and then installed the application. This was the “efficient” way to ship and install the binary.
|
|
|
|
Shell archive it was called? There used to be a lot of installers like that.
|
|
|
|
|
|
This reminds me of a job I had 15+ years ago where we did code reviews by emailing files to one another with our changes. It worked like this with the first part of the file being a script and the end of the file being a base64 encoded zip of the changed files. We had tooling that would pack them, but unpacking was done by execution. What could possibly go wrong with emailing executable scripts?
|
|
|
|
This is a great trick, but no one should ever run someone else's script that does this unless they have verified the script line by line beforehand.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
I don't think I've ever read through the Nvidia binary drivers that way. (They're named *.run but are basically shar files)
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Sure, but that's turtles all the way down... any time you run untrusted code, you are making a risk based decision, usually based on the provenance of the code.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Maybe? People run all manner of binaries/installers without checking them; I'm not sure why these sorts of things require any EXTRA scrutiny.
|
|
|
|
|
|
Since zip files use a directory at the end, you can make a kind of mullet file - script at the front, archive at the back. I generated single-file runnable Java binaries like that at once point.
|
|
|
|
|
Java JAR files are similar, but reversed. You can add anything you want to the beginning of the JAR file (or is it any ZIP file?) so long as it doesn't include the Zip file header "PK". So, I use this to prepend a bash script that ultimately calls java -jar $0
It makes it very easy to setup and use Java based command line programs on a server.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
this sounds incredibly useful but I couldn't get it to work. I just get java.util.zip.ZipException: invalid CEN header (bad signature)
at java.base/java.util.zip.ZipFile$Source.zerror(ZipFile.java:1623)
if I try to do anything with a JAR file that has leading text. I'm creating it just using echo 'java -jar $0' | cat - test.jar > test.run.jar
Is there more to it?
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Technically, you should update the offset to the central directory in the Zip footer, along with the offsets to each file header in each central directory entry. If you don’t, the zip file reader has to apply some heuristics to locate the central directory; not all readers implement these heuristics, and those that do won’t always be robust. The “unzip” utility can be useful as a sanity check; run “unzip -t” to test the integrity of the file.
|
|
|
|
|
I did a similar thing for a lowish volume embedded product. The update files are just bash scripts with a tar file cat'd on them. The unit just looks for a particular file on an external flash drive to run and the bash script runs, copies off a tar and checks that it has the right hash. Super simple and flexible when customers need me to do something special. Like extract some specific log onto a flash drive.
|
|
|
|
|
|
I can vaguely remember that many programs used to install themselves this way under Linux.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
It was used on Unix systems even before that.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
definitely used something similar on VAX/VMS called VMS_SHARE ( https://www.glaver.org/ftp/multinet-contributed-software/vms...) circa '90-91 in fact I found an old archive of mine floating around on usenet and wrote a python script to unpack it. Looking at the original, it was using a scripting. language bootstrap to make a COM script unpack embedded the original code.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Lots of commercial Linux software use this still for installing their stuff. It’s a neat trick
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
I've seen it recently with the Conda and Mamba package managers.
|
|
|
|
This is my default approach to writing installers for the Unices. The program is compressed and added to the end of the script, and the script does the unpacking and any needed setup/configuration for the specific platform it's getting installed on. I don't append it in binary form, though. I uuencode it. That way, there is no danger in using text editors.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Why uuencode? Base64 is the defacto standard these days.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Sorry, I did mean base64. I have a bad habit of calling all "binary as text" encodings "uuencode". I usually catch myself before I put it in writing, though.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
I've used both, but only briefly. I think I used uuencode when using uucp.
And Base64 in one of my Python programs. What are their pros and cons, in your opinion?
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Base 64 is slightly more space efficient. Other than that it's just more popular and better supported.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Got it, thanks. Yes, uuencode / uudecode are probably older too. They are from the uucp dialup comms era of networking.
|
|
|
|
|
That's what uuencode / uudecode were once used for.
|
|
|
|
This for any sh type script, not just bash :) Will work with sh, ksh and even [t]csh
|
|
|
|
That's how I made a bash backdoor once. It was just a script somewhere on the FS, until it unpacked itself and executed the rest of the rootkit. Long story but trust me that I had good intentions.
|
|
|
|
BASIC and Perl had or have something like that too. IIRC, Perl copied it from BASIC, because BASIC came much before Perl. And, again, IIRC, I've read about the shar (shell archive) method that someone else commented about in this thread (and which even has a Wikipedia entry), in either the classic Kernighan and Pike book, The Unix Programming Environment (which I've recommended here multiple times before), or in some Unix man pages, long ago. So it's quite an old method.
|
|
|
|
This reminds me of ZX Spectrum Basic where all the graphics, sound, and level layouts were defined using DATA lines at the end of the program.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Or any machine code routines you wanted to POKE into memory. A suppressed obscure part of my lizard brain secretly wishes I could just code for 8bit computers from the 80s, just with all the modern niceties like text editors, assemblers and emulators etc.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
You could also put the binary data in the first line of the Basic program after the ‘rem’ command, change the line number to 0 using the poke command, so that it’s not possible to edit this line. The second line would run the code using ‘randomize usr’. There were also fun tricks with control sequences, that would hide the ‘rem’ command and the line number, and put something like “Cracked by Bill Gilbert (c) 1982” instead. Gosh, why I still remember all this nonsense after all these years…
|
|
|
|
I use a fun little hack, a la awk: ```
#!/usr/local/bin/bash echo "HELLO" TAIL_REMOTE_MARKER=`awk '/^__THE_REMOTE_PART__/{flag=1;next}/^__END_THE_REMOTE_PART__/{flag=0;exit}flag' ${0}` eval "$TAIL_REMOTE_MARKER" exit 0 __THE_REMOTE_PART__ echo "WORLD" __END_THE_REMOTE_PART__
```
|
|
|
|
I used to do something similar for Windows executable files. Append a large file to the end as necessary.
|
|
|
|
I seem to recall that you can do the opposite as well: stash some extra data at the end of a binary file. The 'tclkit' system used this to package up an executable with the scripts you wanted to ship.
|
|
|
|
I vaguely remember this is what Ocaml does for one format of its executable.
|
|
|
|
This is a malware technique. I am not saying don't do it. But that is mostly where I see this type of trick.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
Malware is about intent and consent, not executable format.
|
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
>portswigger does that for the burpsuite installers. Wow, that triggered my wordplay radar, which I'm working on as a fun side line these days, thanks :) port, suite (sweet) swig, burp |
|
|
|
I think this is how GOG ships the Linux version of Battletech.
|
|
![s.gif](https://news.ycombinator.com/s.gif) |
|
I believe this is how GOG ships all of its Linux titles, all of the installs I've used from them are downloaded as a single *.sh file. I just checked an example game, and it looks to be using this method.
|
|
|
|
Makeself archives are a classic self-extracting tarball who do exactly that...
|
|
|
|
|
I dont understand this website it is too hard and i dont understand anything. Anyone help me with this?
|
|
|