1

A Little Shell Rabbit Hole

 1 year ago
source link: https://zwischenzugs.com/2022/09/28/a-little-shell-rabbit-hole/
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client

Occasionally I run dumb stuff in the terminal. Sometimes something unexpected happens and it leads me to wonder ‘how the hell did that work?’

This article is about one of those times and how looking into something like that taught me a few new things about shells. After decades using shells, they still force me to think!

The tl;dr is at the end if you don’t want to join me down this rabbit hole…

The Dumb Thing I Ran

The dumb thing I occasionally ran was:

grep .* *

If you’re experienced in the shell you’ll immediately know why this is dumb. For everyone else, here are some reasons:

  • The first argument to grep should always be a quoted string – without them, the shell treats the .* as a glob, not a regexp
  • grep .* just matches every line, so…
  • you could just get almost the same output by running cat *

Not Quite So Dumb

Actually, it’s not quite as dumb as I’ve made out. Let me explain.

In the bash shell, ‘.*‘ (unquoted) is a glob matching all the files beginning with the dot character. So the ‘grep .* *‘ command above interpreted in this (example) context:

$ ls -a1
.    ..    .adotfile    file1   file2

Would be interpreted as the command in bold below:

$ echo grep .* *
grep . .. .adotfile file1 file2

The .* gets expanded by the shell as a glob to all file or folders beginning with the literal dot character.

Now, remember, every folder contains at least two folders:

  • The dot folder (.), which represents itself.
  • The double-dot folder (..), which represents the parent folder

So these get added to the command:

grep . ..

Followed by any other file or folder beginning with a dot. In the example above, that’s .adotfile.

grep . .. .adotfile

And finally, the ‘*‘ at the end of the line expands to all of the files in the folder that don’t begin with a dot, resulting in:

grep . .. .adotfile file1 file2

So, the regular expression that grep takes becomes simply the dot character (which matches any line with a single character in it), and the files it searches are the remaining items in the file list:

..
.adotfile
file1
file2

Since one of those is a folder (..), grep complains that:

grep: ..: Is a directory

before going on to match any lines with any characters in. The end result is that empty lines are ignored, but every other line is printed on the terminal.

Another reason why the command isn’t so dumb (and another way it differs from ‘cat *‘) is that since multiple files are passed into grep, it reports on the filename, meaning the output automatically adds which file the line comes from.

bash-5.1$ grep .* *
grep: ..: Is a directory
.adotfile:content in a dotfile
file1:a line in file1
file2:a line in file2

Strangely, for two decades I hadn’t noticed that this is a very roundabout and wrong-headed (ie dumb) way to go about things, nor had I thought about its output being different from what I might have expected; it just never came up. Running ‘grep .* *‘ was probably a bad habit I picked up when I was a shell newbie last century, and since then I never needed to think about why I did it, or even what it did until…

Why It Made Me Think

The reason I had to think about it was that I started to use zsh as my default terminal on my Mac. Let’s look at the difference with some commands you can try:

bash-5.1$ mkdir rh && cd rh
bash-5.1$ cat > afile << EOF
text
EOF
bash-5.1$ bash
bash-5.1$ grep .* afile
grep: ..: Is a directory
afile:text
bash-5.1$ zsh 
zsh$ grep .* afile
zsh:1: no matches found: .*

For years I’d been happily using grep .* but suddenly it was telling me there were no matches. After scratching my head for a short while, I realised that of course I should have quotes around the regexp, as described above.

But I was still left with a question: why did it work in bash, and not zsh?

Google It?

I wasn’t sure where to start, so I googled it. But what to search for? I tried various combinations of ‘grep in bash vs zsh‘, ‘grep without quotes bash zsh‘, and so on. While there was some discussion of the differences between bash and zsh, there was nothing which addressed the challenge directly.

Options?

Since google wasn’t helping me, I looked for shell options that might be relevant. Maybe bash or zsh had a default option that made them behave differently from one another?

In bash, a quick look at the options did not reveal many promising candidates, except for maybe noglob:

bash-5.1$ set -o | grep glob
noglob off
bash-5.1$ set -o noglob
bash-5.1$ set -o | grep glob
noglob on
bash-5.1$ grep .* *
grep: *: No such file or directory

But this is different from zsh‘s output. What noglob does is completely prevent the shell from expanding globs. This means that no file matches the last ‘*‘ character, which means that grep complains that no files are matched at all, since there is no file named ‘*‘ in this folder.

And for zsh? Well, it turns out there are a lot of options in zsh…

zsh% set -o | wc -l
185

Even just limiting to those options with glob in them doesn’t immediately hit a jackpot:

zsh% set -o | grep glob
nobareglobqual        off
nocaseglob            off
cshnullglob           off
extendedglob          off
noglob                off
noglobalexport        off
noglobalrcs           off
globassign            off
globcomplete          off
globdots              off
globstarshort         off
globsubst             off
kshglob               off
nullglob              off
numericglobsort       off
shglob                off
warncreateglobal      off

While noglob does the same as in bash, after some research I found that the remainder are not relevant to this question.

(Trying to find this out, though, it tricky. First zsh‘s man page is not complete like bash‘s, it’s divided into multiple man pages. Second, concatenating all the zsh man pages with man zshall and searching for noglob gest no matches. It turns out that options are documented in caps with underscored separating words. So, in noglob‘s case, you have to search for NO_GLOB. Annoying.)

zsh with xtrace?

Next I wondered whether this was due to some kind of startup problem with my zsh setup, so I tried starting up zsh with the xtrace option to see what’s run on startup. But the output was overwhelming, with over 13,000 lines pushed to the terminal:

bash-5.1$ zsh -x 2> out
zsh$ exit
bash-5.1$ wc -l out
13328

I did look anyway, but nothing looked suspicious.

zsh with NO_RCS?

Back to the documentation, and I found a way to start zsh without any startup files by starting with the NO_RCS option.

bash-5.1$ zsh -o NO_RCS
zsh$ grep .* afile
zsh:1: no matches found: .*

There was no change in behaviour, so it wasn’t anything funky I was doing in the startup.

At this point I tried using the xtrace option, but then re-ran it in a different folder by accident:

zsh$ set -o xtrace
zsh$ grep .* *
zsh: no matches found: .*
zsh$ cd ~/somewhere/else
zsh$ grep .* *
+zsh:3> grep .created_date notes.asciidoc

Interesting! The original folder I created to test the grep just threw an error (no matches found), but when there is a dotfile in the folder, it actually runs something… and what it runs does not include the dot folder (.) or parent folder (..)

Instead, the ‘grep .* *‘ command expands the ‘.*‘ into all the files that begin with a dot character. For this folder, that is one file (.created_date), in contrast to bash, where it is three (. .. .created_date). So… back to the man pages…

tl;dr

After another delve into the man page, I found the relevant section in man zshall that gave me my answer:

FILENAME GENERATION

[...]

In filename generation, the character /' must be matched explicitly; also, a '.' must be matched explicitly at the beginning of a pattern or after a '/', unless the GLOB_DOTS option is set. No filename generation pattern matches the files '.' or '..'. In other instances of pattern matching, the '/' and '.' are not treated specially.

So, it was as simple as: zsh ignores the ‘.‘ and ‘..‘ files.

But Why?

But I still don’t know why it does that. I assume it’s because the zsh designers felt that that wrinkle was annoying, and wanted to ignore those two folders completely. It’s interesting that there does not seem to be an option to change this behaviour in zsh.

Does anyone know?


If you like this, you might like one of my books:
Learn Bash the Hard Way

Learn Git the Hard Way
Learn Terraform the Hard Way

LearnGitBashandTerraformtheHardWay
Buy in a bundle here

If you enjoyed this, then please consider buying me a coffee to encourage me to do more.

head_shot.jpg?w=768

bmc-button.png?w=1024

Share this:

Loading...

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK