Intro to Shell Scripting

I’m regularly asked to write something about the magic of shell scripting, so here goes. While I don’t expect deep understanding from the reader, I assume basic knowledge of how to work with the terminal itself and having seen some scripts (while maybe being too scared to touch them yet).

Shell scripting is different from other scripting or programming in that we don’t have “libraries” we include. Instead, all programs we have installed serve as our huge library of tools we can invoke, chain together, loop over, etc. Thus, “learning shell scripting” consists of a) learning the tools commonly available on your regular UNIX/Linux workstation, and b) learning the language that chains together these tools.

For the language we will use the POSIX Shell subset ¹ that’s virtually supported by any shell, including Bash, Zsh, but also more modern incarnations of Ksh. This isn’t only a plus due to portability, but also because POSIX Shell is much more simple than the many different ways we can build if in Bash, or iterate in Zsh. While they are definitely useful in some contexts, most often the multitude of syntaxes only confuse the user.

POSIX Shell

If & Test

Probably the most esoteric part of the classic shell is if in combination with the test program. Ksh, Bash, Zsh and so on all set out to “fix” this, however, the added complexity made things, arguably, worse. And while definitely an idiosyncratic design, it’s rather easy to understand, so let’s start:

The if built-in keyword simply executes a program and checks its exit code. If the program exited with code 0, this is considered to be a true condition. Or, as described more verbosely in the standard under The if Conditional Construct:

In most environments you will have two programs called true and false available at /bin/true & /bin/false or /usr/bin/true and /usr/bin/false respectively. Let’s check what exit-code they have! You can either enter sh to get a POSIX interactive shell and type the following directly, or save it as a file, e.g., foo.sh and run it as sh ./foo.sh:

It prints “exit code 0!” which makes sense since the executable is called “true”.

More commonly, however, we don’t want to check the exit code of a program, but check the value of a variable. We can reduce this problem to the checking of the exit code, if we’d have a program which takes an expression and exits with the appropriate exit-code. Luckily for us, someone already went through the hassle of writing this and called the program test. Let’s give it a ride:

While working, this looks a bit clumsy, so the shorthands [ and ] were created as alternative names and delimiters to test:

Since [ is a program with the arguments "$answer" (shell-expanded to the value of the variable), -eq, 42 and ] you need to separate all of these with spaces. The following does not work!:

Multiple Conditions

In order to check for the truth value of multiple conditions, we call test multiple times, chaining the results:

Depending on your previous knowledge, the && may already be known to you. While it acts as a logical-and here, it’s semantics are slightly different: The command on the left-hand side of the && is executed, if it exited successfully (i.e., exit status is zero), the right-hand side is executed as well with the exit-status of the complete expression being the the latter.

But, if the first command did fail, the second command will not be executed, and failure signaled with a non-zero exit status.

Analogously, we can produce or using || which short-curcuits as well, i.e., stops after the first command exited successfully.

For completeness sake, there are also Sequential Lists using ; which simply executed the commands in order, without exiting early and simply passing the last command in the list. More advanced are Asynchronous Lists (using a single &) which run commands in the background, (possibly) in parallel but are not appropriate for usage in if, since they always exit with 0.

Else, Else-If and Empty Bodies

We can also do else blocks as well as else-if blocks, however, typing is hard, and Shell syntax even worse, which is why:

Further, if we have an empty body, we cannot just leave it empty, the shell expects something. Luckily, the : serves as a no-op.

While- & Until-Loops

The while loop works almost identical to the if construct with the slight adjustment that the command specified (i.e., most commonly test) is called multiple times:

This uses Arithmetic Expansion using the $((...)) syntax. Within these we do not need $ to refer to variables and can now do pretty complex maths, directly from the console, neat!

For-Loops

The for loop is probably the most avant-garde construct of the shell as it is a range-based for-in loop, unlike a three-expression-style as in C:

The syntax is quite easy to pick up and using the program seq we can also iterate over indices:

Since the for loop doesn’t expect to run a program as part of its “head” (unlike if) we need to explicitly ask the shell to do Command Substitution using the $(...) construct which runs the program with the specified arguments and replaces the expression with the output of it (not the exit-code, again, unlike if). Since seq produces a list of numbers from 1 to 42 inclusive when called like above, i will take precisely these values.

However, unlike our first example, the numbers produced aren’t delimited by spaces but by newlines! Indeed, tabs would’ve worked just as well. The shell does something called Field Splitting here, and, by default, fields are split at space, tab or newline. Again, quoting the standard:

We can actually modify at what position fields are split, e.g., for parsing semicolon-delimited CSVs using IFS=';', but this is out of scope for this article :-)

A Curious Case

In some cases you want to check the value of one variable against a whole range of patterns. In many languages this can be done using a switch-case or match construction. Since shell is a language for those who don’t like to type much, only say case and terminate the construct with esac (“case” reversed).

Quoting

In this case there’d have been no difference if I’d have ommitted the quotes, but it is often considered good style to use them everywhere where you can.

To demonstrate the difference, let’s replace the first value (foo) of the list we iterate over by the string “The world is ending” which contains spaces. In order to tell the loop that we consider this one item of the list (and not four), we put quotes around it:

I also replaced the echo with a printf to highlight the issue we will now observe: The output is:

But… didn’t we ask the for to consider this as just one item? We did, but I also sneakily removed the quotes around $x, leading to the following chain of executed commands:

That is, the printf is executed with five arguments, the format string (first) plus the four additional strings. However, we only used one format specifier %s and thus expected just one string following the format. This is the culprit here, as printf has a rather unexpected behavior if passed more arguments than allowed for in the format string.

Indeed, I recommend quoting all variables by default, and only thinking of it as “when must I omit the quotes” instead of the other way around.

However, there are also single-quotes which we didn’t talk about yet. All strings within double-quotes are subject to Word Expansions, that is, we could write:

To print the value of the variable answer since the shell expanded it before passing the resulting string to echo. Sometimes, we don’t want things like these to happen, and actually want to print, say, a dollar sign:

If we’d have used double quotes here, our shell would’ve been very confused. In fact, many cases, like the printf format strings above, we could (and possibly should) use single quotes to prevent errornous expansion(s).

We can also nest quotes, use escape sequences, etc., but this is again out-of-scope for this article.

Tools

We’ve now had a brief look at some of the most simple constructs of the POSIX Shell, but it is, by itself, not that powerful. We need the tools of the UNIX workbench in order to do any useful composition using the shell language.

ECHO/PRINTF

While using echo is simple, unfortunately, for all more advanced usages, the exact behavior of echo is different from platform to platform. Thus, if you do something differently than echoing a simple variable or printing simple text, use printf instead.

GREP

The tool grep had it’s origins in the line editor ed, from the editor command g/re/p, meaning, “work globally”, “match by regular expression given as re”, and “print the resulting lines”.

Spinned out as its own command-line tool, we can do just that, without learning The Standard Editor (which is the precursor to ex, precursor to vi, precursor to vim, precursor(?) to nvim). Most usage of grep boils down to learning regular expressions, which is out of scope of this article. However, I want to give some notes that many seem not to be aware of:

Whatever you do with grep, remember though that it works on lines, due to its heritage to ed.

SED

The stream-editor sed also shares a heritage with ed, basically being a simple scriptable version of it. Instead of searching for a pattern and printing the results, we can replace occurances, delete them, list them, print them, etc.

The most common usage is probably replacement, using the syntax of s/regexp/replacement/ with an optional trailing g to replace all matches globally.

A specialisation to sed/grep is tr. Instead of replacing one occurance with another string, we can replace ranges with other ranges. E.g., to capitalize all the letters in a given text:

AWK

AWK supercharges the featureset of grep and sed by allowing us to execute arbitrary code if a certain pattern is matched. That is, the input is iterated over line by line, split into columns and you can formulate patterns as well as conditions by referring to single columns or the whole line. This is best understood in action, and, since I cannot describe this any better, I copy this verbatim from the excellent book “The AWK Programming Language” ² by Alfred V. Aho (The Dragon Book on compiler design), Peter J. Weinberger, and Brian W. Kernighan (“The C Programming Language”):

Useful awk programs are often short, just a line or two. Suppose you have a file called emp.data that contains the name, pay rate in dollars per hour, and number of hours worked for your employees, one employee per line, like this:

Now you want to print the name and pay (rate times hours) for everyone who worked more than zero hours. This is the kind of job that awk is mneant for, so it’s easy. Just type this command line:

Let’s analyze the program, given in the single quotes: The $3 refers to the third column, and thus the pattern matches every line where the employee $1 worked more than 0 hours. In these cases, we execute the action given in the {...}, printing the name, as well as the pay.

FIND

If awk matches patterns against lines in a file, find matches patterns against files in your file system. As with awk, it can execute code, when a pattern is matched, for example printing the line count of every file with the extension .c in the current directory, or any subdirectory:

The expression -name '*.c' matches, and the expression -exec wc -l {} ; executes the program wc with the option -l (printing lines only), while substituting the {} with eached matched file. Note that we need to escape the ; since ; is a keyword in the shell language (';' would’ve worked as well, but is one more character to type). This results in e.g., the following executions:

A bit unfortunate for us, however, the wc program now prints only the line counts themselves, but we have a hard time associating them with each file.

Most command-line tools are built “intelligently” though – they change behavior, depending on whether they are called with multiple arguments or just one. If we’d run:

How to achieve that with find? Well, asking find nicely, would be a plus, so we replace the ; with a +, and behold:

With this, we can build powerful meta-tools, many of my personal scripts are just wrappers around one powerful find construct. And we don’t need to sin an use the non-POSIX GNU/grep specific grep -R option, we can simply use the short:

https://pubs.opengroup.org/onlinepubs/9699919799/utilities/V3_chap02.html↩︎
https://9p.io/cm/cs/awkbook/index.html↩︎