Winter 2017 - January to April 2017 - Updated 2018-11-29 14:31 EST
There are many ways to make mistakes in script programming. Here are some warnings about common errors.
Shells are not particularly good about giving helpful error messages when shell scripts contain errors. For example, having a missing or non-integer argument to the test
command may produce a vague error message:
bash$ test 1 -eq
bash: test: 1: unary operator expected
sh$ test 1 -eq ""
sh: 1: test: Illegal number:
A common mistake when writing a new shell script is to write too many lines of code, run the new script, and then get too many error messages. Because you wrote so many lines, you don’t know which line contains the error.
Create shell scripts a few lines at a time, testing the script after you add each line or two so you know where the errors lie.
Running the script using a shell with the debug options -x
or -v
set may also be helpful:
$ bash -u -x ./myscript.sh
$ bash -u -v ./myscript.sh
Don’t forget that shells aren’t designed to do arithmetic. They find and run commands, and it is commands that you must use in if
statements.
This if
statement below is wrong thinking; it forgets to use the test
helper program to compare the numbers:
#!/bin/sh -u
if $# -gt 0 ; then # WRONG WRONG WRONG
echo "Number of arguments is $#"
fi
The above if
line will have the shell expand the variable $#
into some number and then try to execute that number as a command. The error message will not be very helpful and will depend on the number of arguments:
$ ./myscript.sh a b c
./myscript.sh: 2: ./myscript.sh: 3: not found
$ ./myscript.sh a b c d e f g h
./myscript.sh: 2: ./myscript.sh: 8: not found
You can see the problem if you use the -x
option to the shell:
$ bash -ux ./myscript.sh a b c d
+ 4 -gt 0
./myscript.sh: line 2: 4: command not found
The shell if
keyword must always be followed by a command name. Always remember to code some command name after if
:
if test $# -gt 0 ; then # RIGHT: use test helper command to compare
if [ $# -gt 0 ] ; then # RIGHT (syntactic sugar for test helper)
Conversely, don’t double up on the command name you use after if
:
$ if [ fgrep foo /etc/passwd ] ; then date ; fi # WRONG WRONG WRONG
[: foo: binary operator expected.
The above syntactic-sugar line is equivalent to this (also incorrect) line:
$ if test fgrep foo /etc/passwd ; then date ; fi # WRONG WRONG WRONG
test: foo: binary operator expected.
The command name after the if
, above, is test
. The test
command will see three command-line arguments (fgrep
, foo
, and /etc/passwd
) and complain that the middle one isn’t an operator. You can’t use both commands test
and fgrep
at the same time.
If you want to test the return status of fgrep
, then fgrep
must be the command name that immediately follows the if
keyword:
$ if fgrep foo /etc/passwd ; then date ; fi # RIGHT
Don’t put command names inside square brackets.
[...]
square bracket syntax for test
needs surrounding blanksIndexThe if
keyword must always be followed by a command name, and that command name is exactly one square bracket [
in the syntactic-sugar form of the test
helper command.
The following code does not work because blanks are missing around the first square bracket, making it into an unknown command named [1
or [!
:
$ if [1 -eq 1 ] ; then echo "ALWAYS USE BLANKS" ; fi
sh: [1: command not found
$ if [! -r /etc/passwd ] ; then echo "ALWAYS USE BLANKS" ; fi
sh: [!: command not found
The shell sees [1
and [!
as two-character command names that don’t exist. The following incorrect statement fails for the same reason:
$ if [a=b] ; then echo "ALWAYS USE BLANKS" ; fi
bash: [a=b]: command not found
Square brackets are not punctuation! Always use blanks around [
and ]
:
$ if [ 1 -eq 1 ] ; then ... # RIGHT! surround with blanks
$ if [ ! -r /etc/passwd ] ; then ... # RIGHT! surround with blanks
The arguments to the test
command must always be separate command line arguments. This next line fails because of the missing blank before the required closing square bracket:
$ if [ 1 -eq 1] ; then echo "ALWAYS USE BLANKS" ; fi
[: missing `]'
The [
command is looking for the argument “left square bracket” ]
not 1]
as its last command line argument. The corrected line uses blanks:
$ if [ 1 -eq 1 ] ; then ... # RIGHT! surround with blanks
Always surround all the test
helper command arguments with blanks.
test
operatorsIndexThe test
helper command behaves differently depending on the number of arguments you pass to it:
$ test 1 -eq 2 # three arguments: operator in middle
$ test -f file # two arguments: operator on left
$ test string # one argument: -n assumed on left: -n string
If the test
command has only one single command line argument, it defaults to using -n
as the implied operator (test for non-empty string) on the one argument. The following one-argument tests are always TRUE, though they may not appear that way at first to human eyes:
if test a=b ; then # WRONG! THIS IS TRUE (good return code) !
if [ a=b ] ; then # WRONG! THIS IS TRUE (good return code) !
if test 1=2 ; then # WRONG! THIS IS TRUE (good return code) !
if [ 1=2 ] ; then # WRONG! THIS IS TRUE (good return code) !
if test 0 ; then ... # WRONG! THIS IS TRUE (good return code) !
if [ 0 ] ; then ... # WRONG! THIS IS TRUE (good return code) !
In all the above lines, the test
command has only one command line argument (not counting the trailing ]
that is always ignored). Since the single argument to test
is not the empty string, test
returns a good status and the if
succeeds. The test
command is defaulting to use an implied -n
operator on the left. The shell is actually executing these tests for non-empty strings:
if test -n "a=b" ; then # THIS IS ALWAYS TRUE (good return code) !
if [ -n "a=b" ] ; then # THIS IS ALWAYS TRUE (good return code) !
if test -n "1=2" ; then # THIS IS ALWAYS TRUE (good return code) !
if [ -n "1=2" ] ; then # THIS IS ALWAYS TRUE (good return code) !
if test -n "0" ; then ... # THIS IS ALWAYS TRUE (good return code) !
if [ -n "0" ] ; then ... # THIS IS ALWAYS TRUE (good return code) !
All the above tests are always true, because the three-character strings a=b
and 1=2
are not empty strings and never will be empty, and the single-character string 0
is also never the empty string.
If you want to perform equality tests, you must separate each argument by blanks so that test
sees three separate arguments, not just one:
if test a = b ; then # correct 3-argument syntax
if [ a = b ] ; then # correct 3-argument syntax
if test 1 = 2 ; then # correct 3-argument syntax
if [ 1 = 2 ] ; then # correct 3-argument syntax
Always keep the arguments to test
separated by blanks.
<
or >
for -lt
less or -gt
greaterIndexAnother common mistake, usually made by programmers accustomed to other programming languages, is to use shell redirection metacharacters <
and >
instead of the correct operators -lt
and -gt
in test
numeric comparisons. Here are two identical mistakes:
if test 1 > 2 ; then ... # THIS IS WRONG - must use -gt not >
if [ 1 > 2 ] ; then ... # THIS IS WRONG - must use -gt not >
The above two lines have the shell first use redirection (>
) to create a file named 2
and redirect the output of the test
command into it. (The test
command produces no output; the file remains empty.) The test
command itself is left with only one single command line argument, the digit 1
. With one argument and no operators, the test
command returns success if the argument is not the empty string (test -n 1
). The string 1
is never empty, so the above test, and the if
, always succeeds.
The correct shell scripting form does not use the redirection syntax:
if test 1 -gt 2 ; then ... # right syntax for "greater than"
if [ 1 -gt 2 ] ; then ... # right syntax for "greater than"
Do not use shell redirection metacharacters inside test
expressions!
test
string equality operator is =
not ==
IndexIf you’re a programmer, you’re used to doing equality comparisons using the ==
operator. In shell programming the test
command uses the string comparison operator =
(one equals) and not ==
(two equals):
if [ "$1" = '--help' ] ; then ... # correct syntax uses '='
if [ "$1" == '--help' ] ; then ... # WRONG !
Some shells (e.g. bash
) accept the incorrect ==
operator as well as =
to compare strings, but the /bin/sh
(a link to /bin/dash
) shell on Ubuntu (the CLS) is not one of them:
bash$ [ a = b ] # correct syntax uses one '='
bash$ [ a == b ] # WRONG ! but bash allows it anyway
sh$ [ a = b ] # correct syntax uses one '='
sh$ [ a == b ] # WRONG ! causes error in /bin/sh
sh: 1: [: a: unexpected operator
Always use one single equals =
to compare strings.
This script below has no argument:
$ ./example.sh
Given the above script command line, inside the script the value of $#
(the number of arguments) is zero. The value of the first argument $1
(and all following arguments) is undefined.
These script command lines below both have a single empty or null string argument:
$ ./example.sh ''
$ ./example.sh ""
Given the above script command lines, inside the script the value of $#
is one because the script has one argument. The first argument itself $1
is defined but has zero characters in it:
test -z "$1" # this is TRUE inside the script
[ "$1" = '' ] # this is TRUE inside the script
An argument with no characters in it is not the same thing as a missing argument.
An argument that is a space character is not null or empty. These script command lines below all have a single string argument that contains a space character:
$ ./example.sh ' '
$ ./example.sh " "
$ ./example.sh \ # there is a space after the backslash
test -z "$1" # this is FALSE inside the script
[ "$1" = '' ] # this is FALSE inside the script
test -n "$1" # this is TRUE inside the script
[ "$1" = ' ' ] # this is TRUE inside the script
[ "$1" = " " ] # this is TRUE inside the script
[ "$1" = \ ] # this is TRUE inside the script
Remember the difference between:
The exit status negation operator !
may be used to the left of any single expression used inside the test
command:
if test ! -r "$file" ; then ...
if [ ! -r "$file" ] ; then ...
if test ! -z "$string" ; then ...
if [ ! -z "$string" ] ; then ...
The test
command uses the exclamation point operator !
to negate/invert/complement the exit status of a Boolean test. If you combine the negation operator of test
with the shell return code negation operator that also uses !
, you can end up with confusing or unreadable code:
if ! test ! -r file ; then # CONFUSING
if ! [ ! "abc" != "def" ] ; then # EVEN MORE CONFUSING
Don’t use confusing double-negative logic. Rework the expression to use only a single !
or none at all:
if test -r file ; then ... # same expression as above: readable
if [ "abc" != "def" ] ; then ... # same expression as above: readable
Keep the negation operator as an argument to test
; don’t place it before the opening square bracket alias to negate the return code of test
. To test if a file is non-existent or exits but is not readable:
if ! [ -r file ] ; then # NO: correct but awkward (do not use)
if [ ! -r file ] ; then # YES: correct and preferred
The shell return code negation operator !
is almost never used to negate the return code of the test
command itself. Always use !
as an argument to the test
command, inside the square brackets, never outside.
&&
or ||
for -a
AND or -o
ORIndexC and Java programming language programmers sometimes confuse the syntax of the Boolean operators AND &&
and OR ||
inside the test
command, where you should be using -a
or -o
:
if [ $# != 1 -o -z "$1" ] ; then ... # YES: correct shell syntax
if [ $# != 1 || -z "$1" ] ; then ... # NO: incorrect C language syntax
The error messages for this incorrect use look like this:
$ if [ $# != 1 || -z "$1" ] ; then echo hi ; fi # WRONG SYNTAX
[: missing `]'
bash: -z: command not found
The Bourne shell ||
and &&
operators separate shell commands in a manner similar to the semicolon ;
. You cannot use them inside test
expressions.
Use -a
and -o
to separate Boolean clauses to the test
command:
if [ $# != 1 -o -z "$1" ] ; then echo hi ; fi # RIGHT
Digression (optional reading):
You can use the shell
||
and&&
command separators between individualtest
commands if you make sure eachtest
command is complete:if [ $# != 1 ] || [ -z "$1" ] ; then echo hi ; fi # valid but inefficient
Above, the
||
separates two different and completetest
command executions. Rather than using thetest
command twice, you can simply join them into one using the correcttest
Boolean operator:if [ $# != 1 -o -z "$1" ] ; then echo hi ; fi # RIGHT
Less code is better code.
test
IndexThe test
helper command has six ways to compare numbers and two ways to compare strings. Don’t mix them up. In particular, don’t use the numeric operators to try to compare strings; the error message isn’t very obvious:
$ if [ "$1" -eq "" ] ; then echo "Empty string" ; fi
sh: [: Illegal number:
The string comparison operators are =
and !=
, not -eq
and -ne
.
test
IndexBoolan logic has some subtle consequences when applied to the operations performed by the test
helper command.
-n
and -z
, =
and !=
, -eq
, and -ne
IndexThe logical opposite of the test
operator -n
(is not an empty string) is -z
(is an empty string), just as the opposite of =
(string equality) is !=
(string inequality), and the opposite of -eq
(integer equality) is -ne
(integer not equal). These are all correct opposites.
-lt
and -ge
IndexThe logical opposite of the test
operator -lt
(less than) is not -gt
(greater than), it is -ge
(greater than or equal to). (If you are not younger than your sister, you are either older or the same age.)
The opposite of the test
operator -gt
is not -lt
, it is -le
.
The test
operators -f
and -d
are not opposites. If a pathname is not a file, it may or may not be a directory. It could be a directory or any number of other special file types under Unix/Linux. (/dev/null
is a common example of a pathname that is not a directory or a plain file.)
You cannot replace the test ! -f
with -d
or vice-versa.
test
pathname operators, e.g. ! -r
IndexThe test
pathname operators all return success (zero) only if the pathname is accessible (all the directories can be traversed) AND the pathname exists AND if it has the given pathname property. This means that the negation/inversion of a pathname operation has to include the possibility that the pathname does not exist or that it can’t be accessed:
if [ -r file ] ; then ... # succeed if pathname is accessible and readable
if [ ! -r file ] ; then ... # succeed if pathname inaccessible, non-existent, or not readable
Inverting the status of most of the pathname operators means that the resulting test might succeed either because the pathname can’t be reached, OR the pathname doesn’t exist, OR because the pathname exists but fails the test. You need to apply more programming logic if you want to know that a pathname actually exists but is not, for example, readable:
if [ -e pathname -a ! -r pathname ] ; then ... # if path exists AND path is *not* readable
Remember that inverting a pathname test may mean the inverted test succeeds because the pathname is not accessible or does not exist!
The opposite of “pathname is readable” is “pathname is not accessible, OR pathname does not exist, OR pathname is not readable”.
test
pathname testsIndexIf a test
pathname operator (e.g. -r
, -w
, -x
, -f
, -d
, -s
, -e
) succeeds, you also know that you have permission to traverse all the directories leading up to it and that the pathname actually exists.
If a test
pathname operator fails, it may also fail because you have no permission to search one of the directories in the pathname, or because the pathname simply doesn’t exist. Without first testing if you can access the pathname and that it actually exists, the following error message is misleading:
if [ ! -r "$path" ] ; then
echo 1>&2 "$0: '$path' is not readable" # POOR ERROR MESSAGE
fi
While it is true that the pathname is not readable, the above error message is incomplete. You might not have permission to traverse all the directories in its pathname, or, the pathname might not even exist. Saying the overall pathname is not readable is true, but it is only part of the truth. A more accurate error message would be:
if [ ! -r "$path" ] ; then
echo 1>&2 "$0: '$path' is inaccessible, missing, or not readable"
fi
If you want to be more specific in your error message about why the pathname is not readable, you need code to test for existence first:
if [ ! -e "$path" ] ; then
echo 1>&2 "$0: '$path' does not exist or is not accessible by you"
else
# the pathname exists and is accessible; test readability:
if [ ! -r "$path" ] ; then
echo 1>&2 "$0: '$path' exists but is not readable by you"
fi
fi
The test for readability is now done only if the pathname exists and is accessible; if the test for readability fails, you know the (existing, accessible) pathname item is truly not readable. The error message is more accurate now.
Any time one of the test
pathname operator tests fails, be accurate in your error message. State whether the failure is due to a missing or inaccessible pathname, or due to a failure of the actual test being performed on the (existing, accessible) pathname.
test
expressions cloud error messageIndexBe careful in if
statements when testing multiple conditions at the same time that you do not make the failure error message unhelpful:
if [ $x -gt 0 -a -f "$file" -a $y -lt 27 -a -n "$string" ] ; then
... do something useful ...
else
echo 1>&2 "$0: Error: ... what do you say here ??? ..."
fi
The ???
error message above would have to say what failed, and there are so many possibilities for failure that the message becomes unreadable. The error would have to read like this: Error: $x is <= 0 or $file is inaccessible, does not exist, or is not a file, or $y is >= 27, or '$string' is a null string
. Which failure was it? Such a complex error message is not helpful to the users of your scripts!
Use separate tests and separate error messages for each test condition; don’t bunch them together using Boolean -a
or -o
operators:
# Split the huge condition into more readable error messages.
# Test each condition separately and exit if any condition fails.
#
if [ $x -le 0 ] ; then
echo 1>&2 "$0: Error: x value $x is <= 0"
exit 1
fi
if [ ! -f "$file" ] ; then
echo 1>&2 "$0: Error: path '$file' is inaccessible, does not exist, or is not a file"
exit 1
fi
if [ $y -ge 27 ] ; then
echo 1>&2 "$0: Error: y value $y is >= 27"
exit 1
fi
if [ -z "$string" ] ; then
echo 1>&2 "$0: Error: string value '$string' is a null string"
exit 1
fi
... all tests passed; now do something useful ...
Less code is better code.
Consider this correct but amateur shell script code:
fgrep "foo" /etc/passwd >/dev/null
if [ $? -eq 0 ] ; then
echo "I found foo in the password file"
fi
The programmer forgot that the if
statement can directly test the return code of the command it executes. Calling up the test
command to examine the shell variable for the return code of the previous command is superfluous. The “less code” version of the above amateur code is:
if fgrep "foo" /etc/passwd >/dev/null ; then
echo "I found foo in the password file"
fi
A real pro might have read the manual page for fgrep
an knows that fgrep
has a --quiet
(-q
) option to suppress output, so the pro version becomes:
if fgrep -q "foo" /etc/passwd ; then
echo "I found foo in the password file"
fi
Don’t write more code than you need to. Less code is better code.