Winter 2016 - January to April 2016 - Updated 2019-01-29 15:21 EST
Shell redirection is a powerful way to change from where commands read input and to where commands send output. It applies to every command run by the shell, so once you learn the shell syntax of how it works, it works on all the commands you can type into the shell.
If you want to hear a simple explanation of the power of shell redirection directly from the bearded men who invented it back in the 1970s, watch either of these historic 1982 videos on The UNIX Operating System:
- UNIX: Making Computers More Productive (27 minutes).
- UNIX: Making Computers Easier To Use (23 minutes).
You can redirect the input of a command away from your keyboard, and you can redirect the output of a command away from your screen.
The redirection can be to or from a file using the shell meta-characters ‘>
’ or ‘<
’ (angle brackets) or it can be to or from a program using the shell meta character ‘|
’ (the pipe symbol).
Commands produce two kinds of output – normal output and error message output – and the shell can redirect each of these separately.
stdout
IndexIn the process of output redirection, the shell (not the command) redirects (diverts) most command output that would normally appear on the screen to some other place. The redirection can be either into a file using a file output redirect meta-character ‘>
’, or into the input of another command by separating the commands using a pipe meta-character ‘|
’:
$ echo hi >outfile # redirect echo output into an output file "outfile"
$ echo hi | wc # redirect echo output into the program "wc"
The normal command output that appears on your screen is called the standard output of the command, abbreviated stdout. It is the expected output that a command generates if everything works. Anything written to stdout can be redirected as shown in the above examples.
stderr
IndexIf something goes wrong, commands produce error messages. These error message are sent to what is called the standard error output of the command, abbreviated stderr. Error messages are almost always sent to your screen, even if you redirect stdout somewhere else:
$ ls nosuchfile >outfile
ls: cannot access nosuchfile: No such file or directory
Standard error output is not subject to simple output redirection, but with extra syntax the shell can redirect it, too, into a file:
$ ls nosuchfile >outfile 2>errors
In programming terms, stdout and stderr outputs are written to different I/O units, but both end up on your screen, by default. The shell can redirect them separately. More on that later.
You need to remember these four things about output redirection:
Redirection is done for the command by the shell, first, before finding and running the command; the shell has no idea if the command exists or will produce any output. The shell performs the redirection before it finds or runs the command.
The shell can only redirect output that is produced by the command. You can only redirect the output that you can see. If there is no visible output on your screen without redirection, adding redirection won’t create any. Re-read this a few times and remember it.
Before you redirect output into a file or a pipe, look at what output is on your screen with no redirection added. If what is appearing on your screen isn’t right wihout redirection, adding redirection won’t make it right.
Redirection can only go to one place. You can’t use multiple redirections to send output to multiple places. (See the tee
command for a way to send output into multiple files.)
By default, error messages (called “standard error” or stderr) are not redirected; only “normal output” (called “standard output” or stdout) is redirected. (You can also redirect stderr with more shell syntax; see below.)
Summary of Rules for Output Redirection:
- Redirection is done first, before running the command.
- You can only redirect output that you can (could) see.
- Redirection goes only to one place.
- Only standard output is redirected, by default.
We will discuss each of these rules in detail, below.
>
IndexThe shell meta-character right angle bracket ‘>
’ signals that the next word on the command line is an output file (not a program) that the shell should create or truncate (set to empty) and make ready to receive the standard output of a command. Standard output is the normal output that appears on your screen when you run the command:
$ echo hi # normal output appears on screen; no redirection
hi
$ echo hi >outfile # output goes into file "outfile", not on screen
$ cat outfile # display the contents of outfile on screen
hi
The space before the angle bracket ‘>
’ is sometimes required, so always include it and you won’t go wrong. The space after the ‘>
’ is optional and most people omit it:
$ echo hi >outfile # this is the most common usage
$ echo hi > outfile # space is optional after the >
$ echo hi>outfile # AVOID THIS - always put a space before the >
Putting the space in front of ‘>
’ makes the command line easier to read.
Output redirection means the shell truncates (make empty) an existing file. An existing file will have its contents removed:
$ echo hello >out # "hello" goes into the file "out"
$ cat out
hello
$ nosuchcommandxxx >out # "out" is made empty again
sh: nosuchcommandxxx: command not found
$ cat out # command failed -- "out" is still empty
$
The shell always makes the output file empty before trying to run the command. If the command fails or doesn’t produce any standard output, the redirection output file remains empty.
It is the shell that does all redirection, not the command being run.
For output redirection into files, the shell creates or truncates the output file and sets up the redirection, not the command being redirected. The command knows nothing about the redirection – the redirection syntax is removed from the command line by the shell before the command is found and executed:
$ echo one two three # echo has three command line arguments
one two three
$ echo one two three >out # echo still has three arguments
$ cat out
one two three
Shells handle redirection before they go looking for the command name to run. Indeed, redirection happens even if the command is not found or if there is no command at all:
$ >out # file "out" is created or truncated empty
$ wc out
0 0 0 out # shell created an empty file
$ nosuchcommandxxx >out # file "out" is created empty
sh: nosuchcommandxxx: command not found
$ wc out
0 0 0 out # shell created an empty file
The shell creates or truncates the file “out
” empty, and then it tries to find and run the nonexistent command and fails. The empty file that was created by the shell remains.
The shell does the output file creation and truncation before it finds or runs the command. This can affect the output of commands that operate on files:
$ mkdir empty
$ cd empty
$ ls -l
total 0 # no files found
$ ls -l >out # shell creates "out" first
$ cat out # display output
total 0
-rw-r--r-- 1 idallen idallen 0 Sep 21 06:02 out
The shell creates the file out
before it runs the ls
command, so the ls
command finds the new output file and the output of ls
with redirection is different than the output of ls
without.
Because the shell truncates the redirection output file before the shell looks for and runs the command, you cannot use output redirection files as input to the same command:
$ cp /etc/passwd a
$ sort a >a # WRONG WRONG WRONG! File "a" is truncated to be empty!
$ cat a # shows empty file
$
Above, the shell makes the file a
empty before it runs the sort
command. The sort
command sorts an empty file, and the empty output goes into the file a
, which remains empty.
Never use redirection output files as input files!
Redirection does not create output that wasn’t there without redirection. If you don’t see any output from a command without redirection, adding redirection to the command won’t cause the command to create output.
Adding redirection to a command that generates no output will simply have the shell redirect no output and create an empty file:
$ cp /etc/passwd x # no output visible on standard output
$ cp /etc/passwd x >out # file "out" is created empty
$ cd /tmp # no output visible on standard output
$ cd /tmp >out # file "out" is created empty
$ touch x ; rm x # no output from rm on standard output
$ touch x ; rm x >out # file "out" is created empty
You can only redirect the output that you can see! Run a command without redirection and observe what output the command produces. If you don’t see any output, adding redirection won’t create any.
Output redirection can only go to one place, so adding multiple output redirection to a command line doesn’t do what you might think:
$ date >a >b >c # output goes into file c; a and b are empty
The right-most output file redirection always gets all the output and the other output redirection files to the left are created empty by the shell.
If you redirect output into both a file and a pipe, the file gets all the output and the pipe gets nothing:
$ date >a | cat # all output goes into file "a"; cat shows nothing
See the following section on redirection into programs using “
|
” pipes.
By default, the shell only redirects the standard output (normal output) of commands. Error messages from commands are not redirected into the output file and still appear directly on your screen:
$ ls -l nosuchfile /etc/passwd
ls: cannot access nosuchfile: No such file or directory
-rw-r--r-- 1 root root 2542 Jun 24 2014 /etc/passwd
$ ls -l nosuchfile /etc/passwd >out # standard output goes into "out"
ls: cannot access nosuchfile: No such file or directory
$ cat out # show contents of "out"
-rw-r--r-- 1 root root 2542 Jun 24 2014 /etc/passwd
Error messages continue to appear on your screen, even with redirection, so that you know the command you are using had an error.
Shells don’t care where in the command line you put the output file redirection. No matter where in the command line you type it, it has the same effect, though most people put it at the end, as in the first example below:
$ echo hi there mom >file # echo has three arguments
$ echo hi there >file mom # echo has three arguments
$ echo hi >file there mom # echo has three arguments
$ echo >file hi there mom # echo has three arguments
$ >file echo hi there mom # echo has three arguments
All the command lines above are equivalent and create the same output file. The redirection syntax is processed by and removed by the shell before the command runs. The redirection syntax is never counted as arguments to a command.
In every case above the echo
command sees exactly three command line arguments and all three arguments “hi
”, “there
”, and “mom
” are all redirected into the output “file
”.
The file redirection is found and done first by the shell, then the redirection syntax is removed from the command line before the command is called. The command actually being run doesn’t see any part of the redirection syntax; the number of arguments is not affected.
>>
IndexYou can append (add) output to the end of a file, instead of truncating it to empty, using a double right angle bracket “>>
”:
$ echo first line >file # file is truncated using single >
$ echo second line >>file # append second line to end of file using >>
$ echo third line >>file # append third line to end of file using >>
$ cat file # display what is in the file
first line
second line
third line
The redirection output file is always truncated (set to empty) before the command runs, unless you use the append syntax “>>
” instead of “>
”.
Redirection is done by the shell, first, before it finds and runs the command. Things happen in this order on the command line:
All redirection is found and done by the shell, no matter where the redirection is typed in the command line. All redirection output files are created or truncated to empty (except when appending). This redirection and truncation happens even if no command executes. (If the redirection fails, the shell does not run any command.)
The output file is created or truncated to zero size before the command runs.
The shell removes all the redirection syntax from the command line. The command will have no idea that its output is being redirected.
The command (if any) executes and may produce output. The shell executes the command after doing all the redirection.
The output from the command (if any) happens, and it goes into the indicated redirection output file. This happens last. If the command produces no output, the output file will be empty. Adding redirection never creates output. Standard Error Output (error messages) does not get redirected; it goes onto your screen.
Explain this sequence of commands:
$ mkdir empty
$ cd empty
$ cp a b
cp: cannot stat 'a': No such file or directory
$ cp a b >a
$ # why is there no error message from cp this time? what is in file a ?
Explain this sequence of commands:
$ date >a
$ cat a
Wed Feb 8 03:01:21 EST 2012
$ cp a b
$ cat b
Wed Feb 8 03:01:21 EST 2012
$ cp a b >a
$ cat b
$ # why is file b empty? what is in file a ?
Explain this sequence of commands:
$ rm
rm: missing operand
$ touch file
$ rm >file
rm: missing operand # why doesn't rm remove "file"?
$ rm nosuchfile
rm: cannot remove 'nosuchfile': No such file or directory
$ rm nosuchfile >nosuchfile
$ # why is there no rm error message here?
Is the file nosuchfile
in existence after the last command, above?
How many words are in each of these five output files?
$ echo one two three >out1
$ echo one two >out2 three
$ echo one >out3 two three
$ echo >out4 one two three
$ >out5 echo one two three
What is in each file a
, b
, c
, after this command line?
$ echo hi >a >b >c
Most commands have two separate output “streams” or “units”, numbered 1 and 2:
The stdout (stream 1) and stderr (stream 2) outputs mix together on your screen. They look the same on the screen, so you can’t tell by looking at your screen what comes out of a program on stdout and what comes out of a program on stderr.
To show a simple example of stdout and stderr both appearing on your screen, use the ls
command and give it one file name that exists and one name that does not exist (and thus causes an error message to be displayed on standard error output):
$ ls /etc/passwd nosuchfile # no redirection used
ls: nosuchfile: No such file or directory # this on screen from stderr
/etc/passwd # this on screen from stdout
Both output streams look the same on your screen. The stderr (error message) output often appears first, before stdout, due to internal I/O buffers used by commands for stdout.
The default type of output redirection (whether redirecting to files or to programs using pipes) redirects only standard output and lets standard error go, untouched, to your screen:
$ ls /etc/passwd nosuchfile >out # shell redirects only stdout
ls: nosuchfile: No such file or directory # only stderr appears on screen
$ cat out
/etc/passwd # stdout went into the file
Programming information (for programmers only):
The Standard Output is the Unit 1 output from
printf
andcout
statements in C and C++ programs, and fromSystem.print
andSystem.println
in Java.The Standard Error Output is the Unit 2 output from
fprintf(stderr
andcerr
statements in C and C++ programs, and fromSystem.err.print
andSystem.err.println
in Java.
2>outfile
IndexNormally, both stdout and stderr appear together on your screen, and redirection only redirects stdout and not stderr.
You can redirect stdout and stderr separately into files using a unit number immediately before the right angle-bracket ‘>
’ meta-character:
>outfile
or 1>outfile
2>outfile
Put the unit number immediately (no blank) before the ‘>
’ meta-character to redirect just that output stream:
$ ls /etc/passwd nosuchfile 2>errors # shell redirects only stderr (unit 2)
/etc/passwd # only stdout (unit 1) appears on screen
$ cat errors
ls: nosuchfile: No such file or directory # stderr unit 2 went into file "errors"
If you don’t use a unit number before the right angle-bracket >
, the default is to assume Unit 1 and redirect just standard output. The default output redirection syntax >foo
(no preceding unit number) is a shell shorthand for 1>foo
so that >foo
and 1>foo
are identical.
You can redirect stdout (unit 1) into one file and stderr (unit 2) into another file using two redirections on the same command line:
$ ls /etc/passwd nosuchfile >out 2>errors # shell redirects each one
$ # nothing appears on screen
$ cat out
/etc/passwd # stdout unit 1 went into "out"
$ cat errors
ls: nosuchfile: No such file or directory # stderr unit 2 went into "errors"
Always use different output file names if you redirect both units. Do not redirect both units into the same output file name; the two outputs will overwrite each other. See the next section.
2>&1
IndexYou needed a special syntax “2>&1
” to redirect both stdout and stderr safely together into a single file. Read the syntax “2>&1
” as “send unit 2 to the same place as unit 1”:
$ ls /etc/passwd nosuchfile >both 2>&1 # redirect both into same file
$ # nothing appears on screen
$ cat both
ls: nosuchfile: No such file or directory
/etc/passwd
The order of the redirections >both
and 2>&1
on the command line matters! The stdout redirect “>both
” (unit 1) must come first (to the left of) the stderr redirect “2>&1
” (unit 2) because you must set where unit 1 goes before you send unit 2 to go “to the same place as unit 1”. Don’t reverse these! Remember: 1 comes before 2.
You must use the special syntax “>both 2>&1
” to put both stdout and stderr into the same file. Don’t use the following, which is not the same:
$ ls /etc/passwd nosuchfile >wrong 2>wrong # WRONG! DO NOT DO THIS!
$ cat wrong
/etc/passwd
ccess nosuchfile: No such file or directory
This above WRONG example shows how stderr and stdout overwrite each other and the result is a mangled output file; don’t do this. Use 2>&1
to send stderr into the same file as stdout.
The modern Bourne shells now have a special shorter syntax for redirecting both stdout and stderr into the same output file:
$ ls /etc/passwd nosuchfile &>both # redirect both into same file
$ # nothing appears on screen
$ cat both
ls: nosuchfile: No such file or directory
/etc/passwd
You can now use either “&>both
” or “>both 2>&1
”, but only the latter works in every version of the Bourne shell (back to the 1960’s!). When writing shell scripts, use the “>both 2>&1
” version for maximum portability. Don’t rely on &>both
working everywhere.
The most common redirection mistake is to use a redirection output file as a command argument input file. There is an obvious way to get this wrong, and a hidden way to get this wrong:
sort out >out
IndexSuppose you want to sort a file and put the sorted output back into the same file. (The sort
command is used as the example program below – anything that reads the content of a file and produces output is at risk.) This is the WRONG way to do it:
$ cp /etc/passwd out
$ sort out >out # WRONG! Redirection output file is used as sort input file!
$ cat out
$ # File is empty!
Here is the problem with the sort out >out
command above:
out
” and gets it ready to receive the standard output of the command being run:
out
” are lost – truncated – GONE! – before the shell even goes looking for the sort
command to run!sort
command with its one file name argument “out
” that is now an empty file.sort
command opens the empty argument file “out
” for reading. Sorting an empty file produces no output.out
”, so the “no output” goes into file out
; the file remains empty.Result: File “out
” is always empty, no matter what was in it before.
There are two safe and correct ways to do this, one of which depends on a special output file feature of the sort
command (that may not be available in other commands):
$ sort out >tmp && mv tmp out # sort into tmp file and rename tmp to out
$ sort -o out out # use special sort output file option
Here is another incorrect example that uses the same redirection output file as an input file. The result is wrong but is not an empty file this time:
$ date >out
$ wc out # count the lines, words, and characters in file "out"
1 6 29 out
$ wc out >out # WRONG! Redirection output file is used as input file!
$ cat out
0 0 0 out # Using wc on an empty file produces zeroes!
Here is the problem with the wc out >out
command above:
out
” and gets it ready to receive the standard output of the command being run:
out
” are lost – truncated – GONE! – before the shell even goes looking for the wc
command to run!wc
command with its one file name argument “out
” that is now an empty file.wc
command opens the empty argument file “out
” for reading. It counts the lines, words, and characters of an empty file and produces one line of output: 0 0 0 out
out
”, so the one line of output goes into file out
. The file shows all zeroes, not the word count of the original date.Result: File “out
” always shows zeroes, not the count of the original content.
Here is the only safe and correct way to do this with wc
:
$ wc out >tmp && mv tmp out # do output redirection into tmp file and move it
Other incorrect redirection examples that DO NOT WORK because the redirection output file is being used as an input file:
$ head file >file # ALWAYS creates an EMPTY FILE
$ tail file >file # ALWAYS creates an EMPTY FILE
$ uniq file >file # ALWAYS creates an EMPTY FILE
$ cat file >file # ALWAYS creates an EMPTY FILE
$ sort file >file # ALWAYS creates an EMPTY FILE
$ fgrep 'foo' file >file # ALWAYS creates an EMPTY FILE
$ wc file >file # ALWAYS counts an EMPTY FILE (0 0 0)
$ sum file >file # ALWAYS checksums an EMPTY FILE (0)
...etc...
Do not use a redirection output file as an input to a program or a pipeline! Never use the same file name for both input and redirection output – the shell will truncate the file before the command reads it.
Here are the four rules for Output Redirection again:
- Redirection is done first, before running the command.
- You can only redirect output that you can (could) see.
- Redirection goes only to one place.
- Only standard output is redirected, by default.
Never use a redirection output file as an input file!
Many Unix/Linux commands read input from files, if file pathnames are given on the command line. If no file names are given, these commands usually read from what is called Standard Input (“stdin”), which is usually connected to your keyboard. (You can send EOF
by typing ^D
(Ctrl-D) to get the command to stop reading your keyboard.)
Here is an example of the nl
command reading from a file, then reading from stdin (your keyboard) when no files are supplied:
$ nl /etc/passwd # nl reads content from the file /etc/passwd
[...many lines print here, with line numbers...]
$
$ nl # no files; nl reads standard input (your keyboard)
foo # you type this line and push ENTER
1 foo # this is the line as numbered and output by nl
bar # you type this line and push ENTER
2 bar # this is the line as numbered and output by nl
^D # you signal keyboard EOF by typing ^D (CTRL-D)
$
Examples of commands that may read from pathnames or, if not given any pathnames, from standard input:
less, more, cat, head, tail, sort, wc, grep, fgrep, nl, uniq, etc.
Commands such as the above may read standard input. They will read standard input (which may be your keyboard) only if there are no pathnames to read on the command line:
$ cat foo # cat opens and reads file "foo"; cat completely ignores stdin
$ cat # cat opens and reads standard input = your keyboard; use ^D for EOF
$ tail foo # tail opens and reads "foo"; tail completely ignores stdin
$ tail # tail opens and reads standard input = your keyboard; use ^D for EOF
$ wc foo # wc opens and reads file "foo"; wc completely ignores stdin
$ wc # wc opens and reads standard input = your keyboard; use ^D for EOF
The above is true for all commands that can read from stdin. They only read from stdin if there are no pathmames given on the command line.
To tell a command to stop reading your keyboard, send it an EOF (End-Of-File) indication, usually by typing
^D
(Control-D). If you interrupt the command (e.g. by typing^C
), you usually kill the command and the command may not produce any output at all.
Not all commands read from standard input, because not all commands read data from files supplied on the command line. Examples of common Unix/Linux commands that don’t read any data from files or standard input:
ls, date, who, pwd, echo, cd, hostname, ps, sleep # etc. NEVER READ DATA from STDIN
All the above commands have in common the fact that they never open any files for reading on the command line. If a command never reads any data from any files, it will never read any data from standard input, and it will never read data from your keyboard or anywhere else.
The Unix/Linux copy command cp
obviously reads content from files, but it never reads file data from standard input because, as written, it always has to have both a source and destination pathname argument. The cp
command must always have an input file name. It never reads file data from standard input.
<file
IndexThe shell meta-character left angle-bracket ‘<
’ signals that the next word on the command line is an input file (not a program) whose content the shell should make available to a command on standard input. Standard Input is the place that many commands read when they don’t have any pathnames to open. The command may or may not actually read the input made available by the shell; the shell can’t know that.
Using the shell meta-character ‘<
’ to do input redirection, the shell changes from where standard input comes for a command, so that it doesn’t come from your keyboard but instead comes from the specified input file.
$ nl # no files; nl reads standard input (your keyboard)
foo # you type this line and push ENTER
1 foo # this is the line as numbered and output by nl
^D # you signal keyboard EOF by typing ^D (CTRL-D)
$
$ nl </etc/passwd # no files; nl reads from standard input (/etc/passwd)
[...many lines print here, with line numbers...]
$
You can only usefully use standard input redirection on a command that would otherwise read your keyboard. If the command doesn’t read your keyboard (standard input) without the redirection, adding the redirection does nothing and is ignored. The redirection only works if, without redirection, the command would read your keyboard.
If (and only if!) a command reads from standard input, the redirected standard input will cause the program to read from whatever file the shell attaches to standard input. Here are examples using the shell to attach files to commands that are all reading standard input:
$ cat file # reads from file "file"
$ cat # reads from stdin (from your keyboard)
$ cat <file # reads from stdin that is now from file "file"
$ cat file <bar # reads from file "file" and ignores stdin file "bar"
$ head file # reads from file "file"
$ head # reads from stdin (from your keyboard)
$ head <file # reads from stdin that is now from file "file"
$ head file <bar # reads from file "file" and ignores stdin file "bar"
$ sort file # reads from file "file"
$ sort # reads from stdin (from your keyboard)
$ sort <file # reads from stdin that is now from file "file"
$ sort file <bar # reads from file "file" and ignores stdin file "bar"
The above is true for all commands that can read from stdin. They only read from stdin if there are no pathmames given on the command line.
The shell does not know which commands will actually read input from standard input; you can attach a file on standard input to any command. A command that ignores standard input will ignore the attached file.
If a command is not reading from standard input, redirecting input into the command will be ignored and do nothing. The shell cannot force a command to read any data from standard input.
For example, the date
command and the sleep
command never read any data from standard input, and you can’t force them to do so by adding redirection. The redirection is just ignored:
$ date # date never reads stdin
Thu Feb 16 05:48:13 EST 2012
$ date <file # date never reads stdin and ignores <file
Thu Feb 16 05:48:15 EST 2012
$ sleep 10 # sleep never reads stdin
$ sleep 10 <file # sleep never reads stdin; ignores <file
$ sleep <file # sleep never reads stdin; ignores <file
sleep: too few arguments
Many other common commands never read standard input, and so adding input redirection to these commands does nothing useful:
$ ls -l /bin # show pathnames under /bin
$ ls -l /bin <input # no difference; ls never reads stdin
$ cd /bin # change to the /bin directory
$ cd /bin <input # no difference; cd never reads stdin
$ cp foo bar # cp reads data from foo and writes to bar
$ cp foo bar <file # no difference; cp never reads stdin
Commands have to want to read stdin. The shell can’t force it.
Commands that take pathname arguments do not read standard input if any pathnames are present on the command line. If supplied with pathname arguments, the commands always read the pathnames and ignore stdin.
Here are more examples that DO NOT WORK as input redirection because the command was not reading from standard input when redirection was added. The following command lines all ignore standard input, because all the commands have been given file name arguments to read instead:
$ cat file1 <file2 # cat reads from "file1", ignores stdin <file2
$ sort file1 <file2 # sort reads from "file1", ignores stdin <file2
$ head file1 <file2 # head reads from "file1", ignores stdin <file2
$ tail file1 <file2 # tail reads from "file1", ignores stdin <file2
The above is true for all commands that can read from stdin. They only read from stdin if there are no pathmames given on the command line.
If there are pathname arguments on the command line, stdin is not used. In all the above incorrect examples, the shell will open the file file2
and attach it and make it ready on stdin for the command to read; the command itself will ignore stdin and read from the file1
pathname argument supplied on the command line. Attaching the input redirection <file2
on standard input is ignored because the command is reading from the pathname argument.
Commands never read both pathname arguments and standard input; it’s one or the other, and command pathname arguments are always used instead of stdin.
wc file
vs. wc <file
IndexIf a file can be supplied as a command line pathname or attached to a command via standard input, what is the difference? Below are the differences between “wc file
” and “wc <file
”:
We assume we have put the current date and time into a file:
$ date >file
wc file
Index$ wc file
1 6 29 file
wc
command has a pathname argument, which means it ignores stdinwc
command reads data from the file file
that it opens itselfwc
command is the program that is opening the file argument file
, not the shellwc
command, not the shell, and will mention the file name given on the command line, e.g.:$ wc /etc/shadow
wc: /etc/shadow: Permission denied
Note how it is the wc
program that issues the error message, above.
wc <file
IndexRather than giving the command a pathname argument, you might instead redirect input to the command using shell input redirection:
$ wc <file
1 6 29
file
, which means standard input for the wc
program will come from the file named file
wc
command has no pathname arguments, which means it will read from standard input, opened by the shellfile
, not the wc
commandwc
has no file name, it can’t print a file name in the outputwc
command, and the shell will be the one mentioning the file name, e.g.:$ wc </etc/shadow
-bash: /etc/shadow: Permission denied
Note how it is the bash
shell that issues the error message, above. Because the shell cannot open the file, it will not even look for or run the wc
program. Redirection I/O errors mean that no command will be run.
wc file
vs. wc <file
IndexFor commands that display their input pathnames in their output, the difference between giving a pathname on the command line or using stdin is more significant. Normally, the pathname is passed to the command, so the command knows the pathname and prints the name in the output:
$ wc -l /etc/passwd
44 /etc/passwd
wc
was passed the file name /etc/passwd
as a command line pathname argument and so wc
has to open the file itself and knows its namewc
command knows the file name, so it prints the name in the outputIf no pathnames are supplied on the command line and all the data comes from standard input, there is no pathname available to the command to indicate in the output:
$ wc -l </etc/passwd
44
/etc/passwd
, which means standard input for the wc
command will come from the file /etc/passwd
wc
command has no pathname arguments, which means it will read from standard input, opened by the shellwc
command has no pathname arguments, which means it does not know the name of the file it is reading from stdin/etc/passwd
, not the wc
commandwc
command doesn’t know the file name; only the shell knows the namewc
does not print any file name; it wasn’t given any file namewc
cannot know the file name, since it didn’t open the fileThe above input redirection trick can be useful to get just the number of lines in a file, without also getting the file name as well:
$ echo "The number of lines is:" ; wc -l /etc/passwd
The number of lines is:
44 /etc/passwd # wrong - "44 /etc/passwd" is not a number
$ echo "The number of lines is:" ; wc -l </etc/passwd
The number of lines is:
44 # correct - just the number, no name
You already know that using an output redirection file as an input file name argument doesn’t work because the file is truncated by the output redirection. The same is true if you use the output redirection file name as an input redirection file name. Don’t do it:
$ cat <myfile >myfile # WRONG! myfile is truncated empty!
$ sort <myfile >myfile # WRONG! myfile is truncated empty!
$ head <myfile >myfile # WRONG! myfile is truncated empty!
$ tr <myfile >myfile # WRONG! myfile is truncated empty!
Given the above, why is myfile
not left empty in the following case?
$ wc <myfile >myfile # WRONG! myfile is trucated empty!
$ cat myfile # What is in the file "myfile" now?
Hint: What happens when wc
counts nothing? Is there no output?
|
(pipes)IndexSince the shell can redirect both the output of programs and the input of programs, it can connect (redirect) the output of one program directly into the input of another program without using any files in between. This output-to-input redirection is called piping and uses the “pipe” meta-character ‘|
’ that is usually located above the backslash key ‘\
’ on your keyboard. Using it looks like this:
$ date
Mon Feb 27 06:37:52 EST 2012
$ date | wc # wc counts the output of "date"
1 6 29
Here are three major rules that apply to useful pipes:
- Pipe redirection is done by the shell, first, before file redirection.
- The command on the left of the pipe must produce some standard output.
- The command on the right of the pipe must want to read standard input.
|
between commandsIndexThe shell meta-character |
(“pipe”) is similar to semicolon ;
in that it signals the start of another command on the command line. The pipe is different because the standard output (only stdout; not stderr) of the command on the immediate left of the pipe |
is attached/connected (“piped”) to the standard input of the command on the immediate right of the pipe:
$ date
Mon Feb 27 06:37:52 EST 2012
$ date | wc # wc counts the output of "date"
1 6 29
$ echo hi
hi
$ echo hi | wc # wc counts the output of "echo hi"
1 1 3
(Note that the invisible newline character at the end of a line is also counted by wc
in the above example.)
It is the shell that is redirecting the standard output of the command on the left into the standard input of the command on the right. As with all redirection, the shell does this redirection before it finds and runs any commands. The commands themselves do not see the redirection.
You can approximate some of the behaviour of a pipe between two commands by using an intermediate file for intermediate storage of the output of the first command before using the second command to read that output:
$ nl /etc/passwd >out # save all the first command's standard output in a file
$ head <out # use the file as standard input for the second command
[...first ten line-numberd lines display here...]
If you use an intermediate file instead of a pipe, the first command has to finish and put all its output into the intermediate file before the shell can find and run the next command to read the file containing the output of the first command. This is true even, as in the above example, the second command will only use the first few lines of output from the first command. Without using a pipe, the nl
command has to line-number the entire password file before we can run the second command to see the first ten lines. Using a pipe, the output from nl
flows into head
until ten lines have been displayed, then both commands exit:
$ nl /etc/passwd | head # use a pipe instead of a temporary file
If the first command takes a long time to run, using a temporary file means an unnecessary delay. Without using pipes:
$ find / -ls >out # huge output of find has to finish first (slow)
$ less out # now we can display the output of "find"
Using a pipe, the output from find
can start to appear in less
right away, before the find
command has finished generating all the output:
$ find / -ls | less # huge output of find goes directly into "less"
Pipes don’t need to wait for the first command to finish before the second command starts reading the output of the first. The output starts flowing immediately through the pipe because both commands are actually running simultaneously.
The pipe also requires no intermediate file to hold the output of the first command, and so as soon as the command on the left of the pipe starts producing standard output, it goes directly into the standard input of the command on the right.
If the command on the left of the pipe never finishes, the command on the right will read all the input that currently available and then continue to wait for more input, processing it as soon as it appears.
If the command on the left of the pipe does finish, the command on the right sees an EOF (end-of-file) on the pipe (its standard input). As with EOF from a file, EOF usually means that the command on the right will finish processing, produce its last output, and exit.
As with semicolon meta-characters ;
, the shell does the recognizing of pipe characters and splitting a command line into piped commands first, before doing file redirection. File redirection happens second (after pipe splitting), and if present, has precedence over pipe redirection. (The file redirection is done after pipe splitting, so it always wins, leaving nothing for the pipe.)
$ ls -l | wc # correct - output of ls goes into the pipe
2 11 57
$ ls -l >out | wc # WRONG! - output of ls goes into the file
0 0 0 # wc reads an empty pipe and outputs zeroes
This is why in the above pipe wc
has no characters to count from ls
:
ls
command on the left of the pipe and changes the ls
standard output away from the pipe into the file out
.ls
on the left goes into the file out
; nothing is available to go into the pipe.wc
command on the right of the pipe counts an empty input from the pipe and outputs zeroes: 0 0 0
Remember: Redirection can only go to one place, and file redirection always wins over pipes, because it is done after pipe splitting:
$ ls /bin >out # all output from ls goes into file "out"
$ ls /bin >out | wc # WRONG! output goes into "out", not into pipe
0 0 0 # wc counts an empty input from the pipe
As with output redirection into files, you can only redirect into a pipe the standard output that you can see. Using redirection never creates output, even when using pipes:
$ ls /bin >out # all output from ls goes into file "out"
$ ls /bin >out | wc # nothing goes into the pipe to "wc"
0 0 0 # wc counts an empty input from the pipe
$ cp /etc/passwd x # no output visible on standard output
$ cp /etc/passwd x | wc # nothing goes into the pipe to "wc"
0 0 0 # wc counts an empty input from the pipe
$ cd /tmp # no output visible on standard output
$ cd /tmp | wc # nothing goes into the pipe to "wc"
0 0 0 # wc counts an empty input from the pipe
$ touch x ; rm x # no output from rm on standard output
$ touch x ; rm x | wc # nothing goes into the pipe to "wc"
0 0 0 # wc counts an empty input from the pipe
You can only redirect output that you can see.
2>&1
with pipesIndexAs with file redirection, pipes only redirect Standard Output (stdout) from commands, not Standard Error Output (stderr). Standard Error Output still goes directly to your screen; it does not go into a pipe:
$ ls /etc/passwd nosuchfile # no redirection used
ls: cannot access nosuchfile: No such file or directory # STDERR unit 2
/etc/passwd # STDOUT unit 1
$ ls /etc/passwd nosuchfile | wc # only stdout is redirected to "wc"
ls: cannot access nosuchfile: No such file or directory # STDERR unit 2
1 1 12 # stdout went into the pipe to "wc"
You need the special syntax “2>&1
” to redirect both stdout and stderr both into a pipe. Recall that “2>&1
” means “redirect standard error to go to the same place as standard output”, so if standard output is already going into a pipe (and remember pipe splitting happens first), “2>&1
” will send standard error into the pipe too:
$ ls /etc/passwd nosuchfile 2>&1 | wc # both stdout and stderr redirected
2 10 68 # wc counts both lines from pipe
The “2>&1
” above happens after pipe-splitting; it works because pipe-splitting happens first and Standard Output is already redirected into the pipe. It sends Standard Error to the same place, i.e. into the pipe.
|&
instead of 2>&1
IndexSome shells (including the BASH shell) allow a “|&
” pipe syntax to redirect both stderr and stdout into the pipe. These are equivalent in the BASH shell:
$ ls /etc/passwd nosuchfile 2>&1 | wc # both stdout and stderr redirected (all Bourne-style shells)
$ ls /etc/passwd nosuchfile |& wc # both stdout and stderr redirected (BASH shell only)
Not all shells recognize the “|&
” pipe syntax. (The /bin/sh
shell on Ubuntu systems does not!) Don’t use the |&
syntax inside a shell script; use the standard “2>&1
” instead that works with all Bourne-style shells.
Many Unix/Linux commands can be made to act as filters in pipelines. A filter command has no file name arguments and doesn’t open any files itself. The filter command reads its input lines from its standard input that is usually connected to a pipe on its left. The filter command writes it output to standard output, which might often be into another pipe and filter command on its right. The filter command has no file name arguments of its own to process.
With no file name arguments on the command line, filter commands read from standard input and write to standard output. The shell uses pipes to provide redirection for both standard input and standard output:
$ fgrep "/bin/sh" /etc/passwd | sort | head
The fgrep
command above is reading from the filename argument /etc/passwd
given on the command line. The output of the fgrep
command always goes to standard output, which in the above command pipeline means the output goes into the pipe, not onto the screen.
The sort
and head
commands above have no file name arguments to read. Without file name arguments, each of the commands reads from its standard input, which is set up to be from the pipes created by the shell.
Both sort
and head
have no file name arguments and are acting as filter commands. (The fgrep
command is technically not a filter – it is reading from the supplied pathname argument, not from standard input.)
Lines of input are sent through a pipe into the standard input of a filter command (such as sort
and head
, above). The filter command reads the lines from the pipe, filters them in some way, and sends the result into another pipe (or perhaps onto your screen, or into an output file with redirection, if the command is the last one in the pipeline).
Filter commands read from standard input (not from a file name) and they write to standard output.
You can only redirect what you can see, so if you use a command to select some lines from a file and then send those lines into a second filter command via a pipe, remember that it is only the selected lines that are being read by that second filter command, not the original file.
Filter commands in pipelines read their input from other commands output, through pipes, they don’t read directly from files.
Below is an example that shows how a second fgrep
in a pipeline searches for its text pattern in the output of the first fgrep
, not in the original file.
In the example below, looking for the word mail
in the file /etc/services
finds five lines. Looking for the word file
in the file /etc/services
also finds five lines, but they are a different five lines. There are no lines in that file with both words in them:
$ fgrep 'mail' /etc/services
smtp 25/tcp mail
re-mail-ck 50/tcp # Remote Mail Checking Protocol
re-mail-ck 50/udp
mailq 174/tcp # Mailer transport queue for Zmailer
mailq 174/udp
$ fgrep 'file' /etc/services
remotefs 556/tcp rfs_server rfs # Brunhoff remote filesystem
afs3-fileserver 7000/tcp bbs # file server itself
afs3-fileserver 7000/udp bbs
supfilesrv 871/tcp # SUP server
supfiledbg 1127/tcp # SUP debugging
$ fgrep 'file' /etc/services | fgrep 'mail' # pipeline gives NO OUTPUT !!!
$ fgrep 'mail' /etc/services | fgrep 'file' # pipeline gives NO OUTPUT !!!
The two fgrep
pipeline command lines at the end of the above example give no output, because none of the lines that contain the text string file
also contain the text string mail
, and vice-versa.
In each example pipeline above, the second fgrep
is searching for its pattern in the output of the first fgrep
, and the second pattern is not in any of the lines output by the first fgrep
.
A line in the file would have to contain both text strings mail
and file
to pass through both fgrep
commands in the pipe. The first fgrep
selects lines with one text string and then the second fgrep
reads the output of the first fgrep
and looks for the second text string. Lines must contain both strings to be output.
No lines contain both strings in the example. There is no output.
If we change the second fgrep
in the pipeline to select a word that is in the output of the first fgrep
, it finds a line to output:
$ fgrep 'mail' /etc/services | fgrep 'Remote'
re-mail-ck 50/tcp # Remote Mail Checking Protocol
$ fgrep 'Remote' /etc/services | fgrep 'mail'
re-mail-ck 50/tcp # Remote Mail Checking Protocol
The output line is the only line from /etc/services
that contains both the word mail
and the word Remote
in it. It doesn’t matter which word you search for first; the order of the searches doesn’t matter. In both cases, the output is the only line that has both words in it.
Successive filter commands can be used to select lines that contain multiple strings in a line.
ssh
break-in attempts in JanuaryIndexWe are asked to count the number of times the machine rejected an SSH break-in attempt in the month of January. Here is a practical example showing the use of a filter command that reads from standard input and writes to standard output.
We need to look for lines in the system log file auth.log
that contain both the string 'refused connect'
and the date string for January.
Here is a sample auth.log
input file that we will use in the following example (484 lines): auth.log
This sample file was taken from an actual /var/log/auth.log
file.
First, we need to extract from the log file only the lines that indicate a rejected break-in attempt. Since there could be thousands of lines of output in a real system log file, we always pipe the large output into a command head
that limits the output on our screen to only ten lines:
$ fgrep 'refused connect' auth.log | head
Sep 2 02:51:01 refused connect from 61.174.49.108 (61.174.49.108)
Sep 4 09:05:00 refused connect from 193.107.17.72 (193.107.17.72)
Sep 5 03:27:11 refused connect from 61.144.43.235 (61.144.43.235)
Sep 6 05:53:51 refused connect from 122.225.109.208 (122.225.109.208)
Sep 8 06:28:53 refused connect from 116.10.191.180 (116.10.191.180)
Sep 10 15:30:18 refused connect from 122.225.109.105 (122.225.109.105)
Sep 22 12:11:22 refused connect from 211.143.243.35 (211.143.243.35)
Sep 30 04:11:02 refused connect from 220.177.198.39 (220.177.198.39)
Oct 3 01:09:02 refused connect from 61.174.51.235 (61.174.51.235)
Oct 3 19:54:33 refused connect from 117.21.173.35 (117.21.173.35)
$ fgrep 'refused connect' auth.log | wc
100 800 7055
$ fgrep -c 'refused connect' auth.log
100
Looking at the output, we see that every line has the month abbreviation at the start of the line. We only want January dates, so we use the date string 'Jan '
in another fgrep
filter to further restrict the output to only lines containing both 'refused connect'
and 'Jan '
. (Note the trailing blank in the date string.)
$ fgrep 'refused connect' auth.log | fgrep 'Jan ' | head
Jan 2 15:43:42 refused connect from 221.235.188.212 (221.235.188.212)
Jan 2 15:46:46 refused connect from 221.235.188.212 (221.235.188.212)
Jan 2 15:49:48 refused connect from 221.235.188.212 (221.235.188.212)
[... etc ...]
$ fgrep 'refused connect' auth.log | fgrep 'Jan ' | wc
26 208 1948
$ fgrep 'refused connect' auth.log | fgrep -c 'Jan '
26
Below are the functions of the two commands in the above pipeline. The second fgrep
command is acting as a filter command, reading Standard Input from a pipe and writing output to Standard Output (to the screen).
fgrep
command selects the lines containing the text string 'refused connect'
inside the auth.log
file. The output of this first command (only lines containing the 'refused connect
’ string) goes into the first pipe, not onto the screen.fgrep
reads the output of the first fgrep
from the pipe and only selects (and counts, using the -c
option) lines that also contain the date pattern for January 'Jan '
(with a trailing blank). The lines being selected and counted have to contain both the string 'refused connect'
from the first fgrep
and the string 'Jan '
from the second fgrep
. The output of this second fgrep
(a count of lines containing both strings: 26) displays on the screen.When filtering output by date, always look in the file you are filtering to see what format date is being used on each line. Use the date format found in the file.
The last (seventh) colon-separated field in the system password file /etc/passwd
contains the name of the login shell given to the user when the user logs in:
$ head /etc/passwd
root:x:0:0:root:/root:/bin/bash
daemon:x:1:1:daemon:/usr/sbin:/usr/sbin/nologin
sync:x:4:65534:sync:/bin:/bin/sync
[... etc ...]
(A blank/empty field means use the default shell, which on Linux systems is usually /bin/sh
that is often a link to /bin/bash
.)
In this example we must “Count the number of each kind of shell in /etc/passwd
and display the top four results sorted in descending numeric order.”
We will build up the answer iteratively using a pipeline:
Problem 2-A: Extract just the shell field from each line.
Solution 2-A: Use the cut
command that extracts from input lines fields separated by a delimiter. Since there could be thousands of lines of output, we pipe the large output into a command that limits the output on our screen to ten lines:
$ cut -d : -f 7 /etc/passwd | head
/bin/bash
/usr/sbin/nologin
/bin/sync
[... etc ...]
We now have a list of shells, in the order that they appear in the password file. On to the next problem: 2-B.
Problem 2-B: Count the identical shells.
Solution 2-B: The uniq
command can count adjacent lines in an input file (or from standard input) using the -c
option, but the lines have to be adjacent. We can sort the lines to make all the shell lines adjacent so that they can be counted, then add uniq -c
to count the sorted lines. First, we add the sort to the pipeline, check the output, then we add the uniq -c
to the pipeline:
$ cut -d : -f 7 /etc/passwd | sort | head
/bin/bash
/bin/bash
/bin/bash
[... etc ...]
$ cut -d : -f 7 /etc/passwd | sort | uniq -c
1170 /bin/bash
23 /bin/false
1 /bin/sh
1 /bin/sync
16 /usr/sbin/nologin
697 /usr/sbin/nologin_lock.sh
The output of uniq -c
shows the counts of each shell, but the counts are not sorted in descending order, and there are more than four lines of output. On to the next problem: 2-C.
Problem 2-C: Display the top four most used shells in descending order of use.
Solution 2-C: First we add another sort
to the pipeline, using options to sort the count numbers numerically and in descending (reverse) order, then we add a final head
command to limit the output to four lines:
$ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr
1170 /bin/bash
697 /usr/sbin/nologin_lock.sh
23 /bin/false
16 /usr/sbin/nologin
1 /bin/sync
1 /bin/sh
$ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr | head -n 4
1170 /bin/bash
697 /usr/sbin/nologin_lock.sh
23 /bin/false
16 /usr/sbin/nologin
Summary:
$ cut -d : -f 7 /etc/passwd | sort | uniq -c | sort -nr | head -n 4
cut
command picks out colon-delimited field 7 in each line in the password file and sends just those fields (the shell name) into the pipe.sort
command reads the shell names from the pipe and puts all the shell names in sorted ascending order and sends the sorted names into another pipe.uniq
command reads the sorted names from the pipe and counts the number of adjacent names. The output for each unique name is the count followed by the name. The output goes into another pipe.sort
command reads the lines containing the count and the shell name and it sorts the lines numerically (using the count field) and in reverse. Those sorted lines go into another pipe.head
command reads the sorted lines from the pipe and selects only the first four lines. Only those four lines display on the screen.In this Example showing the use of multiple filter commands, we use filter commands to find the unique IP addresses used in SSH break-in attempts in January and then count how many times each IP address was used. This Example uses features of the previous two Examples.
As in the first Example above, we need to look for lines in the system log file auth.log
that contain both the string 'refused connect'
and the date string 'Jan '
. Instead of counting all of them together, we need to extract the IP address from each line and count the number of times each IP address appears. Counting occurrences was a feature of the second Example, above.
Here is the solution, using features of both previous Examples:
$ fgrep 'refused connect' auth.log | fgrep 'Jan ' \
| awk '{print $NF}' \
| sort | uniq -c | sort -nr
Below are the functions of the six commands in the above pipeline. Five of the commands are acting as filter commands, reading Standard Input from a pipe and writing output to Standard Output (often, to another pipe, except for the last command that writes on the screen).
This example uses the same sample auth.log
input file that we used earlier (484 lines): auth.log
fgrep
command selects the lines containing the text string 'refused connect'
inside the auth.log
file. The output of this first command (only lines containing the 'refused connect'
string) goes into the first pipe, not onto the screen.fgrep
reads the output of the first fgrep
from the pipe and only selects lines that also contain the date pattern for January 'Jan '
. The lines being selected have to contain both the string 'refused connect'
from the first fgrep
and the string 'Jan '
from the second fgrep
. The output of this second fgrep
(lines containing both strings) goes into another pipe.awk
command reads the selected lines from the pipe. It displays just the last field (NF
) on each line, which happens to be the IP address used by the attacker. The awk
output (a list of IP addresses, one per line) goes into another pipe. (The list of addresses are not in sorted order; they are in whatever order they appear in the input file.)sort
command reads lines of IP addresses from the pipe. It sorts all the IP addresses together so that uniq
can count them, and sends the sort
output (the sorted lines) into another pipe.uniq -c
command reads the sorted list of IP addresses from the pipe. It counts how many adjacent addresses are the same and sends the uniq
output (lines with the count followed by the IP address with that count) into another pipe.sort -nr
command reads the lines with the counts and IP addresses from the pipe. It sorts numerically and in reverse (descending) order the lines containing the leading count numbers and sends the second sort
output (sorted lines, each containing a count and an IP address) onto the screen.Note the use of two sort
commands. The first sort
takes an unordered list of IP addresses and sorts them so that all the same IP addresses are together, so that the uniq
command can count them. (The uniq
command can only count adjacent lines in an input stream.) Without the first sort
, the IP addresses wouldn’t be all together and wouldn’t be counted correctly by uniq
. The second sort
command sorts the output of uniq
numerically and in reverse and puts the IP addresses with the largest counts first. Both sort
commands are needed.
Problem: Display only lines 6-10 of the password file.
Solution: Extract the first 10 lines of the file, and from those 10 lines extract just the last five lines, which are lines 6-10. You can use the nl
command to add line numbers to the file to confirm your solution.
$ head /etc/passwd | tail -n 5
$ nl /etc/passwd | head | tail -n 5
Problem: Display only the second-to-last line of the password file.
Solution: Extract the last two lines of the file, and from those last two lines extract just the first line, which is the second-to-last line.
$ tail -n 2 /etc/passwd | head -n 1
Problem: Which five (non-hidden) files in current directory are largest:
$ ls -s | sort -nr | head -n 5
The -s
option outputs the size of the file in blocks as a number at the start of every line, which makes it easy to sort the lines numerically.
Here is another answer that uses some sort options to pick which field to sort:
$ ls -l | sort -k 5,5nr | head -n 5
If we want to sort by file size in bytes, bytes is the fifth field in the output of ls -l
. We have to use some options to sort that tell it to sort using the fifth field of every line. The above sort command is sorting by the fifth field, numerically, in reverse.
elinks
to fetch and search formatted web pagesIndexFor the examples below, we need a program that fetches formatted web pages (or RSS pages) from the Internet. We will use the elinks
text web browser with some options.
Because we are typing commands in to an interactive shell, we will define a shell alias for elinks
and its list of required arguments, to make the examples below shorter to type and display:
$ alias ee='elinks -dump -no-numbering -no-references'
Don’t use aliases inside script files that you write.
Problem: Display the dates of the Midterm tests from the Course Home Page:
$ ee 'http://teaching.idallen.com/cst8207/19w/' | fgrep 'Midterm'
Problem: Display weekly course notes file modify dates:
$ ee 'http://teaching.idallen.com/cst8207/19w/notes/' | fgrep 'notes.txt'
Problem: Display the assignment file modify dates from the Course Notes:
$ ee 'http://teaching.idallen.com/cst8207/19w/notes/' | fgrep 'assignment'
Problem: Display current Ottawa weather temperature:
$ ee 'http://weather.gc.ca/rss/city/on-118_e.xml' | fgrep 'Temperature:'
Problem: Display the current BBC weather for Vancouver:
$ ee 'http://www.bbc.co.uk/weather/6173331' \
| fgrep -A19 'Observations' | tail -n 20
Problem: Display the current Space Weather forecast for Canada:
$ ee 'http://www.spaceweather.gc.ca/forecast-prevision/short-court/sfst-1-eng.php' \
| fgrep 'Current Conditions'
Problem: Display the current phase of the Moon:
$ ee 'http://www.die.net/moon/' \
| fgrep -A2 'Moon Phase' | head -n 3 | tail -n 1
There are many ways to misuse pipes. Here are some common ones.
If a command does read from file names supplied on the command line, it is more efficient to let it open its own file name than to use cat
to open the file and feed the data to the command on standard input. (There is less data copying done!)
Do not do this (wasteful of processes and I/O and flags you as a novice):
$ cat /etc/passwd | head # DO NOT DO THIS - INEFFICIENT
$ cat /etc/passwd | sort # DO NOT DO THIS - INEFFICIENT
$ cat /etc/passwd | fgrep 'root:' # DO NOT DO THIS - INEFFICIENT
Do this: Give the file name(s) directly to the commands, like this:
$ head /etc/passwd
$ sort /etc/passwd
$ fgrep 'root:' /etc/passwd
Let commands open their own files; don’t feed them with cat
and unnecessary pipes.
If a Unix/Linux command that can open and read the contents of pathnames is not given any pathnames to open, it usually reads input lines from standard input (stdin) instead:
$ wc /etc/passwd # wc reads /etc/passwd, ignores stdin and your keyboard
$ wc # without a file name, wc reads stdin (your keyboard)
If the command is given a pathname, it reads from the pathname and always ignores standard input, even if you try to send it something:
$ date | wc foo # WRONG! wc opens and reads file foo; wc ignores stdin
The above applies to every command that reads file content, e.g.:
$ date | head foo # WRONG! head opens and reads file foo; head ignores stdin
$ date | less foo # WRONG! less opens and reads file foo; less ignores stdin
If you want a command to read stdin, you cannot give it any file name arguments. Commands with file name arguments ignore standard input; they should not be used on the right side of a pipe.
Commands that are ignoring standard input (because they are opening and reading from pathnames on the command line) will always ignore standard input, no matter what silly things you try to send them on standard input:
$ echo hi | head /etc/passwd # WRONG: head has a pathname and ignores stdin
$ echo hi | tail /etc/group # WRONG: tail has a pathname and ignores stdin
$ echo hi | wc .vimrc # WRONG: wc has a pathname and ignores stdin
$ sort a | cat b # WRONG: cat has a pathname and ignores stdin
$ cat a | sort b # WRONG: sort has a pathname and ignores stdin
Standard input is thrown away if it is sent to a command that ignores it. The shell cannot make a command read stdin; it’s up to the command. The command must want to read standard input, and it will only want to read standard input if you leave off all the file names.
Commands that do not open and process the contents of files usually ignore standard input, no matter what silly things you try to send them on standard input. All these commands will never read standard input:
$ echo hi | ls # NO: ls doesn't open files - always ignores stdin
$ echo hi | pwd # NO: pwd doesn't open files - always ignores stdin
$ echo hi | cd # NO: cd doesn't open files - always ignores stdin
$ echo hi | date # NO: date doesn't open files - always ignores stdin
$ echo hi | chmod +x . # NO: chmod doesn't open files - always ignores stdin
$ echo hi | rm foo # NO: rm doesn't open files - always ignores stdin
$ echo hi | rmdir dir # NO: rmdir doesn't open files - always ignores stdin
$ echo hi | echo me # NO: echo doesn't open files - always ignores stdin
$ echo hi | mv a b # NO: mv doesn't open files - always ignores stdin
$ echo hi | ln a b # NO: ln doesn't open files - always ignores stdin
Some commands that open and read file contents only operate on file name arguments and never read stdin:
$ echo hi | cp a b # NO: cp opens arguments - always ignores stdin
Standard input is thrown away if it is sent to a command that ignores it. The shell cannot make a command read stdin; it’s up to the command.
Commands that might read standard input will do so only if no file name arguments are given on the command line. The presence of any file arguments will cause the command to ignore standard input and process the file(s) instead, and that means they cannot be used on the right side of a pipe to read standard input. File name arguments always win over standard input.
Remember: If a file name is given to a command on the command line, the command ignores standard input and only operates on the file name.
The very long sequence of pipes below is pointless – the last (rightmost) command head
has a pathname argument and it will open and read it, ignoring all the standard input coming from all the pipes on the left:
$ fgrep "/bin/sh" /etc/passwd | sort | head /etc/passwd # WRONG!
The head
command is ignoring the standard input coming from the pipe and is reading directly from its /etc/passwd
filename argument. The fgrep
and sort
commands are doing a lot of work for nothing, since head
is not reading the output of sort
coming down the pipe. The head
command is reading from the supplied file name argument /etc/passwd
instead. File names take precedence over standard input.
The above long-but-mal-formed pipeline is equivalent to this (same output):
$ head /etc/passwd
Don’t make the above mistake. Filter commands must not have file name arguments; they must read standard input from the pipe.
If you give a command a file to process, it will ignore standard input, and so a command with a file name must not be used on the right side of any pipe.
The following command line redirection is faulty (an input file on the left is also used as and output file on the right); however, it sometimes works for small files:
$ cat foo bar | tr 'a' 'b' | fgrep "lala" | sort | head >foo # WRONG!
There is a critical race between the first cat
command trying to read the data out of file foo
before the shell truncates it to zero when launching the head
command at the right end of the pipeline. Depending on the system load and the size of the file, cat
may or may not get out all the data before the foo
file is truncated or altered by the shell in the redirection at the end of the pipeline. Don’t do this.
Don’t depend on long pipelines saving you from bad redirection! Never redirect output into a file that is being used as input in the same command or anywhere in the command pipeline.
- Pipe redirection is done by the shell, first, before file redirection.
- The command on the left of the pipe must produce some standard output.
- The command on the right of the pipe must want to read standard input.
Never use a redirection output file as an input file anywhere in a pipeline!
There is only one standard input and one standard output for each command. Each can only be redirected to one other place. You cannot redirect standard input from two different places, nor can you redirect standard output into two different places.
The Bourne shells (including BASH) do not warn you that you are trying to redirect the input of a command from two or more different places (and that only one of the redirections will work – the others will be ignored):
$ wc <a <b <c <d <e
wc
standard input to come from the rightmost file e
only.$ date | wc <file
date
command output into the pipe is wasted (ignored).wc
command read its standard input only from the file file
because file redirection is done second and always wins over pipe redirection that is set up first (and then ignored).The Bourne shells (including BASH) do not warn you that you are trying to redirect the output of a command to two or more different places and that only one of the redirections will work – the others will be ignored:
$ date >a >b >c >d >e
date
output goes only into the rightmost file e
.e
wins.$ date >out | wc
0 0 0
date
output goes into file out
. Nothing goes into the pipe and wc
outputs zeroes. (File redirection is done second and always wins over pipe redirection.)Some shells (including the “C” shells, but not the Bourne shells) will try to warn you about silly shell redirection mistakes:
csh% date <a <b <c <d Ambiguous input redirect. csh% date | cat <file Ambiguous input redirect. csh% date >a >b >c Ambiguous output redirect. csh% date >a | wc Ambiguous output redirect.
The C shells tell you that you can’t redirect stdin or stdout to/from more than one place at the same time. Bourne shells do not tell you – they simply ignore the “extra” redirections and do only the last one of each.
/dev/null
IndexThere is a special file on every Unix/Linux system into which you can redirect output that you don’t want to keep or see: /dev/null
The following command generates some error output we don’t like to see:
$ cat * >/tmp/out
cat: course_outlines: Is a directory # errors print on STDERR
cat: jclnotes: Is a directory # errors print on STDERR
cat: labs: Is a directory # errors print on STDERR
cat: notes: Is a directory # errors print on STDERR
We can throw away the errors (stderr, unit 2) into /dev/null
:
$ cat * >/tmp/out 2>/dev/null
The file /dev/null
never fills up; it just eats and throws away output.
System Administrators: Do not get in the habit of throwing away all the error output of commands! You will also throw away legitimate error messages and nobody will know that these commands are failing.
When used as an input pathname, /dev/null
always appears to be empty:
$ wc /dev/null
0 0 0 /dev/null
You can use /dev/null
to provide “no input” to a program that would normally read your keyboard:
$ mail -s "Test message" user@example.com </dev/null
$
The mail
command reads from standard input; it would normally read your keyboard as the message to send. Redirecting input from /dev/null
ensures that there is nothing to read and mail
will send a message with no message body and only a subject line.
This is worth repeating:
People are often misled into thinking that adding redirection to a command will create output that wasn’t there before the redirection was added.
It isn’t so. You can only redirect what you can see.
$ cp /etc/passwd x # no output visible on standard output
$ cp /etc/passwd x >out # file "out" is created empty
$ cp /etc/passwd x | wc # word count counts nothing; output is zeroes
If there was no output on your screen before you added redirection, adding redirection will not create any. You will redirect nothing; no output.
Before you add redirection to a command, look at the output on your screen. If there is no output visible on your screen, why are you bothering to redirect it?
You can only redirect what you can see.
tr
– a command that only reads Standard InputIndexThe tr
command is one of the few (only?) commands that reads standard input and does not allow any pathnames on the command line – you must always supply input to tr
on standard input, either through file input redirection or through a pipe:
$ tr 'abc' 'ABC' <in >out # correct for a single file
$ cat file1 file2 | tr 'abc' 'ABC' >out # correct for multiple files
$ tr 'abc' 'ABC' file1 file2 >out # *** WRONG - ERROR ***
tr: too many arguments
The tr
command must always use some kind of Input Redirection to read data.
No version of tr
accepts pathnames on the command line. All versions of tr
only read standard input.
tr
IndexDon’t make the mistake of using a tr
output redirection file as its redirection input file. (This doesn’t work for any command.) See Don’t use redirection output file as redirection input file, above.
System V Unix versions of tr
demand that character lists appear inside square brackets, e.g.: tr '[abc]' '[ABC]'
Berkeley Unix and Linux do not need or use the brackets around the lists.
tr
IndexProblem: convert some selected lower-case letters to upper-case from the “who” command:
$ who | tr 'abc' 'ABC'
Shell question: Are the single quotes required around the two arguments? (Are there any special characters in the arguments that need protection?)
tr
IndexUsing POSIX character classes such as [:lower:]
and [:upper:]
, you can use tr
to convert a lower-case file of text into upper-case.
Warning: Do not use alphabetic character ranges such as a-z
or A-Z
in tr
or any other commands, since the ranges often contain unexpected characters in the character set collating sequence. For full details, see Internationalization and Collating
Full-screen keyboard interactive programs such as the VIM text editor do not behave nicely if you redirect their input or output – they really want to be talking to your keyboard and screen; don’t redirect them or try to run them in the background using &
. You can hang your terminal if you try.
If you accidentally redirect the input or output of something such as
vim
, switch screens or log in a second time using a different terminal and find and kill the hung process.
It’s easy to redirect only stdout into a pipe; that’s just the way pipes work. In this example below, only stdout is sent into the line numbering program. The error message sent to stderr bypasses the redirection and goes directly onto the screen:
$ ls /etc/passwd nosuchfile | nl
ls: cannot access nosuchfile: No such file or directory
1 /etc/passwd
It’s also easy to redirect both stdout and stderr into a pipe by sending stderr to the same place as stdout:
$ ls /etc/passwd nosuchfile 2>&1 | nl
1 ls: cannot access nosuchfile: No such file or directory
2 /etc/passwd
How do you redirect only stderr into the pipe, and let stdout bypass the pipe and go directly to the screen? This is tricky; on the left of the pipe you have to swap stdout (attached to the pipe) and stderr (attached to the screen). You need a temporary output unit (I use “3”, below) to record and remember where the screen is (redirect unit 3 to the same place as unit 2: “3>&2
”), then redirect stderr into the pipe (redirect unit 2 to the same place as unit 1: “2>&1
”), then redirect stdout to the screen (redirect unit 1 to the same place as unit 3: “1>&3
”):
$ ls /etc/passwd nosuchfile 3>&2 2>&1 1>&3 | nl
1 ls: cannot access nosuchfile: No such file or directory
/etc/passwd
You seldom need to do this advanced trickery, even inside scripts. But you can do it!