============================================ Unix Shell I/O Redirection (including Pipes) ============================================ -IAN! idallen@idallen.ca Contents of this file: 1. Output Redirection - stdandard output and standard error - redirection into files - throwing away output via /dev/null - two mistakes to avoid 2. Input Redirection 3. Redirection into programs (Pipes) 4. Misuse of Redirection 5. Unique STDIN and STDOUT 6. Redirection Examples In the examples below, I use the metacharacter ";" to put multiple commands on one shell command line: $ date ; who ; echo hi ; pwd These behave as if you had typed each of them on separate lines. ================== Output Redirection - stdandard output and standard error ================== In output redirection, the shell (not the command) diverts (redirects) most output that would normally appear on the screen to some other place, either into the input of another command (using a pipe metacharacter) or into a file (using a file redirect metacharacter). * redirection is done by the shell, first, before finding the command; the shell has no idea if the command will produce any output * you can only redirect the output that you can see (if there is no output without redirection, adding redirection won't create any) * redirection can only go to *one* place; you can't use multiple redirections to send output to multiple places (see the "tee" command) * by default, error messages are not redirected; only "normal output" is redirected (but you can also redirect error output with more syntax) The normal (non-error) output on your screen is called the "standard output" ("stdout") of the command - it is the output from "cout" statements in C++ programs. The error messages are called the "standard error output" ("stderr") of the command - it is the output from "cerr" statements in C++ programs. Normally, both stdout and stderr appear on your terminal. The shell can redirect the two outputs individually or together. The default type of output redirection (whether redirecting to files or to programs) redirects *only* standard output and lets standard error go to your terminal: $ ls /etc/passwd nosuchfile # no redirection used ls: nosuchfile: No such file or directory # this on screen from stderr /etc/passwd # this on screen from stdout $ ls /etc/passwd nosuchfile >out # shell redirects only stdout ls: nosuchfile: No such file or directory # only stderr appears on screen You can also tell the shell to redirect standard error to a file: $ ls /etc/passwd nosuchfile 2>out # shell redirects only stderr /etc/passwd # only stdout appears on screen $ ls /etc/passwd nosuchfile >out 2>errors # shell redirects both $ # nothing appears on screen ---------------------- Redirection into files ---------------------- The shell metacharacter ">" signals that the next word on the command line is an output file (not a program) that should be truncated (set to empty) and made ready to receive the standard output of a command: $ date >outfile The spaces between ">" and the file name are optional: $ date > outfile The file is always truncated to empty before the shell finds and runs the command, unless you double up the character like this: $ date >> outfile - output is *appended* to outfile; no truncation An example of output redirection of stdout into a file: $ echo hello - stdout goes to terminal $ echo hello >file ; cat file - erase file; send stdout to file hello $ echo there >>file ; cat file - append stdout to end of file hello there It is the shell that creates or truncates the file and sets up the redirection, not the command being redirected. The command knows nothing about the redirection - the redirection is removed from the command line before the command is found and executed: $ echo one two three - echo has three arguments one two three $ echo one two three >out - echo still has three arguments Shells handle redirection before they go looking for the command name to run. Indeed, you can have redirection even if the command is not found: $ nosuchcommandxxx >out - file "out" is created empty sh: nosuchcommandxxx: command not found $ wc out 0 0 0 out - shell created an empty file The shell creates or truncates the file "out" empty, and then it tries to find and run the nonexistent command and fails. The empty file remains. Any existing file will have its contents removed: $ echo hello >out ; cat out hello $ nosuchcommandxxx >out sh: nosuchcommandxxx: command not found $ wc out 0 0 0 out - shell truncated the file Redirection is done by the shell before the command is run: $ mkdir empty $ cd empty $ ls -l total 0 # no files found $ ls -l >out # shell creates "out" first $ cat out # display output total 0 -rw-r--r-- 1 idallen idallen 0 Sep 21 06:02 out $ date >out $ ls -l total 4 -rw-r--r-- 1 idallen idallen 29 Sep 21 06:04 out $ ls -l >out # shell empties "out" first $ cat out # display output total 0 -rw-r--r-- 1 idallen idallen 0 Sep 21 06:06 out The shell creates or empties the file "out" before it runs the "ls" command. Explain this sequence of commands: $ mkdir empty $ cd empty $ cp a b cp: cannot stat `a': No such file or directory $ cp a b >a $ # why is there no error message? Shells don't care where on or in the command line you do the file redirection. The file redirection is done by the shell, then the redirection information is removed from the command line before the command is called. The command actually being run doesn't see any part of the redirection syntax; the number of arguments is not affected. All the command lines below are equivalent to the shell; in every case the echo command sees only three arguments and the output is redirected into "file": $ echo hi there mom >file - echo has three arguments $ echo hi there >file mom - echo has three arguments $ echo hi >file there mom - echo has three arguments $ echo >file hi there mom - echo has three arguments $ >file echo hi there mom - echo has three arguments Redirection is never counted as arguments to a command. Examples: $ echo hello there - shell calls "echo" with two arguments ==> echo(hello,there) - "echo" echoes two arguments - output appears in default location (standard output is your screen) $ echo hello there >file - shell creates "file" and diverts standard output into it - shell removes ">file" from the command line - shell calls "echo" with two arguments ==> echo(hello,there) (note NO CHANGE in arguments to "echo" from the previous example) - "echo" echoes two arguments - standard output is captured in output "file", NOT on your screen $ >file echo hello there - this is identical to the above example (the shell does not care where in the command line you put the redirection) - standard output is captured in output "file", NOT on your screen - you can put the redirection anywhere in the command line! * most commands have two separate output "streams" - stdout - unit 1 - Standard Output (output via "cout <<" in C++) - stderr - unit 2 - Standard Error Output (output via "cerr <<" in C++) * redirection is done by the shell, first, before finding the command - the shell creates a new empty file or truncates (empties) an existing file - after doing the redirection, and removing it from the command line, the shell finds and executes the command: $ mkdir empty ; cd empty ; touch out ; rm out >out ; ls $ * you can only redirect the output that you can see! - redirection does not invent new output! - if you don't see any output from a command, adding redirection will simply have the shell create an empty file (no output) Example: $ cp /etc/passwd x # no output $ cp /etc/passwd x >out # file "out" is created empty $ touch x ; rm x # no output $ touch x ; rm x >out # file "out' is created empty * redirection can only go to *one* place - the right-most file redirection wins (others create empty files) Example: $ date >a >b >c # output goes into file c; a and b are empty * redirection to a file wins over redirection into a pipe - see the following section on redirection into programs using "|" pipes - if you redirect into a file and a pipe, the pipe gets nothing Example: $ date >a | wc # output goes into file "a"; wc outputs zero * the redirection output file is emptied (truncated) unless you append via >> - the file is emptied *before* the shell looks for and runs the command - don't use output redirection files as input to the same command Bad Example: $ sort a >a # file "a" is truncated to be empty - stdout and stderr mix on your screen (they look the same on the screen) so you can't tell by looking at your screen what comes on stdout and what comes on stderr Example: $ date >out ; ls -l out nosuchfile ls: nosuchfile: No such file or directory -rw-r--r-- 1 idallen idallen 29 Jan 16 06:52 out - stderr (error messages) often appear first, before stdout - you can redirect stdout and stderr separately using unit numbers - stdout is always unit 1 and stderr is always unit 2 (stdin is unit 0) - put the unit number immediately (no blank) before the ">" metacharacter: $ date >out # same as "1>out"; redirect stdout only $ ls -l out nosuchfile # no redirection used yet ls: nosuchfile: No such file or directory -rw-r--r-- 1 idallen idallen 29 Jan 17 13:01 $ ls -l out nosuchfile >foo # same as "1>foo"; redirect stdout only ls: nosuchfile: No such file or directory $ ls -l out nosuchfile 2>errs # redirect stderr (unit 2) only -rw-r--r-- 1 idallen idallen 29 Jan 17 13:01 $ ls -l out nosuchfile >foo 2>errs # redirect both stdout and stderr $ cat foo -rw-r--r-- 1 idallen idallen 29 Jan 17 13:01 $ cat errs ls: nosuchfile: No such file or directory ">foo" (no preceding unit number) is a shell shorthand for "1>foo" ">foo" redirects the default unit 1 (stdout) only ">foo" and "1>foo" are identical - you need a special syntax "2>&1" to redirect both stdout and stderr together; read this as "send unit 2 to the same place as unit 1": $ date >out $ ls -l out nosuchfile >foo 2>foo # *** WRONG *** $ ls -l out nosuchfile >foo 2>&1 # correct - redirect stdout and stderr - the order of these two redirections matters! - the >foo stdout redirect must come first (to the left of stderr 2>&1) because you must set where stdout goes before you send stderr to the same place - all redirection (and file truncation) happens first (done by the shell) - the command executes second (the shell executes it after doing redirection) - the output from the command (if any) happens third, and it goes into the indicated redirection output file last Some bad examples of using output files as input files: $ date >out ; sort out >out # this is bad! - first command (date): - shell truncates out - shell finds and runs date - output of date goes into out (1 line) - second command (sort): - shell truncates out (file out is now EMPTY) - shell finds and runs sort - sort opens its argument file "out" for reading (an empty file) - output from sort (empty) goes into out (empty) - out remains empty! $ date >out ; wc out >out # this is bad! - first command (date): - shell truncates out - shell finds and runs date - output of date goes into out (1 line) - second command (wc): - shell truncates out (file out is now EMPTY) - shell finds and runs wc - wc opens it argument file "out" for reading (an empty file) - output from wc (1 line: 0 0 0 out) goes into out - out now has one line: 0 0 0 out -------------------- Throwing away output -------------------- There is a special file on every Unix system, into which you can redirect output that you don't want to keep or see: /dev/null The following command generates some error output we don't like to see: $ cat * >/tmp/out cat: course_outlines: Is a directory cat: jclnotes: Is a directory cat: labs: Is a directory cat: notes: Is a directory We can throw away the errors (stderr, unit 2) into /dev/null: $ cat * >/tmp/out 2>/dev/null The file /dev/null never fills up; it just eats output. When used as an input pathname, it always appears to be empty: $ wc /dev/null 0 0 0 /dev/null Unix Big Redirection Mistake #1 ------------------------------- Do not use a redirection file as both output and input to a program (sort is used as the example program here - anything that reads files and produces output is at risk): $ sort a b >a - shell truncates file "a" and redirects command output into it - original contents of "a" are lost - truncated - GONE! before the shell even goes looking for the "sort" command to run! - shell finds and calls sort command with two file name arguments ==> i.e. sort(a,b) - sort command processes contents of file "a" (now an empty file) - sort command processes contents of file "b" - output has been redirected by the shell to appear in file "a" - Result: file "a" gets a sorted copy of "b"; the original contents of "a" are lost Work Around: Use a Temporary Third File $ sort a b >c # mv c a - the third file safely receives the output of "a" and "b" Other examples that DO NOT WORK: $ head file >file - creates an EMPTY FILE $ tail file >file - creates an EMPTY FILE $ tr a-z A-Z file - creates an EMPTY FILE $ wc file >file - file will always contain: 0 0 0 ...etc... Never use the same file name for both input and output - the shell will truncate the file before the command reads it. Unix Big Redirection Mistake #2 ------------------------------- Do not use a wildcard/glob that picks up the name of the output redirection file and causes it to become an input file (cat is used as the example program here - anything that reads files and produces output is at risk): $ cat * >z - shell creates "z" and redirects all future standard output into it - shell expands wildcards; wildcard "*" includes file "z" that was just created by the shell (Note: Bourne shells will do the wildcard before the file creation; C Shells do the file creation first.) - shell finds and calls cat command with all file names as arguments ==> e.g. cat(a,b,c,d,e,file1,file2,...etc...,z) - cat command processes each argument, opening each file and sending the output into file "z" - when cat opens file "z", it ends up reading from the top of file "z" and writing to the bottom of file "z" at the same time! - Result: an infinite loop that fills up the disk drive as "z" gets bigger and bigger Fix #1: Use a hidden file name $ cat * >.z - uses a hidden file name not matched by the shell "*" wildcard - the cat command is not given ".z" as an argument, so no loop occurs Fix #2 (two ways): Use a file in some other directory $ cat * >../z $ cat * >/tmp/z - redirect output into a file that is not in the current directory so that it is not read by the cat command and no loop occurs ================= Input Redirection ================= Many Unix commands read input from files, if file names are given on the command line, and from standard input ("stdin", often your keyboard) if no file names are given, e.g. less, cat, head, tail, sort, wc, etc. (Not *all* commands read from standard input. Examples of common commands that never read from standard input: ls, cp, mv, date, who, echo, etc.) If (and only if!) a command reads from standard input, you can tell the shell to use input redirection to change from where standard input comes, so that it doesn't come from your keyboard: $ cat food - reads from file "food" $ cat - reads from stdin (keyboard) $ cat out # *** WRONG - ERROR *** tr: too many arguments $ cat file1 file2 | tr 'a-z' 'A-Z' >out # correct for multiple files $ tr 'a-z' 'A-Z' out # correct for a single file Do not redirect the input of full-screen programs such as VIM: -------------------------------------------------------------- Full-screen keyboard interactive programs such as the VIM text editor do not behave nicely if you redirect their input or output - they really want to be talking to your keyboard and screen; don't redirect them. You can hang your terminal if you try. ================================= Redirection into programs (Pipes) ================================= The shell metacharacter "|" ("pipe") signals the start of another command on the command line. The standard output (only stdout; not stderr) of the command on the left of the "|" is connected ("piped") to the standard input of the command on the right: $ date | wc 1 6 29 You can approximate the behaviour of a pipe using a temporary file for intermediate storage: $ date >out ; wc out | wc # nothing goes into the pipe; it's all in the file "out" * pipe redirection is done by the shell, first, before file redirection * as with file output redirection, you can only redirect into a pipe the standard output that you can see; redirection never creates output * if you want to redirect standard error into a pipe, you have to redirect standard error to go to the same place as standard output: 2>&1 Example: $ ls nosuchfile 2>&1 | wc * redirection can only go to *one* place (file redirection always wins) Examples: - combining head and tail to select any set of lines from a file: $ head /etc/passwd | tail -5 # last five lines of first ten: lines 6-10 $ tail /etc/passwd | head -5 # first five lines of last ten lines - you can only redirect into a pipe the standard output that you can see $ date >a ; rm a # rm has no output if it works $ date >a ; rm a | wc # nothing to redirect - screen output: 0 0 0 - to redirect standard error into a pipe, make it appear on the same place as standard output (i.e. redirect unit 2 into the pipe too): $ ls nosuchfile 2>&1 | wc - you can only redirect to one place (files win over pipes) $ ls /bin >a | wc # nothing in pipe - screen output: 0 0 0 ----------------------------------- Misuse of redirection into programs ----------------------------------- If a Unix command that opens and reads the contents of pathnames is not given any pathnames to open, it usually reads lines from standard input (stdin) instead: $ wc /etc/passwd # wc reads /etc/passwd, ignores stdin $ wc # wc reads stdin - your keyboard If the command is given a pathname, it reads from the pathname and always ignores standard input, even if you try to send it something: $ date | wc # wc opens and reads standard input, counts date stdout $ date | wc foo # wc opens and reads foo; wc completely ignores stdin $ date | sort # sort opens and reads standard input, sorts date stdout $ date | sort foo # sort opens and reads foo; sort completely ignores stdin $ date | head # head opens and reads standard input, heads date stdout $ date | head foo # head opens and reads foo; head completely ignores stdin If you want a command to read stdin, you cannot give it any file name arguments. Commands with file name arguments ignore standard input. Commands that are ignoring standard input (because they are opening and reading from pathnames on the command line) will always ignore standard input, no matter what silly things you try to send them on standard input: $ echo hi | head /etc/passwd # head has a pathname; it ignores stdin $ echo hi | tail /etc/group # tail has a pathname; it ignores stdin $ echo hi | wc .vimrc # wc has a pathname; it ignores stdin $ sort a | cat b # "cat" has a pathname; it ignores stdin $ cat a | sort b # "sort" has a pathname; it ignores stdin Standard input is thrown away if it is sent to a command that ignores it. The shell *cannot* make a command read stdin; it's up to the command. Commands that do not open and process the contents of files usually ignore standard input, no matter what silly things you try to send them on standard input: $ echo hi | ls # ls doesn't open files, it always ignores stdin $ echo hi | pwd # pwd doesn't open files, it always ignores stdin $ echo hi | cd # cd doesn't open files, it always ignores stdin $ echo hi | date # date doesn't open files, it always ignores stdin $ echo hi | chmod +x . # chmod doesn't open files, it always ignores stdin $ echo hi | rm foo # rm doesn't open files, it always ignores stdin $ echo hi | rmdir dir # rmdir doesn't open files, it always ignores stdin $ echo hi | echo me # echo doesn't open files, it always ignores stdin $ echo hi | mv a b # mv doesn't open files, it always ignores stdin $ echo hi | cp a b # cp always needs at least 2 names and ignores stdin Standard input is thrown away if it is sent to a command that ignores it. The shell *cannot* make a command read stdin; it's up to the command. Example of mis-used redirection: -------------------------------- The the very long sequence of pipes below is pointless - the last (rightmost) command ("head") has a pathname and will open and read it, ignoring all the standard input coming from the pipes on the left: $ head /etc/passwd | sort | tail -3 | sort -r | head -1 /etc/passwd The above mal-formed pipeline is equivalent to this (same output): $ head -1 /etc/passwd If you give a command a file to process, it will ignore standard input. ======================= Unique STDIN and STDOUT ======================= There is only one standard input and one standard output for each command. Each can only be redirected to *one* other place. You cannot redirect standard input from two different places, nor can you redirect standard output into two different places. The Bourne shells (including bash) do not warn you that you are trying to redirect the input of a command from two or more different places (and that only one of the redirections will work - the others will be ignored): bash$ wc a >b >c >d >e - the "date" output goes into file "e"; the other files are created by the shell but are empty because only the final redirection wins bash$ date >out | wc 0 0 0 - the "date" output goes into file "out"; nothing goes into the pipe (file redirection always wins over pipe redirection) Some shells (including the "C" shells) will try to warn you about silly shell redirection mistakes: csh% date a >b >c Ambiguous output redirect. csh% date >a | wc Ambiguous output redirect. The C shells tell you that you can't redirect stdin or stdout to/from more than one place at the same time. Bourne shells do not tell you - they simply ignore the "extra" redirections and do only the last one of each. -------------------- Redirection Examples -------------------- A command line to convert lower-case to upper-case from the "who" command: $ who | tr 'a-z' 'A-Z' Shell question: Are the single quotes required around the two arguments? (Are there any special characters in the arguments that need protection?) Using redirection, you can use a similar command to convert a lower-case file of text into upper-case. EXPERIMENT: Why doesn't this convert the file "myfile" to upper-case? $ date >myfile $ tr 'a-z' 'A-Z' myfile $ wc myfile 0 0 0 myfile - what happened? Why is the file "myfile" empty after this command is run? What about the following command lines - what is in "myfile" when the command finishes? $ cat myfile $ sort myfile $ head myfile Given the above, why is "myfile" not left empty in the following case? $ wc myfile The following command line doesn't work because the programmer doesn't understand the "tr" command syntax: $ tr 'a-z' 'A-Z' myfile >new Why does this generate an error message from "tr"? (The "tr" command is unusual in its handling of command line pathnames. RTFM) The following command line redirection is faulty; however, it sometimes works for small files: $ cat foo bar | tr 'a' 'b' | grep "lala" | sort | head >foo There is a critical race between the first "cat" command trying to read the data out of "foo" before the shell truncates it to zero when launching the "head" command at the end of the pipeline. Depending on the system load and the size of the file, "cat" may or may not get out all the data before the "foo" file is truncated or altered by the shell in the redirection at the end of the pipeline. Don't depend on long pipelines saving you from bad redirection! Never redirect output into a file that is being used as input in the same command or anywhere in the command pipeline.