Updated: 2016-02-24 04:49 EST
find
GLOB patternDo not print this assignment on paper!
- On paper, you will miss updates, corrections, and hints added to the online version.
- On paper, you cannot follow any of the hyperlink URLs that lead you to hints and course notes relevant to answering a question.
- On paper, scrolling text boxes will be cut off and not print properly.
15h00 (3pm) Monday February 29, 2016 (start of Week 7)
WARNING: Some inattentive students upload Assignment #5 into the Assignment #4 upload area. Don’t make that mistake! Be exact.
Do not print this assignment on paper! On paper, you cannot follow any of the hyperlink URLs that lead you to hints and course notes relevant to answering a question.
This assignment is based on your weekly Class Notes.
Remember to READ ALL THE WORDS to work effectively and not waste time.
This is an overview of how you are expected to complete this assignment. Read all the words before you start working.
For full marks, follow these directions exactly.
You will create file system structure in your CLS home directory containing various directories and files. You can use the Checking Program to check your work as you do the tasks.
You can check your work with the Checking Program as often as you like before you submit your final mark. Some task sections below require you to finish the whole section before running the Checking Program; you may not always be able to run the Checking Program successfully after every single task step.
When you are finished the tasks, leave the files and directories in place on the CLS as part of your deliverables. Do not delete any assignment work until after the term is over!
Assignments may be re-marked at any time on the CLS; you must have your term work available on the CLS right until term end.
Since I also do manual marking of student assignments, your final mark may not be the same as the mark submitted using the current version of the Checking Program. I do not guarantee that any version of the Checking Program will find all the errors in your work. Complete your assignments according to the specifications, not according to the incomplete set of the mistakes detected by the Checking Program.
All references to the Source Directory below are to the CLS directory ~idallen/cst8207/16w/assignment05/
and that name starts with a tilde character ~
followed by a user name with no intervening slash. The leading tilde indicates to the shell that the pathname starts with the HOME directory of the account idallen
(seven letters).
You do not have permission to list the names of all the files in the Source Directory, but you can access any files whose names you already know.
Before starting the worksheets, read the Readings in the weekly course notes, especially Shell GLOB Patterns and Redirection and Pipes.
See your previous assignments for how best to fill in the worksheets. These worksheets prepare you to do the rest of the tasks listed below. Failure to complete the worksheets will make the rest of this assignment very difficult. Do the worksheets first! Record and save all your worksheet answers for study and quizzes!
Do a Remote Login to the Course Linux Server (CLS). All work in this assignment must be done on the CLS.
Set your PS1
shell prompt.
Use LibreOffice or OpenOffice to complete Worksheet #04 ODT. (View online: Worksheet #04 HTML.) Record and save all your worksheet answers for study and quizzes!
Use LibreOffice or OpenOffice to complete Worksheet #05 ODT. (View online: Worksheet #05 HTML.) Record and save all your worksheet answers for study and quizzes!
Failure to complete the worksheets will make the rest of this assignment very difficult. Do the worksheets first!
You must keep a list of command names used each week and write down what each command does, as described in the List of Commands You Should Know. Without that list to remind you what command names to use, you will find assignments very difficult.
CST8207-16W
directory in your CLS HOME directory.Assignments
directory in the CST8207-16W
directory.assignment05
directory in the Assignments
directory.Hint: You can create the entire directory tree above using one single command.
This assignment05
directory is called the Base Directory for most pathnames in this assignment. Store your files and answers in this Base Directory, not in your HOME directory or anywhere else.
Run the Checking Program to verify your work so far.
You need to understand Shell GLOB Patterns to do this task.
oldnotes
newnotes
In your HOME directory, create two symbolic links to the old and new course notes for CST8207 using the ln -s
command and option and the method described in Copies of the CST8207 Course Notes. (The old notes must be term 15f
and the new notes must be term 16w
in the pathnames you use.)
Do a long listing of the new oldnotes
symlink and verify that it looks similar to this (but the userid and time will differ):
lrwxrwxrwx 1 abcd0001 abcd0001 52 Oct 4 00:00 oldnotes -> /home/idallen/public_html/teaching/cst8207/15f/notes
You should be able to do ls oldnotes | less
and see all the course notes file names from last term (15f
). If not, remove and redo the symlink.
In your HOME directory, use the ls
command with no options and a single shell GLOB pattern to match all pathnames under the symbolic link oldnotes/
that end in .txt
and display all the names on your screen. The shell will find 91 pathnames ending in .txt
, and the ls
command will display those 91 names on your screen. One of the last names on your screen should look exactly like this:
oldnotes/worksheet08.txt
Make sure you see 91 pathnames. (You can use a command pipeline to count the lines and words to be sure you have 91.)
Hints: No pipeline or find
commmand is required to generate the 91 pathnames, just use the ls
command with no options and one single GLOB pattern argument starting with the symlink oldnotes/
. This use of a GLOB pattern on a command line is illustrated in Copies of the CST8207 Course Notes. The example in the notes uses the given GLOB pattern to generate pathnames to the ls
command and count them. Follow the example and display the 91 pathnames on your screen instead of counting them. (Don’t use any redirection.)
textfound.txt
When the ls
output on your screen is correct (91 names), redirect the output 91 names into file textfound.txt
under your Base Directory (not under your current HOME directory). The file must contain 91 names, one per line.
Note: The ls
command will put each name on a separate line when output is not being sent to your screen.
Still in your HOME directory, use the echo
command with a shell GLOB pattern to match all pathnames under oldnotes/
that contain the acronym RTFM
anywhere in the file name and display the names on your screen. The shell will find two pathnames, one ending in .html
and the other in .txt
, and the echo
command will display those two names on your screen on one line.
Hints: See the previous Hint. Use only a GLOB pattern.
manfound.txt
When the echo
output on your screen is correct (two names on one line), redirect the output into file manfound.txt
under your Base Directory (not under your current HOME directory). The file must contain two names on one line.
Again in your HOME directory, use the echo
command with a shell GLOB pattern to match pathnames under oldnotes/
that contain the digit 2
anywhere in the file name and end in the extension .pdf
at the end. The shell will find six pathnames, each ending in .pdf
at the end, and the echo
command will display those six names on your screen on one (long) line.
When the echo
output on your screen is correct (six names on one long line), change the command name from echo
to ls
and add an option to show the full, long information about the pathnames. You should see six lines on your screen, showing the full file information for each of the six files. One of the lines should look like this:
-rw-r--r-- 2 idallen idallen 25791 Jul 7 2015 oldnotes/2015-2016_CST8207.pdf
pdffound.txt
pdffound.txt
under your Base Directory (not under your current HOME directory). The file must contain six lines and approximately 54 words.Run the Checking Program to verify your work so far.
As mentioned in Worksheet #03 HTML, choose which text search command you use depending on whether special characters are being used in the search string. You should use the fixed-string fgrep
command to begin with in this introductory course. You will learn regular expressions and the grep
command later in the term. Use fgrep
to begin.
Always verify that the correct output appears on your screen before you redirect the output into a file. You can only redirect what you can see.
Make your Base Directory your current directory for this section.
mypassword.txt
Search for lines containing your login userid in the password file. You should find exactly one line. (For an explanation of what the seven fields are in this line, see man 5 passwd
.
When the output is correct (one line) then redirect the output into file mypassword.txt
in your Base Directory. The file should contain one line.
Search for lines containing a period (dot) character (.
) in the file special.txt
in the Source Directory.
Hint: A period can be a special character. Choose the right text searching command, as described at the start of this section. The word count of the six lines of correct output should be: 6 42 227
periods.txt
When you have the correct output on your screen, redirect that output into file periods.txt
under your Base Directory. The word count of the file should be the same as above (six lines).
Search for lines containing two asterisk characters (**
) in the file special.txt
in the Source Directory.
Hint: An asterisk is a special character to the shell. Hide the asterisks so that the shell does not GLOB expand them. Also choose the right text searching command, as described at the start of this section. The word count of the four lines of correct output should be: 4 34 190
asterisks.txt
When you have the correct output on your screen, redirect that output into file asterisks.txt
under your Base Directory. The word count of the file should be the same as above (three lines).
In your Base Directory, create two more symbolic links to the old and new course notes for CST8207, as you did inside your HOME directory earlier in the assignment.
In Copies of the CST8207 Course Notes, see the example use of fgrep
with shell GLOB patterns to match *.txt
files in these oldnotes
and newnotes
directories. The GLOB pattern easily generates a huge list of file names for fgrep
to search inside.
In the old course notes from last year, use one command to search inside all the .txt
files for the word Filezilla
(spelled exactly as shown, case-sensitive). Only three lines of text should display, from three files. Each line of text should be preceded by the file name in which it was found. (The word count of the output must be 3 38 324
.)
Hint: You will need to use the same GLOB pattern you used earlier in this assginment to match all the .txt
files under oldnotes
. This time, use the GLOB pattern to make the shell give all the file names to the command that searches for text inside all those files. No pipes are needed to find these lines; use just one command with no options and a single GLOB pattern. If you see more than three lines of output, you are likely using options that make the search case-insensitive. If your word count is wrong because the file names are missing, you are likely using unnecessary pipes to find the files. Don’t do that.
Repeat the above on all the *.txt
files, but add the searching option that ignores case distinctions when matching lines in the files (RTFM). Now, 13 lines are found in six different files.
Hint: These text-searching commands are case-sensitive by default – searching inside files for lines containing abc
won’t find any lines containing ABC
unless you use an option to ignore case distinctions during the search. (What option? RTFM)
filezilla.txt
filezilla.txt
under your Base Directory.Run the Checking Program to verify your work so far.
You need to understand Shell GLOB Patterns to do this task.
The “story” here is that a malicious cracker has dumped a bunch of WAREZ files in a directory on the server and has hidden them among thousands of other files. (See https://en.wikipedia.org/wiki/Warez.) Your job is to take a copy of the WAREZ files, and only the WAREZ files, for use in a court case. You must not touch or copy any other files, only the WAREZ files.
There is a directory named warez
under the Source Directory. Hidden (really hidden) deeper under this directory is one single directory containing over 101,000 file names. Be careful about typing ls
in this directory without using any output pagination pipe – the amount of output may flood your terminal window for some time and even a ^C
interrupt may take a minute or two to interrupt the command! One way to avoid flooding your screen is by using ls | wc
to count how many pathnames would be output on your screen before you do just ls
. Be careful!
Find this huge hidden directory and make this huge directory your current directory, so that you can experiment with the GLOB pattern you will need in the following questions.
Hints: This isn’t a maze. There is only one path down to the huge hidden directory inside the warez
directory, though the way is hidden. Remember not to type ls
in this large directory, when you find it, because the output is very large!
Exactly 100 files in this one (huge) directory have names that contain your userid (which must be matched lower-case) followed somewhere later by the text string warez
, where warez
is case-insensitive and may appear in any combination of upper- and lower-case letters, e.g. warez
,Warez
,wArez
,waREz
, etc. Any amount of text may appear before your userid, between your userid and the warez
, and after the warez
.
Some sample file names for userid abcd0001
might look like these (note that the warez
word must always follow the userid in all the required file names):
PTKabcd0001PTKwAreZkmfGTDDeNTJFZ
zynabcd0001uKVUFOsCXaGFWZPECbYWVFKzynuKWaREZv
HhUtfgYtyGhjJADGekCAkgtZEKsTGKdYZZabcd0001ADGekCwaREZZaFSrXJnxGex
Many of the file names are up to 100 characters long.
warez
Using one single copy command and a single shell GLOB pattern, copy all 100 (exactly 100) of these cracker files (and no others) into a new directory named warez
that you must create in your own Base Directory. Make sure you preserve the modify times of the copied files, as you did in a previous lab. (In this simulation, all the files are empty.)
Hints: Before you try to copy any files, use echo
with the GLOB pattern into word count to verify your GLOB patterns before using them to see if your file names are correct. The shell must correctly expand the GLOB pattern before you try to use the GLOB pattern in a copy command.
Use a copy command with one shell GLOB pattern to match the 100 file names. The shell can do it all with one copy command using the right GLOB pattern for the source files, as you did in section 4.1 of Worksheet #04 HTML.
Do not use a pipe or find
to select the file names. Use only the copy command with a GLOB pattern for the source files, as you did in section 4.1 of Worksheet #04 HTML.
Do not quote the shell GLOB patten. Quoting turns off shell GLOB patterns. You want the shell to expand the GLOB pattern for this task! (If you were passing a GLOB pattern as an expression in a find
command, you would quote it so that the shell didn’t expand it. That is not what you are doing here.)
copywarez.txt
Put the copy command line that you used into file copywarez.txt
in your Base Directory.
Make sure that the content of the file is exactly the same as the copy command you typed, with no special characters expanded. The number of words in the file should be about four.
Hints: The best way to put this command line in the file is to use a Linux text editor, or you can use the cat
keyboard and EOF method from section 5.5a in Worksheet #05 HTML.
Warning: It is tricky to use
echo
with redirection to put this command line into the file because the line contains shell metacharacters. You can’t just stickecho
on the front of a command line that contains shell metacharacters such GLOB patterns; the shell will expand all those metacharacters before theecho
command runs. You will need special Quoting to make it work. You will need to hide all the shell metacharacters in the command line from the shell. Make sure the command line echoes correctly to the screen before you try to redirect it into the file. You can only redirect what you can see! Use a text editor instead!
You can check your work by doing a recursive listing of your warez
directory and counting the number of names that were copied.
All the files should have their original modify dates preserved – verify this.
Run the Checking Program to verify your work so far.
You need to understand Shell GLOB Patterns to do this task.
abcd0001.txt
Under the Source Directory there is a name maze
(four letters). What is the absolute path of this maze
under that directory? Put the absolute pathname of this maze
in that directory into a file in your Base Directory with a basename similar to abcd0001.txt
, but use the basename that starts with your own Blackboard userid, not the fake userid abcd0001
. Use your own userid in the file name.
Save the actual absolute pathname, not a shell tilde short-cut for an absolute pathname. (Do not start the name with a tilde.) The file basename must be exactly 12 characters long. The absolute pathname of the maze itself is over 40 characters long.
You will need this maze absolute pathname in several places, below.
Use the GLOB feature to have the shell display on your screen on six lines the six absolute paths of the six file names under the above maze
directory that begin with your userid. (One of the six absolute pathnames will end in abcd0001.txt
where abcd0001
is your own userid.)
Each of the six pathnames should contain seven forward slashes.
Hint: Use the ls
command (no options) with an absolute path shell GLOB pattern as an argument, in a manner similar to how you displayed all the tty
names in section 4.1 of Worksheet #04 HTML.
Use the actual absolute pathname, not a shell tilde short-cut for an absolute pathname. (Do not start the name with a tilde.)
firstmaze.sh
Save the full and exact ls
command line you just used into file firstmaze.sh
in your Base Directory. Pay attention to the file name extension in this file.
Hints: The best way to put this command line in the file is to use a Linux text editor, or you can use the cat
keyboard and EOF method from section 5.5a in Worksheet #05 HTML.
Warning: It is tricky to use
echo
with redirection to put this command line into the file because the line contains shell metacharacters. You can’t just stickecho
on the front of a command line that contains shell metacharacters such redirection; the shell will do the redirection before theecho
command runs. You will need special Quoting to make it work. You will need to hide all the shell metacharacters in the command line from the shell. Make sure the command line echoes correctly to the screen before you try to redirect it into the file. You can only redirect what you can see! Use a text editor instead!
firstmaze.txt
When you have the correct ls
command that generates six lines of output, redirect and save the six lines of ls
output into file firstmaze.txt
in your Base Directory. The file must contain six absolute pathnames, one on each line, each containing your userid.
Save the actual absolute pathnames, not shell tilde short-cuts for absolute pathnames. (Do not start the names with a tilde.) Each of the six pathnames should contain seven forward slashes.
These six pathnames are only six of the many file names in the maze that start with your userid. We need to find them all, in all the sub-directories, too.
find
GLOB patternIndexShell GLOB patterns can only look in one directory; they don’t search the entire maze. To find all the files in the maze that start with your userid, we can’t use shell GLOB patterns directly. We need to use that command that searches a directory recursively, and make it use the GLOB pattern. (You have used this command many times already.)
You need to understand Shell GLOB Patterns to do this task. You must know about Finding Files. The shell will not be expanding the GLOB pattern in this task, since you will be quoting them and passing the GLOB pattern to another command for evaluation, but the GLOB pattern metacharacters work the same way to match basenames, as shown in the examples in Finding Files.
We need to hide the GLOB patterns from the shell, since we want to pass the GLOB patterns unchanged to the command we use. Here’s how:
quote
in the course notes web page on Searching for and finding files by name, size, use, modify time, etc. Read all the paragraphs containing this quote
word (search multiple times) and remember the importance of quoting. You will need to know how to do this quoting when you start the finding and searching work for this task on the CLS, below.quotehow.txt
In the first paragraph you found, above, put the example command line (showing the use of quotes around the *.txt
argument that contains a GLOB character) into file quotehow.txt
in your Base Directory. The file must contain just the example command line text after the e.g.
and it will be one line, three words, 19 characters, according to wc
.
If the count is wrong, look in the file to see what is wrong with the text. Does the file contain exactly the same text as the course notes? If not, edit the file and fix it.
Hints: The best way to put this example line in the file is to use a Linux text editor, or you can use the cat
keyboard and EOF method from section 5.5a in Worksheet #05 HTML.
Warning: It is tricky to use
echo
with redirection to put this command line into the file because the line contains shell metacharacters. You can’t just stickecho
on the front of a command line that contains shell metacharacters such quotes; the shell will remove the quotes before theecho
command runs. You will need special Quoting to make it work. You will need to hide all the shell metacharacters in the command line from the shell. Make sure the command line echoes correctly to the screen before you try to redirect it into the file. You can only redirect what you can see! Use a text editor instead!
Use the absolute pathname of the maze
name in the Source Directory as an argument to ls
along with an option that shows the long information about the pathname. (You already saved this maze pathname in a file, above.)
Use the actual absolute pathname that you saved, not a shell tilde short-cut for an absolute pathname. (Do not start the name with a tilde.) Do not put a trailing slash on the pathname.
Hints: You should see exactly one line of output. You have the right option to ls
if the first word of the output is lrwxrwxrwx
, indicating that maze
is a symbolic link, not a directory.
If the ls
long listing gives you a directory listing full of files instead of one line starting with lrwxrwxrwx
, make sure you are using the right option to ls
and the correct Source Directory path from this assignment and not any previous assignment.
The command you use should use one option and one absolute pathname (with no trailing slash).
We will learn more about symbolic links in a future assignment. For now, note that the maze
symbolic link has an arrow that leads to the same directory maze used in Assignment #03 HTML. (See that assignment for details on the size of this maze.)
lscommand.sh
ls
command line you just used into file lscommand.sh
in your Base Directory. Pay attention to the file name extension in this file.mazeinfo.txt
ls
command line that generates one long line of output, redirect and save the output (one line) into file mazeinfo.txt
under your Base Directory.abcd0001
IndexAgain, in a manner similar to your previous assignments, you must find files in this maze, using the maze as the starting directory. The symbolic link requires some special handling, because the command that recursively finds files does not follow symbolic link arguments on the command line without using an option. You must choose one of these methods to search this symbolic link maze (choose one):
while processing
, and do not use the -L
option, ORmaze
your current directory and then recursively search the current directory. (A current directory can never be a symbolic link – it must be a real directory.)You will choose one of the previous two starting directory methods to reach the maze when you start searching, below.
As you know from a previous assignment, this maze
contains many hidden sub-directories. With this maze as a starting directory, and using one of the two above methods, use a single command (no pipes needed) to recursively find all pathnames with a basename that begins with your eight-character userid at the start of the name.
For example, if your userid were abcd0001
then you might match and output pathnames containing basenames such as abcd0001
and abcd0001YYY
but not XXXabcd0001
or XXXabcd0001YYY
or abcdYYYY
where XXX
and YYY
can be any non-empty strings of characters. Your own userid must start every basename.
Your single recursive command should find exactly 23 pathnames.
Hint: You must use a single command (not a pipeline) that is good at Finding Files by a basename pattern to do this. Do not try to use cd
and ls
to find all the files; the maze is really, really big.
Hint: You have previously used this recursive command many times without a pattern for a basename. This task requires you to use a quoted GLOB pattern that matches your userid followed by zero or more characters. The command you use should recursively find exactly 23 pathnames, all containing your userid.
Hint: If you don’t find any pathnames, re-read the section on Methods, above. If you only find a few pathnames, re-read the section on quotes, above.
When you see all 23 pathnames on your screen, take the same single command you used to find the names above and modify it to use the expression that makes the command show the full detailed attribute information about the names (including permissions, owner, size, date, etc.) instead of just the pathname. Use the same command; just remove -print
and add the right expression.
You will know you have the right expression if the output of the command is 23 lines and approximately 256 words (instead of 23 words).
Hint: You know which expression to use from your answers in Worksheet #02 HTML and Worksheet #03 HTML and from reading the detailed attribute information paragraph at the end of Section 2 of the Finding Files notes.
You may want to review using pipes in Worksheet #05 HTML and Redirection and Pipes to do this next item.
mazefound1.txt
mazefound1.txt
under your Base Directory. The sorted file will still contain exactly the same number of lines and words as you counted, above.findcom1.sh
Put the above two-command pipeline with redirection that you just used, into file findcom1.sh
in your Base Directory. Pay attention to the file name extension in this file.
Hints: The best way to put this command line in the file is to use a Linux text editor, or you can use the cat
keyboard and EOF method from section 5.5a in Worksheet #05 HTML.
Warning: It is tricky to use
echo
with redirection to put this command line into the file because the line contains shell metacharacters. You can’t just stickecho
on the front of a command line that contains shell metacharacters such pipes; the shell will execute the pipes before theecho
command runs. You will need special Quoting to make it work. You will need to hide all the shell metacharacters in the command line from the shell. Make sure the command line echoes correctly to the screen before you try to redirect it into the file. You can only redirect what you can see! Use a text editor instead!
abcd0001
anywhereIndexIn this same maze, use a single command (not a pipeline) to recursively find all pathnames with a basename that contains your eight-character userid anywhere in the name.
For example, if your userid were abcd0001
then you might output pathnames containing basenames such as abcd0001
, abcd0001YYY
, XXXabcd0001
, and XXXabcd0001YYY
where XXX
and YYY
can be anything (zero or more characters). Your own userid will be somewhere in every basename.
Your single recursive command should find exactly 47 pathnames.
Hint: See the hints for the previous section. This command line is a simple modification of the previous one.
When you see all 47 pathnames on your screen, take the same single command you used to find the names above modify it to use again the expression that makes the command show the detailed attribute information about the names, as you did above.
You will know you have the right expression if the output of the command is 47 lines and approximately 535 words (instead of 47 words).
mazefound2.txt
mazefound2.txt
under your Base Directory. The reverse-sorted file will still contain exactly the same number of lines and words as you counted, above.findcom2.txt
Put the above two-command pipeline with redirection that you just used, into file findcom2.txt
in your Base Directory.
Hint: See the hints for the previous section.
Run the Checking Program to verify your work so far.
Somewhere under the warez
directory in the Source Directory you used earlier for the WAREZ problem are exactly three non-empty files whose names contain your userid (lower-case) somewhere (anywhere) in the name. (Most of the other files in the WAREZ directory whose names contain your userid are empty files.)
Use a command to recursively find and display these three non-empty (size larger than zero) files with your userid anywhere in the name.
Hints: What command finds files based on expressions that can include both size and a basename that can be a GLOB-style pattern? You have used this command many times this term. See the end of Worksheet #02 HTML and the “multiple expressions” example in Finding Files.
You will find your userid mentioned inside each file, but because the files are not all Unix/Linux text files, some of the text content may not display correctly on your terminal screen. The less
command is better than cat
when displaying files containing strange (e.g. unprintable) characters, but see also the “show-nonprinting” option to cat
.
When you know the three pathnames, manually copy each of these files (preserving modify times) to a new directory named 3OSfiles
that you must create in your Base Directory.
Since there are only three file names, you can use your mouse to copy-and-paste the three long file names you need to copy, once you know their names. Be careful to use quoting to hide any blanks in the names from the shell.
(Optional advanced use: You can also read this optional material on a better way to use find -exec and xargs.)
unix
windows
macintosh
In your 3OSfiles
directory, determine which operating system created each of the three non-empty files. Rename the Unix/Linux file to be unix
, the Windows file to be windows
and the Macintosh file to be macintosh
.
Hints: In Assignment #02 HTML you used a command that can determine file type to identify the text inside a date.txt
file. You will also find this command listed under Week 02 in the List of Commands in your notebook. Use this command and the notes on Text File Line End Differences to identify the special line endings of the Windows and Macintosh files.
Your instructor will also mark the Base Directory in your account on the due date. Leave everything there on the CLS. Do not delete anything.
Run the Checking Program to verify your work so far.
You need to understand Redirection and Pipes to do this task.
wc
Count the lines, words, and characters in the file services
under the /etc
directory and put the count in file wc
under your Base Directory. (Use the absolute pathname of the services
file when you count and do not use any pipes.) The file wc
should contain one line containing three numbers and an absolute pathname at the end. There is no file extension on this file; Linux doesn’t care.
Extract just the first line of the same services
file and append this one line to the end of the wc
file, so that the file wc
now has two lines in it (the word count line and the first line of services
).
Hint: You know a command that shows lines at the start of a file. Review your work in Worksheet #05 HTML and the notes on Redirection and Pipes.
Append the count of the lines, words, and characters in the file protocols
in the /etc
directory to the end of file wc
, so that the wc
file now has three lines in it. (Use the absolute pathname of the protocols
file when you count and do not use any pipes.)
Extract just the last line of the same protocols
file and append just this one line to the end of the wc
file, so that the file wc
now has four lines in it.
Hint: You know a command that shows lines at the end of a file. Review your work in Worksheet #05 HTML and the notes on Redirection and Pipes.
Confirm that the word count of the wc
file gives 4 20 140
. If you see the right number of lines but the other values differ, go back and re-read all the words in the sentences above, especially the sentences that start with the words “Use the”.
Run the Checking Program to verify your work so far.
The Course Linux Server is on the open Internet and is under constant attack on its SSH login port. The Denyhosts intrusion protection system locks out attacking IP addresses so that they are refused when they try again. We will find the seven most common refused IP addresses.
You need to understand Redirection and Pipes to do this task, especially the section on Using successive filters in pipes.
The course notes file Selecting Fields with awk
explains how to use the command that extracts fields from lines.
Copy the six-command pipeline used in Example 2 given in Using successive filters in pipes and modify it as follows and then run it:
Jan
to be the first month of the current academic term (three letters, with a space following)Hints: Do not change any other parts of the existing six commands in the pipeline. All you need to do is (possibly) change the month and add a seventh filter command. The first line of output must be 751 (223.252.32.73)
and the last line (of seven lines) must be 69 (106.51.226.25)
.
refused7.sh
When the output is correct, put the new seven-command pipline you used into file refused7.sh
in the Base Directory. You can put it on separate lines with backslashes at the end of each line, as shown in the notes, or you can remove the backslashes and put it all on one long line.
Typing sh -u refused7.sh
should print the seven most active attack IP addresses for the one month on your screen. If it doesn’t do this, you haven’t copied the command line correctly. Check it!
You can debug your script file by running it like this:
bash -ux refused7.sh
and making sure you see seven commands execute before the output appears.
Edit the refused7.sh
file and add to the end of the file, underneath your seven-command pipeline, exactly seven numbered shell comments that explain briefly and in your own words the meaning of each of the seven commands used in the pipeline, using the comment format described below.
Shell script comments start with the number-sign (or hash-tag) character #
and extend to the end of the line. The seven numbered comment lines must have a syntax similar to this (though this is the wrong pipeline and wrong comments to use for this task):
last idallen | awk '{ print $3 }' | grep '^[0-9]' | sort | uniq | wc -l
# 1. last idallen: show last login lines only for user idallen
# 2. awk '{ print $3 }': display only third field (IP address)
# 3. grep '^[0-9]': select only lines starting with a digit
# 4. sort: put IP addresses into sorted order
# 5. uniq: throw away duplicate adjacent IP addresses, leaving only unique
# 6. wc -l: count the number of unique IP addresses (number of lines)
Comment Format: Since there are seven commands in your script pipeline, you will need to write exactly seven numbered comment lines to explain them. As you see in the above example, each of the seven comment lines starts at the left margin with the #
comment character (no spaces in front), followed by a space, number, a period, space, the pipeline command name and options to which the comment refers, and then your own comment text written in your own words. Each comment text is written in your own words to explain what the command does in the pipeline. Do not copy words; write your own.
Follow the syntax shown in the above example, and use your own words (don’t copy mine). Including the seven comment lines, your refused7.sh
file will be at least eight (or more) lines long.
The Course Linux Server is on the open Internet and is under constant attack on its SSH login port. The Denyhosts intrusion protection system locks out attacking IP addresses and logs the event. We will find the month in 2015 with the most locked out IP addresses.
Write a command to count the number of lines containing the string new denied hosts
in the denyhosts-2015
log file on the CLS. (This log file is in the same directory as the auth.log
file used in the previous item and in most of the Weekly Class Notes.) You should find 2154
matching lines in the file.
Hint: Look at the contants of the log file before you begin. My solution used one command name with no pipes needed. I used an option that counted the number of matching lines, as shown in the weekly course notes.
denycmd1.sh
When the output is correct, put the command line you used to generate the number 2154
into file denycmd1.sh
in the Base Directory.
Typing sh -u denycmd1.sh
should print the number 2154
on your screen. If it doesn’t do this, you haven’t copied the command line correctly. Check it!
You can debug your script file by running it like this:
bash -ux denycmd1.sh
and making sure you see the correct command execute before the output appears.
Write a command pipeline (using pipes) to count the number of lines containing the string new denied hosts
in only September 2015 in the denyhosts-2015
log file on the CLS. You should find 177 matching lines to count and the output should be the number 177.
Hints: The Example 1 given in Using successive filters in pipes explains how you might find some lines in the auth.log
file that were created in January. Apply what you learn there to solve this problem. Before you try, look at the denyhosts-2015
file and find out what format it uses to represent the date “September 2015”. You can’t just look for the text “September 2015” in the file; it’s not there. Look into the file to see the actual date format and create a filter command to search for that date format and count the lines. My solution used two command names with one pipe between. The second command used an option that counted the number of matching lines, as shown in the weekly course notes.
denycmd2.sh
When the command pipeline is correct, put the command pipeline you used to generate the number 177
into file denycmd2.sh
in the Base Directory.
Typing sh -u denycmd2.sh
should print the number 177
on your screen. If it doesn’t do this, you haven’t copied the command line correctly. Check it!
You can debug your script file by running it like this:
bash -ux denycmd2.sh
and making sure you see the correct commands in the pipeline execute before the output appears.
Using your shell history and the command you used in the previous item, modify and redo the command a few times to manually find the number of denied hosts in each month in 2015. Use this to determine the month with the largest number of denied hosts (390).
Hint: It’s one of the months before June.
denyhosts3.txt
When you find the month with the largest number of denied hosts, Put the first five lines and the last five lines of log entries for this month into file denyhosts3.txt
in the Base Directory.
Hint: Use a command pipeline to generate the first five lines of log output for this month and save them, then modify the command pipeline slightly to generate the last five lines of log output for this month and append them to the file containing the first five lines. That is your answer. The first five lines should be from the start of the month and the last five lines should be from the end of the month. The word count of this ten-line file should be: 10 100 854
and the sum
should be 28039
.
Run the Checking Program to verify your work so far.
That is all the tasks you need to do.
Check your work a final time using the Checking Program below and save the standard output of that program into a file as described below. Submit that file (and only that one file) to Blackboard following the directions below.
When you are done, log out of the CLS before you close your laptop or close the PuTTY window, by using the shell exit
command:
$ exit
Summary: Do some tasks, then run the Checking Program to verify your work as you go. You can run the Checking Program as often as you want. When you have the best mark, upload the single file that is the output of the Checking Program to Blackboard.
Since I also do manual marking of student assignments, your final mark may not be the same as the mark submitted using the current version of the Checking Program. I do not guarantee that any version of the Checking Program will find all the errors in your work. Complete your assignments according to the specifications, not according to the incomplete set of the mistakes detected by the Checking Program.
There is a Checking Program named assignment05check
in the Source Directory on the CLS. You can execute this program by typing its (long) pathname into the shell as a command name:
$ ~idallen/cst8207/16w/assignment05/assignment05check
You will learn of ways to make this shorter in future assignments.
When you are done, execute the above Checking Program as a command line on the CLS. This program will check your work, assign you a mark, and display the output on your screen.
If the Checking Program is not yet ready, it will say NOT FINISHED YET
and DO NOT SUBMIT THIS FILE
. No mark is shown; do not submit the file. Wait until the checking program is finished (it gives you a mark) before you save and submit your marks.
You may run the Checking Program as many times as you wish, allowing you to correct mistakes and get the best mark. Some task sections require you to finish the whole section before running the Checking Program at the end; you may not always be able to run the Checking Program successfully after every single task step.
When you are done with this assignment, and you like the mark displayed on your screen by the Checking Program, you must redirect only the standard output of the Checking Program into the text file assignment05.txt
in your Base Directory on the CLS, like this:
$ ~idallen/cst8207/16w/assignment05/assignment05check >assignment05.txt
$ less assignment05.txt
assignment05.txt
file name.YOUR MARK for
assignment05.txt
(containing the output from the Checking Program) from the CLS to your local computer.
YOUR MARK for
assignment05.txt
file from your local computer to the correct Assignment area on Blackboard (with the exact name) before the due date:
assignment05
file from your local computer. Make sure the assignment file has the correct name on your local computer before you attach it. Attach only your assignment05.txt
file for upload. Do not attach any other file names.assignment05.txt
file on the Upload Assignment page, scroll down to the bottom of the page and use the Submit button to actually upload your attached assignment05.txt
file to Blackboard.Use only Attach File, Browse My Computer on the Upload Assignment page. Do not enter any text into the Write Submission or Add Comments boxes on Blackboard; I do not read them. Use only the Attach File, Browse My Computer section followed by the Submit button. If you need to comment on any assignment submission, send me EMail.
You can revise and upload the file more than once using the Start New button on the Review Submission History page to open a new Upload Assignment page. I only look at the most recent submission.
You must upload the file with the correct name from your local computer; you cannot correct the name as you upload it to Blackboard.
You will also see the Review Submission History page any time you already have an assignment attempt uploaded and you click on the underlined assignment05 link. You can use the Start New button on this page to re-upload your assignment as many times as you like.
You cannot delete an assignment attempt, but you can always upload a new version. I only mark the latest version.
Your instructor may also mark files in your directory in your CLS account after the due date. Leave everything there on the CLS. Do not delete any assignment work from the CLS until after the term is over!
I do not accept any assignment submissions by EMail. Use only the Blackboard Attach File, Browse My Computer. No word processor documents. Plain Text only.
Use the exact file name given above. Upload only one single file of Linux-format plain text, not HTML, not RTF, not MSWord. No fonts, no word-processing. Linux plain text only.
NO EMAIL, WORD PROCESSOR, PDF, RTF, or HTML DOCUMENTS ACCEPTED.
No marks are awarded for submitting under the wrong assignment number or for using the wrong file name. Use the exact 16-character, lower-case name given above.
WARNING: Some inattentive students don’t read all these words. Don’t make that mistake! Be exact.
READ ALL THE WORDS. OH PLEASE, PLEASE, PLEASE READ ALL THE WORDS!