% Unix/Linux Disk Usage, Hard Links, Quotas, Finding Inodes % Ian! D. Allen -- -- [www.idallen.com] % Winter 2017 - January to April 2017 - Updated 2017-01-20 00:48 EST - [Course Home Page] - [Course Outline] - [All Weeks] - [Plain Text] Disk Usage ========== The `du` command counts the number of disk blocks used in files and directories, and does it recursively for directories. You can turn off visible directory recursion using the `-s` option to `du`, and then `du` will show you only the sum total of disk blocks in that directory, including disk usage in all subdirectories underneath it. The sizes of the directories themselves (in disk blocks) are included in the directory totals: $ rm -rf new $ mkdir new $ du new 4 new $ date >new/foo $ du new 8 new $ date >new/bar $ du new 12 new $ mkdir new/dir $ du new 4 new/dir 16 new $ du -s new 16 new $ date >new/dir/foo $ du new 8 new/dir 20 new $ du -s new 20 new $ rm -r new/* $ du new 4 new Note that even an empty directory takes up some disk space, and some file systems allocate a minimum number of disk blocks for any non-empty file or directory (e.g. `4` blocks, above). In this document, we will assume a minimum of `4` blocks per non-empty file or directory. An *empty* file, such as created by `touch`, takes *no* disk space, since it doesn't need any disk blocks: $ rm -rf new; mkdir new ; du new 4 new $ touch new/foo ; du new 4 new Given these supposedly successful commands and output: $ rm -rf new $ mkdir new $ cp file1 file2 file3 new $ du -s new 132 new QUESTION: If I removed *all* the files under `new`, how much disk space would I free up? (This is the same as asking: How many disk blocks are used by all the files under `new`? You must not include the disk space used by the directory itself. Only count the file space.) ANSWER: The total disk space used in `new` (including the directory itself) is `132` blocks. We know `4` blocks are used for the directory itself, leaving `132-4=128` blocks for the all the files inside `new`. So `128` blocks would be freed up by removing all the files under `new`. Disk Usage (du), Quotas, and Linked Files ========================================= The `quota` command shows your disk quota, in disk blocks, if enabled. It also shows you the number of inodes you are using. A *disk quota* is a limit on how much disk space, and how many inodes, you can use on the system. (Quotas are enabled on the CLS.) Linked files (files with multiple names) don't take up extra disk blocks or inodes, and so they don't affect the output of `du` or `quota`: $ rm -rf new ; mkdir new ; date >new/foo $ du new ; quota 8 new Disk quotas for user cst8207a (uid 1002): Filesystem blocks quota limit grace files quota limit grace home 780 204800 512000 193 0 0 $ ln new/foo new/bar $ ln new/bar new/abc $ ls -i new 3138961 abc 3138961 bar 3138961 foo $ du new ; quota # does not show any new blocks or inodes 8 new Disk quotas for user cst8207a (uid 1002): Filesystem blocks quota limit grace files quota limit grace home 780 204800 512000 193 0 0 $ ls -dils new/* 3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/abc 3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/bar 3138961 4 -rw-r--r-- 3 idallen idallen 29 Oct 21 13:28 new/foo Since `foo`, `bar`, and `abc` are all names for the same disk blocks and the same inode, `du` and `quota` count the disk space and inode only once. To actually free up disk space, all three names for inode `3138961` must be removed. Only then will `du` and `quota` show a reduction in space: $ rm new/abc ; du new ; quota # does not release disk space 8 new Disk quotas for user cst8207a (uid 1002): Filesystem blocks quota limit grace files quota limit grace home 780 204800 512000 193 0 0 $ rm new/bar ; du new ; quota # does not release disk space 8 new Disk quotas for user cst8207a (uid 1002): Filesystem blocks quota limit grace files quota limit grace home 780 204800 512000 193 0 0 $ rm new/foo ; du new ; quota # this releases disk space 4 new Disk quotas for user cst8207a (uid 1002): Filesystem blocks quota limit grace files quota limit grace home 776 204800 512000 192 0 0 To release disk space, *all* the names for an inode must be removed. Finding Linked files ==================== For a file with multiple names (a link count greater than 1), the multiple names each lead to the same inode number, but those multiple names may appear in any directory anywhere in the disk partition: $ rm -rf new ; mkdir -p new/dir $ date >new/a ; date >new/b ; date >new/c $ ln new/a new/dir/x $ ln new/b new/dir/y $ ln new/c new/dir/z $ find new -type f -ls 3138990 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/c 3138989 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/b 3138988 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/a 3138990 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/z 3138989 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/y 3138988 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/x (Recall that the `-ls` expression to `find`, instead of `-print`, shows the same detailed output as you would see from `ls -dils`. We also use the `-type f` expression to limit output to only files.) Above, the file names with the same inode number do not appear together in the output, because they are in different directories. It is hard to notice that the two names `new/c` and `new/dir/z` have the same inode numbers and so must be the same file, though we can see that they both have a link count of 2. The solution is to use `find` and `sort` to make inode numbers sort together on your screen, to make finding duplicates by eye easier: $ find new -type f -ls | sort 3138988 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/a 3138988 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/x 3138989 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/b 3138989 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/y 3138990 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/c 3138990 4 -rw-r--r-- 2 idallen idallen 29 Oct 21 13:47 new/dir/z Now, the same inode numbers sort together on your screen and it is easier to see the duplicate inode numbers and know which files are hard links to each other. We can easily see that `new/c` and `new/dir/z` have the same inode number and a link count of 2, so they must be the only two names for this file inode. Since every file under `new/dir/` has a link count of 2 and a second name in another directory, removing all the names in `new/dir`, as in `rm new/dir/*`, would reclaim no disk space. All that would happen is that the link count of each file would go down from 2 to 1. To reclaim the disk space, you have to remove *all* the names of the files, including the names that are in the other directory `new/[abc]`: $ du -s new 20 new # see how much space is in use $ rm new/dir/* # remove all those file names $ du -s new 20 new # no change in space used ! $ rm new/[abc] # remove the other names for the files $ du -s new 8 new # only now is the space reclaimed You can also use an expression to `find` to find files by **inode number** or by **link count** (RTFM). Newer versions of `find` even have a `-samefile` expression to find files with the same inode number as a given file. These expressions are very useful to find all the names of files that have more than one link. > Names for files can appear in any directory, so if you don't really know > where all the names for a file are, you may have to search every directory > in the entire file system! This can take a long time. Usually you have some > idea where the other name(s) might be, so you don't have to search > everything. -- | Ian! D. Allen, BA, MMath - idallen@idallen.ca - Ottawa, Ontario, Canada | Home Page: http://idallen.com/ Contact Improv: http://contactimprov.ca/ | College professor (Free/Libre GNU+Linux) at: http://teaching.idallen.com/ | Defend digital freedom: http://eff.org/ and have fun: http://fools.ca/ [Plain Text] - plain text version of this page in [Pandoc Markdown] format [www.idallen.com]: http://www.idallen.com/ [Course Home Page]: .. [Course Outline]: course_outline.pdf [All Weeks]: indexcgi.cgi [Plain Text]: 457_disk_usage.txt [Pandoc Markdown]: http://johnmacfarlane.net/pandoc/