Hard Link Soft Symbolic Links

| 0 Comments | 0 TrackBacks
Hard links and soft links are easily known and used by anyone on the unix console. A soft link can be made between file systems but the fact that a hard link cannot extend beyond the current file system is known to everyone yet its explanation remains cloudy to most. We go for decades working with Unix and never clear up the issue. This article will explain in detail what hard links are, why they are confined to their own file system and their impact on the existence of a file. But first let's explain soft links, or symbolic links, which ever you wish to call it.

There is a prerequisite to this. Please read the article on files and directory permissions which explains how a directory contains the inode numbers which contain all the information about a file
Sparrow_Hawk.jpgexcept the file's name. I assume that that is clear before continuing.

Soft or Symbolic Link:

I explained in the previous article that a directory contains nothing but file names and for each file name there is an inode number. This i-node number refers to the i-node structure definition for this file. One of the flags in the i-node structure defines the type of file. When the ls -l command displays the file, it shows those files that are symbolic links with an l at the beginning. 

farhad@farhad-desktop:/tmp/test$ ln -s /usr/bin/java java ; ls -l
lrwxrwxrwx 1 farhad farhad 13 2010-12-25 01:36 java -> /usr/bin/java

A note on permissions on symbolic links: they have no effect at all. Now, what I just created is an entry into the directory /tmp/test/ that has the name java. In fact I just created a new file with a new i-node definition. A new set of blocks on disk were just reserved for this new file. The type of file mentioned in the inode is set to be a symbolic link. And the contents inside those data blocks for this file on the hard drive are 13 bytes which form the string "/usr/bin/java" (count the characters). As a matter of fact I highlighted the number 13 in the output above.

Let's show the i-node number for the newly created file /tmp/test/java and the one we had before, /usr/bin/java:

farhad@farhad-desktop:/tmp/test$ ls -i java /usr/bin/java
6426080 java  4206093 /usr/bin/java

Let's show this graphically:


I could have saved all the wording and just showed this picture which took me an hour to draw. The inode 6426080 is of type S_IFLNK, or a Symbolic Link. 

When the file /tmp/test/java is accessed, it becomes known that this file is a Symbolic Link and therefore its blocks are read to find out where the destination file is. If the destination file is also another symbolic link then its blocks are also read and followed. This process continues until a regular file (any file other than a symbolic link) is found. This behavior is program dependent. The programmer chooses to follow the symbolic link or not. Notice that the symbolic link can be deleted without affecting the destination file in any way. They are completely different files. That's why they are allowed to reside on separate file systems because deleting one has no effect on the other because each has its own inode table.

Hard Links:

It is easy now to differentiate between a soft link and a hard link. Simply put, a hard link refers directly to an existing i-node instead of creating a new i-node for the new directory entry as soft link does. 

Note that a hard link cannot be created across different file systems. Since my /tmp and /usr are on different file systems, trying to create a hard link from /tmp to anything under /usr will fail:

farhad@farhad-desktop:/tmp$ ln /usr/bin/java /tmp/java
ln: creating hard link `/tmp/java' => `/usr/bin/java': Invalid cross-device link

This is the main reason why anyone would read this article :-) They want to know why this is the case and now it will all become clear. To be able to create a hard link to /usr/bin/java I must create a link within the same partition as /usr. But before I actually create the link, look at the ls -il output of the existing /usr/bin/java file:

farhad@farhad-desktop:/$ ls -li /usr/bin/java
4206093 -rwxr-xr-x 1 root  root  38508  2010-09-07   10:35  /usr/bin/java

I exaggerated the "1" in the above listing. I will create a hard link to /usr/bin/java and we'll look at this output again to see what happens.

When we create a hard link we remove the -s option:

The one thing that is important to understand is that NO NEW inode was created. A hard link only creates a new directory entry to an existing inode. Now let's look again at the output of ls -li on the existing /usr/bin/java:

farhad@farhad-desktop:/$ ls -li /usr/bin/java
4206093 -rwxr-xr-x 2 root  root  38508  2010-09-07   10:35  /usr/bin/java

Notice that the count which was 1 before we created the hard link has been incremented to 2. This is another field of the i-node. Every time a new hard link is created this count is incremented. Remember in the previous article on permissions I mentioned how a user does not need write access to a file in order to be able to delete it? The real answer has to do with hard links. Those data blocks of a file including the i-node table are only permanently deleted by the kernel when the number of hard links in the i-node is zero. So if I were to delete the java entry inside the directory /usr/bin still the java file and its inode would remain because the link would drop back to 1, not zero.

And finally to answer the very famous question, why a hard link cannot cross file systems: The reason has to do with references a hard link makes to the i-node. In order to maintain file system integrity every entry inside a directory file must refer to an inode within the same file system. That's because of the hard link count. This count has to be valid. If you remove an entry inside a directory file the kernel must correctly decrement the hard link count inside the corresponding inode. If the directory entry were on another file system then you could wipe out this other file system. What would happen to the inode count? It would incorrectly remain unchanged and the kernel would never delete it because its count would never reach zero. This would lead to an inode leak.

The other reason why a hard link cannot cross file systems is because the i-node does not contain any file system entry. Meaning that all i-nodes inside the i-list refer to data blocks that reside within the same file system. They don't hold data block addresses that reside on another partition. There's no entry for that in the i-node table. Think of the file system as a street. Each house on the street is a data block. Each inode contains a door number for each house. But does not know the name of the street it is on. So when the kernel looks up a door number in the inode table, it assumes that this address is on the same street as the inode itself. This is by design. Since a hard link is a direct pointer to a data block, it is assumed that this address is on the same partition. There's no other way of knowing on which partition this data block would be if it were not on the same partition as the inode itself. This is another reason why a hard link cannot cross filesystems. If it did then the current file's data block address inside the inode would be insufficient to find it.

I hope that my explanation of hard and symbolic links was satisfactory.

No TrackBacks

TrackBack URL: http://www.farhadsaberi.com/cgi-bin/mt/mt-tb.cgi/9

Leave a comment

About this Entry

This page contains a single entry by Farhad Saberi published on December 25, 2010 1:16 AM.

Files Directory security setuid sticky bit permissions was the previous entry in this blog.

Parallel processing fork exit vfork _exit is the next entry in this blog.

Find recent content on the main index or look in the archives to find all content.