Directory Junctions and Volume Shadow Snapshots

Stumbled upon something by accident yesterday. Figured it would be a good idea to post about it in case it helps someone else. Before going into that though it’s worth briefly mentioning the various types of links you can have on NTFS.

Hard Links

Same data, but multiple links to it. It only applies to files (because files are what contain data). Think of it like Unix hard links. Hard links can only be made to data in the same volume. That is, you can have hard links from c:\abc.txt to c:\def.txt but you cannot have a hard link to d:\def.txt (because it’s on a different volume).

An interesting side effect of hard links is that because all the links point to the data, deleting any of those links still leaves the data accessible. The original file which was created to create the data is no longer its sole owner. The data is what matters, not the file associated with it (as you can have multiple files – the hard links – associated with the same data). The only way to actually delete the data is when all links pointing to it are removed. (Each time a link is created to the data Windows increments a counter in the file table for that data; when a link is removed the counter is decremented). NTFS has a limit of 1023 links to the same data. 

I use hard links at home while taking backups. I use robocopy to backup my folders, and since I am copying the folder when making a backup I end up having multiple copies of the same file – all taking up unnecessary space. To avoid this it’s possible to be smart and only create hard links (instead of copying) to files that are unchanged. This way all your backups point to the same data and even if the original file is deleted the data still remains as long as the links point to it. I came across this idea from my Linux days; on Windows I use the excellent DeLorean Copy for this (to be correct, I use the command-line version). 

Worth repeating – hard links are only for files (because files are what contain data). 

Soft Links/ Symbolic Links

The data has a file associated with it (as usual). You create soft links to the file. Unlike a hard link though where everything points to the data, here everything points to the original file containing the data. That original file is still important and if it is deleted all the links become invalid.

Soft links can be for files and folders. The target they point to can be on different volumes or even network shares. Moreover, the path to the target can be absolute or relative (if it’s relative then obviously it can only be to a target on the same volume). Also, the target needn’t even exist when creating the soft link! Only when the soft link is actually accessed must the target exist. (Makes sense when you think of how the target can be a network share that may not be accessible always). This way if a target pointed to by a soft link is deleted, the soft link still exists – just that it won’t work. Recreate the deleted target and the soft link will continue as usual. 

Although soft links point to a target and not the actual data, that is transparent to end-users. For an end-user the link appears just like a regular file or folder. Some people move folders from their C: drive to other locations via soft links. Don’t do this for all folders though (for example: ProgramData). Also, apparently symbolic links do not work at boot so the Windows folder can’t be redirected. I have used symbolic links to sync folders outside my Dropbox folder with Dropbox.  

Not everyone can create soft links. The SeCreateSymbolicLinkPrivilege privilege must be present to create soft links. By default administrators have this privilege; for non-administrators this privilege can be granted via Security Policies

Directory Junction

A special type of folder soft link is a Directory Junction. A directory junction is like a folder soft link except that it cannot point to network shares. Directory Junctions were introduced in Windows 2000 and make use of something called “reparse points” (which is an NTFS feature) (as an aside: reparse points are what OneDrive/ SkyDrive too use for selective sync). There are two types of “junctions” possible – directory junctions, which redirect one directory to another; and volume junctions, which redirect a directory to a volume. In both types of junctions the target is an absolute path – not relative, and not a network share. 

Soft links/ Symbolic links are an evolution of Directory Junctions (though I present the latter as a special case of the former). Soft links/ Symbolic links were introduced in Vista and are baked into the kernel. They behave like *nix symbolic links. Oddly, however, even though directory junctions were present from Windows 2000 they were never widely used, but once symbolic links were introduced directory junctions became more widely used (in fact Vista and upwards use directory junctions to redirect folders such as “Users” to “Documents and Settings”). Vista also introduced tools such as mklink to create directory junctions, soft links, and hard links. 

Links and VSS

Moving on to what I stumbled upon yesterday. 

Consider the following:

I have a folder and a symbolic link and a directory junction pointing to that folder.

The folder has a file a.txt in it. Thus both the symbolic link and directory junction too will show this file.

Now say I make a shadow copy of this drive (the C: drive in my case) and mount it someplace (say C:\Shadow).  This location is a shadow copy of the C: drive and as such it too will contain the folder and two links I created above. 

Here’s the catch though …

Say I add a new file b.txt to the folder on C: drive (C:\TEST\Folder). One would expect that file to be present in the two links of the C: drive – obviously – but not in the two links of the shadow C: drive. But that’s just not what happens. The file is present in the two links on the C: drive and also the directory junction of the shadow volume!

This bit me yesterday because for various reasons I had some directory junctions in my C: drive and I was trying to back them up via shadow copies. But the backups kept failing as the files in the shadow copy were in use (because they were pointing to the live volume by virtue of being directory junctions!) and that was an unexpected behavior. After some troubleshooting I realized directory junctions were the culprit. 

Later, I read up on reparse points and VSS and realized that while the reparse point will be backed up via VSS, the backed up location cannot be traversed in the shadow copy. Apparently, if the reparse point target is a separate volume (i.e. a volume junction), that volume will be shadow copied and must be accessed as an independent shadow copy. But if the reparse point is a directory (i.e. a directory junction like in my case) I am not sure what happens. It looks like it isn’t shadow copied, and the directory junction is repointed to the target on the main volume. 

Moral of the story: stick to symbolic links rather than directory junctions? (Update this part once I have a better answer …)

Update: I changed the two links above to point to a folder on another drive (D:). Then I took a snapshot of the C: drive. When I accessed the links in the snapshot, they were still referring to the live D: drive and not a shadow version of it. In fact, no shadow of D: drive was automatically created when I took a shadow of C: drive. I tried to create a manual snapshot of D: drive but that failed because I use TrueCrypt and apparently that doesn’t support snapshots (the error I got was VSS_E_UNEXPECTED_PROVIDER_ERROR; see this and this (broken link)).

Next I created two VHDs to simulate two drives. 

Created links from folders in G: drive to a folder in H: drive. Took snapshots of both drives. Mounted them as c:\ShadowG and c:\ShadowH. Yet when I go to the links in c:\ShadowG they still point to the live volume H: and not the snapshot H:

Odd! Even more odd is the fact that now both directory junction and symbolic link point to the live volume. So I guess the revised moral of the story is that when using directory junctions or symbolic links, try and refer to the actual target rather than the link/ junction itself. The link/ junction point to the live file system, not the shadow copy.

This also means if you were to mount an older shadow copy to try and retrieve some file you deleted in your Desktop for instance, go and check under \path\to\shadow\Users\... rather than \path\to\shadow\Documents and Settings\... – the latter is a junction and will always point to the live system, not the shadow copy as you’d expect.