As part of our general server maintenance, we deploy several tools. One of them is a script that runs a defrag on the disks periodically, and reports back the fragmentation and other disk-related information. Generally, this can help us troubleshoot a server outage before it happens. This time, it failed, and in a really interesting way.
The server running the tool recently started acting extremely slow in the field, and the defrag script was reporting that it couldn't complete. The role of the server happens to be that of a security monitor, and it creates logs periodically, encrypts and compresses them and then fires them off to us for review. The idea sounds simple and without any long-term complication, so what happened that caused this disk fragmentation?
The problem either was related to our log, compress, send, verify receipt, and delete procedures when multiple telemetry was being collected, or some update process in combination was doing it. Somehow it happened, and now we solve for that, first.
This defrag screen shot - although cropped, was fragmented like this across the whole drive, beginning to end. Believe it or not, there is 26% free space on the volume.
Now the Windows Defrag tool works well, until you get into this situation. There isn't enough space to do a defrag in one pass, and I am about forty passes in. Each time, the fragmentation gets smaller, but not enough to complete and have little blue blocks everywhere. Frankly, I am wondering how many times it will take!
So, I decided to build a defragger tool myself, and of course in C#. I had some experience with the Defrag DeviceIOControl API's in C++ from back in the day when I built device drivers and needed to control them, but could I do this project solely in C# with interop? I took a look at Jeffrey Wall's blog and found some wrappers - tried them and found that they worked without problems in terms of executing the IO, but were not functional when they pulled the structures out from the API. In this case, the Marshal.PtrToStructure was incorrectly pulling out the UINT64's in an improper endianness, and so I had to fix that, wrap the API and functionality in proper classes and speed the whole thing up by allocating things that I needed at init time instead of before every file was processed.
The way I implemented the tool was to run in two passes. I wasn't thinking very generic, but for this problem I decided it was best solved by finding the largest fragmented file and moving the fragments towards the end of the volume, filling back towards the middle. I then shifted to the other files, working my way to the smaller ones in the same manner. The center now had free space. I then walk from the middle towards the front of the drive, defragging each fragmented file as I go, and then start from the middle and work to the end.
Out of the "defragger in a day" project came a couple of tools, one mimicking contig by Mark Russinovich, where I could not only specify that the file needs to be contiguous, but favour where on the disk to place it (begin, middle, or end, etc...) The next was a graph maker that would display the free space bitmap of the drive - in a bitmap nonetheless, which helped to see that the APIs were doing things, and finally an analysis tool that describes the fragmentation, for use in telemetry in later projects.