Velvet Star Monitor

Standout celebrity highlights with iconic style.

news

Does Btrfs need defragmentation?

Writer Andrew Mclaughlin

I have just installed Ubuntu 11.10 on the Btrfs filesystem.

Do I really need to defragment files or the whole system?

Defragment # btrfs filesystem defragment /pool1 Defragment

Btrfs defragmentation

1

5 Answers

You don't really need to defrag Btrfs filesystems manually.

Yes, Btrfs is COW (copy-on-write), which would imply it fragments files much more than Ext, but this is addressed in several aspects of the design, including the ability to easily defrag the filesystem while it is online. This excerpt provides more detail (emphasis mine):

Automatic defragmentation

COW (copy-on-write) filesystems have many advantages, but they also have some disadvantages, for example fragmentation. Btrfs lays out the data sequentially when files are written to the disk for first time, but a COW design implies that any subsequent modification to the file must not be written on top of the old data, but be placed in a free block, which will cause fragmentation (RPM databases are a common case of this problem). Additionally, it suffers the fragmentation problems common to all filesystems.

Btrfs already offers alternatives to fight this problem: First, it supports online defragmentation using the command btrfs filesystem defragment. Second, it has a mount option, -o nodatacow, that disables COW for data. Now btrfs adds a third option, the -o autodefrag mount option. This mechanism detects small random writes into files and queues them up for an automatic defrag process, so the filesystem will defragment itself while it's used. It isn't suited to virtualization or big database workloads yet, but works well for smaller files such as rpm, SQLite or bdb databases.

So, as long as you don't plan to run IO-intensive software like a database under significant load, you should be all good, as long as you mount your filesystems with the autodefrag option.

To check the fragmentation of files, you can use the filefrag utility:

$ find /path -type f -exec filefrag {} + >frag.list
# Now you can use your favourite tools to sort the data

On Systemd systems, /var/log/journal/ will probably be the most fragmented. You can also look at ~/.mozilla and other browsers databases.

To defragment, use:

$ sudo btrfs fi defrag -r /path
5

The command to defrag a btrfs filesystem is

btrfs filesystem defragment sync -r -v -f [-czlib] / /home

Based on this answer I came up with:

sudo find / -xdev -type f -exec filefrag {} + | sed -En 's/(.+): (\w+) extent.*/\2 \1/p' | sort -nr

This lists the most fragmented files first, with the format:

<num-fragments> <pathname>
1

From the manpage as seen by running man btrfs-filesystem, for the usage of btrfs filesystem defragment (or, short, btrfs fi de):

defragment [options] <file>|<dir> [<file>|<dir>...] Defragment file data on a mounted filesystem. Requires kernel 2.6.33 and newer.

Though the interesting part is the warning:

 Warning Defragmenting with Linux kernel versions < 3.9 or ≥ 3.14-rc2 as well as with Linux stable kernel versions ≥ 3.10.31, ≥ 3.12.12 or ≥ 3.13.4 will break up the reflinks of COW data (for example files copied with cp --reflink, snapshots or de-duplicated data). This may cause considerable increase of space usage depending on the broken up reflinks.

I don't know why they did it, but depending on which Linux kernel version you are using defragmentation might (on affected kernels it will) break all deduplication that may be present in the filesystem. For me, personally, I find deduplication and CoW to be very important features of btrfs. Loosing it due to defragemtation is a thought that scares me, because there is no easy way—that I know of—to easily recreate CoW extents (i.e. deduplication).

For example, I ran duperemove on an almost full 8 TB SATA HDD and it took days. Defragmenting would not only have me do it all over, it would also probably fill up the whole drive in the process by breaking CoW data...

Long story short: regularly running btrfs scrub (checking the filesystem) or even btrfs balance (rebalance data across devices, more important for RAID configurations) on the filesystem is probably more important than defragmentation...

See also a related question at Unix&Linux: btrfs — Is it dangerous to defragment subvolume which has readonly snapshots?

For the the sake of this topic i think is better to clarify that:

Any filesystem != recovery/management utilities

Keep in mind that the filesystem is about how data is organized on a physical hardware, every other activity is done with extra software and utilities that, especially in the GNU/linux, are done by peoples that probably are not strictly related with who had made the filesystem.

This is probably not the case of the Btrfs, but sometimes we are near to that because desktop and enterprise environments can provide different solutions with different grades of realiability.

The answer to your question is "no" but is a "no" related to the normal behaviour of a generic filesystem, and every utility you can find about this is simply a different project.

When you need to defrag your filesystem keep in mind that you are going to use external software that is not really related to the life of the filesystem itself, with all the pros and cons.

0

Your Answer

Sign up or log in

Sign up using Google Sign up using Facebook Sign up using Email and Password

Post as a guest

By clicking “Post Your Answer”, you agree to our terms of service, privacy policy and cookie policy