ext3, lots of files, and you

Don’t do it.

While there’s no technical file limit on a per-directory basis with ext3 (there is on a filesystem level — see inodes), there is *significant* performance degradation as you add more and more files to a single directory. I think the most I’ve seen without any user-noticeable sluggishness was about 300,000. Note that this is well beyond the point where you can’t `ls’ anymore and you have to resort to `find’ and `xargs’. This should be your first warning sign.

Approaching 5 million files in one directory, things start to get weird. Creating new files in that directory generates significant load, though resource usage is low. However, statting a specific file (as opposed to a lookup with a glob) is decently fast. As long as you know what you want, it’ll work acceptably fast.

The more files you add, the slower lookup-based operations (new file creation, for example) will go — we’re talking seconds and tens of seconds here, not a few milliseconds more. As long as you give it an exact filename, though, it will be of an acceptable speed.

The filesystem option dir_indexes will help, though not hugely once you start getting into millions of files. Compared to no dir_indexing, it’s faster, but it doesn’t make it magically work. Converting to ext2 is a terrible idea and should not be considered — journals are good things and well worth the extremely slight (comparatively) performance hit endured.

The real solution, however, is to not put that many files into a single directory. Subdirectories are always a good idea (though keep in mind the subdirectory limit — 32k subdirs per dir!). Heck, most code can almost trivially be modified to pull content from a hash from the filename, such as /var/www/images/y/o/yourmom.jpg and /var/www/images/y/i/yipee.jpg. When designing an application, one should be mindful of the limitations of the underlying OS (and in this case, the filesystem being used).

Recover an ext3 journal

dmesg scrolling with “journal aborted”, filesystem in read-only

Give this a go (may need a rescue environment):
[code lang=”bash”]tune2fs -f O ^has_journal /dev/sda1
tune2fs -j /dev/sda1[/code]

Recover ext3 filesystem with missing superblock

[code lang=”bash”]mount: wrong fs type, bad option, bad superblock on /dev/sda1, or too many mounted filesystems[/code]

Usually, this is code for “you’re fucked”. Here’s something you can try, however:

List the proposed superblocks (filesystem must be unmounted):
[code lang=”bash”]mke2fs -n /dev/sda1[/code]

fsck the filesystem using a backup superblock (caution, should try with -n switch to fsck first):
[code lang=”bash”]fsck -b 24577 /dev/sda1[/code]

If it fails, scan for the superblocks and use one of those:
[code lang=”bash”]dumpe2fs /dev/sda1 |grep super[/code]

Then again, if using a backup superblock doesn’t work, you’re probably fucked, as originally thought.