Bittorrent database, helpful commands

From: Ty Heath <heath_at_mediadefender.com>
Date: Mon, 21 May 2007 11:37:00 -0700

Since 129.47.9.250 is a Solaris box I'll give you guys some helpful
commands to see what its doing.

prstat //this is equivalent to top on linux, but more powerful. -
L option will give you per thread cpu usage info

mpstat //per core cpu statistics, there are 32 'cores'

iostat -xtc 1 //advanced i/o info for the harddrives. The important
fields are %w and %b. %w = percent of time that data is waiting to
be written, %b = percent of time the drive is busy. %w should be ~0
most of the time, otherwise the disk i/o is bottlenecking.

vmstat 1 //memory info. free = amount of physical memory available
in KB. swap = amount of swap available in KB. if sr is consistently
over 200, then there is a memory shortage. If 'r' is consistently
more than 32 then there is a cpu shortage.

zfs list // to see how much hard drive is free on the raid array

zfs get all zfspool/mysql //to see some important info about the
raid array, like compression ratios etc. You can see that I've
turned access times off (atime). This is something I wish you could
do easily on all file systems, I don't want the system wasting cpu
and disk i/o updating the access time for each file that I access.

I will add some info I've gotten off the web about the new Solaris
ZFS file system we are using on this server.

- Filesystems can be compressed. Unlike a compressed disk image, a
compressed ZFS filesystem is read/write. Moreover, the compression
flag can be turned on and off on the fly. New data will be compressed
(or not) as per the flag, and old data will be left as is.

- Filesystems are nested and making them is as easy as making a
directory. This in itself is not very interesting but combined with
compression, this means that you can effectively turn on compression
for just a subfolder on your drive.

- Every block of data on the disk is checksummed so errors can be
detected during read operations. Many common hard drive failures are
catastrophic, and painfully obvious when they happen. But it is
possible for your data to be corrupted on disk in ways that you, and
the hard disk, will never notice. While checksumming will not allow
you to recover your data, it will let you know when you should go
retrieve a file from your backup.

- Space-efficient and fast snapshots. A snapshot allows you to see
your filesystem as it was some time in the past. ZFS is designed to
snapshot a filesystem in constant time, no matter how much data you
have, or how frequently you snapshot it. Moreover, the snapshot is
very space efficient. Identical blocks are shared between snapshots
and the live filesystem until they are written to. The space required
for snapshots is therefore mostly a function of how quickly your
files change, and not so much how often you make a snapshot. It’s
like version control for your entire computer!

- Automatically growing filesystems. Once you add your disks to the
storage pool, all of their space is available to all of the
filesystems you have. You can reserve space for a filesystem, to
guarantee a minimum amount is available when you need it, and you can
also set quotas. But these are just flags which are easy to change on
the fly. The default for every filesystem is automatically expanding
capacity up to the limit of your storage pool. There are no manual
volume or filesystem resizing operations, ever.

- Dynamic striping of file blocks over all drives in the storage
pool. If you throw 2 drives in your storage pool, then files are
automatically distributed over both disks, making large reads and
writes faster. The disks do not have to be the same size (unlike
usual striping configurations) and you can expand the pool whenever
you want by installing a new disk. New files will stripe over old and
new disks, and the old files will stay where they are. But, when you
modify old files, the changed blocks are spread over all the
available disks again. After adding a new disk, ZFS will get faster
as you use the filesystem!

- Software mirroring with automatic error detection and self-healing.
ZFS also incorporates features traditionally left to software RAID
drivers. You can arrange your disks into mirrored pairs (or triples,
etc), which speeds up data reads, and also protects against single
disk failure. Moreover, since ZFS checksums all data blocks, if one
disk returns bad data, ZFS knows without having to query the other
disk every time. Having identified the problem, it can then access
the failed block from the other disk(s) in the mirror set and return
to you correct data. ZFS then writes the correct data back to the
original disk which failed the checksum. If the data error was a
fluke due to some correctable problem, perhaps a bad sector (which
modern drives can reassign to a new physical location) or just a bad
write, then this will solve the problem. If the disk is really dead,
then ZFS will take it offline and wait for you to replace it.

- Fast resync of mirrors. In the unfortunate circumstance where a
drive does die and you replace it, the resync process is faster with
ZFS. This is because, unlike many other RAID systems, ZFS knows which
blocks on the were used, and which blocks were not used. During
resynchronization, ZFS only copies blocks with actual filesystem data
on them to the new disk. So, if your disk pair was only half-full,
then you are back in business twice as fast.

- Software pairity RAID that actually works. The most popular pairity
RAID system is by far RAID-5, where for every N-1 data blocks, there
is one parity block. The parity block allows you to recover all your
data if any one disk fails, much like mirroring, but without as much
space penalty. There is a seldom discussed problem with RAID-5, known
as the “RAID-5 write hole.” When modifying a single block, you have
to rewrite all N blocks (including the parity block). If a power or
hardware failure happens in the middle of rewriting these N blocks,
then you effectively lose all N blocks of data, with no way to
recover them. You can fix this in hardware with battery backup
systems, or RAID controllers with non-volatile write caches. The
structure of ZFS is such that you can also solve the problem in
software using a variant of the RAID 5 algorithm called RAID-Z. RAID-
Z behaves much like RAID-5, but has no write hole. Recent ZFS
releases have also added a double parity version of RAID-Z, which
allows you to withstand 2 disk failures at once. We are using RAID-Z
with 1 bit parity.

- A stream format which allows you to copy snapshots to other
systems. This feature is a little hard to explain, but it basically
allows you to dump a ZFS filesystem, preserving the snapshot history,
and reload it on another system. This could be used for maintaining a
backup server, or loading a filesystem into another storage pool.

- Highly SMP-friendly design. ZFS is designed to efficiently support
many processes all accessing a filesystem at the same time.

- Nearly unlimited capacity and scalability. We come full circle back
to the capacity issue. For servers which need to manage a large
number of disks, ZFS scales pretty well up from the single-disk
scenario we started with. Sun certainly pushes ZFS on their 48 disk
monster, the Sun Fire X4500.

As a side note, ZFS will be the standard file system on the upcoming
Mac OS X 10.5.

Ty
Received on Fri Sep 14 2007 - 10:55:53 BST

This archive was generated by hypermail 2.2.0 : Sun Sep 16 2007 - 22:19:46 BST