Saturday, October 10, 2009

Increase USB Flash Drive Write Speed

The one of the biggest problems of usb flash drives is a slow data write speed. This article will guide you through the process that can possibly increase your flash stick write speed.

Okay, first I bought Transcend 8GB usb flash stick. It had been formatted with FAT32 filesystem initially. So I decided to run data read/write speed test. Mount the filesystem and execute following

# hdparm -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 102 MB in 3.05 seconds = 33.43 MB/sec

$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 29.5112 s, 3.6 MB/s


The disk read speed is good enough, but the write speed is not so good. That's because most of NAND flash drives (the most commonly used flash sticks) have 128k erase block size. Filesystems usually have 4k (4096 bytes) block size. And here we came into problem. If the filesystem blocks are not aligned to flash drive blocks, the performance overhead during disk writes will increase. So what we can do is to align filesystem properly. The best way to do this is to use 224 (32*7) heads and 56 (8*7) sectors/track. This produces 12544 (256*49) sectors/cylinder, so every cylinder is 49*128k.

# fdisk -H 224 -S 56 /dev/sdb

Now turn on expert mode with fdisk and force the partition to begin on 128k alignment. In my case I have set new beginning of data to 256. Create as many partitions as you need (I created only one - /dev/sdb1).
Do not forget to save changes and write new layout to flash drive (all data on the flash disk will be lost)
Now it's time to create the filesystem. I used ext4 because there is a way to tell it to specify a strip width to keep your filesystem aligned:

# mke2fs -t ext4 -E stripe-width=32 -m 0 /dev/sdb1

Now lets mount the filesystem and test the overall performance

# hdparm -t /dev/sdb

/dev/sdb:
Timing buffered disk reads: 102 MB in 3.01 seconds = 33.94 MB/sec

$ dd count=100 bs=1M if=/dev/urandom of=/media/disk/test
100+0 records in
100+0 records out
104857600 bytes (105 MB) copied, 17.0403 s, 6.2 MB/s


As we can see, the data read performance is almost the same while the write speed is considerably faster.

31 comments:

  1. Hi, thanks for a really interesting article.

    What is the algorithm you are using to calculate the heads and sectors?

    I have a Sandisk 32GB memory card I would like to use this technique on.

    ReplyDelete
  2. The partitioning geometry doesn't depend of the disk size, so you should be ok with

    # fdisk -H 224 -S 56 /dev/sdb

    ReplyDelete
  3. now if only you could tell me how to do it in WinXP. much appreciated. atleast there is hope that there is a way

    ReplyDelete
  4. There are some quirks to do this under Windows XP. Take a look into the nice guide http://www.ocztechnologyforum.com/forum/showthread.php?48309-Partition-alignment-importance-under-Windows-XP-%2832-bit-and-64-bit%29-why-it-helps-with-stuttering-and-increases-drive-working-life

    ReplyDelete
  5. You wrong when use /dev/urandom, because this device is very slow. Really, you can measure your tuning with follow command for example:

    # dd count=100 bs=1M if=/dev/zero of=/media/disk/test oflag=sync

    It using /dev/zero which is extremely fast, just compare following:
    $ dd count=100 bs=1M if=/dev/urandom of=/dev/null
    $ dd count=100 bs=1M if=/dev/zero of=/dev/null

    Please report us what really you get.

    ReplyDelete
  6. Thank you for your tip! I have run your test and get

    dd count=100 bs=1M if=/dev/urandom of=/dev/null
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 11.6463 s, 9.0 MB/s

    dd count=100 bs=1M if=/dev/zero of=/dev/null
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 0.0862834 s, 1.2 GB/s

    Also I didn't consider that for precise test we need to append `sync` mount option. This option disables write cache buffer. Moreover, today's tests show me extremely good performance with ntfs file system having 64k cluster size and aligned the same way as mentioned in the article (basically it is the same flash drive). Take a look at these values:

    $ sudo mount -t ntfs-3g -o sync,user,umask=0002 /dev/sdb1 /mnt/flash/

    $ dd if=/dev/zero bs=1M count=200 of=/mnt/flash/z
    200+0 records in
    200+0 records out
    209715200 bytes (210 MB) copied, 14.7261 s, 14.2 MB/s

    It would be good if someone make the same tests to confirm these results

    ReplyDelete
  7. I tried to fill up all flash disk space with one ntfs partition and noticed that speed was dropped then to approx. 5,7 megs per second. So there is nothing extraordinary

    ReplyDelete
  8. What I don't understand is why you haven't made the cylinder size 128K, eg:

    # fdisk -H 32 -S 8 /dev/sda

    If you start with cylinder 1 the partition will be at the first track: 4K, if you start at cylinder 2 or later the track will start at a multiple of 128K. You can leave the first cylinder (erase block) empty as now, or put an unaligned filesystem as partition 1.

    ReplyDelete
  9. Dear Michael,

    Could you please explain what you are doing with the number of heads and sectors? Why are you changing them? Are you trying to get track size to be the same as the filesystem block size?

    And what does the number of heads mean? I know it does not mean physical heads, but what does it mean? Is the track size derived from total disk capacity divided by number of heads divided by number of cylinders divided by number of sectors per track?

    That does not seem to add up, for example:

    # fdisk -l /dev/sda

    Disk /dev/sda: 500.1 GB, 500107862016 bytes
    255 heads, 63 sectors/track, 60801 cylinders
    Units = cylinders of 16065 * 512 = 8225280 bytes

    Device Boot Start End Blocks Id System
    /dev/sda1 * 1 13 104391 83 Linux
    /dev/sda2 14 60801 488279610 8e Linux LVM
    # perl -e 'print 500107862016/255/63/60801,"\n"'
    512.002674878606
    #

    Cheers,
    Aleksey

    ReplyDelete
  10. @Alex: Read "heads" as "tracks/ cylinder" in the computer outputs and explanations. Now re-read the third paragraph of the blog above that begins with:

    "The disk read speed is good enough . . . "

    and ends with:

    ". . . 49*128k."

    Key concept is: "Adjusting disk geometry of flash drive to multiples of the 128k NAND flash erase block." The 4k OS read/ write size IS bound by disk geometry parameters, and the 128k erase block size IS NOT. This way any 4k write operation will be within one 128k erase block.

    You have 16065 sectors/ cylinder total, which is
    8225280 bytes/ cylinder. 8225280 bytes is not evenly divisible by 128k (= 128*1024bytes): Actually, 8225280/ 128k= 62.753... You can't even begin to "align" the 4k read size to the 128k erase block size in this case.

    ReplyDelete
    Replies
    1. I want to know the exact difference between cylinder and erase block.In the above article heads(---*7),from where that '7' factor comes from? help me at gokhalesushant65@gmail.com because I m in the final yr of engg. and we are doing project related with the usb.Also help me with the different articles if possible

      Delete
    2. I want to know whether is there any way to change the NAND erase block size.reply me at gokhalesushant65@gmail.com

      Delete
  11. Thank you. That helps a lot.

    My actual USB stick is 8 GB, like the author's, and I was able to use the same parameters for -H and -S to get a cylinder block size that is evenly divisible by 128k.

    (The 500 GB disk was a regular spinning disk, I was just trying to understand the fdisk output as it relates to this issue.)

    ReplyDelete
  12. Big thanks to the author for this post, we tripled our USB flash drive write speed.

    ReplyDelete
  13. Make sure you have the "sync" option ommited for flash drive mounts. There are two major disadvantages with the "sync" option on flash drives.
    1. It will do more erase/write cycles on your flash drive reducing its lifetime
    2. It will be very slow to write in to the flash file system (sometimes more than 100 times slower) because it keeps writing/re-writing sectors.

    Use the "async" option when mounting flash drives.

    ReplyDelete
  14. Hello,
    is this applicable to a 16GB USB Stick?

    ReplyDelete
  15. Interesting, but confusing (for a noob like me...)

    1. For the read-test you use "/dev/sdb" and for the write-test "/media/disk/test". Am I correct that this should also be /dev/sdb ?
    2. Shouldn't people be warned NOT to run dd on a stick containing data!?!?
    3. Can you explain a bit more about the fdisk/expert/begin partition please? Do I first create the partition and then hit 'x' for expert and then change to..... what? My default states 2048, do I change that to 256? Is that the same for every size stick?
    4. I tried this and the outcom is exactly the same speed.

    ReplyDelete
  16. Hi Michael,
    my case is a Kingston DT 101 2G 32GB using NTFS.

    I called fdisk -H 224 -S 56 /dev/sdd and did the instructions getting

    Disk /dev/sdd: 31.2 GB, 31221153792 bytes
    224 heads, 56 sectors/track, 4861 cylinders, total 60978816 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0xc3072e18

    Device Boot Start End Blocks Id System
    /dev/sdd1 256 60978815 30489280 7 HPFS/NTFS/exFAT


    After (w)riting to save things, I called fdisk /dev/sdd to verify and obtained

    Disk /dev/sdd: 31.2 GB, 31221153792 bytes
    44 heads, 24 sectors/track, 57745 cylinders, total 60978816 sectors
    Units = sectors of 1 * 512 = 512 bytes
    Sector size (logical/physical): 512 bytes / 512 bytes
    I/O size (minimum/optimal): 512 bytes / 512 bytes
    Disk identifier: 0xc3072e18

    Device Boot Start End Blocks Id System
    /dev/sdd1 256 60978815 30489280 7 HPFS/NTFS/exFAT

    That is, the heads (44) and sectors (24) have changed !!!
    Tried several times and the results were the same.

    Is this the expected behaviour? This also happened with you?


    As I need to use NTFS (ntfs-3g), what the necessary parameters while making the file system
    with mkntfs to obtain the correct alignment?

    Regards
    Fabio

    ReplyDelete
  17. Are you really serious with this "guide"?
    It's really impossible to repartitioning and reformating every USB stick I can use ;)
    Some of them doesn't belongs to me and I use them to write some data to anyone else...
    Why the Windows (2k,xp,++) can write 8-10 times faster than Linux?
    Linux have serious problem with USB sticks - yeah! it is unusable...
    I had hope that it will be fixed soon, but waiting about 5 years with no advance.
    Yes i tried all these tricks with ehci and others, sync and async options but still no change.
    My writing speed on flash disks is about 400kB/s - various HW with linux - the same poor results.
    Using the same HW with Windows work fine. But any other work in windows is much worse than in linux.
    Writing to SSD card in my (Samsung) mobile connected via USB is almost instant. Cannot find any difference in mounting parameters.
    It is much faster to burn data to DVD than copy data to flash disk. That's really poor and sad...

    ReplyDelete
  18. Great tip. I use Linux to deploy Windows by USB sticks and this is a good tip. Bookmarked!

    ReplyDelete
  19. This is insane! Using if=/dev/zero almost writes in an instant!

    I tried testing this on my Transcend JetFlash 32GB having an ext4 and an xfs filesystem and below are the results. Here are the partition of my Flash Drive.

    1st Partition - 1GB (ext4) Label: boot-os
    2nd Partition - 30GB- (xfs) Label: Transcend32G

    The way I format the 1st partition is:

    mkfs.ext4 -E stripe-width=32 -m 0 -O ^has_journal,large_file -L boot-os -vvv /dev/sdc1

    mke2fs 1.42 (29-Nov-2011)
    fs_types for mke2fs.conf resolution: 'ext4'
    warning: 224 blocks unused.

    Filesystem label=boot-os
    OS type: Linux
    Block size=4096 (log=2)
    Fragment size=4096 (log=2)
    Stride=0 blocks, Stripe width=32 blocks
    65664 inodes, 262144 blocks
    0 blocks (0.00%) reserved for the super user
    First data block=0
    Maximum filesystem blocks=268435456
    8 block groups
    32768 blocks per group, 32768 fragments per group
    8208 inodes per group
    Superblock backups stored on blocks:
    32768, 98304, 163840, 229376

    Allocating group tables: done
    Writing inode tables: done
    Writing superblocks and filesystem accounting information: done



    The second partition is this:

    mkfs.xfs -L Transcend32G /dev/sdc2

    meta-data=/dev/sdc2 isize=256 agcount=4, agsize=1915072 blks
    = sectsz=512 attr=2, projid32bit=0
    data = bsize=4096 blocks=7660288, imaxpct=25
    = sunit=0 swidth=0 blks
    naming =version 2 bsize=4096 ascii-ci=0
    log =internal log bsize=4096 blocks=3740, version=2
    = sectsz=512 sunit=0 blks, lazy-count=1
    realtime =none extsz=4096 blocks=0, rtextents=0



    Then I tested their read speed and they're almost identical:

    N53SV ~ # hdparm -t /dev/sdc1

    /dev/sdc1:
    Timing buffered disk reads: 62 MB in 3.06 seconds = 20.29 MB/sec
    N53SV ~ # hdparm -t /dev/sdc2

    /dev/sdc2:
    Timing buffered disk reads: 62 MB in 3.06 seconds = 20.24 MB/sec



    Now here's the fun part using if=/dev/zero

    john@N53SV:~$ dd count=100 bs=1M if=/dev/zero of=/media/Transcend32G/test.xfs;echo;dd count=100 bs=1M if=/dev/zero of=/media/boot-os/test.ext4
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 0.0612922 s, 1.7 GB/s

    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 0.0589694 s, 1.8 GB/s


    Using if=/dev/urandom yields the results below:

    john@N53SV:~$ dd count=100 bs=1M if=/dev/urandom of=/media/Transcend32G/test.xfs;echo;dd count=100 bs=1M if=/dev/urandom of=/media/boot-os/test.ext4
    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 7.88737 s, 13.3 MB/s

    100+0 records in
    100+0 records out
    104857600 bytes (105 MB) copied, 7.90414 s, 13.3 MB/s

    ReplyDelete
  20. hey John,

    try using sync option when mount to measure real speed.. it seems that in your case you are using OS write cache.

    ReplyDelete
  21. hi can you help me please ! how can i test read/write speed in linux ?

    ReplyDelete
  22. I have Kingstone DTSE9 32G usb stick and I've installed an linux operating system on it (Debian Wheezy) ... my question is what I can do to improve writing speed cause is really bad. A simple menu browse will cause system hang and it is simply unusable as hard drive.

    I try to format like this tutorial said but with no success. I will very appreciate any help or an "step by step" noob tutorial :)

    ReplyDelete
  23. You should really try to use aufs on top of your rootfs in case you're using usb stick for Linux distribution. Try to read this article https://help.ubuntu.com/community/aufsRootFileSystemOnUsbFlash and see if it helps.

    ReplyDelete
  24. Hi,

    Does this apply to external usb hard disks as well? (like 1TB sized and so on) ?
    Currently I got 3MB/s write speed and it's actually slower than I can download files from Internet.

    Best regards
    denu

    ReplyDelete
  25. Hey denu,

    No it does not apply to hard disks since they dont have similar erase block technique.

    ReplyDelete
  26. Your guide suggests measuring write speed by writing from /dev/urandom However, on a fast drive (e.g. USB3 or if one is testing an internal drive) the random number generator can become a bottleneck. On my machine it maxes out at 14 megs a second. It's best, therefore, to use /dev/zero as the source. e.g.
    dd if=/dev/zero of=tempfile bs=1M count=1024 conv=fdatasync,notrunc

    ReplyDelete
  27. Using /dev/zero to test is actually a bad idea (usually*), as filesystems such as ext4 will store files full of zero using "sparse blocks", meaning that they don't actually write the data to the disk. (Instead, they will just write a small amount of data saying, effectively, "this is a 2GB file full of zeros".) This is why John above observed such a fast write speed with /dev/zero -- because it wasn't actually writing the data.

    I'd suggest running dd count=100 bs=1M if=/dev/urandom of=/dev/null and checking how fast it goes -- if your drive is slower than this, then you should be right with urandom for input. Otherwise, I'd suggest that the next best bet is to create a 100MB file on a fast storage device (another flash drive?) and use that as the input file, so that you're not generating the randomness on-the-fly.


    * Of course, if you're using an older/simpler filesystem such as FAT32, that doesn't support sparse blocks, then using /dev/zero is not a problem.

    ReplyDelete
    Replies
    1. I've tested this out, and found that at least with Ubuntu 12.04, following these steps gives no speed increase compared to just creating and formatting the partition with gparted. I also found that the stripe_width parameter to mke2fs didn't make any noticeable different to write speed for me.

      In fact, I'm not sure that making these changes should even in principle speed up your drive. While it's true that flash drives use large (eg 128k) erase-block sizes, my understanding is that they still write data in sector-sized blocks (512 bytes), and therefore aligning your partition to 128k will not make a difference to write speeds, at least on a fresh drive. It should only make a difference when the drive starts deleting blocks of data, which only happens once the drive gets full. (And even then, partition alignment has very little to do with the speed.)

      So I don't think that following these steps will actually make your drive any faster. My suspicion is that the only reason the OP found that his drive ran 2x faster than before, is because he changed the filesystem from FAT32 to ext4, and ext4 is a much faster filesystem than FAT. (Or, at a minimum, the linux ext4 driver is faster than the linux VFAT driver.)




      Postscript: I think I've confirmed this theory. I reformatted with FAT32 on the exact same disk I was testing on before, and found that my write speed went from ~ 7MB/s down to ~ 4.4MB/s. So, moral of the story: if you want it to work faster (at least under Linux), reformat it with ext4 instead of FAT32.

      Delete