Thursday, 27 December 2012

Ubuntu 12.10 on an SSD

After having cloned the preloaded disk content to a small old disk freeing up the 500GB hard disk the W530 came with, I also replaced the original disk with a 256MB SSD. This SSD is sold by Lenovo (P/N: 0A65620), and is a rebranded Samsung PM830 self encrypting disk featuring FIPS certified hardware based full disk encryption with AES-256. It allows me to set up an ITCS 300 compliant thinkpad without any impact on performance or battery life imposed by software based full disk encryption such as LUKS. This post will describe the initial steps I took to make the system more SSD-friendly.

Firmware setup

I entered firmware setup and confirmed the disk is initialized in AHCI mode instead of IDE. AHCI stand for Advance Host Controller Interface and is the technical standard defining the operation of SATA host but adapters. It offers features Parallel ATA (IDE) does not offer, such as hot-plugging and native command queueing. Under Linux, AHCI is required to use TRIM support - to be covered later. Many systems ship with IDE legacy compatibility configured by default as Windows XP had installation issues when AHCI is enabled, so it is always useful to double check this setting. In my case, AHCI was selected by default.

I ran memtest, then I returned to the firmware setup and I verified that the firmware of my w530 is up to date. Normally, updating the firmware should be one of the first steps, but one has to make sure the memory modules pass memory check routines before applying any firmware updates - never ever attempt such an update on an unstable system as you could easily brick your box. In my case the firmware seemed to be up to date.

I issued other minor changes such as setting the system time, disabling wake-on-lan and enabling virtualization (both vt-x and vt-d) - these are not related to SSD in any way. I also ensured to boot in UEFI mode instead of BIOS emulation mode.

Before booting from the SSD

I booted an Ubuntu 12.10 live usb stick, quickly ran hdparm and tested the disk via smartmontools. Those who do not know S.M.A.R.T stands for disk self monitoring, analysis and reporting technology should definitely read up on the topic.

sudo hdparm -i /dev/sda # you can see 'Model=SAMSUNG MZ7PC256HAFU-000L7, FwRev=CXM72L1Q'
sudo hdparm -I /dev/sda # more detailed info about the disk
sudo apt-get install aptitude # aptitude is my choice of package management frontend on both servers and desktops. May look old school but is very powerful.
sudo aptitude install smartmontools
smartctl -a /dev/sda | less -S
smartctl -t short /dev/sda # wait until the test completes, check progress/result with the command above
smartctl -t long /dev/sda # same as above

A very short summary on performance and lifetime considerations for SSDs: contrary to traditional rotating hard disks, where the smallest writeable unit is usually one sector, SSDs organize their 4k pages into so called erase blocks. Instead of overwriting a single sector, an SSD must erase and reprogram a full block - typically 512KB or more depending on the SSD. As an immediate consequence, many small writes may sum up to large amount of disk space being erased and rewritten. This is called write amplification. Given that each flash cell can only be rewritten for given number of times (5000 for standard MLC NAND flash), reducing the number or writes and the size of data written will extend the lifetime of your SSD and maintain good performance in the long run. More information on the topic can be found here.

Before launching the installer, I manually created a GPT partition table and the partitions with gdisk. In order to do so, first I had to enable the universe repository - launch 'Software Sources', and enabled the second checkbox. Then I installed the gdisk package with aptitude. I decided to create a small 64MB EFI System partition, 32GB root partition, 4G swap and a separate partition for home taking the remaining unallocated space. Having read up on SSD partitioning and erase block size calculations earlier, I decided to go with 2048 sector aligned partitions - nowadays this is the default in gdisk.

sudo gdisk -l /dev/sda # partition listing after partitioning
Number  Start (sector)    End (sector)  Size       Code  Name
   1            2048          133119   64.0 MiB    EF00  EFI System
   2          133120        67241983   32.0 GiB    0700  Linux filesystem
   3        67241984        75630591   4.0 GiB     8200  Linux swap
   4        75630592       500118158   202.4 GiB   0700  Linux filesystem
sudo mkfs.vfat -F 32 -n EFISYS /dev/sda1 # create the fat32 filesystem

Note that I only formatted the EFI System partition, not the other partitions. Then, I ran the installer, chose "Something else" at the disk setup menu, mapped the partitions I created before. Both root and home will have ext4 filesystem. Installation competed in just a very few minutes. Note that I did not reboot, after installation, but made some additional adjustments to reduce the wear of the ssd. I made sure the ext4 partitions will always be mounted with discard, relatime and an increased commit interval.

  • discard enables TRIM support - the kernel will tell the SSD which blocks are no longer in use, allowing the SSD to optimize and improve both performance and lifetime in the long run.
  • relatime tells the kernel to only update the access time of files (small disk writes) if the previous access time is older than modification or creation time. Many sources recommend noatime instead, but I stick to relatime since it does not break programs like mutt that rely on comparing atime with mtime. See the related discussion for further insights here.
  • commit=120 - when using ext4 filesystem, the kernel will sync in-memory cached data every 5 seconds to disk by default. Increasing this to 2 minutes will boost performance and save many small writes. As the thinkpad is battery backed, I find is a safe option. Even in the case of power loss (or complete kernel lock ups or crashed) the filesystem will not be damaged, I only risk loosing up to 2 minutes of file modifications - thanks to journalling.
Many sources suggest adding these options to your fstab, but instead of doing so, one can specify default mount options on the filesystems themselves with the command tune2fs. This way, mounting the partitions externally - e.g. from a livecd - will automatically pick up and use these mount options unless explicitly overridden. Read the manual for mount and tune2fs. In addition to these, I also wanted to mount a tmpfs over /tmp - allowing it to take up to 25% of my RAM.

sudo su
tune2fs -o discard /dev/sda2 # one could also use -E mount-opts=discard which is newer and less documented
tune2fs -o discard /dev/sda4
# relatime is employed by default on Ubuntu 12.10 so no need to set it up
# default commit interval cannot be specified with the -o option, so enable it in fstab
# mount the root partition and edit
mount /dev/sda1 /mnt; nano /mnt/etc/fstab # add commit=120 to root and home
echo "tmpfs        /tmp        tmpfs        nosuid,nodev,size=25%        0 0" >> /mnt/etc/fstab
umount /mnt

After applying these tweaks, I booted into my freshly installed Ubuntu. On the first boot I experienced a black background with a mouse pointer instead of the lightdm screen. I switched to a virtual terminal (press Ctrl-Alt F1), logged in, installed aptitude, updated packages and powered off. The next boot worked fine. (From time to time, the black screen came back - I fixed it, this will be covered in a separate post.)

Post installation tweaks

One additional tweak I strongly recommend is changing the default I/O scheduler used by Linux kernel. The "Completely Fair Queuing" is optimized for rotating disks where seek time is an important factor. Since SSDs do not have seek times varying based on the on disk location of the data, it is best to avoid the 'CFQ' scheduler and use 'deadline' or 'noop' instead. Many sources suggest hard-coding the scheduler in kernel boot options or scripts, whereas I prefer a dynamic approach that will set 'CFQ' for rotating disks and 'deadline' for non-rotating disks automatically. The following command will create udev rules that automatically assign the right scheduler based on the rotational properety of the disk.

cat <<EOF | sudo tee /etc/udev/rules.d/60-io-scheduler.rules
# set deadline scheduler for non-rotating disks
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="0", ATTR{queue/scheduler}="deadline"

# set cfq scheduler for rotating disks
ACTION=="add|change", KERNEL=="sd[a-z]", ATTR{queue/rotational}=="1", ATTR{queue/scheduler}="cfq"

Before adding this rule, I recommend checking the current scheduler - it my case it turned out that Ubuntu 12.10 was automatically selecting 'deadline' anyway, so I did not apply the tweak mentioned above on this particular setup.

cat /sys/block/sda/queue/scheduler
noop [deadline] cfq

Last but not least, concerns regarding the swap partition. The old-school approach is to create a swap partition that has at least the size of your memory. I do not want to hibernate - only suspend to ram, and I feel that 16GB of swap is simply overkill. Many sources recommend completely leaving out the swap partition, but as this laptop is going to be a workhorse, I opted for creating a 4G swap partition on the SSD. Recent Linux kernels send TRIM commands automatically if supported by the disk the swap partition is located on. Additionally, I tuned sysctl variables to decrease the swappiness and the tendency of the kernel to swap out directory entry and inode caches. I expect these settings to forbid swapping until really-really necessary.

# transient changes until reboot
echo 1 | sudo tee /proc/sys/vm/swappiness
echo 50 | sudo tee /proc/sys/vm/vfs_cache_pressure

# persistent changes
cat <<EOF | sudo tee -a /etc/sysctl.conf

As a closing note, end-user behaviour also matters much. I try to pay attention to creating transient files under /tmp - if you compile a lot, this matters much. One should find a healthy balance between being SSD-aware and being too paranoid about it. I investigated methods to decrease disk writes caused by syslog - many suggest to mount a tmpfs over /var/log which would mean all your logs are lost when you reboot, making any kind of audit or post mortem debugging impossible. I ended up sticking to the commit=120 mount option after some calculations. You should do the math and enjoy your disk. As with any kind of disk, check the SMART attributes from time to time, and make sure you have take backups on a regular basis.

No comments:

Post a Comment