Sunday, 21 July 2013

Changing UIDs of service accounts on Linux

About a month ago I was setting up a proprietary identity management solution on headless Linux servers. During the installation process I tried to keep the number of installed packages and libraries to the bare minimum, to create an as-lean-as-possible setup as opposed to the bloated 'typical' deployments.

This goal in tandem with my choice to use the most up to date version of the target Linux distribution resulted in an 'advanced' installation procedure using the command line. While it was no surprise that the graphical installshield wizard did not start due to missing libraries on the headless server, it surprise me that non-GUI installation failed each time during setting up service accounts.

Having tried various combinations, disabling SELinux, and even creating the service account manually before re-running the installer, it still kept failing. A quick search on the internet did not yield any useful hint, but looking deeper, so I investigated further. Enabled debug logging for the installshield wizard, then extracted the embedded Java installer from the blob, and found that various native binaries are unpacked and executed during the installation process.

One of these binaries was linked against an older version of a shared library (libstdc++.so.5) that was not available in any of the packages any more. As a quick and dirty workaround I quickly created a symlink with that name pointing to libstdc++.so.6, and confirmed this resolved the problem.

The installation was successful, however, it left me with a technical user and group that I did not fancy.


$ grep ldap /etc/passwd /etc/group
/etc/passwd:idsldap:x:502:501::/home/idsldap:/bin/ksh
/etc/group:idsldap:x:501:root

  • First and foremost, I prefer technical accounts to have numberic UIDs from the range reserved for system accounts, as opposed to having them created as normal users. This range varies by distribution, RedHat & SuSE based distros reserve UIDs smaller than 500 for system accounts while Debian based ones create normal users with the first available UID after 999.
  • If I have the control over the numeric values, I prefer to introduce some semantics into the naming convention, like creating the technical user for an LDAP server with UID 389. This is just a cosmetic point with no real impact, but as I had to change the UID anyway...
  • Last but not least, I very strongly dislike if the group which is the dedicated primary group of a service account, uses the same name as the service user but a different numeric ID.

This post will cover the commands I applied to fix the UID and GID of the service user and group. File ownership is automatically fixed, but only for the files under the home directory (and mail spool) of the user, so additional commands were needed to also fix file ownership or files located elsewhere.


$ find / -user idsldap | less
$ usermod -u 389 idsldap
$ find / -user idsldap | less # you see that ownership is only updated under the home directory and mail spool
$ find / -user 502 | less # the old uid is still present on other files
$ find / -user 502 -exec chown idsldap {} \;
$ find / -user 502 | less # almost done, but by default, symbolic links are dereferenced, therefore not affected.
$ find / -user 502 -exec chown -h idsldap {} \;
$ find / -user 502 # none - confirms we are done.

$ find / -group 501 | less
$ groupmod -g 389 idsldap
$ find / -group 501 -exec chgrp -h idsldap {} \;
$ find / -group 501 # confirm we are done
$ find / -group idsldap | less # this gives the same output the first group related command

While most people probably would not put any effort into changing UIDs but rather leave it that way or reinstall from scratch, I do believe that this case was worth investing a bit of extra time.

Saturday, 22 June 2013

OpenCL, Python and Ubuntu 12.10 (Part 2)

Using multiple OpenCL devices for computation intensive tasks might turn out to be a bit more challenging. Distributing load between heterogeneous GPUs is straightforward, however, ensuring that one gets the cumulative performance of all the GPUs to the same level as the sum of the individual devices is a bit tricky. At least with the drivers that ship with Ubuntu 12.10.

The current case that motivated this post is tweaking a configuration with multiple AMD GPUs of different generations. There is much controversial information available on whether an X session is required or not, whether to use crossfire cables or not, and whether dummy-plugs are required or not. It also seems to be less certain if and how cards of different generations play together.

Having searched the internet for utilities that help in debugging OpenCL related issues I decided to create my own version, that is not extremely chatty but provides a quick overview of the devices recognised along with their most important performance related parameters. The code is based on a script I found on a forum, extended and customized to match my needs.


#!/usr/bin/python

# 2013-04-03 03:35:03 

import sys
import os
import time
import platform
import imp

def getPyOpenCLPath():
    try:
        file, pathname, descr = imp.find_module('pyopencl')
    except:
        pathname = 'Not found'
    return str(pathname)

path = getPyOpenCLPath()

print 'opencl-info @ %s' % time.asctime()
print 'Operating System: %s %s' % (platform.system(), platform.dist())
print 'Python Version: %s (%s)' % (platform.python_version(), platform.architecture()[0])
print 'PyOpenCL Path: %s' % path

if path == 'Not Found':
    print 'Exiting' 
    sys.exit()

try:
    import pyopencl
    import pyopencl.version
except:
    print 'Unable to load PyOpenCL! OpenCL not supported?'
    sys.exit()
 
print 'PyOpenCL Version: %s' % pyopencl.VERSION_TEXT

try:
    platforms = pyopencl.get_platforms()
except:
    print 'Cannot get platform.'

if len(platforms) == 0:
    print 'No OpenCL platforms found!' 
    sys.exit()

count = 0

for i,p in enumerate(platforms):
    print ''
    print '[cl:%d] %s' % (i, p.name.replace('\x00','').strip())
    for k in ['vendor', 'profile', 'version']:
        print '    %s %s' % ((k + ':').ljust(16), getattr(p,k))
    print ''

    devices = platforms[i].get_devices()
    if len(devices) > 0:
        # Iterate through devices
        for j,d in enumerate(devices):
            count += 1
            print '    [cl:%d:%d] %s' % (i, j, d.name.replace('\x00','').strip())
            print '        type:                %s' % pyopencl.device_type.to_string(d.type)
            print '        memory:              %d MB' % (d.global_mem_size//1024//1024)
            print '        compute units:       %s' % d.max_compute_units
            print '        max clock:           %s MHz' % d.max_clock_frequency
            print '        max work group size: %s' % d.max_work_group_size
            print '        max work item size:  %s' % d.max_work_item_sizes
            # Iterate through device info
            #for name in filter( lambda x: not x.startswith('_'), dir(d)):
            #    try:
            #        print(name + ': '+ str(getattr(d, name)))
            #    except:
            #        print(name + ': (skipped)')

After trials and errors and a bit of tinkering I was able to set up the environment to run fine without an X session, without crossfire cables and without any dummy plug, and yield optimal performance. Documenting the configuration details are beyond the scope of this post, however the script above is provided to help others in OpenCL related debugging and optimisation.

Saturday, 18 May 2013

Tracking down TrackPoint issues (Part 2)

This post in a follow up on Tracking down TrackPoint issues. I have noticed few days ago, that hot-plugging a USB mouse "does not work". I normally do not plug a mouse into my thinkpad, so this did not come to my attention so far.

After seeing my cursor would not move after plugging in the mouse, I decided to gather more information, starting with dmesg to verify the device is connected and recognized. Then, I inspected the output of xinput list and found that the mouse is listed as a floating slave. Manually running xinput enable "HID 04b3:3107" fixed the situation, it attached the floating device as a slave to "Virtual core pointer".


⎡ Virtual core pointer                     id=2 [master pointer  (3)]
⎜   ↳ Virtual core XTEST pointer               id=4 [slave  pointer  (2)]
⎜   ↳ SynPS/2 Synaptics TouchPad               id=12 [slave  pointer  (2)]
⎜   ↳ TPPS/2 IBM TrackPoint                    id=14 [slave  pointer  (2)]
⎜   ↳ HID 04b3:3107                            id=15 [slave  pointer  (2)]
⎣ Virtual core keyboard                    id=3 [master keyboard (2)]
    ↳ Virtual core XTEST keyboard              id=5 [slave  keyboard (3)]
    ↳ Power Button                             id=6 [slave  keyboard (3)]
    ↳ Video Bus                                id=7 [slave  keyboard (3)]
    ↳ Video Bus                                id=8 [slave  keyboard (3)]
    ↳ Sleep Button                             id=9 [slave  keyboard (3)]
    ↳ Integrated Camera                        id=10 [slave  keyboard (3)]
    ↳ AT Translated Set 2 keyboard             id=11 [slave  keyboard (3)]
    ↳ ThinkPad Extra Buttons                   id=13 [slave  keyboard (3)]

After restarting my X session - logging out and back in - with the mouse already connected I found it to be enabled and working correctly.

My conclusion was that the device got connected but not enabled after hot-plugging. I revisited my custom hotplug command in .local/bin/touchpad-config (see the previous post referred to above). This script was executed whenever on startup, and then, whenever a device is added or removed. Obviously, the touchpad and trackpoint related commands do not need to be run when an external mouse is connected. They are also needless when any device is disconnected.

I was searching for a way to detect which device is being added or removed and preform different logic based on that and came across this example in the GNOME git repository on github. It clearly demonstrated how device information and action is handed over to the script and allowed me to enhance my script with logging so I could see what was happening under the hood.

Eventually, it turned out that my script was only executed a couple of times on startup, but not invoked at all after that when the mouse was hot-plugged. It boiled down to be caused by some cruft in my Xorg configuration, left over after having explored Optimus related tweaks covered in my other posts. I had to change Option "AutoEnableDevices" "false" in xorg.conf.

Although the original issue was fully resolved by the before mentioned change in to xorg.conf, the sub-optimal .local/bin/touchpad-config was still bugging me so I altered it to only execute the commands relevant for the event that triggered the script. I ended up with the following hotplug-script, which is working flawlessly in my environment:


#!/bin/sh

action=$2
shift 4
device=$@

log () {
 echo "$1" >> /tmp/hotplug-cmd.log
}

log "$action '$device'"

if [ x"$action" = xadded -o x"$action" = xpresent ]; then
 case $device in
  "SynPS/2 Synaptics TouchPad")
                 synclient PalmDetect=1 PalmMinWidth=5 \
    TapButton3=2 HorizTwoFingerScroll=1
   log "configured touchpad"
                 ;;
         "TPPS/2 IBM TrackPoint")
                 xinput disable "<default pointer>" 
                 xinput enable "TPPS/2 IBM TrackPoint"
   log "disabled <default pointer>, enabled trackpoint"
                 ;;
         *)
                 log "nop"
                 ;;
 esac
fi

Note that the script above still includes logging which may be safely removed.

Monday, 1 April 2013

OpenCL, Python and Ubuntu 12.10

Installing PyOpenCL, the python OpenCL wrapper on Ubuntu 12.10 could have been just as easy as installing the according package python-pyopencl but this is currently not possible because of a packaging bug that was reported and confirmed before the release of 12.10 but not fixed since. I am not going to comment on this but document my work around instead.

What is OpenCL?

OpenCL is the short name for Open Computing Language - an open standard that specifies a vendor agnostic language and cross-platform framework for developing code that is can be executed on parallel computing enabled platforms, such as GPUs (graphics cards), FPGAs (field programmable gate arrays) or newer generation multicore CPUs.

The most typical use-case is using the graphics device for non-graphics computation tasks. Typical computation intensive tasks can be broken down into multiple sub-tasks that can be executed independent of each other, in parallel. The many pourpose built processing pipelines, special shift register circuits, shaders and vertext processors inegrated into modern graphics devices are equipped with a multitiude of 256bit registers, and excel at high volume calculation of matrix operations, interpolations and special tranformations (e.g. fourier, laplace). OpenCL enables one to unleash this raw parallel/stream processing power to be used for generic purposes, other than gaming.

Currently, OpenCL support is provided by proprietary drivers for Nvidia GPUs, AMD GPUs and CPUs and Intel CPUs. Although these drivers are not open source components, OpenCL, being an open standard, enables one to avoid coding against vendor specific APIs and create software mostly portable across computing platforms. (One well known example of such vendor specific APIs is provided by Nvidia's Compute Unified Device Architecture - CUDA - but recent CUDA versions expose OpenCL interfaces as well.)

As of the time of this writing, OpenCL is my preferred platform for computation intensive, time critical applications. It is worth to be aware of, and tinker around with it for fun and profit.

What is the bug?

In a nutshell, the package cannot installed, the bug is a package dependency on a non-existent package, namely opencl-icd that is not available in any of the official or partner repositories and also not provided by other package. As the but was reported and confirmed one month before the final release of 12.10 I am rather surprised it has not been fixed yet, especially as fixing the problematic dependency boils down require merely one line to be changed in the package metadata.

I have quickly read up on the format of debian packages to make sure I understand the issue correctly and found the following alternatives for solving the current problem:

  • Modify the package python-pyopencl by removing the dependency on opencl-icd. This is a low hanging fruit but would leads to broken packages in the long run as any update to the package would overwrite my local changes. This issue can be dodged by putting the package on hold.
  • The strategic solution would be to alter the metadata of packages nvidia-current and fglrx so these packages would provide opencl-icd as a virtual package. This is not my preferred approach for a local workaround, as leads to maintenance issues on package updates. The concerns and workarounds of the previous alternative apply here as well.
  • My preferred approach was creating a dummy package python-pyopencl that merely depends on the appropriate drivers but does not contain any real file itself, only metadata. This approach will shield me from package update issues.

Creating and installing the dummy package


$ mkdir /tmp/dummy && cd /tmp/dummy
$ tar czf data.tar.gz * # creating an empty archive, will output some errors
$ tar tzf data.tar.gz   # this verifies the archive is empty and can be read without errors
$ touch md5sums
$ cat <<EOF>> control
Package: opencl-icd
Source: opencl-icd
Version: 1.0.0-1
Architecture: amd64
Maintainer: Tibor Bősze <tibor.boesze@gmail.com>
Installed-Size: 4
Depends: fglrx | nvidia-current | intel-ocl-sdk
Section: universe/libs
Priority: optional
Homepage: http://tuxicate.blogspot.com/
Description: Dummy package
 This is a dummy package to workaround bug 1048036
EOF
$ tar czf control.tar.gz md5sums control
$ echo "2.0" > debian-binary
$ ar rcs opencl-icd_1.0.0-1_amd64.deb debian-binary control.tar.gz data.tar.gz
$ dpkg-deb --verbose --info opencl-icd_1.0.0-1_amd64.deb # verify
$ sudo dpkg -i opencl-icd_1.0.0-1_amd64.deb # install

The package, once installed, will appear under Obsolete and Locally Created Packages in aptitude, so it is easy to track and uninstall if needed. Now, python-pyopencl can be installed and used without issues.

One could modify the dependencies (the line starting with Depends:) as required, I have added Nvidia or AMD Radeon/FireGL graphics drivers, or the Intel OpelCL SDK. The Intel SDK can be downloaded in rpm for Intel, and converted to a deb package with alien.

Tuesday, 12 February 2013

Optimus and Ubuntu 12.10 (Part 5)

This post is the fifth of a series of posts on tweaking Ubuntu 12.10 to exploit Optimus technology on my Lenovo W530 to the extent I need. Make sure you are familiar with the context and objectives.

As described in Part 1, plymouth issues when Optimus is enabled on the W530 in EFI mode. The issue around missing usplash/plymouth turned out to be connected to the random order in which the kernel initialized the graphics devices.

Part 3 captures the root cause of the symthom: Usplash is hardcoded to render to /dev/fb0 which is fine if the IGD is initialized first by the kernel, but not good if /dev/fb0 is associated with the DIS framebuffer.

One workaround already explained is blacklisting nouveau, and loading it later, after usplash is started on the IGD framebuffer. I seeked more elegant approaches and tinkered with initram to explore the alternatives:

  • Forcing proper module load order in initrd:/etc/modules - I found that this file is only processed after the kernel finished autoloading, and cannot load blacklisted modules.
  • Crafting custom initram scripts to take care of module loading early in the book process - this approach did not prove effective either.
  • Creating an initram script to disable DIS proved to be impossible as debugfs was not mounted at the time of initram script execution.
  • Packaging custom udev rules into the initial ramdisk to ensure the desired framebuffer numbering. I settled with this option.

Ensuring consistent framebuffer numbering via udev rules

Traditionally, Unix systems exposed a static set of device files under /dev but current versions of Linux ship with a device manager that can dynamically populate the /dev directory with nodes for the devices present. It allows persistent naming insensitive to the order of hardware initialization or hotplug. This is achieved by udev rules, which match the devices based on their attributes and perform actions such as creating device nodes, symlinks, changing permissions and ownership, or if fact, running arbitrary logic.

This section provides a very brief walk-through on how the rules for the current use-case can be created. First, information about the framebuffer devices had to be gathered, and the properties and attributes for the matching part had to be identified.


$ udevadm info -a -n /dev/fb0 | head -n 24

Udevadm info starts with the device specified by the devpath and then
walks up the chain of parent devices. It prints for every device
found, all possible attributes in the udev rules key format.
A rule to match, can be composed by the attributes of the device
and the attributes from one single parent device.

  looking at device '/devices/pci0000:00/0000:00:01.0/0000:01:00.0/graphics/fb0':
    KERNEL=="fb0"
    SUBSYSTEM=="graphics"
    DRIVER==""
    ATTR{pan}=="0,0"
    ATTR{name}=="nouveaufb"
    ATTR{mode}==""
    ATTR{console}==""
    ATTR{blank}==""
    ATTR{modes}=="U:1024x768p-0"
    ATTR{state}=="0"
    ATTR{bits_per_pixel}=="32"
    ATTR{cursor}==""
    ATTR{rotate}=="0"
    ATTR{stride}=="4096"
    ATTR{virtual_size}=="1024,768"

$ udevadm info -a -n /dev/fb1 | sed -n '8,24p'
  looking at device '/devices/pci0000:00/0000:00:02.0/graphics/fb1':
    KERNEL=="fb1"
    SUBSYSTEM=="graphics"
    DRIVER==""
    ATTR{pan}=="0,0"
    ATTR{name}=="inteldrmfb"
    ATTR{mode}==""
    ATTR{console}==""
    ATTR{blank}==""
    ATTR{modes}=="U:1920x1080p-0"
    ATTR{state}=="0"
    ATTR{bits_per_pixel}=="32"
    ATTR{cursor}==""
    ATTR{rotate}=="0"
    ATTR{stride}=="7680"
    ATTR{virtual_size}=="1920,1080"

As it can be seen from the listing above, is is sufficient to match devices of the graphics subsystem named by the kernel as "fd0" and "fb1". The attribute "name" can be used to discriminate between the two framebuffers. The goal is create device node /dev/fb0 for the intel framebuffer /dev/fb1 for the nouveau framebuffer respectively. This can be achieved by the following udev rule:


KERNEL=="fb?", SUBSYSTEM=="graphics", ATTR{name}=="nouveaufb", NAME="fb1"
KERNEL=="fb?", SUBSYSTEM=="graphics", ATTR{name}=="inteldrmfb", NAME="fb0"

Normally, one would save these lines to a file in the appropriate folder, say /etc/udev/rules.d/82-explicit-fb-assignment.rules. (The numbering is used to ensure the rule is not overridden, as rules are parsed in lexicographical order.)

In this very case, however, the rule will not yield the desired results. One would see usplash on shutdown, but not on boot. The rule file is located on disk, in the root partition to be precise. It gets mounted after usplash initializes during boot...

Packaging custom udev rules into initrd

As implied by the nature of the use case, it would not suffice to drop the custom udev rule into /etc/udev/rules.d/, it needs to be included in the initial ramdisk. Customization of initial ramdisk content is best done via custom hook scripts - they provide a clean, modular way of achieving the current goal while being minimally intrusive to other parts of the system. The hook scripts will run whenever the a new initrd is created and can contribute files to the ramdisk. They do not become part of the ramdisk themselves.


$ cat <<EOX | sudo tee /etc/initramfs-tools/hooks/explicit-fb-assignment
#!/bin/sh -e

PREREQ="udev"

# Output pre-requisites
prereqs()
{
   echo "$PREREQ"
}

case "$1" in
    prereqs)
   prereqs
   exit 0
   ;;
esac


. /usr/share/initramfs-tools/hook-functions

# Create udev rules to control fb1/fb0 assignment
cat > ${DESTDIR}/lib/udev/rules.d/82-explicit-fb-assignment.rules <<EOF
KERNEL=="fb?", SUBSYSTEM=="graphics", ATTR{name}=="nouveaufb", NAME="fb1"
KERNEL=="fb?", SUBSYSTEM=="graphics", ATTR{name}=="inteldrmfb", NAME="fb0"
EOF

EOX

$ sudo chmod +x /etc/initramfs-tools/hooks/fb_order
# create new initrd for the current kernel
$ sudo update-initramfs -c -k $(uname -r)
# verify on next reboot

The listing above provides the commands for creating the hook and generating a new initrd. Having applied this tweak, usplash/plymouth is fully functional and reliable, as /dev/fb0 always refers to the intel framebuffer device.

Troubleshooting

One has to remember that there are 2 udev daemons launched at different parts of the boot process. One runs off initrd, while the other is spawned after the root filesystem has been mounted. The former uses the rules included in ramdisk, while the later one reads the rules from /{lib|etc}/udev/ruled.d of the root partition, and does not use any rule from the initrd.

Friday, 1 February 2013

Tracking down TrackPoint issues

I occasionally experience strange, inconsistent behavior of the TrackPoint on my Lenovo w530. Sometimes, middle button scroll was not working at all, other times it kind of worked, however, after having scrolled and released the middle button, the clipboard's content was pasted into the current caret position. Also, scrolling pdf documents in evince with the TrackPoint simply resulting into up and down shaking and vibrating pages. I also noticed that in this cases, the cursor moved during middle button scroll which was very weird.

Locating the root cause

The symptoms above were not persistent across reboot - occasionally they were present. Further, restarting the X server typically improved the situation. After a bit of experimenting it turned out, that the strange behaviour occurs more frequently when running of a fast SSD, however, even with a rotating disk the symptoms appeared from time to time.

After a bit of googling I used the command xinput to enumerate X input devices and view their settings. I confirmed that wheel emulation was enabled, and configured for button 2. With xev I also confirmed that the middle TrackPoint button indeed fired button 2 events.


$ xinput --list
⎡ Virtual core pointer                        id=2    [master pointer  (3)]
⎜   ↳ Virtual core XTEST pointer              id=4    [slave  pointer  (2)]
⎜   ↳ SynPS/2 Synaptics TouchPad              id=13   [slave  pointer  (2)]
⎜   ↳ <default pointer>                       id=6    [slave  pointer  (2)]
⎣ Virtual core keyboard                       id=3    [master keyboard (2)]
    ↳ Virtual core XTEST keyboard             id=5    [slave  keyboard (3)]
    ↳ Power Button                            id=7    [slave  keyboard (3)]
    ↳ Video Bus                               id=8    [slave  keyboard (3)]
    ↳ Video Bus                               id=9    [slave  keyboard (3)]
    ↳ Sleep Button                            id=10   [slave  keyboard (3)]
    ↳ Integrated Camera                       id=11   [slave  keyboard (3)]
    ↳ AT Translated Set 2 keyboard            id=12   [slave  keyboard (3)]
    ↳ ThinkPad Extra Buttons                  id=14   [slave  keyboard (3)]
∼ TPPS/2 IBM TrackPoint                       id=15   [floating slave]
$ xinput --list-props "TPPS/2 IBM TrackPoint" | grep "Wheel Emulation"
 Evdev Wheel Emulation (425): 1
 Evdev Wheel Emulation Axes (426): 6, 7, 4, 5
 Evdev Wheel Emulation Inertia (427): 10
 Evdev Wheel Emulation Timeout (428): 200
 Evdev Wheel Emulation Button (429): 2
$ xev

First I thought I should just disable the paste action on middle button click, by remapping the buttons. The middle button click actions can be completely disabled by the following command:


$ # disable normal middle button action (like paste)
$ xinput set-button-map "TPPS/2 IBM TrackPoint" 1 0 3 4 5 6 7 

This command eliminated the annoying text-pasting behavior whenever the middle button was released, but it did not solve the occasional inability to use middle button close, neither the shaking/vibrating pages in evince, so I continued investigating the issue...

I drew the conclusion that the error is caused by race conditions on around X server start, resulting an improper start up sequence of input device initialization. The W530 has 8 logical CPU cores, and my thinkpad is also equipped with a fast SSD - I already encountered similar issues in other areas which I recorded in previous posts.

My theory

I revisited the symptoms and tried to find a plausible explanation for the behavior. It seemed like 2 xinput devices were concurrently handling the TrackPoint, to a degree that varied across X restarts.

  • Ability to move the cursor, but inability to use middle mouse scroll - in cases the 'other' devices actively handling the TrackPoint.
  • Button click events when finishing middle wheel scroll - in case both X input devices are actively interpreting raw button press and release events.
  • Moving cursor during middle button scroll - when both X input devices are interpreting pointer motion with the middle button pressed.
  • Shaking pdf pages - I believe evince has some software level 'page dragging' capability built in, where the drag directions are the opposite of the scroll direction, resulting the strange vibrating effect. To clarify this assumption, let us imagine what happened if both X input devices were indeed actively handling the raw TrackPoint events: the pointer is moving upwards while the middle button is in a pressed state.
    • The TrackPoint X input device, as wheel emulation is enabled, sends wheel events scrolling the page up.
    • The other X input device, without wheel emulation, simply forwards the cursor movements and the pressed state of the middle button to the application, which activated page dragging. Dragging the current page upwards is equivalent to scrolling down.
    It sound plausible that this situation could result a conflict between scrolling up and down at the same time, yielding vertically vibrating pages...

According to an old email thread the X server automatically ads the input device <default pointer> if there are no configured pointing devices, and it is well possible that in my case TrackPoint initialization is done after, or in parallel to the X server checking for configured devices.

Steps to fix it

Looking at the output of xinput --list again, as shown above, made me wonder what a floating slave could mean - according to the GDK3 reference manual this indicates that the device is not attached to any virtual device (master). In the case when middle button scroll was not working at all, I could simply enable the TrackPoint device from the command line, which restored my ability to use middle button scroll but also reproduced the phenomenon of pasting-when-scrolling and vibrating pdf pages. To me this seemed to prove the theory described above. I experimented with various xinput calls and eventually fixed the situation by disabling the default pointer pointer.

After having minimized the list of commands needed to resolve the middle button scroll issue, I decided to restore normal button mapping and re-enable middle button paste. Actually, it does not conflict with middle button scroll at all. Just tapping/clicking the middle button triggers the paste action (or normal middle button action as defined for the actual application) while keeping it pressed enters wheel emulation mode - and releasing it does not fire the normal click event. The timout for a normal middle button click is configured to 200 ms which seems to work fine for me.


$ xinput enable "TPPS/2 IBM TrackPoint"
$ xinput disable "<default pointer>"
$ # quickly click on middle button to paste, hold it down to start scrolling

These are the minimal commands I use to fix the TrackPoint behavior whenever it occurs.

Friday, 18 January 2013

Optimus and Ubuntu 12.10 (Part 4)

This post is the fourth of a series of posts on tweaking Ubuntu 12.10 to exploit Optimus technology on my Lenovo W530 to the extent I need. Make sure you are familiar with the context, especially the objectives and constraints as described in Part 1, Part 2 and Part 3.

Moving windows from the primary screen to the external VGA screen

As it has been explained before, windows cannot cross screen boundaries, and the GNOME desktop cannot span multiple X screens even in the case when the these screens belong to the same display (X instance). It should be obvious that this is also the case with screens that belong to different X servers.

The objectives defined in the previous parts require the ability to extend the GNOME desktop to the monitor attached to the external VGA port, to run presentations using two monitors and to clone the primary monitors content to the external monitor.

In order to meet one of the objectives, one could mirror, or better said clone the content of one screen to another screen. To be even more precise, only a given portion of the screen has to be cloned to other viewport, the area of the other screen that is displayed by the external monitor. There is a userspace tool called hybrid screenclone to perform exactly this task - find more on this tool below.

Extending the desktop to a screen of the other X server

Thinking further, in order to be able to extend the desktop and thereby meet other objectives, one could set up a mock monitor in the first X server, and then, clone the content to the screen of the other X server so it would show up on the external monitor.

My first approach was to examine the video outputs of the integrated graphics device, and found VGA[12] that is wired to /dev/null (see Part 2). With various xrandr commands I could force the unused, thus always disconnected output to be configured with a fixed resolution, right of the primary monitor. A screenshot confirmed I was half way through: it contained a black are next to the primary desktop. I could even drag windows onto this black area, however, the desktop would not extend to this portion of the X screen. Also, libreoffice refused to start the slideshow in multi-monitor mode, as it only detected one connected monitor. While the idea was not completely useless, this approach did turn out not to be usable in production.

A bit of googling revealed, that the author of hybrid screenclone also maintains a patch against the intel video driver which adds a dynamically configurable virtual output - enabling exactly the scenario I was targeting.

Intel driver hack

The listing below takes the reader through the process of creating and installing a package containing the patched version of the intel video driver.


$ mkdir /tmp/foo && cd /tmp/foo # we are going to compile in tmpfs/RAM
$ sudo aptitude build-dep --schedule-only  xserver-xorg-video-intel
$ sudo aptitude # review interactively what is going to be installed
$ apt-get source xserver-xorg-video-intel
$ cd xserver-xorg-video-intel-2.20.9/
$ wget https://raw.github.com/liskin/patches/master/hacks/xserver-xorg-video-intel-2.20.2_virtual_crtc.patch
$ # the newer patch did not match.
$ patch -p1 < xserver-xorg-video-intel-2.20.2_virtual_crtc.patch
$ # now update the version to show this is a patched package
$ # NEVER alter packages without making it clear in the package version!
$ # I prepend this to debian/changelog:
$ mv debian/changelog debian/changelog.old && cat <<EOF > debian/changelog
xserver-xorg-video-intel (2:2.20.9-0ubuntu2+virtual-crtc) quantal; urgency=low

  [ Tibor Bősze ]
  * Add xserver-xorg-video-intel-2.20.2_virtual_crtc.patch

 -- Tibor Bősze <tibor.boesze@gmail.com>  Sun, 13 Jan 2013 03:15:00 +0200

EOF
$ cat debian/changelog.old >> debian/changelog && rm debian/changelog.old
$ # now build and install the package
$ dpkg-buildpackage -b
$ sudo dpkg -i ../xserver-xorg-video-intel_2.20.9-0ubuntu2+virtual-crtc_amd64.deb
$ # as you see, the package version will clearly show that this is a patched package
$ # prevent the package to be automatically updated
$ sudo aptitude hold xserver-xorg-video-intel

After a reboot, the command below will active the virtual monitor. The graphical display manager will not show any second display, however, creating a screenshot quickly confirms that the second virtual monitor is active, and the desktop correctly extends to it. Also, libreoffice impress can finely use it for running the slideshow in dual monitor mode.


$ xrandr --output LVDS2 --auto --output VIRTUAL --mode 800x600 --right-of LVDS2

Screenclone

To render this post complete, below are listed the commands to download and compile the tool.


$ # we are still doing stuff in tmpfs/RAM
$ aptitude install git-core
$ git clone git://github.com/liskin/hybrid-screenclone.git
$ cd hybrid-screenclone && make
g++ -std=c++0x -g -Wall    screenclone.cc  -lpthread -lX11 -lXdamage -lXtst -lXinerama -lXcursor -o screenclone
screenclone.cc:18:33: fatal error: X11/Xcursor/Xcursor.h: No such file or directory
compilation terminated.
make: *** [screenclone] Error 1
$ apt-file search Xcursor.h
libxcursor-dev: /usr/include/X11/Xcursor/Xcursor.h
$ aptitude install libxcursor-dev
$ make
g++ -std=c++0x -g -Wall    screenclone.cc  -lpthread -lX11 -lXdamage -lXtst -lXinerama -lXcursor -o screenclone
screenclone.cc:24:37: fatal error: X11/extensions/Xinerama.h: No such file or directory
compilation terminated.
make: *** [screenclone] Error 1
$ aptitude install libxinerama-dev libxdamage-dev libxtst-dev
$ make
$ mv screenclone ~/optimus/

By default the tool will clone the first screen of the first display to the first screen of the second one, that is, :0.0 to :1.0. This almost completely fits my use case, I added the parameter -x 1 which limits the content to be copied to the area of the screen that is displayed by the second monitor, the VIRTUAL output in my case.


$ # clone the viewport of the VIRTUAL output from :0.0 to the top left corner of :1.0
$ ~/optimus/screenclone -x 1 

Part 5 will revisit the issue of changing framebuffer number assignment and provide a more elegant solution to fixing usplash.

Wednesday, 16 January 2013

Optimus and Ubuntu 12.10 (Part 3)

This post is the third of a series of posts on tweaking Ubuntu 12.10 to exploit Optimus technology on my Lenovo W530 to the extent I need. Make sure you are familiar with the context, especially the objectives and constraints as described in Part 1 and Part 2.

My previous post on the topic describes how to control the power state of the DIS with vgaswitheroo that is part of the stock Ubuntu 12.10 kernel. It explains key terms related to X also shows alternative ways to use both the DIS and IGD within the same X server.

As the "single X server with 2 screens" approach is not an option until the related bugfix is available in the official repositories, I investigated how a second X server could be used to reach my goals. As a first step I analysed existing solutions in the area. Starting a second X server is core concept of Bumblebee, that enables rendering on the DIS, and then uses VisualGL to copy the content of individual windows back to the screen handled by the primary X server that uses the IGD only. Unfortunately, this project does not support external monitors in the case when ports are wired to DIS only. Also, it currently does not seem to be mature and stable enough for my production thinkpad. Nevertheless, it gave me a starting point...

Two X servers with one screen each

Starting a separate X instance to handle the DIS enables better isolation/sandboxing but also introduces additional issues.

  • First of all, the primary X instance has to be configured in a way that it will not grab any resources of the DIS, else the secondary instance fails to starts up with the message "No screens found".
  • Similarly, the second instance has to be configured explicitly to only use the DIS and related monitors.
  • I used configuration to override the actual connection status of the external VGA port and always enable VGA internally. Without an enabled monitor X would not start up.
  • The desktop cannot be extended to a monitor of the second X instance.
  • One could use a separate window manager on the second X instance - twm is very lightweight one. I stayed with a naked X for reasons described below.
  • With two X instances, I get a cursor on both displays. Without a pointing device, X would not start either. There are solution to use a mock input device, but anyway, having a two cursors that move in tandem on the two monitors is not critical. With my current configuration, the touchpad only controls the cursor on my primary X, while the trackpoint controls both.
  • Finally, the second X server will run as root, essentially with access control disabled so mortals can open windows on the second X as well.

$ cd ~
$ mkdir optimus && cd optimus
$ cat >>xorg.conf.nouveau<<EOF
Section "Modes"
 Identifier "FallbackModes" # Mode to use if External-VGA is diconnected
 Modeline "1024x768"   65.00  1024 1048 1184 1344  768 771 777 806 -hsync -vsync
EndSection

Section "ServerLayout"
 Identifier "Layout0"
 Screen "Screen0"
 Option "AutoAddDevices" "false"
 Option "AutoEnableDevices" "false"
 Option "AutoAddGPU" "false"
EndSection

Section "Monitor"
 Identifier "External-VGA"
 UseModes "FallbackModes"
 Option "Enable" "true" # always enabled
 Option "PreferredMode" "1024x768"
EndSection

Section "Monitor"
 Identifier "LCD"
 Option "Enable" "false" # always disabled
EndSection

Section "Device"
 Identifier "DIS"
 Driver "nouveau"
 BusID "PCI:1:0:0"
 Option "HWCursor" "true"
 # The numbers in output names change based on whether IGD or DIS is           
 # initialized first by the kernel. This tweak takes care of both cases.
 Option "Monitor-VGA-1" "External-VGA"
 Option "Monitor-VGA-2" "External-VGA"
 Option "Monitor-LVDS-1" "LCD"
 Option "Monitor-LVDS-2" "LCD"
EndSection

Section "Screen"
 Identifier "Screen0"
 Device "DIS"
 Monitor "External-VGA"
 DefaultDepth 24
 SubSection "Display"
  Depth 24
 EndSubSection
EndSection
EOF

$ cat >>xorg.conf.intel<<EOF
Section "ServerLayout"
   Identifier "Layout0"
   Screen "Screen0"
   Option "AutoAddDevices" "true"
   Option "AutoEnableDevices" "true"
   Option "AutoAddGPU" "false"
EndSection

Section "Device"
   Identifier "IGD"
   Driver "intel"
   BusID "PCI:0:2:0"
EndSection

Section "Screen"
   Identifier "Screen0"
   Device "IGD"
   DefaultDepth 24
   SubSection "Display"
      Depth 24
   EndSubSection
EndSection
EOF
$ sudo cp xorg.conf.intel /etc/X11/
$ sudo rm /etc/X11/xorg.conf # this is the link to the 2 screen config
$ sudo ln -s /etc/X11/xorg.conf.intel /etc/X11/xorg.conf
$ sudo lightdm restart

If there is no explicit default xorg configuration, then the X server will hold both /dev/dri/card[01] in which case the second X instance could not start up.

Starting, disabling and restoring external VGA output

I used the following small script to ensure DIS is powered on, spawn the X server, decorate it with a random background. After pressing Enter, X is terminated and DIS powered off.


#!/bin/bash

msg() {
 echo "******* $1"
}

msg "Ensuring DIS is powered on."
echo ON | sudo tee /sys/kernel/debug/vgaswitcheroo/switch
msg "Launching X server on display :1."
sudo /usr/bin/X -ac -audit 0 -config /home/tibi/optimus/xorg.conf.nouveau -sharevts -verbose 1 -logverbose 9 -logfile /tmp/Xorg.1.log -nolisten tcp -noreset :1 &

PID=$!
sleep 2
msg "PID is $PID, log goes to /tmp/Xorg.1.log."
# bonus: get a random background
BACKGROUND=$(find /home/tibi/Pictures -maxdepth 1 -name '*.jpg' | sort --random-sort | head -1)
msg "Setting background: $BACKGROUND"
gm display -window root -display :1.0 $BACKGROUND
msg "DONE."

msg "Press Enter to terminate and clean up."
read
msg "Terminating X server..."
sudo kill $PID
sleep 2
echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch
msg "Discrete graphics device powered off."

I can safely disable and restore the external display without stopping the X server with the following commands:


### DISABLE
# numbers change across reboot, one of the two will work, other will print an error.
xrandr -d :1 --output VGA-1 --off
xrandr -d :1 --output VGA-2 --off
echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch

### RESTORE
echo ON | sudo tee /sys/kernel/debug/vgaswitcheroo/switch
# numbers change across reboot, one of the two will work, other will print an error.
xrandr -d :1 --output VGA-1 --auto
xrandr -d :1 --output VGA-2 --auto

Usplash and plymouth issues

As it has been stated earlier, the issue around missing usplash/plymouth turned out to be connected to the random order in which the kernel initialized the graphics devices.


$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [Quadro K1000M] (rev a1)
$ # framebuffers - the order (0 and 1) changes randomly across reboots
$ cat /proc/fb
0 inteldrmfb
1 nouveaufb

Usplash is hardcoded to render to /dev/fb0 which is fine if the IGD is initialized first by the kernel, but not good if /dev/fb0 is associated with the DIS framebuffer. Kernel boot fbcon=map:1 is of no help in solving this issue. One workaround to mediate this problem is blacklisting nouveau, and loading it later, after usplash is started on the IGD framebuffer.


$ cat <<EOF | sudo tee /etc/modprobe.d/blacklist-nouveau.conf
# blacklist nouveau to force IGD framebuffer to take precedence
blacklist nouveau
EOF
$ sudo update-initramfs -c -k all

### and later during the boot process...

modprobe nouveau # we need modprobe as it was blacklisted during boot
udevadm settle # wait until all udev events are handled
echo OFF > /sys/kernel/debug/vgaswitcheroo/switch

This approach is rather a quick and dirty workaround. Technically, it would be enough - and much more elegant - to merely ensure the proper order of module loading by tinkering around with initram scripts; this is subject to further investigation.

About Part 4

Part 4 shows how this second naked X server can be used to achieve the objectives outlined in the previous post.

Monday, 14 January 2013

Optimus and Ubuntu 12.10 (Part 2)

This post is a follow up on Optimus and Ubuntu 12.10. Please make sure you are familiar with the context.

Setting the scope and objectives

This series of posts takes the reader through Nvidia Optimus related tweaks I applied on my Lenovo W530. My goal is to get a very stable system with extended battery life, and the ability to connect an external projector to the VGA port and cover the following use cases:

  • Extend the desktop to the external monitor.
  • Get a cloned output of the primary monitor to the external monitor with panning support - this means, that in the case the external monitor's resolution is smaller, a smaller viewport will follow the mouse and show a cropped clone of the desktop's content. The viewport will follow the mouse to show the area of interest.
  • Run LibreOffice presentations with the external monitor showing the current slide and the primary monitor showing the presentation overview, notes and time.
  • Never ever get X freezes or kernel lockups on suspend/resume with or without an external monitor connected.
  • Switching to virtual terminals should always work in a bulletproof manner. The box is a workhorse, cannot allow hiccups.
  • Might sound like a small detail, but a properly displayed usplash/plymouth is also important, not only for cosmetic purposes.

All these features I am able to use today on my T400 (that includes a single Intel graphics device), with the help either the GNOME GUI, or xrandr scripts in special cases. However, on the W530 with it's two graphics devices, both the the external VGA and mini DisplayPort are wired to the nvidia chip so I will not able to hook up an external monitor with only the integrated graphics device enabled. The discrete device is a resource hog on one hand, and the W530 is not able to properly boot with only discrete graphics enabled in the case hardware virtualization support is also enabled.

I took the decision to push myself and tweak Ubuntu 12.10 until my goals and criteria is met. I started looking for existing solutions to get Optimus working, and educating myself on the topic. On the road I took the decision not to use Bumblebee or other unstable/immature components that could impose stability issues of additional risks of X or kernel lockups.

Further, I also set a secondary objective: trying to reach my goals with open-source components only. So, I tried to avoid the proprietary Nvidia driver as much as possible.

All these goals are met by now, with the open source nouveau driver and set of tricks and tweaks. This writing focuses more on the way of investigating alternatives and achieving the goals as opposed to merely providing the final solution.

Stock Ubuntu with Optimus enabled in firmware

Having applied the lightdm tweak from the earlier post, the blank screen issues disappeared completely even with Optimus enabled in firmware setup. The issue around missing usplash/plymouth turned out to be connected to the random order in which the kernel initialized the graphics devices.


$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [Quadro K1000M] (rev a1)
$ # framebuffers - the order (0 and 1) changes randomly across reboots
$ cat /proc/fb
0 inteldrmfb
1 nouveaufb
$ # vgaswitheroo state - the order (0 and 1) changes randomly across reboots
$ sudo cat /sys/kernel/debug/vgaswitcheroo/switch
0:IGD:+:Pwr:0000:00:02.0
1:DIS: :Pwr:0000:01:00.0

With Optimus enabled, the X sever starts up properly, but external monitors are not recognized out of the box. Power state of the discrete graphics device can be controlled through vgaswitheroo, but switching between the discrete (DIS) and integrated graphics device (IDG) cannot be performed. Why is that, and what is vgaswitcheroo at all?

vgaswitcheroo

It is a mechanism built into the kernel that allows to switch between multiple graphics devices if (and only if) the configuration is equipped with a hardware multiplexer.It is documented not to work with Nvidia Optimus, as it does not incorporate a hardware multiplexer.

Nevertheless, it automatically loaded (it is part of the stock Ubuntu kernel) and allowed me to change the power state of the discrete graphics device. There are other solutions for changing the power state either automatically or manually, such as acpi_call or bbswitch. Both of these other solution have one thing in common - they are kernel modules not considered stable or mature. So my choice is to go with vgaswitcheroo for power state management, using the commands as shown below:


$ sudo cat /sys/kernel/debug/vgaswitcheroo/switch # display status
$ echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch # power off DIS
$ echo ON | sudo tee /sys/kernel/debug/vgaswitcheroo/switch # power on DIS

I would like to stress again, that it is used for power state management only, as actual switching cannot be done with Optimus. One can verify power state of the discrete device with lspci but also with the power led of a monitor connected to the VGA port.

About X servers, displays, screens and monitors

The core concept of Bumblebee is to start up a second X server to enable rendering on the DIS, and then use VisualGL to copy the content of individual windows back to the screens handled by the primary X server that uses the IGD only. This approach is not necessary the best one.

Contrary to what many might believe, one single X server can handle many screens and many video cards. Let me take a bottom up approach to define the terms and structure used by X - or at least my understanding thereof.

  1. A Monitor is a graphical output device, such as a projector, an LCD panel, CRT or other kind of monitor. They are physically connected (in a static/persistent or dynamic/transient manner) to one of the video cards.
  2. A Screen is a virtual area where applications can render their windows. A screen can be composed of multiple monitors running on the same video card, each monitor can show a portition/viewport of the screen - they can even overlap or show cloned content. (It should be noted that they can also be fully virtual as in the case of VNC.) This way it is obvious that windows within the same screen can overlap or span multiple monitors. Windows cannot overlap or span multiple screens. Window managers can run independently on each screen.
  3. One instance of X server is associated with a display. It can handle multiple video cards and multiple screens, these screens share input devices such as keyboard and mouse. One screen can be active at a time, and the active screen can be changed with the mouse (although there also existed a small command line utility for this, called switchscreen).
  4. Xinerama is an extension that allows Monitors even from multiple video cards to be combined into one screen. It does not support hardware acceleration nor dynamic reconfiguration.

Revisiting the core concept of Bumblebee in light of the information above, one intuitive alternative would be the use of a single X server, with one screen for the IDG, and another screen for the DIS. Monitors can be turned on and off dynamically, and based on that, the screen size can also be adjusted dynamically. One thing that needs to be investigated is the behaviour of the X server when one of the video devices is powered off.

How wiring affects what X identifies

DIS with nouveau driver:

  • LVDS-1 or LVDS-2 which is a dead end and always shows as disconnected
  • VGA-1 or VGA-2 which is the external VGA output wired to the nvidia card
  • DP-1
  • DP-2
  • DP-3

IDG with the intel driver - note the differences in naming, no hyphen is used:

  • LVDS2 or LVDS1 which is wired to the LCD panel
  • VGA2 or VGA1 which would be supported by the card but is a dead end, not wired to anything

One X server with two screens

The configuration below does not address the change of output names caused by the non deterministic order in which the kernel initialized the graphics devices.


cat <<EOF | sudo tee /etc/X11/xorg.conf.2screens
Section "ServerLayout"
  Identifier "Layout0"
  Screen 0 "Screen0"
  Screen 1 "Screen1" RightOf "Screen0" # two screens side by side
  Option "AutoAddDevices" "true"
  Option "AutoEnableDevices" "true"
  Option "AutoAddGPU" "true"
EndSection

Section "Device"
  Identifier "IGD"
  Driver "intel"
  BusID "PCI:0:2:0" # this can be read from the lspci output above
EndSection

Section "Screen" # IDG screen will not have explicit monitor configuration
  Identifier "Screen0"
  Device "IGD"
  DefaultDepth 24
  SubSection "Display"
    Depth 24
  EndSubSection
EndSection

Section "Monitor"
  Identifier "External-VGA"
  Option "Enable" "true" # will always show as connected, so we have at least one active output
  Option "PreferredMode" "1024x768"
EndSection

Section "Monitor" 
  Identifier "LCD"
  Option "Enable" "false" # this is a dead end - could have used "Ignore" "true"
EndSection

Section "Device"
  Identifier "DIS"
  Driver "nouveau"
  BusID "PCI:1:0:0" # this can be read from the lspci output above
  Option "Monitor-VGA-2" "External-VGA" # connecting output names with monitor config
  Option "Monitor-LVDS-2" "LCD"
  Option "HWCursor" "true"
EndSection

Section "Screen" # DIS screen will only have one active monitor: External-VGA
  Identifier "Screen1"
  Device "DIS"
  Monitor "External-VGA"
  DefaultDepth 24
  SubSection "Display"
      Depth 24
  EndSubSection
EndSection
EOF
$ # now create a link xorg.conf that will point to our actual configuration.
$ sudo ln -s /etc/X11/xorg.conf.2screens /etc/X11/xorg.conf 
$ sudo service lightdm restart # ... and log in
$ DISPLAY=:0.1 xclock # start xclock on screen 1 and smile

The configuration above would work fine. Unfortunately, the mouse could not leave screen 0 when I was testing - it wrapped meaning when I pulled the pointer outside of screen 0 on the right side, it entered on the left side of screen 0 instead of screen 1. This is an upstream bug, marked as "fix released", however a working package is not available in the official Ubuntu 12.10 repositories yet.

Powering off the DIS without any provision while the screen was actively used froze the kernel, however powering it off once after turning off the output devices seemed stable.


echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch
# freezes the kernel, requires hard reset

# if the output is disabled first like this:
xrandr --screen 1 -q # see which devices are connected
xrandr --screen 1 --output VGA-2 --off # switch off connected devices
echo OFF | sudo tee /sys/kernel/debug/vgaswitcheroo/switch
# works fine, and screen can be restored when the devices reactivated.

Unfortunately, this approach is not usable at in production until the bugfix making me unable to switch screens is resolved.

It should be also noted, that when running with a two screens configuration, windows cannot span or cross screens, and the desktop can not be simply extended either - without further magic to be covered later. However, any application can be opened on any of the two screens.

Xinerama

This old extension allows one to extend the desktop to screen 1 with a single line of configuration change. The following line has to be added to the ServerLayout section:


Option "Xinerama" "on"

Two downsides made me search for other alternatives: The performance with Xinerama is very very poor, but even more important, Xinerama does not support dynamic configuration changes, so resizing, rearranging the monitors or switching outputs on or off is simply not supported. With the current set of objectives this simply means Xinerama is not what I am looking for.

About Part 3

Part 3 explains setting up two X servers and various related tweaks to fulfill all objectives listed above.

Wednesday, 9 January 2013

Fixing the visual appearance of 32bit applications on Ubuntu 12.10 64bit

Both Lotus Notes and Lotus Sametime are available as 32 bit applications. Ubuntu 12.10 provides multiarch support out of the box, which refers to the ability of a system to install and run applications of multiple different binary targets on the same system, in this case i386-linux-gnu application on an amd64-linux-gnu system. Nevertheless, the visual appearance of 32 bit applications is not in-line with the look-and-feel of the amd64 ones. This is especially annoying as I am going to have Lotus Notes in front of me on a daily basis.

The following section describes how to find and install the set of 32 bit packages that will resolve the issue as much as possible. First some background information. Lotus notes uses an eclipse based framework, and eclipse uses SWT, their cross platform standard widgeting toolkit. SWT on linux uses GTK+, the gimp toolkit as the native backend for rendering widgets. Ubuntu 12.10 provides a desktop environment that mainly uses GTK3, a version of GTK+ available since 2011. GTK3 is backwards compatible with GTK2 which is available since 2002. However, one can install GTK2 and GTK3 themes and rendering engines separately...

To find out more about why the 32bit applications do not render as we would expect, let us install a 32bit GTK2 and GTK3 demo application.


$ sudo aptitude install gtk2.0-examples:i386 gtk-3-examples:i386 --without-recommends
$ gtk3-demo 
Gtk-Message: Failed to load module "overlay-scrollbar"

(gtk3-demo:5164): Gtk-WARNING **: Theme parsing error: gtk-widgets.css:62:17: Theming engine 'unico' not found
Gtk-Message: Failed to load module "canberra-gtk-module"
$ gtk-demo
Gtk-Message: Failed to load module "overlay-scrollbar"

(gtk-demo:4930): Gtk-WARNING **: Unable to locate theme engine in module_path: "murrine",
Gtk-Message: Failed to load module "canberra-gtk-module"

It can be seen from the output what modules and theming engines could not be loaded. Now let us take incrementally try to resolve the issues.


$ # let us first take a try at installing 32bit overlay-scrollbar
$ aptitude install overlay-scrollbar-gtk2:i386 overlay-scrollbar-gtk3:i386 --without-recommends --simulate
The following NEW packages will be installed:
  overlay-scrollbar-gtk2:i386{b} overlay-scrollbar-gtk3:i386{b} 
0 packages upgraded, 2 newly installed, 0 to remove and 0 not upgraded.
Need to get 79.7 kB of archives. After unpacking 250 kB will be used.
The following packages have unmet dependencies:
 overlay-scrollbar-gtk2:i386 : Depends: overlay-scrollbar:i386 which is a virtual package.
 overlay-scrollbar-gtk3:i386 : Depends: overlay-scrollbar:i386 which is a virtual package.
The following actions will resolve these dependencies:

     Keep the following packages at their current version:
1)     overlay-scrollbar-gtk2:i386 [Not Installed]        
2)     overlay-scrollbar-gtk3:i386 [Not Installed]        

Accept this solution? [Y/n/q/?] q
Abandoning all efforts to resolve these dependencies.
Abort.
$ # as it can be seen, there is a missing dependency, so it cannot be installed.

$ # let us take a try at unico theming engine
$ aptitude install gtk3-engines-unico:i386 --without-recommends --simulateThe following NEW packages will be installed:
  gtk3-engines-unico:i386{b} 
0 packages upgraded, 1 newly installed, 0 to remove and 0 not upgraded.
Need to get 9,066 B of archives. After unpacking 57.3 kB will be used.
The following packages have unmet dependencies:
 gtk3-engines-unico : Conflicts: gtk3-engines-unico:i386 but 1.0.2+r139-0ubuntu2 is to be installed.
 gtk3-engines-unico:i386 : Conflicts: gtk3-engines-unico but 1.0.2+r139-0ubuntu2 is installed.
The following actions will resolve these dependencies:

     Remove the following packages:
1)     gtk3-engines-unico          
2)     light-themes                
3)     ubuntu-artwork              
4)     ubuntu-desktop              

Accept this solution? [Y/n/q/?] q
Abandoning all efforts to resolve these dependencies.
Abort.
$ # This time, there is a conflict: 
$ # the 32bit and 64bit versions of unico are mutually exclusive.

$ # to make the long story short, the following packages could be installed:
$ #   - libcanberra-gtk3-module:i386
$ #   - libcanberra-gtk3-module:i386
$ #   - gtk2-engines-murrine:i386
$ sudo aptitude install libcanberra-gtk3-module:i386 libcanberra-gtk3-module:i386 gtk2-engines-murrine:i386 --without-recommends

$ # both the output and the visual appearance confirm things got better...
$ gtk-demo 
Gtk-Message: Failed to load module "overlay-scrollbar"
tibi@grizzly:~$ gtk3-demo 
Gtk-Message: Failed to load module "overlay-scrollbar"

(gtk3-demo:6039): Gtk-WARNING **: Theme parsing error: gtk-widgets.css:62:17: Theming engine 'unico' not found
$ gtk-demo 
Gtk-Message: Failed to load module "overlay-scrollbar"

After installing the packages as shows above, both Lotus Notes and Lotus Sametime render the widgets much nicer. The theme is not perfectly inline with the out of the box Ubuntu 12.10 theme, but I can live without having the overlay scrollbar in Notes and Sametime.

Tuesday, 8 January 2013

Lotus Notes migration to Ubuntu 12.10 64bit

Having made the first steps in securing the W530 (see previous posts) and having set up further security & compliance related components as dictated by corporate policy, I was ready to start migrate my data from the old t400 thinkpad to the new box. The most critical part of the data to be migrated is Lotus Notes data. I am proficient with the operating system and all other software components I have on my old laptop but I am not a Lotus Notes experts at all, it is mostly a black box for me and a potential source of trouble challenges. The old laptop is running Ubuntu 10.04 32bit while I set up the new one with Ubuntu 12.10 64bit as mentioned in previous posts.

Backing up Lotus Noted Data

To clarify the scope, I was migrating from my data from a Lotus Notes 8.5.2 environment on Ubuntu 32bit to a 8.5.3 one on 64bit. Additionally, I also decided to clean the cruft accumulated over the last 7 years... I checked for defunct workspace icons and any obsolete local replicas and removed those. Then sorted all email in my inbox to their appropriate folders (usually, I do this while waiting for connecting flights at airports) and archived all emails from before 2013. I also changes the archive settings to save any new items to a new archive file. After cleaning up, I was ready to migrate my data.

First and foremost, do know what you have to migrate. I wanted to migrate the minimum required amount of settings but not less. Lotus Notes stores data in the proprietary Network Storage Facility format, these files have nsf os NSF extension. These files are self contained document-oriented databases storing semi structured data - the application logic and view definitions are also incorporated. There are some files like cache.nsf or log.nsf that do not carry valuable information, others will have to be backed up.

My mail file is just a normal nsf file, however it is encrypted with the private key stored in my ID file. It is a good practice to keep a backup of your ID file anyway, I have seen people believe that remembering their Lotus Notes password would be enough to recover a corrupt or deleted ID file...

Workspace definitions and icons are stored in the file desktop8.ndk, which also needs to be backed up. I also made sure to backup my custom signatures.

There are some settings stored in the eclipse workspace, but I did not migrate any of those. I was trying hard to start with an clean as possible environment and ended up with using the following command to create the backup on the old machine:


# backup all databases except log.nsf, the id file, desktop file and signatures
find ~/lotus/notes \( -iname '*.nsf' -o -iname '*.id' \
  -o -iname '*.htm' -o -iname 'desktop8.ndk' \) \
  -not -iname 'log.nsf' -print0 | tar cspzf \
/media/SAMSUNG/backup/t400-20130103/lotus_notes_data-sparse.tar.gz --null -T -

Installation and migration

On the W530, before restoring the files, I installed Lotus Notes and let it create the data directory, and initialize the ~/lotus/notes/data folder with default files. To do so, I started it, waited until the splash screen disappears and the first application window appears asking me inputs for initial setup. At this very first screen, I pressed cancel to abort and exit. I made sure all Lotus Notes processes terminated - one can use the "Lotus Notes Zap" utility for this.

Then I extracted the data and issues some more tweaks using the commands listed below. Other minor customization like setting the theme and default font was done from within the graphical interface, but nothing worth to document...


cd ~ # make sure we are in the right directory
tar xzf /media/tibi/SAMSUNG/backup/t400-20130103/lotus_notes_data-sparse.tar.gz
# start Lotus Notes... and choose "office network" as current location, look around, then exit.

chmod -x ~/lotus/notes/data/*.gif # just some cosmetics...
# I use English language, but more with my preferred date format
cat <<EOF >> ~/lotus/notes/data/notes.ini
DateOrder=YMD
ClockType=24_HOUR
DateSeparator=-
TimeSeparator=:
EOF
# restart notes and verify the date format used in your inbox.

The result is a fully functional install with all the data migrated, however it is lacking a few more tweaks to improve the visual appearance of native widgets. These tweaks are going to be covered in the next post.

Monday, 7 January 2013

Optimus and Ubuntu 12.10

The W530 comes with an integrated intel GPU and a discrete graphics card from nvidia. Out of the box, the firmware is set up to run with Optimus enabled, but it provides option to enable only integrated or only discrete graphics.

I do not want to play any games. This thinkpad is a workhorse. I prefer a very stable system that is able run off battery for an extended duration as I am travelling a lot. I also need to be able to use the external VGA port, I am not so keen on using the mini DisplayPort at the moment.

Stock Ubuntu with Optimus enabled

I started using the laptop with the factory settings as far as graphics setup goes. Between Xmas and New Year's Eve I did not have any chance to test it with an external display, but I confirmed both integrated and discrete graphics are active.


$ lspci | grep VGA
00:02.0 VGA compatible controller: Intel Corporation 3rd Gen Core processor Graphics Controller (rev 09)
01:00.0 VGA compatible controller: NVIDIA Corporation GK107 [Quadro K1000M] (rev a1)

Unfortunately, I experienced occasional blank screen on startup - black screen with a X mouse pointer to be precise - that could be worked around by switching to a virtual terminal (e.g. Ctrl-Alt-F1), logging in and restarting lightdm. Also, plymouth failed to show up randomly on startup and shutdown, which was a mostly cosmetic issue, but not a good sign at all. Although this situation was technically manageable, I was looking for a very deterministic, stable configuration, so I changed the firmware settings.

Stock Ubuntu with discrete graphics only

Switched to discrete graphics, which resulted into a garbled screen, an I was not even able to switch to a virtual terminal. This was with the opensource nouveau driver. At that time, I have not tested the proprietary nvidia drivers, but rather quickly changed to integrated graphics in firmware setup.

Stock Ubuntu with integrated graphics only

Switched to integrated graphics, which sometimes worked, but in the majority of case I ended up with an error telling me "The system is running in low-graphics mode". Here, too, I was able to switch to a virtual terminal and recover gracefully. After some investigation it turned out the situation was related to a race condition around lightdm that prevented proper graphical startup - see this bug report for further details. The W530 is up in 4-5 seconds after POST with the fast SSD and is more prone to race conditions therefore.

In the bug report referred to above I found multiple suggested workarounds mainly focused on local scripts restarting lightdm twice. I have applied a slightly different workaround and optimized it to fit my needs:


$ sudo patch /etc/init/lightdm.conf <<EOF
--- /etc/init/lightdm.conf    2012-10-09 17:00:02.000000000 +0200
+++ /tmp/init/lightdm.conf    2013-01-06 00:34:26.061236424 +0100
@@ -42,10 +42,12 @@
         plymouth quit || :
         exit 0
     fi
    fi

+    # line added to prevent race condition resulting into "The system is running in low-graphics mode"
+    sleep 0.5s
    exec lightdm
end script

post-stop script
     if [ "$UPSTART_STOP_EVENTS" = runlevel ]; then
EOF

This tweak introduces negligible delay at startup, and I have not experience any graphics issues ever since. With the discrete graphics card disabled in firmware I was able to run the laptop 12 hours on battery, of which about 3 hours were spent suspended, and the other 9 without heavy workloads, mostly.

There is one very important potential drawback of this approach. On the W530, both the VGA and mini DisplayPort are wired to the nvidia chip - at least this is what I conclude from the user guide and from what I have found on the internet. If this turns out to be the case I will not able to hook up an external display with only the integrated card enabled. This made me unable to use a projector. We'll see once I get access to an external display to test with.

To be continued...

The current writing is the first of a series of posts on the Optimus, W530 and Ubuntu 12.10 tuple. Make sure you do not miss Part 2!

Sunday, 6 January 2013

Security & compliance related adjustments

After having installed the base system, it was time to go productive. This required migrating my work-related stuff from the T400 box to my new W530. But before copying any sensitive information to the thinkpad, a few security measures had to be taken...

According to ITCS 300, the information I need to store on my thinkpad makes whole disk encryption necessary. The purpose of disk encryption is to protect sensitive information even in the case when a physically present attacker gains physical control over your computer or hard drive. Encrypting critical files or even the whole home partition would not be enough - think about the swap partition where in-memory data is saved to. Even if software based full disk encryption is used, there are cold boot attack methods to gain access to encryption keys or other memory content. This is why Bitlocker of even LUKS based full disk encryption alone do not yield an universal solution.

Firmware setup

I acquired a self encrypting disk featuring FIPS certified hardware based full disk encryption with AES-256. It allows me to set up an ITCS 300 compliant thinkpad without any impact on performance or battery life imposed by software based full disk encryption such as LUKS. Moreover, it is more secure than LUKS, as the encryption key never has to leave the SSD, so is not even copied to main memory. All one has to do is enable the hard disk password in the firmware menu.

Physical disassembly of the device in order to bypass the hard disk password check would not impose a security risk to the extent present on traditional rotating disks, where replacing the electronic circuit board would simply allow the attacher to gain access to the data without the password. All data is encrypted with a 256 bit key, and the key resides on the SSD controller itself and never leaves it. When a HDD password is set and forgotten, the encryption key is not accessible and cannot be recovered in any way through disassembling the drive - at least this is true in the case the AES encryption key is not stored plain after setting the password, but scrambled/encrypted using the password or the hash thereof.

To make the long story short, I set both user and master passwords on the SSD from within the firmware setup menu. For a more detailed explanation on why both user and master passwords were set, or why the firmware setup menu was used, I recommend to read my post on the topic from December 2012.

Additionally, I also enabled the supervisor password of the thinkpad, but decided not to use a power-on password. In the way I configured the firmware, one can boot the thinkpad without any password, even select the boot device from a supervisor-controlled, preconfigured menu. The hard drive password will be asked anyways unless booting from another device. Configuring a power-on password with a value which is identical to the hard disk password would allow the thinkpad to only ask for the password once and re-use it, however this would not increase, but decrease the security level of the system. There are known attacks worth to be aware of that exploit this scenario to gain access to the hard disk password by cracking the power-on password, even if a security chip is present on the motherboard.

Disabling the guest session in lightdm

While this is not a real security concern at all, I opted to disable the guest session by altering the lightdm configuration file as follows:


echo “allow-guest=false” | sudo tee -a /etc/lightdm/lightdm.conf

Honestly, I do not see any benefits of this feature in a corporate environment...

Saturday, 5 January 2013

First steps after installing Ubuntu 12.10

After setting up the base system on my W530, I also applied some initial customization to the desktop environment. Obviously, one of the first things was changing the background...

Keyboard shortcuts

There is a set of keyboard shortcuts I got used to. In the majority of cases, shortcuts are uncomplicated to customize - even for multimedia of other buttons. One just presses the Super button to active dash, types 'keyb' and hits 'enter' to active the gnome keyboard panel of the control center...

  • First and foremost, under 'Layout Settings' I configured 'English international w/ AltGr dead keys' in addition to my native layout. I use English layout while coding.
  • I selected 'Use same layout for all windows'.
  • Under 'Options', I configured 'Alt-CapsLock' to change the keyboard layout.
  • Back on the keyboard panel I moved to 'Shortcuts' and reconfigured the logout action to be triggered by 'Ctrl-Alt-Backspace'.
  • Then I added a custom action 'Shutdown menu' that allowed triggering the command '/usr/bin/gnome-session-quit --power-off' on 'Ctrl-Alt-Del'. This command displays a menu allowing to select suspend, restart or shutdown. Another option would have been the command '/usr/lib/indicator-session/gtk-logout-helper --shutdown', but this does not allow choices other than shutdown.

I also wanted to reveal the same shutdown menu when the power button is pressed - I got used to it on my other thinkpads and liked it. I tried using xev and acpi_listen but found that pressing the power button does not send any key or acpi events... However, the unlabelled black button next to my 'mute mic' button was recognized as 'Launch1' so mapped it to trigger the same command as 'Ctrl-Alt-Del'.

Compiz settings

I quickly installed the package 'compizconfig-settings-manager' to tweak the window manager. I am rather conservative in this area, but applied the following changes:

  • Enabled snapping windows to window edges.
  • Disabled 'show desktop in switcher' in the Unity plugin configuration.
  • Tuned the Grid plugin to enable corner and bottom half placement of windows.
  • Set the launcher behaviour to autohide in the Unity plugin configuration. Note that this property can also be controlled from within System Settings > Appearance > Behavior

Fine tuning the touchpad

Many properties of the touchpad can be configured via the GUI, however, in fine tuned parameters related to palm detection to increase productivity. I prefer a sensitive palm detection, as I frequently hit the touchpad with the side of my thumbs during development.

I saved the following script under /home/tibi/.local/bin/touchpad-config:


#!/bin/sh
synclient PalmDetect=1 PalmMinWidth=5 TapButton3=2 HorizTwoFingerScroll=1

Running it made the touchpad behave according to my needs. To make the changes persistent I configured dconf as follows:


gsettings set org.gnome.settings-daemon.peripherals.input-devices hotplug-command /home/tibi/.local/bin/touchpad-config