OS X Lion Acquitted of Breaking the Web

Please accept my apologies for this article’s overly dramatic headline, but how else can you properly respond to the accusation that “Mac OS X Lion’s scroll breaks the web“?

According to the article’s author, Pablo Villalba, a feature that premiered in OS X 10.7 Lion — overloading the scrolling gesture with “Show previous/next page” commands — leads to interaction mayhem (emphasis his):

[This behavior] can be disabled with a system-wide setting from Preferences -> Trackpad -> Mouse gestures (disable two-finger swipe), but my problem with it is that it breaks the web with a non-standard behavior, and gives you no JS API to prevent it.

Thankfully, the situation is not quite as gloomy, because Villalba’s claim is flawed:

  1. The swipe-for-page-turn behavior is not non-standard,
  2. it does not break the web, and
  3. disabling it on a website-by-website basis would be the wrong solution.

Just another way to turn a page

Macs equipped with a multi-touch trackpad have offered a two-finger gesture for scrolling window contents for a few years now.

Under OS X 10.7 Lion, this gesture does double-duty for page turning: If the window contents cannot be scrolled (e.g., if the document is fully visible), or if you have scrolled the content as far as it will go, the gesture will switch from scrolling to jumping to the previous or next page in a document.

This two-finger-swipe-to-turn-the-page behavior is a system-wide feature in OS X Lion, and is supported by several applications. For example, you can use it to turn pages in a PDF document in Preview, or to step through photos in Aperture.

In principle, it is just another way to trigger a common operation. As such, it complements existing keyboard shortcuts and button clicks, and constitutes perfectly standard behavior on any Mac running OS X Lion.

If going to another webpage in Safari via the two-finger swipe is breaking “the” web, why doesn’t Villalba accuse the Previous/Next Page buttons of the same crime?

Admittedly, using one interaction for triggering different commands is always problematic, because the user needs to understand what command will be triggered based on the software’s current mode of operation.

Consequently, the risk of inadvertently moving away from the current webpage via the two-finger swipe is likely higher than inadvertently clicking the Previous/Next buttons. But this is not the core problem here.

The real problem is that, for applications, the web’s page metaphor does not make sense.

Web apps and the page metaphor don’t mix

The original concept for the World Wide Web was based on hypertext, and the data you would view right inside the browser was mostly that: interlinked text.

For this kind of data, a page is a natural “serving size”, and this fact is reflected in the way browsers let you navigate “pages”.

If you’re dealing with an interactive web application, however, the page metaphor does not make sense. For a web application, its user interface is the only “page”, and browser features for handling pages — including the Previous/Next Page commands — are not only meaningless. They can even interfere with the proper functioning of such an app and also lead to data loss.

The problem, therefore, does not lie in how you step away from a webpage running an app — via clicking a button, pressing a keyboard shortcut, or gesturing on a trackpad –, but in the very existence of these page-turning functions.

Keeping things consistent across, and within, apps

To address this problem for his own web application, Villalba would like to disable Safari’s turn-the-page gesture programmatically. He bemoans that, as of yet, there is no way to do this.

As I explained above, though, using this gesture for page turning is not limited to the Safari web browser. It is a system-wide feature in OS X Lion. Therefore, if a user has chosen to use this feature, it should work consistently across, as well as within, applications.

If it were possible to disable the gesture on a website-by-website basis, then that actually would break “the web” for any user who wants to use the gesture.

Instead of letting web developers disable system-features in this manner, I would prefer another solution to the very real problem that Villalba talks about.

An application mode in web browsers

The features and related user interface elements that browsers provide for handling web content as pages is what gets in the way when running a web app.

One possible solution, therefore, could be based on a dedicated “application mode” that is supported across different browsers and invoked by a simple, standardized command or tag.

In this mode, the “Previous/Next Page” buttons would be deactivated, forcing the user to stay on the web app’s page. Selecting the Close Window command would present a warning dialog whose text could be customized programmatically, and the user’s response to it passed on to the web app, before it is requested to quit.

Surely there are other functions that web app developers would like to see being enabled or disabled based on this app setting. Also, I’m not sure exactly how these features should be implemented design-wise: How should the application mode status be communicated to the user?Should there be an override for deactivated functions? Etc.

In any case, the key is that the solution is standardized across browsers and web applications, and that it provides a flawless user experience that does not rely on the design and coding skills of the individual web applications’ developers.

Just as importantly, the application mode must not modify any interactions — like scrolling or page selection in a browser — that may be un-conventional overall, but perfectly common on a specific computer platform.

More Hotel Room Observations

As I’ve stated before, hotel rooms are amazing places for interaction designers to explore. The people who stay here come from a range of cultural backgrounds, and their technical ability varies greatly. This makes designing user interfaces for this environment a formidable challenge, because all artifacts in a hotel room must be usable by every one of those guests.

Here are some fresh observations from a recent stay in Denver, Colorado, and related, useful design guidelines.

Don’t baffle me with wrong affordances!

One of the first discoveries when we walked into the room, was this knob on the night stand.

Night stand that, right below its top, features a knob that looks exactly like those on the TV console dresser

How neat: Looks like you can pull out a board to extend the nightstand’s surface! Only you can’t.

Although the knob looks identical to these two on a fully functional drawer, …

Front of the TV console dresser with three drawers, the top one of which sports two knobs

… the one on the nightstand does nothing. It’s pure decoration. And awful design, because things that look identical should work in identical manners.

Make devices physically easy to use!

Our room had a balcony, that you access through a sliding glass door. Its lock is a poster-child for visibility: Not one detail of its inner workings is hidden behind a cover.

Lock on balcony sliding door, consisting of two brackets connected by a U-shaped bolt

Although this makes it easy to understand how the lock works, it is painfully difficult to actually operate it.

The fit of the lock is very tight. Unless you manage to move the door to just the right position in which the friction between the bolt and the holes is minimized, it is simply impossible to pull the lock bolt out of these brackets.

The lack of a proper handle on the bolt only makes matters worse, and we managed to send the thing flying across the room more than once.

Provide useful instructions for non-simple devices!

Not all things in a hotel room are as simple as the balcony lock. Some of them, like this coffee maker, require instructions to make them work.

Small, two-cop coffee maker plus paper cups, coffee pouches, and condiments

Instructions should be easy to find, easy to read, and easy to understand.

The coffee makers instructions, however, are hidden on the inside of its lid, and nothing on, or near, the device points to that location.

The “manual” itself uses tiny low-resolution images, which are difficult to decode. For example, compare steps two and three: Can you make out the two disks representing the coffee pads inside the machine?

Lid of the coffee maker, lifted up, and displaying iconic brewing instructions on its inside

Why not place a placard next to the machine to make the instructions as easy to find as possible? This would also provide ample room for bigger, easier-to-decipher images, as well as plain text instructions.

The device’s user interface could also use a bit of attention.

The coffee maker from above with three buttons on the top

Its three buttons would benefit from higher-contrast labels, and coloring the more-important “Stop” button red would make it easier to distinguish between it and the “Brew” buttons.

Clue me in on how this thing works!

Speaking of easy-to-use buttons, this beautiful block of chromed metal is the toilet flush-lever. It does stand out from the off-white ceramics of the bowl, so it’s easy to find.

Its very clean rectangular shape, however, fails to provide any visual clues as to how to operate it.

Chrome toilet flush lever on the side of the tank

To trigger a flush, you need to push down on the rear (i.e., wall-side) end of the lever. A little indentation on the lever would remove any doubts about where to place your finger, and in which direction to move this control.

Help me find stuff quickly!

We prefer to fine-tune our rooms’ temperature, but sometimes it’s difficult to find the respective control panel.

See that little box on the wall to the right of the TV? That is the A/C control panel.

Wall of our hotel room, showing HVAC unit in the far corner and HVAC control panel close by, separated by a desk and TV console

The corresponding heater/AC unit is that large thing towards the far end of this wall, several meters away from the control panel.

It’s almost funny how unfortunate the control panel’s mounting location has been chosen, because when you enter the room, it’s hidden behind the narrow column on the wall that you can see near the photo’s right edge.

And when you lie down in bed, it’s hidden behind the huge-screen TV. In fact, my fiancée noticed the panel just when we were about to leave the room to check out …

To close this article on a positive note, something that is essential to us traveling geeks was wonderfully easy to find in this room: Power sockets!

Instead of rummaging around under desks, behind TVs, or inside fridge closets for precious electrons to recharge iPhone & Co, two empty sockets were in plain sight

Base of the desk lamp with the lamp's power switch and two 110V power sockets

Placed in the base of the desk lamp, they are easy to find, convenient to access, and provide power exactly where you will need it.

Yet more (if older) hotel room observations

If you find “hotel room usability” as exciting as I do, you will enjoy reading my observations on a radio alarm clock designed specifically for hotel rooms, a usability problem with hotel safes, an annoying shower curtain, a make-up mirror with a touch user interface, and even something as mundane as an elevator button panel.

Is the Menu of the Future Still a Menu?

The team behind Ubuntu Linux is on a mission to redefine how users issue commands to software applications. In a blog post entitled “Introducing the HUD. Say hello to the future of the menu.“, Mark Shuttleworth explains the approach they are researching.

The concept of seamlessly integrating an “intelligent” command line into a modern graphical user interface has been around for quite a while in the form of utilities like LaunchBar, Alfred, or Quicksilver.

These programs allow you to not only find files, but also let you search application-specific data like address book contacts, music tracks, and browser bookmarks, and apply meaningful operations to the search results.

While not quite as powerful, the system-wide search features in Windows 7 and Mac OS X also transcend simple searches based on the files’ names or their content.

What is new about the Ubuntu HUD is its scope: Instead of operating on just files and data, the HUD can also find and execute commands from the application’s menu.

Apple uses a similar approach with a text field inside the Help menu, which lets you search the entire menu structure of the currently active application as well as the app’s help file.

OS X's Help menu displaying its search text field, as well as menu items and relevant help file chapters for the current application based on the entered search term

The key difference between the two approaches is that Apple designed the menu search as an extension of the contextual Help system. As such it complements the application’s menu.

In contrast, the Ubuntu team considers their HUD interface to become a menu bar replacement.

> Say hello to the Head-Up Display, or HUD, which will ultimately replace menus in Unity applications.

As much as I am intrigued by the Ubuntu HUD as such, getting rid of the menu metaphor completely — including keyboard shortcuts — is not just unnecessarily drastic. It is short-sighted and misguided for a number of reasons.

For sheer speed, keyboard shortcuts are hard to beat

When graphical user interfaces were in their infancy, keyboard shortcuts were “invented” to allow users to more quickly invoke commonly used menu commands.

Instead of opening a menu and selecting one of its items with the mouse, you press a combination of a special “command” modifier key and one or more additional keys.1

The key combination for a command is displayed next to its menu item, so you are reminded of it every time you select the command from the menu via a pointing device.

Once you’ve memorized a keyboard shortcut, pressing it takes a fraction of second.

Compare this to the Undo operation in the “Introducing the HUD to Ubuntu” video (which you can watch embedded in Shuttleworth’s article or on YouTube): At 0:50 minutes into the video, you can observe how the user literally types “undo” into the HUD.

It takes the user about three seconds to issue that command!

This single interaction in the video provides sufficient proof that getting rid of keyboard shortcuts would be a seriously foolish move.

The HUD is modal, a keyboard short isn’t (quite)

To ensure that key presses reach their intended destination — either text entry into a document or text field, or a menu command –, the computer needs to be put into a temporary “command” mode when entering keyboard shortcuts.

The machine enters this mode when you press the Command key, and as soon as you let go of the key, it will leave the mode again. When you press the keys that make up a shortcut, Command is merely “first among equals”.

Once you’ve familiarized yourself with a shortcut, you will no longer “build it” key by key — “first, the Command key. Then the Shift key. And now press ‘S’, and there’s Save As…!”

Instead, you will “chord” the command, and it will feel like a single interaction step.

The HUD, in contrast, always requires at least three interaction steps:

  1. Pressing a key (or shortcut) to summon the HUD
  2. Pressing one or more keys to enter the search term
  3. Pressing a key to commit or cancel the selected command

Keyboard shortcuts are non-ambiguous and non-arbitrary

Depending on the search term you enter into the HUD, you may have to make a conscious selection from the list of search results that the HUD presents to you.

If you do have to make a selection beyond accepting or rejecting the “best match” that is automatically pre-selected for you, then that requires another interaction step, possibly consisting of pressing the up or down arrow keys multiple times.

In contrast, a keyboard shortcut always triggers one, and exactly one, command.

Command-P will always print the current document, but the top match that appears in the HUD when you enter a “P” may just as well be “Preferences”.

Speaking of the match between shortcut keys and commands: Shuttleworth claims that …

> Hotkeys are a sort of mental gymnastics, the HUD is a continuation of mental flow.

At least for basic shortcuts, that claim doesn’t hold water.

When assigning a key to a menu item, developers don’t make random picks. Instead, they carefully choose letters that have a meaningful correlation with the command. Like “Open”, “Save”, “Print”, “Close Window”, or “Quit”.

Note, by the way, how you would start the corresponding HUD search with the exact same letters that are used for the shortcuts!

In cases such as these, where you know exactly which command to execute, the HUD does not offer advantages in terms of learning, recalling, or finding commands.

Some shortcuts use less intuitive letters, like Command-Z for undo, or Command-[comma] for opening an application’s preferences (on the Mac). Once you use them often enough, though, even they will become second nature over time, especially when they are among those commands that are standardized across the entire platform and, thus, trigger the same function in all programs.

Shortcuts and menus leverage motor memory

When you issue a shortcut like Command-Control-F, you will likely not have to consciously place the three fingers on their respective target keys. After you’ve gained enough practice in “chording” the shortcut, your fingers will move into place automatically.

The same holds true when selecting certain commands with the mouse. E.g., on a Mac, the “About [this application]” command’s position in the menu structure is standardized across the operating system.

Therefore, you know (from experience) that your first mouse pointer destination is “somewhere up there in the top-left corner, and it’s the menu directly to the right of that Apple thingy.” Your second destination is the first thing right underneath the menu’s title label.

Even though their positions aren’t standardized as strictly, many other often-used commands — for creating, opening, and saving files, for example — are found in similar locations in every application on the platform.2

Combined with motor memory, selecting a menu item this way can be surprisingly efficient.

Driving around with the mouse? All day!

Here is another quote from Shuttleworth’s article:

> So while there are modes of interaction where it’s nice to sit back and drive around with the mouse, we observe people staying more engaged and more focused on their task when they can keep their hands on the keyboard all the time.

For someone who mainly works with text, that may very well be true.

Watch someone work on non-text data, however, and you’ll observe that, for these people, keeping one hand on the mouse (or graphics tablet, etc.) and another on the keyboard is their standard “mode of interaction”.

This approach is common for any work that involves a continuous mixture of entering keyboard commands and moving objects on the screen, regardless of whether these objects are shapes on a canvas in a graphics editor, audio and MIDI snippets in a track arrange window in a music recording program, or images in a photo management app.

In such cases, you manipulate the on-screen elements with one hand on your pointing device of choice, while using the other hand to enter keyboard shortcuts for copying, pasting, etc., or pressing modifier keys for changing the cursor behavior from dragging to rotation, for example.

Consequently, assuming that keeping both hands on the keyboard at all times is the optimum solution for every type of user is pure nonsense.

For anyone who spends the better part of their working day inside applications like Photoshop, Pro Tools, or Aperture, it would be a nightmare to be forced to use a command line instead of being able to concurrently combine keyboard shortcuts with the extensive use of a pointing device.

The Ubuntu HUD’s content isn’t optimized for its use

When you watch the demo video, keep a close eye on the search results in the HUD. In its current implementation, the HUD provides just a different view on the application’s menu structure. Its output is not optimized for use in this UI control.

For example, at 1:55, the user enters “alic” into the HUD to search for “Alice in Wonderland”. The matching result is listed as “Tools > Bookmarks > Alice in Wonderland”.

What average computer user thinks of an audio track as a bookmark? And how is a bookmark related to tools?

In cases as this, displaying the full menu structure that encloses the “Alice in Wonderland” track, is not adding any useful information. In fact, it even makes decoding that search match confusing.

Compare that to how LaunchBar displays a music track: All you see is the track name.

LaunchBar displaying search results as a simple list with a single line of text per item, preceded by an icon indicating the item's data type

The search results’ types are communicated by simple, reasonably intuitive icons. Unlike the Ubuntu HUD, LaunchBar hides the found items’ meaningless taxonomical overhead from the user.

Using the left and right arrow keys, you can further explore the results list. Moving “right” from the “Cyberfunk Acoustic Revenge” album, you get to see all tracks on that album.

List of song titles displayed by LaunchBar after

Digging into the selected track, your are presented with the track’s artist, the album that it’s on, and its genre.

Further

The core difference between the Ubuntu HUD (as it is working now) and LaunchBar is that the former is based on a rigid menu structure, whereas the latter searches arbitrary data, and presents it in a way that is meaningful and highly accessible to the user.

Instead of mapping a flattened menu tree structure into a linear text list as demonstrated in the video, the HUD should display its information from a task perspective by:

  • displaying contextual information when it is useful ( “History > Planet Ubuntu” ⇒ “Browser History > Planet Ubuntu”),

  • reducing redundancy (“Bookmarks > Bookmark This Page” ⇒ “Bookmark This Page” or “Edit > Undo Fuzzy Glow” = “Undo Fuzzy Glow”), and

  • speaking the user’s language (“Tools > Bookmarks > Alice in Wonderland” ⇒ “Play Music: Alice in Wonderland”).

In with the new, but do keep the old!

I applaud Ubuntu’s efforts to come up with new ways of interacting with “The Machine”.

The HUD has the potential to combine the best from advanced search technologies like Apple’s Spotlight, smart command line interfaces like LaunchBar or Quicksilver, and new ways to access an application’s menu commands into a single, extremely powerful, yet usable interface element.

Shuttleworth is probably right when he claims that:

> the HUD is faster than mousing through a menu, and easier to use than hotkeys since you just have to know what you want, not remember a specific key combination.

Conversely, though, the HUD isn’t easier to use than a menu nor is it faster than hotkeys.

To use the HUD effectively, you need to have an understanding of which commands it understands, whereas you can browse (and sometimes search) a menu for the commands and functions it makes available.

And while you have to memorize a hotkey in order to you use it effectively, it is much faster to access than entering a command via the HUD.

Therefore, I hope the powers that be at Ubuntu will revise the decision to completely tear out the support for classic menus from their operating system.

Instead, they should let your users decide whether the menu bar is displayed in its standard on-screen location, whether it’s stashed away in the panel, or whether it should be fully hidden from the user’s eye.

Sometimes, new ideas aren’t good enough to fully replace the old ones. But, more often than not, they’re just right to complement and extend them.


  1. The Mac has a dedicated Command key, but other operating systems’ use of the Control key works just as well. For brevity and convenience, I will use the term “Command key” in this article. 

  2. Unless, of course, an app’s developer lacks the required expertise of properly designing a native application for a specific operating system or window manager.