OS X Lion Acquitted of Breaking the Web

Please accept my apologies for this article’s overly dramatic headline, but how else can you properly respond to the accusation that “Mac OS X Lion’s scroll breaks the web“?

According to the article’s author, Pablo Villalba, a feature that premiered in OS X 10.7 Lion — overloading the scrolling gesture with “Show previous/next page” commands — leads to interaction mayhem (emphasis his):

[This behavior] can be disabled with a system-wide setting from Preferences -> Trackpad -> Mouse gestures (disable two-finger swipe), but my problem with it is that it breaks the web with a non-standard behavior, and gives you no JS API to prevent it.

Thankfully, the situation is not quite as gloomy, because Villalba’s claim is flawed:

  1. The swipe-for-page-turn behavior is not non-standard,
  2. it does not break the web, and
  3. disabling it on a website-by-website basis would be the wrong solution.

Just another way to turn a page

Macs equipped with a multi-touch trackpad have offered a two-finger gesture for scrolling window contents for a few years now.

Under OS X 10.7 Lion, this gesture does double-duty for page turning: If the window contents cannot be scrolled (e.g., if the document is fully visible), or if you have scrolled the content as far as it will go, the gesture will switch from scrolling to jumping to the previous or next page in a document.

This two-finger-swipe-to-turn-the-page behavior is a system-wide feature in OS X Lion, and is supported by several applications. For example, you can use it to turn pages in a PDF document in Preview, or to step through photos in Aperture.

In principle, it is just another way to trigger a common operation. As such, it complements existing keyboard shortcuts and button clicks, and constitutes perfectly standard behavior on any Mac running OS X Lion.

If going to another webpage in Safari via the two-finger swipe is breaking “the” web, why doesn’t Villalba accuse the Previous/Next Page buttons of the same crime?

Admittedly, using one interaction for triggering different commands is always problematic, because the user needs to understand what command will be triggered based on the software’s current mode of operation.

Consequently, the risk of inadvertently moving away from the current webpage via the two-finger swipe is likely higher than inadvertently clicking the Previous/Next buttons. But this is not the core problem here.

The real problem is that, for applications, the web’s page metaphor does not make sense.

Web apps and the page metaphor don’t mix

The original concept for the World Wide Web was based on hypertext, and the data you would view right inside the browser was mostly that: interlinked text.

For this kind of data, a page is a natural “serving size”, and this fact is reflected in the way browsers let you navigate “pages”.

If you’re dealing with an interactive web application, however, the page metaphor does not make sense. For a web application, its user interface is the only “page”, and browser features for handling pages — including the Previous/Next Page commands — are not only meaningless. They can even interfere with the proper functioning of such an app and also lead to data loss.

The problem, therefore, does not lie in how you step away from a webpage running an app — via clicking a button, pressing a keyboard shortcut, or gesturing on a trackpad –, but in the very existence of these page-turning functions.

Keeping things consistent across, and within, apps

To address this problem for his own web application, Villalba would like to disable Safari’s turn-the-page gesture programmatically. He bemoans that, as of yet, there is no way to do this.

As I explained above, though, using this gesture for page turning is not limited to the Safari web browser. It is a system-wide feature in OS X Lion. Therefore, if a user has chosen to use this feature, it should work consistently across, as well as within, applications.

If it were possible to disable the gesture on a website-by-website basis, then that actually would break “the web” for any user who wants to use the gesture.

Instead of letting web developers disable system-features in this manner, I would prefer another solution to the very real problem that Villalba talks about.

An application mode in web browsers

The features and related user interface elements that browsers provide for handling web content as pages is what gets in the way when running a web app.

One possible solution, therefore, could be based on a dedicated “application mode” that is supported across different browsers and invoked by a simple, standardized command or tag.

In this mode, the “Previous/Next Page” buttons would be deactivated, forcing the user to stay on the web app’s page. Selecting the Close Window command would present a warning dialog whose text could be customized programmatically, and the user’s response to it passed on to the web app, before it is requested to quit.

Surely there are other functions that web app developers would like to see being enabled or disabled based on this app setting. Also, I’m not sure exactly how these features should be implemented design-wise: How should the application mode status be communicated to the user?Should there be an override for deactivated functions? Etc.

In any case, the key is that the solution is standardized across browsers and web applications, and that it provides a flawless user experience that does not rely on the design and coding skills of the individual web applications’ developers.

Just as importantly, the application mode must not modify any interactions — like scrolling or page selection in a browser — that may be un-conventional overall, but perfectly common on a specific computer platform.

Is the Menu of the Future Still a Menu?

The team behind Ubuntu Linux is on a mission to redefine how users issue commands to software applications. In a blog post entitled “Introducing the HUD. Say hello to the future of the menu.“, Mark Shuttleworth explains the approach they are researching.

The concept of seamlessly integrating an “intelligent” command line into a modern graphical user interface has been around for quite a while in the form of utilities like LaunchBar, Alfred, or Quicksilver.

These programs allow you to not only find files, but also let you search application-specific data like address book contacts, music tracks, and browser bookmarks, and apply meaningful operations to the search results.

While not quite as powerful, the system-wide search features in Windows 7 and Mac OS X also transcend simple searches based on the files’ names or their content.

What is new about the Ubuntu HUD is its scope: Instead of operating on just files and data, the HUD can also find and execute commands from the application’s menu.

Apple uses a similar approach with a text field inside the Help menu, which lets you search the entire menu structure of the currently active application as well as the app’s help file.

OS X's Help menu displaying its search text field, as well as menu items and relevant help file chapters for the current application based on the entered search term

The key difference between the two approaches is that Apple designed the menu search as an extension of the contextual Help system. As such it complements the application’s menu.

In contrast, the Ubuntu team considers their HUD interface to become a menu bar replacement.

> Say hello to the Head-Up Display, or HUD, which will ultimately replace menus in Unity applications.

As much as I am intrigued by the Ubuntu HUD as such, getting rid of the menu metaphor completely — including keyboard shortcuts — is not just unnecessarily drastic. It is short-sighted and misguided for a number of reasons.

For sheer speed, keyboard shortcuts are hard to beat

When graphical user interfaces were in their infancy, keyboard shortcuts were “invented” to allow users to more quickly invoke commonly used menu commands.

Instead of opening a menu and selecting one of its items with the mouse, you press a combination of a special “command” modifier key and one or more additional keys.1

The key combination for a command is displayed next to its menu item, so you are reminded of it every time you select the command from the menu via a pointing device.

Once you’ve memorized a keyboard shortcut, pressing it takes a fraction of second.

Compare this to the Undo operation in the “Introducing the HUD to Ubuntu” video (which you can watch embedded in Shuttleworth’s article or on YouTube): At 0:50 minutes into the video, you can observe how the user literally types “undo” into the HUD.

It takes the user about three seconds to issue that command!

This single interaction in the video provides sufficient proof that getting rid of keyboard shortcuts would be a seriously foolish move.

The HUD is modal, a keyboard short isn’t (quite)

To ensure that key presses reach their intended destination — either text entry into a document or text field, or a menu command –, the computer needs to be put into a temporary “command” mode when entering keyboard shortcuts.

The machine enters this mode when you press the Command key, and as soon as you let go of the key, it will leave the mode again. When you press the keys that make up a shortcut, Command is merely “first among equals”.

Once you’ve familiarized yourself with a shortcut, you will no longer “build it” key by key — “first, the Command key. Then the Shift key. And now press ‘S’, and there’s Save As…!”

Instead, you will “chord” the command, and it will feel like a single interaction step.

The HUD, in contrast, always requires at least three interaction steps:

  1. Pressing a key (or shortcut) to summon the HUD
  2. Pressing one or more keys to enter the search term
  3. Pressing a key to commit or cancel the selected command

Keyboard shortcuts are non-ambiguous and non-arbitrary

Depending on the search term you enter into the HUD, you may have to make a conscious selection from the list of search results that the HUD presents to you.

If you do have to make a selection beyond accepting or rejecting the “best match” that is automatically pre-selected for you, then that requires another interaction step, possibly consisting of pressing the up or down arrow keys multiple times.

In contrast, a keyboard shortcut always triggers one, and exactly one, command.

Command-P will always print the current document, but the top match that appears in the HUD when you enter a “P” may just as well be “Preferences”.

Speaking of the match between shortcut keys and commands: Shuttleworth claims that …

> Hotkeys are a sort of mental gymnastics, the HUD is a continuation of mental flow.

At least for basic shortcuts, that claim doesn’t hold water.

When assigning a key to a menu item, developers don’t make random picks. Instead, they carefully choose letters that have a meaningful correlation with the command. Like “Open”, “Save”, “Print”, “Close Window”, or “Quit”.

Note, by the way, how you would start the corresponding HUD search with the exact same letters that are used for the shortcuts!

In cases such as these, where you know exactly which command to execute, the HUD does not offer advantages in terms of learning, recalling, or finding commands.

Some shortcuts use less intuitive letters, like Command-Z for undo, or Command-[comma] for opening an application’s preferences (on the Mac). Once you use them often enough, though, even they will become second nature over time, especially when they are among those commands that are standardized across the entire platform and, thus, trigger the same function in all programs.

Shortcuts and menus leverage motor memory

When you issue a shortcut like Command-Control-F, you will likely not have to consciously place the three fingers on their respective target keys. After you’ve gained enough practice in “chording” the shortcut, your fingers will move into place automatically.

The same holds true when selecting certain commands with the mouse. E.g., on a Mac, the “About [this application]” command’s position in the menu structure is standardized across the operating system.

Therefore, you know (from experience) that your first mouse pointer destination is “somewhere up there in the top-left corner, and it’s the menu directly to the right of that Apple thingy.” Your second destination is the first thing right underneath the menu’s title label.

Even though their positions aren’t standardized as strictly, many other often-used commands — for creating, opening, and saving files, for example — are found in similar locations in every application on the platform.2

Combined with motor memory, selecting a menu item this way can be surprisingly efficient.

Driving around with the mouse? All day!

Here is another quote from Shuttleworth’s article:

> So while there are modes of interaction where it’s nice to sit back and drive around with the mouse, we observe people staying more engaged and more focused on their task when they can keep their hands on the keyboard all the time.

For someone who mainly works with text, that may very well be true.

Watch someone work on non-text data, however, and you’ll observe that, for these people, keeping one hand on the mouse (or graphics tablet, etc.) and another on the keyboard is their standard “mode of interaction”.

This approach is common for any work that involves a continuous mixture of entering keyboard commands and moving objects on the screen, regardless of whether these objects are shapes on a canvas in a graphics editor, audio and MIDI snippets in a track arrange window in a music recording program, or images in a photo management app.

In such cases, you manipulate the on-screen elements with one hand on your pointing device of choice, while using the other hand to enter keyboard shortcuts for copying, pasting, etc., or pressing modifier keys for changing the cursor behavior from dragging to rotation, for example.

Consequently, assuming that keeping both hands on the keyboard at all times is the optimum solution for every type of user is pure nonsense.

For anyone who spends the better part of their working day inside applications like Photoshop, Pro Tools, or Aperture, it would be a nightmare to be forced to use a command line instead of being able to concurrently combine keyboard shortcuts with the extensive use of a pointing device.

The Ubuntu HUD’s content isn’t optimized for its use

When you watch the demo video, keep a close eye on the search results in the HUD. In its current implementation, the HUD provides just a different view on the application’s menu structure. Its output is not optimized for use in this UI control.

For example, at 1:55, the user enters “alic” into the HUD to search for “Alice in Wonderland”. The matching result is listed as “Tools > Bookmarks > Alice in Wonderland”.

What average computer user thinks of an audio track as a bookmark? And how is a bookmark related to tools?

In cases as this, displaying the full menu structure that encloses the “Alice in Wonderland” track, is not adding any useful information. In fact, it even makes decoding that search match confusing.

Compare that to how LaunchBar displays a music track: All you see is the track name.

LaunchBar displaying search results as a simple list with a single line of text per item, preceded by an icon indicating the item's data type

The search results’ types are communicated by simple, reasonably intuitive icons. Unlike the Ubuntu HUD, LaunchBar hides the found items’ meaningless taxonomical overhead from the user.

Using the left and right arrow keys, you can further explore the results list. Moving “right” from the “Cyberfunk Acoustic Revenge” album, you get to see all tracks on that album.

List of song titles displayed by LaunchBar after

Digging into the selected track, your are presented with the track’s artist, the album that it’s on, and its genre.

Further

The core difference between the Ubuntu HUD (as it is working now) and LaunchBar is that the former is based on a rigid menu structure, whereas the latter searches arbitrary data, and presents it in a way that is meaningful and highly accessible to the user.

Instead of mapping a flattened menu tree structure into a linear text list as demonstrated in the video, the HUD should display its information from a task perspective by:

  • displaying contextual information when it is useful ( “History > Planet Ubuntu” ⇒ “Browser History > Planet Ubuntu”),

  • reducing redundancy (“Bookmarks > Bookmark This Page” ⇒ “Bookmark This Page” or “Edit > Undo Fuzzy Glow” = “Undo Fuzzy Glow”), and

  • speaking the user’s language (“Tools > Bookmarks > Alice in Wonderland” ⇒ “Play Music: Alice in Wonderland”).

In with the new, but do keep the old!

I applaud Ubuntu’s efforts to come up with new ways of interacting with “The Machine”.

The HUD has the potential to combine the best from advanced search technologies like Apple’s Spotlight, smart command line interfaces like LaunchBar or Quicksilver, and new ways to access an application’s menu commands into a single, extremely powerful, yet usable interface element.

Shuttleworth is probably right when he claims that:

> the HUD is faster than mousing through a menu, and easier to use than hotkeys since you just have to know what you want, not remember a specific key combination.

Conversely, though, the HUD isn’t easier to use than a menu nor is it faster than hotkeys.

To use the HUD effectively, you need to have an understanding of which commands it understands, whereas you can browse (and sometimes search) a menu for the commands and functions it makes available.

And while you have to memorize a hotkey in order to you use it effectively, it is much faster to access than entering a command via the HUD.

Therefore, I hope the powers that be at Ubuntu will revise the decision to completely tear out the support for classic menus from their operating system.

Instead, they should let your users decide whether the menu bar is displayed in its standard on-screen location, whether it’s stashed away in the panel, or whether it should be fully hidden from the user’s eye.

Sometimes, new ideas aren’t good enough to fully replace the old ones. But, more often than not, they’re just right to complement and extend them.


  1. The Mac has a dedicated Command key, but other operating systems’ use of the Control key works just as well. For brevity and convenience, I will use the term “Command key” in this article. 

  2. Unless, of course, an app’s developer lacks the required expertise of properly designing a native application for a specific operating system or window manager. 

The iPhone Mute Switch Conundrum

If you’re at all interested in the mobile phone ecosystem, you will likely have heard about a recent incident at Avery Fisher Hall: The conductor of the New York Philharmonic Orchestra stopped a concert performance, because of an alarm ringing on an iPhone, whose owner thought he had properly silenced the device.

The New York Times has all the details.

The incident received lots of attention on the Internet, with many a commenter accusing the iPhone’s owner — who has kindly been anonymized as “Patron X” — of being simply too stupid to use a “smart” phone.

As a user interface designer, you can view this unfortunate incident as a real-life usability test. And the iPhone failed this test. Yes, the phone failed, and not the user.

The iPhone fails a usability test

Here is a key passage from the New York Times article:

[Patron X] said he made sure to turn [the iPhone] off before the concert, not realizing that the alarm clock had accidentally been set and would sound even if the phone was in silent mode.

“I didn’t even know phones came with alarms,” the man said.

This user was expecting the mute switch to completely silence the phone.

Since he had just gotten the iPhone the day before the concert, he had to rely on his intuition to make sense of the device’s controls and related on-screen messages.

The mental model he came up with for the mute switch was very simple: “This thing switches the phone’s speaker on or off.”3

His exact thoughts might have been different, but the important part is that he was not expecting the general function of the switch to be any more complex than that.

“It’s ‘all sounds on’ or ‘all sounds off’. Got that!”

This is a perfectly sound expectation of how the mute switch works. Here’s is the reason why: There is no indication in the iPhone’s UI that would contradict this straight-forward expectation. I’ll explain that in a minute.

A conflict of expectations, not a conflict of commands

Marco Armend is among those who disagree.

He argues that the iPhone switch works just as it should, and offers this explanation for what went wrong (emphasis his):

The user told the iPhone to make noise by either scheduling an alarm or initiating an obviously noise-playing feature in an app.

The user also told the iPhone to be silent with the switch on the side.

The user has issued conflicting commands, and the iPhone can’t obey both.

Alas, he’s wrong. Well, kind of.

Marco takes the point of view of a programmer. From that perspective, there are, indeed, conflicting commands that the designers of the iPhone’s user interface had to foresee and address.

Keep in mind, though, that Patron X was not only not aware that an alarm had been set unintentionally in the iPhone’s Clock app.4 He claims that he did not even know that that feature existed on his phone.

The expectation of the user clearly was this: “When I flip this switch, the phone won’t make any noise whatsoever during this concert.”

Therefore, from the user’s perspective, only a single, non-ambiguous command was given: Stay quiet, no matter what!

This is not a case of conflicting commands. Instead, it is a matter of conflicting expectations of how the device operates, or should operate. A classical example of the user’s mental model not matching that intended by the device’s designers.

Managing user expectations through status visibility

According to page 11 of the iPhone User Guide (iOS 5 version), the “Ring/Silent switch” operates as follows:

In ring mode, iPhone plays all sounds. In silent mode, iPhone doesn’t ring or play alerts and other sound effects.

Important: Clock alarms, audio apps such as Music, and many games still play sounds through the built-in speaker when iPhone is in silent mode.

Alas, the iPhone’s user interface does not make the user aware of these exceptions at all.

To understand the behavior of the mute switch, you must read the user’s manual. And to apply this understanding, you have to memorize it.

Let’s put this into perspective: A premium “smart” mobile phone whose user interface is considered by some to be the epitome of user-friendliness, requires you to memorize configuration details, because the device does not properly communicate its system status.5

If you take a closer look at the status feedback the iPhone provides with regards to the mute switch, you will find that they do not take alarm settings into account.

When you mute the phone, you feel a single vibration burst, and this info bezel appears on the iPhone’s screen:

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_MutedBezel.jpg” alt=”Screen of a locked iPhone, displaying the “muted” info bezel” border=”0″ width=”300″ height=”450″ />

Un-mute the iPhone, and you get to see this (without any vibration):

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_UnMutedBezel.jpg” alt=”Screen of a locked iPhone, displaying the “_un_muted” info bezel” border=”0″ width=”300″ height=”450″ />

The only setting I could find that affects this behavior is the Vibrate option in the Sounds settings. Un-check that option, and the vibration feedback is now reversed: the phone vibrates when you un-mute it, and stays quiet when you mute it.

The icons shown above look exactly the same, however, regardless of whether an alarm is set, or not.

There is an indication that an Alarm may sound: the clock icon in the status bar. Similar to the bell icons, however, it also has the exact same shape and color, regardless of whether any of the active alarms will actually “make a noise”.

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_AlarmIcon.jpg” alt=”Screen of a locked iPhone, with an arrow highlighting the “clock” alarms icon in the status bar” border=”0″ width=”300″ height=”450″ />

Therefore, if you absolutely, positively must make sure that your iPhone stays quiet while it is muted and the Alarm icon is showing, you will need to open the Clock app and check the list of alarms.

Which list, by the way, again fails to provide visual cues as to which alarms have a sound assigned to it, or which are vibration-only. You can only check this by entering edit mode, tapping an alarm, and checking the Sound setting. Seeing a corresponding icon right in the list items would be useful here.

List of alarms in the Clock app on an iPhone

Of course, you can switch the iPhone off completely, but that would make the very existence of the Ring/Silent switch rather moot. It would also prevent you from quickly looking up a piece of information on the iPhone due to the time it takes the iPhone to cold-start. And if you did do so, you may just be surprised by an alarm going off the moment your log-in screen appears …

As a third option, you could try to memorize which alarms you have set for which times, and whether any of these will play a sound, or not. That is a “suggestion” that you hear a lot from the commenters attacking Patron X over his mishap: “He should have remembered that he had set an alarm that would go off during the concert.”

I cannot begin to explain how stunned I am by these people’s notion that it is perfectly acceptable for a high-tech 21st-century digital device to offload memorizing system status information to its user in this manner.

When designers do not expect their user’s expectations

Given that the iPhone does not present any status feedback — neither visual, nor audible, nor tactile — that it may play a sound while muted, and that a simple control for a simple function should cause the least possible surprise when operated, I think it is perfectly valid and appropriate for a non-expert user of an iPhone to expect the mute switch to work exactly like Patron X thought it would.

I’m not arguing that this simplistic behavior is the most appropriate or the most useful for average users, mind you!

What I am saying, though, is that it is be the most intuitive behavior, i.e., the behavior that a non-expert user would most likely expect to see.

Do tell, little iPhone, what are you up to?

Without hands-on user testing, it is not possible to say which changes to the iPhone’s UI would have been necessary to prevent Patron X from making his mistake.

I have no doubt, though, that more explicit status feedback would have helped immensely.

For example, the “muted” status bezel could display a warning in case an alarm is active that may play a sound.

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_MutedBezelWithAlarmWarning.jpg” alt=”iPhone “muted” info bezel expanded with “Alarm at 8:00pm” warning text” border=”0″ width=”300″ height=”450″ />

For those of us who toggle the Silent mode with the device in their pockets, the iPhone could play different vibration patterns in response to flipping the mute switch.

I assume that the tactile feedback from physically flipping the switch provides sufficient status information about what that switch has been set to. Hence, I don’t think the reversal of the vibration pattern when de-activating silent-mode vibrations is necessary6.

User-testing notwithstanding, the vibration patterns could be modified to vibrate once when the phone is un-muted, stay quiet when it is muted, but vibrate thrice when the phone is muted while alarms with sounds are active.

In other words, “no vibration” means “no sound”. Both vibrations mean “sounds will play”, with the triple-vibe effectively operating like an exclamation mark: “Are you trying to silence me? Well, you better BE AWARE THAT I MIGHT PLAY SOUNDS, you know!”

“Least surprises” is good default behavior

When it comes to making decisions about a system’s default behavior, the “Principle of Least Surprise” is a useful guideline.

In my opinion, the ”Incident at Avery Fisher Hall” is a sign that Apple’s design of the default behavior of the mute switch violates that principle.

Besides adding more explicit status indicators as outlined above, the — again: default — behavior of the mute switch should be to silence all of the phone’s sounds without any exceptions.

To allow users to customize this behavior for their personal preferences, I would test adding an option — “Play sounds in Silent mode” — to the Notification settings.

Every app that can play sounds while the phone is muted, would gain this override option, with its default setting being “Off”.

With a standardized UI for this behavior, the user would have to learn how to set this option for one application in order to understand how to set it for any and all others.

It would also get rid of the guesswork involved with the current design’s imprecise list of “Clock alarms, audio apps such as Music, and many games” by supporting a clear, unambiguous selection of which apps will blare away even during “silent” mode, and which are muted for good.


  1. That, of course, was his mental model before his iPhone caused the interruption of the concert. I’m absolutely sure that Patron X has since revised it. 

  2. Unfortunately, none of the coverage I have read so far addresses how the unintended alarm happened to be set. The fact that it did happen, though, points to a potential UI problem in the Clock application: Why did the user not realize that they had set the alarm? What additional feedback or interactions would have been required to make them aware that they had just set an alarm (presumably) without actually intending to do so. 

  3. Incidentally, “Visibility of system status” is the top-most item on Jakob Nielsen’s list of “Ten Usability Heuristics“. 

  4. In fact, I don’t understand what the point of that reversal is, anyway.