January 2012 – The UI Observatory

Is the Menu of the Future Still a Menu?

The team behind Ubuntu Linux is on a mission to redefine how users issue commands to software applications. In a blog post entitled “Introducing the HUD. Say hello to the future of the menu.“, Mark Shuttleworth explains the approach they are researching.

The concept of seamlessly integrating an “intelligent” command line into a modern graphical user interface has been around for quite a while in the form of utilities like LaunchBar, Alfred, or Quicksilver.

These programs allow you to not only find files, but also let you search application-specific data like address book contacts, music tracks, and browser bookmarks, and apply meaningful operations to the search results.

While not quite as powerful, the system-wide search features in Windows 7 and Mac OS X also transcend simple searches based on the files’ names or their content.

What is new about the Ubuntu HUD is its scope: Instead of operating on just files and data, the HUD can also find and execute commands from the application’s menu.

Apple uses a similar approach with a text field inside the Help menu, which lets you search the entire menu structure of the currently active application as well as the app’s help file.

OS X's Help menu displaying its search text field, as well as menu items and relevant help file chapters for the current application based on the entered search term

The key difference between the two approaches is that Apple designed the menu search as an extension of the contextual Help system. As such it complements the application’s menu.

In contrast, the Ubuntu team considers their HUD interface to become a menu bar replacement.

> Say hello to the Head-Up Display, or HUD, which will ultimately replace menus in Unity applications.

As much as I am intrigued by the Ubuntu HUD as such, getting rid of the menu metaphor completely — including keyboard shortcuts — is not just unnecessarily drastic. It is short-sighted and misguided for a number of reasons.

For sheer speed, keyboard shortcuts are hard to beat

When graphical user interfaces were in their infancy, keyboard shortcuts were “invented” to allow users to more quickly invoke commonly used menu commands.

Instead of opening a menu and selecting one of its items with the mouse, you press a combination of a special “command” modifier key and one or more additional keys.¹

The key combination for a command is displayed next to its menu item, so you are reminded of it every time you select the command from the menu via a pointing device.

Once you’ve memorized a keyboard shortcut, pressing it takes a fraction of second.

Compare this to the Undo operation in the “Introducing the HUD to Ubuntu” video (which you can watch embedded in Shuttleworth’s article or on YouTube): At 0:50 minutes into the video, you can observe how the user literally types “undo” into the HUD.

It takes the user about three seconds to issue that command!

This single interaction in the video provides sufficient proof that getting rid of keyboard shortcuts would be a seriously foolish move.

The HUD is modal, a keyboard short isn’t (quite)

To ensure that key presses reach their intended destination — either text entry into a document or text field, or a menu command –, the computer needs to be put into a temporary “command” mode when entering keyboard shortcuts.

The machine enters this mode when you press the Command key, and as soon as you let go of the key, it will leave the mode again. When you press the keys that make up a shortcut, Command is merely “first among equals”.

Once you’ve familiarized yourself with a shortcut, you will no longer “build it” key by key — “first, the Command key. Then the Shift key. And now press ‘S’, and there’s Save As…!”

Instead, you will “chord” the command, and it will feel like a single interaction step.

The HUD, in contrast, always requires at least three interaction steps:

Pressing a key (or shortcut) to summon the HUD
Pressing one or more keys to enter the search term
Pressing a key to commit or cancel the selected command

Keyboard shortcuts are non-ambiguous and non-arbitrary

Depending on the search term you enter into the HUD, you may have to make a conscious selection from the list of search results that the HUD presents to you.

If you do have to make a selection beyond accepting or rejecting the “best match” that is automatically pre-selected for you, then that requires another interaction step, possibly consisting of pressing the up or down arrow keys multiple times.

In contrast, a keyboard shortcut always triggers one, and exactly one, command.

Command-P will always print the current document, but the top match that appears in the HUD when you enter a “P” may just as well be “Preferences”.

Speaking of the match between shortcut keys and commands: Shuttleworth claims that …

> Hotkeys are a sort of mental gymnastics, the HUD is a continuation of mental flow.

At least for basic shortcuts, that claim doesn’t hold water.

When assigning a key to a menu item, developers don’t make random picks. Instead, they carefully choose letters that have a meaningful correlation with the command. Like “Open”, “Save”, “Print”, “Close Window”, or “Quit”.

Note, by the way, how you would start the corresponding HUD search with the exact same letters that are used for the shortcuts!

In cases such as these, where you know exactly which command to execute, the HUD does not offer advantages in terms of learning, recalling, or finding commands.

Some shortcuts use less intuitive letters, like Command-Z for undo, or Command-[comma] for opening an application’s preferences (on the Mac). Once you use them often enough, though, even they will become second nature over time, especially when they are among those commands that are standardized across the entire platform and, thus, trigger the same function in all programs.

Shortcuts and menus leverage motor memory

When you issue a shortcut like Command-Control-F, you will likely not have to consciously place the three fingers on their respective target keys. After you’ve gained enough practice in “chording” the shortcut, your fingers will move into place automatically.

The same holds true when selecting certain commands with the mouse. E.g., on a Mac, the “About [this application]” command’s position in the menu structure is standardized across the operating system.

Therefore, you know (from experience) that your first mouse pointer destination is “somewhere up there in the top-left corner, and it’s the menu directly to the right of that Apple thingy.” Your second destination is the first thing right underneath the menu’s title label.

Even though their positions aren’t standardized as strictly, many other often-used commands — for creating, opening, and saving files, for example — are found in similar locations in every application on the platform.²

Combined with motor memory, selecting a menu item this way can be surprisingly efficient.

Driving around with the mouse? All day!

Here is another quote from Shuttleworth’s article:

> So while there are modes of interaction where it’s nice to sit back and drive around with the mouse, we observe people staying more engaged and more focused on their task when they can keep their hands on the keyboard all the time.

For someone who mainly works with text, that may very well be true.

Watch someone work on non-text data, however, and you’ll observe that, for these people, keeping one hand on the mouse (or graphics tablet, etc.) and another on the keyboard is their standard “mode of interaction”.

This approach is common for any work that involves a continuous mixture of entering keyboard commands and moving objects on the screen, regardless of whether these objects are shapes on a canvas in a graphics editor, audio and MIDI snippets in a track arrange window in a music recording program, or images in a photo management app.

In such cases, you manipulate the on-screen elements with one hand on your pointing device of choice, while using the other hand to enter keyboard shortcuts for copying, pasting, etc., or pressing modifier keys for changing the cursor behavior from dragging to rotation, for example.

Consequently, assuming that keeping both hands on the keyboard at all times is the optimum solution for every type of user is pure nonsense.

For anyone who spends the better part of their working day inside applications like Photoshop, Pro Tools, or Aperture, it would be a nightmare to be forced to use a command line instead of being able to concurrently combine keyboard shortcuts with the extensive use of a pointing device.

The Ubuntu HUD’s content isn’t optimized for its use

When you watch the demo video, keep a close eye on the search results in the HUD. In its current implementation, the HUD provides just a different view on the application’s menu structure. Its output is not optimized for use in this UI control.

For example, at 1:55, the user enters “alic” into the HUD to search for “Alice in Wonderland”. The matching result is listed as “Tools > Bookmarks > Alice in Wonderland”.

What average computer user thinks of an audio track as a bookmark? And how is a bookmark related to tools?

In cases as this, displaying the full menu structure that encloses the “Alice in Wonderland” track, is not adding any useful information. In fact, it even makes decoding that search match confusing.

Compare that to how LaunchBar displays a music track: All you see is the track name.

LaunchBar displaying search results as a simple list with a single line of text per item, preceded by an icon indicating the item's data type

The search results’ types are communicated by simple, reasonably intuitive icons. Unlike the Ubuntu HUD, LaunchBar hides the found items’ meaningless taxonomical overhead from the user.

Using the left and right arrow keys, you can further explore the results list. Moving “right” from the “Cyberfunk Acoustic Revenge” album, you get to see all tracks on that album.

List of song titles displayed by LaunchBar after

Digging into the selected track, your are presented with the track’s artist, the album that it’s on, and its genre.

The core difference between the Ubuntu HUD (as it is working now) and LaunchBar is that the former is based on a rigid menu structure, whereas the latter searches arbitrary data, and presents it in a way that is meaningful and highly accessible to the user.

Instead of mapping a flattened menu tree structure into a linear text list as demonstrated in the video, the HUD should display its information from a task perspective by:

displaying contextual information when it is useful ( “History > Planet Ubuntu” ⇒ “Browser History > Planet Ubuntu”),
reducing redundancy (“Bookmarks > Bookmark This Page” ⇒ “Bookmark This Page” or “Edit > Undo Fuzzy Glow” = “Undo Fuzzy Glow”), and
speaking the user’s language (“Tools > Bookmarks > Alice in Wonderland” ⇒ “Play Music: Alice in Wonderland”).

In with the new, but do keep the old!

I applaud Ubuntu’s efforts to come up with new ways of interacting with “The Machine”.

The HUD has the potential to combine the best from advanced search technologies like Apple’s Spotlight, smart command line interfaces like LaunchBar or Quicksilver, and new ways to access an application’s menu commands into a single, extremely powerful, yet usable interface element.

Shuttleworth is probably right when he claims that:

> the HUD is faster than mousing through a menu, and easier to use than hotkeys since you just have to know what you want, not remember a specific key combination.

Conversely, though, the HUD isn’t easier to use than a menu nor is it faster than hotkeys.

To use the HUD effectively, you need to have an understanding of which commands it understands, whereas you can browse (and sometimes search) a menu for the commands and functions it makes available.

And while you have to memorize a hotkey in order to you use it effectively, it is much faster to access than entering a command via the HUD.

Therefore, I hope the powers that be at Ubuntu will revise the decision to completely tear out the support for classic menus from their operating system.

Instead, they should let your users decide whether the menu bar is displayed in its standard on-screen location, whether it’s stashed away in the panel, or whether it should be fully hidden from the user’s eye.

Sometimes, new ideas aren’t good enough to fully replace the old ones. But, more often than not, they’re just right to complement and extend them.

The Mac has a dedicated Command key, but other operating systems’ use of the Control key works just as well. For brevity and convenience, I will use the term “Command key” in this article. ↩
Unless, of course, an app’s developer lacks the required expertise of properly designing a native application for a specific operating system or window manager. ↩

The iPhone Mute Switch Conundrum

If you’re at all interested in the mobile phone ecosystem, you will likely have heard about a recent incident at Avery Fisher Hall: The conductor of the New York Philharmonic Orchestra stopped a concert performance, because of an alarm ringing on an iPhone, whose owner thought he had properly silenced the device.

The New York Times has all the details.

The incident received lots of attention on the Internet, with many a commenter accusing the iPhone’s owner — who has kindly been anonymized as “Patron X” — of being simply too stupid to use a “smart” phone.

As a user interface designer, you can view this unfortunate incident as a real-life usability test. And the iPhone failed this test. Yes, the phone failed, and not the user.

The iPhone fails a usability test

Here is a key passage from the New York Times article:

[Patron X] said he made sure to turn [the iPhone] off before the concert, not realizing that the alarm clock had accidentally been set and would sound even if the phone was in silent mode.

“I didn’t even know phones came with alarms,” the man said.

This user was expecting the mute switch to completely silence the phone.

Since he had just gotten the iPhone the day before the concert, he had to rely on his intuition to make sense of the device’s controls and related on-screen messages.

The mental model he came up with for the mute switch was very simple: “This thing switches the phone’s speaker on or off.”³

His exact thoughts might have been different, but the important part is that he was not expecting the general function of the switch to be any more complex than that.

“It’s ‘all sounds on’ or ‘all sounds off’. Got that!”

This is a perfectly sound expectation of how the mute switch works. Here’s is the reason why: There is no indication in the iPhone’s UI that would contradict this straight-forward expectation. I’ll explain that in a minute.

A conflict of expectations, not a conflict of commands

Marco Armend is among those who disagree.

He argues that the iPhone switch works just as it should, and offers this explanation for what went wrong (emphasis his):

The user told the iPhone to make noise by either scheduling an alarm or initiating an obviously noise-playing feature in an app.

The user also told the iPhone to be silent with the switch on the side.

The user has issued conflicting commands, and the iPhone can’t obey both.

Alas, he’s wrong. Well, kind of.

Marco takes the point of view of a programmer. From that perspective, there are, indeed, conflicting commands that the designers of the iPhone’s user interface had to foresee and address.

Keep in mind, though, that Patron X was not only not aware that an alarm had been set unintentionally in the iPhone’s Clock app.⁴ He claims that he did not even know that that feature existed on his phone.

The expectation of the user clearly was this: “When I flip this switch, the phone won’t make any noise whatsoever during this concert.”

Therefore, from the user’s perspective, only a single, non-ambiguous command was given: Stay quiet, no matter what!

This is not a case of conflicting commands. Instead, it is a matter of conflicting expectations of how the device operates, or should operate. A classical example of the user’s mental model not matching that intended by the device’s designers.

Managing user expectations through status visibility

According to page 11 of the iPhone User Guide (iOS 5 version), the “Ring/Silent switch” operates as follows:

In ring mode, iPhone plays all sounds. In silent mode, iPhone doesn’t ring or play alerts and other sound effects.

Important: Clock alarms, audio apps such as Music, and many games still play sounds through the built-in speaker when iPhone is in silent mode.

Alas, the iPhone’s user interface does not make the user aware of these exceptions at all.

To understand the behavior of the mute switch, you must read the user’s manual. And to apply this understanding, you have to memorize it.

Let’s put this into perspective: A premium “smart” mobile phone whose user interface is considered by some to be the epitome of user-friendliness, requires you to memorize configuration details, because the device does not properly communicate its system status.⁵

If you take a closer look at the status feedback the iPhone provides with regards to the mute switch, you will find that they do not take alarm settings into account.

When you mute the phone, you feel a single vibration burst, and this info bezel appears on the iPhone’s screen:

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_MutedBezel.jpg” alt=”Screen of a locked iPhone, displaying the “muted” info bezel” border=”0″ width=”300″ height=”450″ />

Un-mute the iPhone, and you get to see this (without any vibration):

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_UnMutedBezel.jpg” alt=”Screen of a locked iPhone, displaying the “_un_muted” info bezel” border=”0″ width=”300″ height=”450″ />

The only setting I could find that affects this behavior is the Vibrate option in the Sounds settings. Un-check that option, and the vibration feedback is now reversed: the phone vibrates when you un-mute it, and stays quiet when you mute it.

The icons shown above look exactly the same, however, regardless of whether an alarm is set, or not.

There is an indication that an Alarm may sound: the clock icon in the status bar. Similar to the bell icons, however, it also has the exact same shape and color, regardless of whether any of the active alarms will actually “make a noise”.

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_AlarmIcon.jpg” alt=”Screen of a locked iPhone, with an arrow highlighting the “clock” alarms icon in the status bar” border=”0″ width=”300″ height=”450″ />

Therefore, if you absolutely, positively must make sure that your iPhone stays quiet while it is muted and the Alarm icon is showing, you will need to open the Clock app and check the list of alarms.

Which list, by the way, again fails to provide visual cues as to which alarms have a sound assigned to it, or which are vibration-only. You can only check this by entering edit mode, tapping an alarm, and checking the Sound setting. Seeing a corresponding icon right in the list items would be useful here.

List of alarms in the Clock app on an iPhone

Of course, you can switch the iPhone off completely, but that would make the very existence of the Ring/Silent switch rather moot. It would also prevent you from quickly looking up a piece of information on the iPhone due to the time it takes the iPhone to cold-start. And if you did do so, you may just be surprised by an alarm going off the moment your log-in screen appears …

As a third option, you could try to memorize which alarms you have set for which times, and whether any of these will play a sound, or not. That is a “suggestion” that you hear a lot from the commenters attacking Patron X over his mishap: “He should have remembered that he had set an alarm that would go off during the concert.”

I cannot begin to explain how stunned I am by these people’s notion that it is perfectly acceptable for a high-tech 21st-century digital device to offload memorizing system status information to its user in this manner.

When designers do not expect their user’s expectations

Given that the iPhone does not present any status feedback — neither visual, nor audible, nor tactile — that it may play a sound while muted, and that a simple control for a simple function should cause the least possible surprise when operated, I think it is perfectly valid and appropriate for a non-expert user of an iPhone to expect the mute switch to work exactly like Patron X thought it would.

I’m not arguing that this simplistic behavior is the most appropriate or the most useful for average users, mind you!

What I am saying, though, is that it is be the most intuitive behavior, i.e., the behavior that a non-expert user would most likely expect to see.

Do tell, little iPhone, what are you up to?

Without hands-on user testing, it is not possible to say which changes to the iPhone’s UI would have been necessary to prevent Patron X from making his mistake.

I have no doubt, though, that more explicit status feedback would have helped immensely.

For example, the “muted” status bezel could display a warning in case an alarm is active that may play a sound.

https://uiobservatory.com/media/2012/iPhoneMuteSwitch_MutedBezelWithAlarmWarning.jpg” alt=”iPhone “muted” info bezel expanded with “Alarm at 8:00pm” warning text” border=”0″ width=”300″ height=”450″ />

For those of us who toggle the Silent mode with the device in their pockets, the iPhone could play different vibration patterns in response to flipping the mute switch.

I assume that the tactile feedback from physically flipping the switch provides sufficient status information about what that switch has been set to. Hence, I don’t think the reversal of the vibration pattern when de-activating silent-mode vibrations is necessary⁶.

User-testing notwithstanding, the vibration patterns could be modified to vibrate once when the phone is un-muted, stay quiet when it is muted, but vibrate thrice when the phone is muted while alarms with sounds are active.

In other words, “no vibration” means “no sound”. Both vibrations mean “sounds will play”, with the triple-vibe effectively operating like an exclamation mark: “Are you trying to silence me? Well, you better BE AWARE THAT I MIGHT PLAY SOUNDS, you know!”

“Least surprises” is good default behavior

When it comes to making decisions about a system’s default behavior, the “Principle of Least Surprise” is a useful guideline.

In my opinion, the ”Incident at Avery Fisher Hall” is a sign that Apple’s design of the default behavior of the mute switch violates that principle.

Besides adding more explicit status indicators as outlined above, the — again: default — behavior of the mute switch should be to silence all of the phone’s sounds without any exceptions.

To allow users to customize this behavior for their personal preferences, I would test adding an option — “Play sounds in Silent mode” — to the Notification settings.

Every app that can play sounds while the phone is muted, would gain this override option, with its default setting being “Off”.

With a standardized UI for this behavior, the user would have to learn how to set this option for one application in order to understand how to set it for any and all others.

It would also get rid of the guesswork involved with the current design’s imprecise list of “Clock alarms, audio apps such as Music, and many games” by supporting a clear, unambiguous selection of which apps will blare away even during “silent” mode, and which are muted for good.

That, of course, was his mental model before his iPhone caused the interruption of the concert. I’m absolutely sure that Patron X has since revised it. ↩
Unfortunately, none of the coverage I have read so far addresses how the unintended alarm happened to be set. The fact that it did happen, though, points to a potential UI problem in the Clock application: Why did the user not realize that they had set the alarm? What additional feedback or interactions would have been required to make them aware that they had just set an alarm (presumably) without actually intending to do so. ↩
Incidentally, “Visibility of system status” is the top-most item on Jakob Nielsen’s list of “Ten Usability Heuristics“. ↩
In fact, I don’t understand what the point of that reversal is, anyway. ↩

A User-Customizable User Guide

Manuals that ship with hardware appliances often contain instructions in more than one language. Such multi-lingual documentation simplifies logistics and reduces costs, because only a single document needs to be handled during production, and a single product package can be sold in multiple countries.

For the users, however, these manuals come at a price. They can be more tedious to navigate than a single-language document, and they require more storage space.

When I recently unpacked an electronics gadget, the manual inside the box was folded into a dense, almost children’s fist-sized wad of paper.

Stack of unfolded individual paper sheets making up the user manual

At first sight, it reminded me of an oversized version of the information flyers that you find in medicine packages — and whose folding pattern is so complex that you can never get it back into the box.

Unexpectedly, the wad unfolded into several individual double-sided sheets, each of which contains instructions in two languages.

Two perforation lines run from top to bottom along the sheets’ centers.

Close-up of the two parallel perforation lines along the center of a sheet from the manual

Thanks to the perforation, you can rip each sheet in two, so that you end up with a customized manual that only contains instructions in your own language.

Single sheet from the manual in front of a stack of the remaining sheets in other languages

This design is a double-edged sword, because you throw away the majority of the printed document, so that the waste of resources — paper, ink, and energy — is the same as with bound booklets.

Customizing the document in this manner does make it more usable, though, because it reduces its content scope and physical size to just exactly what you need.

I wonder if there is a way to apply this approach to larger documents whose contents exceed a single page per language.