Category Archives: multi-touch

Paper: Experimental Study of Stroke Shortcuts for a Touchscreen Keyboard with Gesture-Redundant Keys Removed

Text Entry on Touchscreen Keyboards: Less is More?

When we go from mechanical keyboards to touchscreens we inevitably lose something in the translation. Yet the proliferation of tablets has led to widespread use of graphical keyboards.

You can’t blame people for demanding more efficient text entry techniques. This is the 21st century, after all, and intuitively it seems like we should be able to do better.

While we can’t reproduce that distinctive smell of hot metal from mechanical keys clacking away at a typewriter ribbon, the presence of the touchscreen lets keyboard designers play lots of tricks in pursuit of faster typing performance. Since everything is just pixels on a display it’s easy to introduce non-standard key layouts. You can even slide your finger over the keys to shape-write entire words in a single swipe, as pioneered by Per Ola Kristensson and Shumin Zhai (their SHARK keyboard was the predecessor for Swype and related techniques).

While these type of tricks can yield substantial performance advantages, they also often demand a substantial investment in skill acquisition from the user before significant gains can be realized. In practice, this limits how many people will stick with a new technique long enough to realize such gains. The Dvorak keyboard offers a classic example of this: the balance of evidence suggests it’s slightly faster than QWERTY, but the high cost of switching to and learning the new layout just isn’t worth it.

In this work, we explored the performance impact of an alternative approach that builds on people’s existing touch-typing skills with the standard QWERTY layout.

And we do this in a manner that is so transparent, most people don’t even realize that anything is different at first glance.

Can you spot the difference?

Snap quiz time

Stroke-Kbd-redundant-keys-removed-fullres

What’s wrong with this keyboard?  Give it a quick once-over. It looks familiar, with the standard QWERTY layout, but do you notice anything unusual? Anything out of place?

Sure, the keys are arranged in a grid rather than the usual staggered key pattern, but that’s not the “key” difference (so to speak). That’s just an artifact of our quick ‘n’ dirty design of this research-prototype keyboard for touchscreen tablets.

Got it figured out?

All right. Pencils down.

Time to check your score. Give yourself:

  • One point if you noticed that there’s no space bar.
  • Two points if you noticed that there’s no Enter key, either.
  • Three points if the lack of a Backspace key gave you palpitations.
  • Four points and a feather in your cap if you caught the Shift key going AWOL as well.

Now, what if I also told you removing four essential keys from this keyboard–rather than harming performance–actually helps you type faster?

One Trick TO WOO THEM ALL

All we ask of people coming to our touchscreen keyboard is to learn one new trick. After all, we have to make up for the summary removal of Space, Backspace, Shift, and Enter somehow. We accomplish this by augmenting the graphical touchscreen keyboard with stroke shortcuts, i.e. short straight-line finger swipes, as follows:marking-menu-overlay-5

  • Swipe right, starting anywhere on the keyboard, to enter a Space.
  • Swipe left to Backspace.
  • Swipe upwards from any key to enter the corresponding shift-symbol. Swiping up on the a key, for example, enters an uppercase A; stroking up on the 1 key enters the ! symbol; and so on.
  • Swipe diagonally down and to the left for Enter.

marking-menu-overlay-with-finger

DESIGN PROPERTIES OF A STROKE-AUGMENTED GRAPHICAL KEYBOARD

In addition to possible time-motion efficiencies of the stroke shortcuts themselves, the introduction of these four gestures–and the elimination of the corresponding keys made redundant by the gestures–yields a graphical keyboard with number of interesting properties:

  • Allowing the user to input stroke gestures for Space, Backspace, and Enter anywhere on the keyboard eliminates fine targeting motions as well as any round-trips necessary for a finger to acquire the corresponding keys.
  • Instead of requiring two separate keystrokes—one to tap Shift and another to tap the key to be shifted—the Shift gesture combines these into a single action: the starting point selects a key, while the stroke direction selects the Shift function itself.
  • Removing these four keys frees an entire row on the keyboard.
  • Almost all of the numeric, punctuation, and special symbols typically relegated to the secondary and tertiary graphical keyboards can then be fit in a logical manner into the freed-up space.
  • Hence, the full set of characters can fit on one keyboard while holding the key size, number of keys, and footprint constant.
  • By having only a primary keyboard, this approach affords an economy of design that simplifies the interface, while offering further potential performance gains via the elimination of keyboard switching costs—and the extra key layouts to learn.
  • Although the strokes might reduce round-trip costs, we expect articulating the stroke gesture itself to take longer than a tap. Thus, we need to test these tradeoffs empirically.

RESULTS AND PRELIMINARY CONCLUSIONS

Our studies demonstrated that overall the removal of four keys—rather than coming at a cost—offers a net benefit.

Specifically, our experiments showed that a stroke keyboard with the gesture-redundant keys removed yielded a 16% performance advantage for input phrases containing mixed-case alphanumeric text and special symbols, without sacrificing error rate. We observed these performance advantages from the first block of trials onward.

Even in the case of entirely lowercase text—that is, in a context where we would not expect to observe a performance benefit because only the Space gesture offers any potential advantage—we found that our new design still performed as well as a standard graphical keyboard. Moreover, people learned the design with remarkable ease: 90% wanted to keep using the method, and 80% believed they typed faster than on their current touchscreen tablet keyboard.

Notably, our studies also revealed that it is necessary to remove the keys to achieve these benefits from the gestural stroke shortcuts. If both the stroke shortcuts and the keys remain in place, user hesitancy about which method to use undermines any potential benefit. Users, of course, also learn to use the gestural shortcuts much more quickly when they offer the only means of achieving a function.

Thus, in this context, less is definitely more in achieving faster performance for touchscreen QWERTY keyboard typing.

The full results are available in the technical paper linked below. The paper contributes a careful study of stroke-augmented keyboards, filling an important gap in the literature as well as demonstrating the efficacy of a specific design; shows that removing the gesture-redundant keys is a critical design choice; and that stroke shortcuts can be effective in the context of multi-touch typing with both hands, even though previous studies with single-point stylus input had cast doubt on this approach.

Although our studies focus on the immediate end of the usability spectrum (as opposed to longitudinal studies over many input sessions), we believe the rapid returns demonstrated by our results illustrate the potential of this approach to improve touchscreen keyboard performance immediately, while also serving to complement other text-entry techniques such as shape-writing in the future.

Stroke-Keyboard-GI-2014-thumbArif, A. S., Pahud, M., Hinckley, K., and Buxton, B.,  Experimental Study of Stroke Shortcuts for a Touchscreen Keyboard with Gesture-Redundant Keys Removed In Proc. Graphics Interface 2014 (GI’14).  Canadian Information Processing Society, Toronto, Ont., CanadaMontreal, Quebec, Canada, May 7-9, 2014. Received the Michael A. J. Sweeney Award for Best Student Paper.  [PDF] [Talk Slides (.pptx)] [Video .MP4] [Video .WMV]

Watch A Touchscreen Keyboard with Gesture-Redundant Keys Removed video on YouTube

Paper: Motion and Context Sensing Techniques for Pen Computing

I continue to believe that stylus input — annotations, sketches, mark-up, and gestures — will be an important aspect of interaction with slate computers in the future, particularly when used effectively and convincingly with multi-modal pen+touch input. It also seems that every couple of years I stumble across an interesting new use or set of techniques for motion sensors, and this year proved to be no exception.

Thus, it should come as no surprise that my latest project has continued to push in this direction, exploring the possibilities for pen interaction when the physical stylus itself is augmented with inertial sensors including three-axis accelerometers, gyros, and magnetometers.

Figure-1-Sensor-Pen-hardware

In recent years such sensors have become integrated with all manner of gadgets, including smart phones and tablets, and it is increasingly common for microprocessors to include such sensors directly on the die. Hence in my view of the world, we are just at the cusp of sensor-rich stylus devices becoming  commercially feasible, so it is only natural to consider how such sensors afford new interactions, gestures, or context-sensing techniques when integrated directly with an active (powered) stylus on pen-operated devices.

In collaboration with Xiang ‘Anthony’ Chen and Hrvoje Benko I recently published a paper exploring motion-sensing capabilities for electronic styluses, which takes a first look at some techniques for such a device. With some timely help from Tom Blank’s brilliant devices team at Microsoft Research, we built a custom stylus — fully wireless and powered by an AAAA battery — that integrates these sensors.

These range from very simple but clever things such as reminding the user if they have left behind the pen — a common problem that users encounter with pen-based devices — to fun new techniques that emulate physical media, such as the gesture of striking a loaded brush on one’s finger in water media.

fig-ink-spatter

Check out the video below for an overview of these and some of the other techniques we have come up with so far, or read more about it in the technical paper linked below.

We are continuing to work in this area, and have lots more ideas that go beyond what we were able to accomplish in this first stage of the project, so stay tuned for future developments along these lines.

Motion-Context-Pen-thumbHinckley, K., Chen, X., and Benko, H., Motion and Context Sensing Techniques for Pen
Computing. 
In Proc. Graphics Interface 2013 (GI’13).  Canadian Information Processing Society, Toronto, Ont., CanadaRegina, Saskatchewan, Canada, May 29-31, 2013. [PDF] [video - MP4].

Watch Motion and Context Sensing Techniques for Pen Computing video on YouTube

GroupTogether — Exploring the Future of a Society of Devices

My latest paper discussing the GroupTogether system just appeared at the 2012 ACM Symposium on User Interface Software & Technology in Cambridge, MA.

GroupTogether video available on YouTube

I’m excited about this work — it really looks hard at what some of the next steps in sensing systems might be, particularly when one starts considering how users can most effectively interact with one another in the context of the rapidly proliferating Society of Devices we are currently witnessing.

I think our paper on the GroupTogether system, in particular, does a really nice job of exploring this with strong theoretical foundations drawn from the sociological literature.

F-formations are small groups of people engaged in a joint activity.

F-formations are the various type of small groups that people form when engaged in a joint activity.

GroupTogether starts by considering the natural small-group behaviors adopted by people who come together to accomplish some joint activity.  These small groups can take a variety of distinctive forms, and are known collectively in the sociological literature as f-formations. Think of those distinctive circles of people that form spontaneously at parties: typically they are limited to a maximum of about 5 people, the orientation of the partipants clearly defines an area inside the group that is distinct from the rest of the environment outside the group, and there are fairly well established social protocols for people entering and leaving the group.

A small group of two users as sensed by GroupTogether's overhead Kinect depth-cameras

A small group of two users as sensed via GroupTogether’s overhead Kinect depth-cameras.

GroupTogether also senses the subtle orientation cues of how users handle and posture their tablet computers. These cues are known as micro-mobility, a communicative strategy that people often employ with physical paper documents, such as when a sales representative orients a document towards to to direct your attention and indicate that it is your turn to sign, for example.

Our system, then, is the first to put small-group f-formations, sensed via overhead Kinect depth-camera tracking, in play simultaneously with the micro-mobility of slate computers, sensed via embedded accelerometers and gyros.

The GroupTogether prototype sensing environment and set-up

GroupTogether uses f-formations to give meaning to the micro-mobility of slate computers. It understands which users have come together in a small group, and which users have not. So you can just tilt your tablet towards a couple of friends standing near you to share content, whereas another person who may be nearby but facing the other way — and thus clearly outside of the social circle of the small group — would not be privy to the transaction. Thus, the techniques lower the barriers to sharing information in small-group settings.

Check out the video to see what these techniques look like in action, as well as to see how the system also considers groupings of people close to situated displays such as electronic whiteboards.

The full text of our scientific paper on GroupTogether and the citation is also available.

My co-author Nic Marquardt was the first author and delivered the talk. Saul Greenberg of the University of Calgary also contributed many great insights to the paper.

Image credits: Nic Marquardt

Paper: Cross-Device Interaction via Micro-mobility and F-formations (“GroupTogether”)

GroupTogetherMarquardt, N., Hinckley, K., and Greenberg, S., Cross-Device Interaction via Micro-mobility and F-formations.  In ACM UIST 2012 Symposium on User Interface Software and Technology (UIST ’12). ACM, New York, NY, USA,  Cambridge, MA, Oct. 7-10, 2012, pp. (TBA). [PDF] [video - WMV]. Known as the GroupTogether system.

See also my post with some further perspective on the GroupTogether project.

Watch the GroupTogether video on YouTube

Paper: Informal Information Gathering Techniques for Active Reading

This is my latest project, which I will present tomorrow (May 9th) at the CHI 2012 Conference on Human Factors in Computing Systems.

I’ll have a longer post up about this project after I return from the conference, but for now enjoy the video. I also link to the PDF of our short paper below which has a nice discussion of the motivation and design rationale for this work.

Above all else, I hope this work makes clear that there is still tons of room for innovation in how we interact with the e-readers and tablet computers of the future– as well as in terms of how we consume and manipulate content to produce new creative works.

Informal Information Gathering Techniques for Active ReadingHinckley, K., Bi, X., Pahud, M., Buxton, B., Informal Information Gathering Techniques for Active Reading. 4pp Note. In Proc. CHI 2012  Conf. on Human Factors in Computing Systems, Austin, TX, May 5-10, 2012. [PDF]

[Watch Informal Information Gathering Techniques for Active Reading on YouTube]

Paper: CodeSpace: Touch + Air Gesture Hybrid Interactions for Supporting Developer Meetings

CodeSpace systemBragdon, A., DeLine, R., Hinckley, K., and Morris, M. R., Code space: Touch + Air Gesture Hybrid Interactions for Supporting Developer Meetings.  In Proc. ACM International Conference on Interactive Tabletops and Surfaces (ITS ’11). ACM, New York, NY, USA,  Kobe, Japan, November 13-16, 2011, pp. 212-221. [PDF] [video - WMV]. As featured on Engadget and many other online forums.

Watch CodeSpace video on YouTube

Paper: Enhancing Naturalness of Pen-and-Tablet Drawing through Context Sensing

Context-Sensing Pen with multi-touch and orientation sensorsSun, M. Cao, X., Song, H., Izadi, S., Benko, H., Guimbretiere, F., Ren, X., and Hinckley, K. Enhancing Naturalness of Pen-and-Tablet Drawing through Context Sensing.  In Proc. ACM International Conference on Interactive Tabletops and Surfaces (ITS ’11). ACM, New York, NY, USA,  Kobe, Japan, November 13-16, 2011, pp. 212-221. [PDF] [video - WMV].

Watch Enhancing Naturalness of Pen through Context Sensing video on YouTube

Classic Post: The Hidden Dimension of Touch

I’ve had a number of conversations with people recently about the new opportunities for mobile user interfaces afforded by the increasingly sophisticated sensors integrated with hand-held devices.

I’ve been doing research on sensors on and off for over twelve years now, and it’s a topic I keep coming back to every few years. The possibilities offered by these sensors have never been more promising. They increasingly will be integrated right on the microchip with all the other specialized computational units, so they are only going to become more widespread to the point that it will be practically impossible to buy a mobile gadget of any sort that doesn’t contain sensors. In practical terms there will be no incremental cost to include the sensors, and it’s just a matter of smart software to take advantage of them and enrich the user experience.

I continue to be excited about this line of work and think there’s a lot more that could be done to leverage these sensors. In particular, I believe the possibilities afforded by modern high-precision gyroscopes– and their combination with other sensors and input modalities– are not yet well-understood. And I believe the whole area of contextual sensing in general remains rich with untapped possibilities.

I posted about this on my old blog a while back, but I definitely wanted to make this post available here as well, so here it is. If you just want to cut to the chase, I’ve embedded the video demonstration at the bottom of the post.

The Hidden Dimension of Touch

The Hidden Dimension of Touch

What’s the gesture of one hand zooming?

This might seem like a silly question, but it’s not. The beloved multi-touch pinch gesture is ubiquitous, but it’s almost impossible to articulate with one hand. Need to zoom in on a map, or a web page? Are you using your phone while holding a bunch of shopping bags, or the hand of your toddler?

Well then, you’re a better man than I am if you can zoom in without dropping your darned phone on the pavement. You gotta hold it in one hand, and pinch with the other, and that ties up both hands.  Oh, sure, you can double-tap the thing, but that doesn’t give you much control, and you’ll probably just tap on some link by mistake anyway.

So what do you do? What’s the gesture of one hand zooming?

Well, I found that if you want an answer to that, first you have to break out of the categorical mindset that seems to pervade so much of mainstream thinking, the invisible cubicle walls that we place around our ideas and our creativity without even realizing it. And Exhibit A in the technology world is the touch-is-best-for-everything stance that seems to be the Great Unwritten Rule of Natural User Interfaces these days.

Here’s a hint: The gesture of one hand zooming isn’t a touch-screen gesture.

Well, that’s not completely true either. It’s more than that.

Got any ideas?

- # -

Every so often in my research career I stumble across something that reminds me that this whole research gig is way easier than it seems.

And way harder.

Because I’ve repeatedly found that some of my best ideas were hiding in plain sight. Obvious things. Things I should have thought of five years ago, or ten.

The problem is they’re only obvious in retrospect.

Of course touch is all the rage; every smartphone these days has to have a touchscreen.

But people forget that every smartphone has motion sensors too– accelerometers and gyroscopes and such– that let the device respond to physical movement, such as when you hold your phone in landscape and the display follows suit.

I first prototyped that little automatic screen rotation interaction, by the way, over twelve years ago, so if you don’t like it, you can blame it on me. Come on, admit it, you’ve cussed more than once when you lay down in bed with your smartphone and the darned screen flipped to landscape. It’s ok, let loose your volley of curses. You won’t be judged here.

Because the first step to a solution is admitting you have a problem.

I started thinking hard about all of this- touch and motion sensing, zooming with one hand and automatic screen rotation gone wild– a while back and gradually realized that there’s an interesting new class of gestures for handhelds hiding in plain sight here. And it’s always been there. Any fool– like me, twelve years ago, for example– could have taken the inputs from a touchscreen and the signals from the sensors and started to build out a vocabulary of gestures based on that.

But well, um… nope. Never been explored in any kind of systematic way, as it turns out.

Call it the Hidden Dimension of Touch, if you like, an uncharted continent of gestures just laying there under the surface of your touchscreen, waiting to be discovered.

- # -

So now that we’re surveying this new landscape, let me show you the way to the first landmark, the Gesture of One Hand Zooming:

  • Hold your thumb on the screen, at the point you want to zoom.
  • Tip the device back and forth to zoom in or zoom out.
  • Lift your thumb to stop.

Yep, it’s that simple and that hard.

It’s a cross-modal gesture: that is, a gesture that combines both motion and touch. Touch: hold your thumb at a particular location on the screen. Motion sensing: your phone’s accelerometer senses the tilt of the device, and maps this to the rate of expansion for the zoom.

It’s not any faster or more intuitive than pinch-to-zoom.

But, gosh darn it, you can do it with one hand.

One-Handed Zooming

One-Handed Zooming by holding the screen and subtly tilting the device back and forth.

- # -

All right then, what about this problem of your smartphone gone wild in your bed? Ahem. The problem with the automatic screen rotation, that is.

Well, just hold your finger on the screen as you lay down. Or as you pivot the screen to a new viewing orientation.

Call it Pivot-to-Lock, another monument on this new touch-plus-motion landscape: just hold the screen while rotating the device.

Screen Pivot Lock

Lock engaged. Just flip the screen to a new orientation to slip out of the lock. Simple, and fun to use.

- # -

Is that it? Is there more?

Sure, there’s a bunch more touch-and-motion gestures that we have experimented with. For example, here’s one more: you can collect bits of content that you encounter on your phone- say, crop out piece of a picture that you like- just by framing it with your fingers and then flipping the phone back in a quick motion. Here, holding two fingers still plus the flipping motion defines the cross-modal gesture, as demonstrated in our prototype for Windows Phone 7:

Check out the video below to see all of these in action, and some other ideas that we’ve tried out so far.

But there’s something else.

Another perspective. Something completely different from all the examples above.

There’s really two ways to look at interaction with motion sensors.

We can use them to support explicit new gestures– like giving your device a shake, for example– or the phone can use them in a more subtle way, by just sitting there in the background and seeing what the sensors have to say about how the device is being used.  Did the user just pick up the phone? Is the user walking around with the phone? Is the phone sitting flat and motionless on a desk? Yep, you can infer all these things with high confidence.

And we can bring this perspective back to our thinking about combined touch and motion.

Imagine your touchscreen as the surface of a pond on a windless day. Perfectly flat. Smooth.

Motionless.

Now what happens when you set your finger to the surface of that pond?

Motion in Touch

Yep, ripples.

Touch the surface of the pond again, somewhere else. More ripples, expanding from a different spot.

Now take your finger and sweep it along the surface of the water. Another disturbance– a wake in the trail of your finger this time. That’s another pattern. A different pattern.

Touch and motion are inextricably linked. The sensors on these devices– particularly the new generation of low-cost gyroscopes that are making their way onto handhelds– are increasingly sensitive, even to rather subtle motions and vibrations.

When you touch the screen of your device, or place a finger anywhere on the case of your device for that matter, we have a good sense of how you’re touching it and about where you’re touching it and how you’re holding it.

And all of this can be used to optimize how your device reacts, how it interprets your gestures, how accurately it can respond to you. And maybe some more stuff that nobody even realizes is possible yet.

Frankly, I’m not even sure myself. We’ve probably only just scratched the surface of the possibilities here.

Yeah, there’s a hidden dimension of touch all right, and to be honest I still feel like we’re a long way from surveying all the landmarks of this new world.

But I like what we see so far.

- VIDEO-

Here’s a video of our system in action:

YouTube Video of Touch and Motion Gestures for Mobiles.

- PUBLICATION DETAILS -

Our scientific paper on the work described in this post won an Honorable Mention Award for CHI 2011 Best Paper.  The paper appeared May 9th at the ACM CHI 2011 Conference on Human Factors in Computing Systems in Vancouver, British Columbia, Canada.

Check out the paper for a full and nuanced discussion of this design space, as well as references to a whole bunch of exciting work that has been conducted by other researchers in recent years.

Sensor Synaesthesia: Touch in Motion, and Motion in Touch, by Ken Hinckley and Hyunyoung Song. CHI 2011 Conf. on Human Factors in Computing Systems.

The paper was presented at the conference by my co-author Hyunyoung Song of the University of Maryland. Hyunyoung worked with me for her internship at Microsoft Research in the summer of 2010 and her contributions to this project were tremendous– very, very impressive work by a great young researcher.

Book Chapter: Input Technologies and Techniques, 2012 Edition

Input Technologies and Techniques, 3rd EditionHinckley, K., Wigdor, D., Input Technologies and Techniques. Chapter 9 in The Human-Computer Interaction Handbook – Fundamentals, Evolving Technologies and Emerging Applications, Third Edition, ed. by Jacko, J., Published by Taylor & Francis. To appear. [PDF of author's manuscript - not final]

This is an extensive revision of the 2007 and 2002 editions of my book chapter, and with some heavy weight-lifting from my new co-author Daniel Wigdor, it treats direct-touch input devices and techniques in much more depth. Lots of great new stuff. The book will be out in early 2012 or so from Taylor & Francis – keep an eye out for it!

Classic AlpineInker Post #2: Pen + Touch Input in “Manual Deskterity”

Alright, here’s another blast from the not-so-distant past: our exploration of combined pen and touch input on the Microsoft Surface.

And this project was definitely a blast. A lot of fun and creative people got involved with the project and we just tried tons and tons of ideas, many that were stupid, many that were intriguing but wrong, and many cool ones that didn’t even make our demo reel. And as is clear from the demo reel, we definitely took a design-oriented approach in this work, meaning that we tried multiple possibilities without focusing too much on which was the “best” design in this work. Or, said another way, I would not advocate putting together a system that has all of the gestures that we explored in this work; but you can’t put together a map if you don’t explore the terrain, and this was most definitely a mapping expedition.

Since I did this original post, I’ve published a more definitive paper on the project called “Pen + Touch = New Tools” which appeared at the ACM UIST 2010 Symposium on User Interface Software and Technology. This is a paper I’m proud of; it really dissects this design space of pen + touch quite nicely. I’ll have to do another post about this work that gets into that next level of design nuances at some point.

I had a blast preparing the talk for this particular paper and to be honest it was probably one of the most entertaining academic talks that I’ve done in recent years. I have a very fun way of presenting this particular material, with help during the talk from a certain Mr. I.M.A. Bigbody:

MR. I.M.A. Bigbody, Corporate Denizen, Third Rate Inc.Mr. Bigbody, a Corporate Denizen of Third Rate, Inc., is exactly the sort of arrogant prove-it-to-me, you’re-just-wasting-my-time sort of fellow that seems to inhabit every large organization.

Well, Mr. Bigbody surfaces from time to time throughout my talk to needle me about the shortcomings of the pen:

Why the pen? I can type faster than I can write.

Just tell me which is best, touch or pen.

Touch and pen are just new ways to control the mouse, so what’s the big deal?

And in the end, of course, because the good guys always win, Mr. Bigbody gets sacked and the world gets to see just how much potential there is in combined Pen + Touch input for the betterment of mankind.

One other comment about this work before I turn it over to the classic post. We originally did this work on the Microsoft Surface, because at the time this was the only hardware platform available to us where we could have full multi-touch input while also sensing a pen that we could distinguish as a unique type of contact. This is a critical point. If you can’t tell the pen from any other touch– as is currently a limitation of capacitive multi-touch digitizers such as those used on the iPad– it greatly limits the type of pen + touch interactions that a system can support.

These days, though, a number of slates and laptops with pen + touch input are available. The Asus EP121 Windows 7 slate is a noteworthy example; this particular slate contains a Wacom active digitizer for high-quality pen input, and it also includes a second digitizer with two-touch multi-touch input. The really cool thing about it from my perspective is that you can also use Wacom’s multi-touch API’s to support simultaneous pen + touch input on the device. This normally isn’t possible under Windows 7 because Windows turns off touch when the pen comes in range. But it is possible if you use Wacom’s multi-touch API and handle all the touch events yourself, so you can do some cool stuff if you’re willing to work at it.

Which gets us back to the Manual Deskterity demo on the Surface. To be honest, the whole theme in the video about the digital drafting table is a bit of a head fake. I was thinking slates the whole time I was working on the project, it just wasn’t possible to try the ideas in a slate form factor at the time. But that’s definitely where we intended to go with the research. And it’s where we still intend to go, using devices like the Asus EP121 to probe further ahead and see what other issues, techniques, or new possibilities arise.

Because I’m still totally convinced that combined pen and touch is the way of the future. It might not happen now, or two years from now, or even five years from now– but the device of my dreams, the sleek devices that populate my vision of what we’ll be carrying around as the 21st century passes out of the sun-drenched days of its youth– well, they all have a fantastic user experience that incorporates both pen and touch, and everyone just expects things to work that way.

Even Mr. Bigbody.

Manual Deskterity: An Exploration of Simultaneous Pen + Touch Direct Input

With certain obvious multi-touch devices garnering a lot of attention these days, it’s easy to forget that touch does not necessarily make an interface “magically delicious” as it were. To paraphrase my collaborator Bill Buxton, we have to remember that:

Next week at the annual CHI 2010 Conference on Human Factors in Computing Systems, I’ll be presenting some new research that investigates the little-explored area of simultaneous pen and touch interaction.

Now, what does this really mean? Building on the message we have articulated in black and white above, we observe the following:

The future of direct interaction on displays is not about Touch.

Likewise, it is not about the Pen.

Nor is it about direct interaction on displays with Pen OR Touch.

It is about Pen AND Touch, simultaneously, designed such that one complements the other.

That is, we see pen and touch as complementary, not competitive, modalities of interaction. By leveraging people’s natural use of pen and paper, in the real world, we can design innovative new user experiences that exploit the combination of pen and multi-touch input to support non-physical yet natural and compelling interactions.

Our research delves into the question of how one should use pen and touch in interface design. This really boils down to three questions: (1) What is the role of the pen? (2) What is the role of multi-touch? And (3) What is the role of simultaneous pen and touch? The perspective that we have arrived at in our research is the following: the pen writes, touch manipulates, and the
combination of pen + touch yields new tools:

I’ve now posted a video of the research on YouTube that shows a bunch of the techniques we explored. We have implemented these on the Microsoft Surface, using a special IR-emitting pen that we constructed. However, you can imagine this technology coming to laptops, tablets, and slates in the near future– the N-Trig hardware on the Dell-XT2, for example, already has this capability, although as a practical matter it is not currently possible to author applications that utlize simultaneous pen and touch; hence our exploration of the possibilities on the Microsoft Surface.

Manual Deskterity: An Exploration of Simultaneous Pen + Touch Direct Input

The name, of course, is a simple pun on “Manual Dexterity” – in the context of shuffling papers and content on a “digital desk” in our case.  Hence “manual deskterity” would be the metric of efficacy in paper-shuffling and other such activities of manual organization and arrangement of documents in your workspace. This name also has the virtue that it shot a blank on <name your favorite search engine>. Plus I have a weakness for unpronounceable neologisms.

Special thanks to my colleagues (co-authors on the paper) who contributed to the work, particularly Koji Yatani, who contributed many of the novel ideas and techniques and did all of the heavy lifting in terms of the technical implementation:

Koji Yatani (Microsoft Research Intern, Ph.D. from the University of Toronto, and as of 2011 a full-time employee at Microsoft Research’s Beijing lab)

Michel Pahud

Nicole Coddington

Jenny Rodenhouse

Andy Wilson

Hrvoje Benko

Bill Buxton