Technology Trends

Human Computer Interface

Augmented Maps

Printed maps are easy to manipulate, provide an easy way of interacting for multiple users, but are static and can be out of date. On the contrary, computer-based map displays can provide dynamic and more recent information than paper-based maps, but do not help a group of people to communicate. So why not mix them? This is what have done researchers at England’s University of Cambridge with their augmented maps, which add digital graphical information and user interface components to printed maps. Here is how this works: the printed maps are placed on a flat surface; an overhead camera linked to a PC tracks the map via the live video stream; and an overhead projector adds graphical information to the maps. This could be useful for many applications, and the researchers have applied it to a flood simulation of the Cambridge area. Read more…


First, here is a diagram showing the whole system and its components (Credit: University of Cambridge, UK).



And below is an augmented map showing the flooded River Cam. “The image browser to the right shows views corresponding to locations and different stages of the flood, while the PDA to the left controls a helicopter unit” (Credit: University of Cambridge, UK).



Here is a description of the system which has been developed by Dr T.W. Drummond, Dr G. Reitmayr and Ethan Eade.


Tom’s demonstration of the dynamic paper map comprises of a camera and a projector looking down at a paper map from above. The system performs interactive tracking of the map on a table top environment using the live video stream captured by the camera. Once the locations of the maps are known, the projector displays extra information directly on the maps.

The system also tracks user interface devices which can be placed on the map and which enable access to information that is linked to locations on the map. A simple physical prop, for example a piece of white card, becomes a selection tool and projection surface at the same time. Images referenced by the location pointed at are displayed in the white card.

So far, Tom and his colleagues have used their system to show how it could be used to monitor a flooding situation in the Cambridge area and how easy it would be to deploy emergency units, such as an helicopter, by controlling it with a PDA.


Now, the researchers want to move out from their labs and build a deployable and mobile system.


You’ll find more information on the project page, with more technical explanations and different images.


For your viewing pleasure, here is a link to a short video (2 minutes and 43 seconds, 25.2 MB) showing the different tools and components of the system.


And if you’re interested by these augmented maps, a technical paper will be published soon under the name “Localisation and Interaction for Augmented Maps.” This paper will appear in the Proceedings of the 4th IEEE and ACM International Symposium on Mixed and Augmented Reality (ISMAR 2005), which will be held in Vienna, Austria, on October 5-8, 2005.


Sources: University of Cambridge, Engineering Department, News & Features, July 7, 2005; and various web sites


Related stories can be found in the following categories.


  • Computers

  • Engineering

  • Human Computer Interface

  • Innovation

  • Vision and Visualization Apps


Augmented Reality For Poultry Plants?

Augmented reality (AR) is a technology that puts computer-generated objects on the top of the real world. And now, AR is going to be used in poultry processing plants to improve communication between computers and workers. Researchers at Georgia Tech have designed two AR systems that project graphical instructions from an automated inspection system onto birds on a processing line, telling workers which chicken are ‘defective products’ and have to be discarded. For example, some workers will wear see-through head-mounted displays (HMD), which will allow them to see graphical instructions about a bird and what to do with it. ‘Right now, this inspection is done visually by human screeners, who communicate instructions to trimmers using gestures.’ AR technology should increase the throughput of poultry plants if their owners want to pay about $3,600 per device. Read more…


Here is the introduction of this Georgia Institute of Technology Research News article.


Technology that transfers computer-generated information onto the physical world is being tested for use in poultry plants to improve communication between computers and workers.

Using augmented reality (AR) technology, researchers have designed two systems that project graphical instructions from an automated inspection system onto birds on a processing line. These symbols tell workers how to trim or whether to discard defective products.

Below is a photo of these augmented reality systems (Credit: Gary Meek, for Georgia Tech, on this page). And here is a link to a larger version (1.09 MB).



One augmented reality system developed at Georgia Tech uses a location-tracked, see-through, head-mounted display (foreground) worn by poultry workers. It directly overlays graphical instructions on a trimmer’s view of the birds. A second solution uses a laser scanner, mounted in a fixed location near the processing line, to project graphical instructions (red square on bird illustration) directly onto each bird that requires some action, such as trimming.

But what has motivated researchers to use such a sophisticated technology in poultry plants?


“It’s easy to see this technology working in a poultry plant,” said Blair Macintyre, an assistant professor in the Georgia Tech College of Computing and AR expert. “The question is, ‘What is the best implementation of the technology to satisfy the environmental constraints?’”

Researchers have had to consider that poultry processing plants are typically wet and slippery and have to be thoroughly washed down with high-pressured water streams daily. Also, trimmers need simple, graphical instructions and must have their hands free of any object except a knife for cutting defective bird parts.

This is why they developed two independent AR solutions, without knowing which one could be chosen by the food industry..


“Each solution appears to have advantages and disadvantages,” Macintyre said. One of the greatest benefits that both solutions provide is the potential for advance warning to trimmers of the workload coming down the line, he added. Current practices don’t provide this advantage.

“But our suspicion is that the laser-based system is the more practical in the near term and potentially in the long term,” Macintyre said. “The real disadvantage of the head-mounted system is its cost. Heads-up displays cost about $3,600, but they are getting cheaper. Two years ago, they cost about $7,000 each.”

These AR systems will not be commercially available before several years, and they might mot be even successful — for psychological reasons.


“We think these technologies have the potential to be better than current practices,” Macintyre said. “But, two humans working together over time have learned to use non-verbal cues and have developed a smooth communication system. That will be hard to beat at some level.”

Anyway, these AR solutions will be described during the 2005 Annual International Meeting of the American Society of Agricultural Engineers, held on July 17-20 in Tampa, Florida. The research paper, “Augmented Reality Systems Applied to Poultry Grading & Inspection,” will be presented on July 18 at 11:45AM, but is not yet available online.


Finally, for slightly more information, you can visit the Augmented Reality for Poultry Inspection page at the Augmented Environment Lab (AEL).


Sources: Jane Sanders, Georgia Institute of Technology Research News, via EurekAlert!, July 14, 2005; and various web sites


Related stories can be found in the following categories.


  • Food

  • Human Computer Interface

  • Innovation

  • Virtual Reality

  • Vision and Visualization Apps


Texting Is Too Slow? Draw Your Words!

Admit it, typing an SMS on a cell phone takes time, and writing an e-mail on a PDA is only marginally better. But according to the San Jose Mercury News, a researcher at IBM has found a solution to this vexing problem. Instead of typing words on these ridiculous small keyboards, with the SHARK, an abbreviation for ShortHand-Aided Rapid Keyboarding, you use a grid and a stylus. The grid appears on the screen of your portable device. You put a stylus on the first letter of the word you want to type. Then you drag the stylus to draw a line connecting all the other letters of the word. When you release the stylus, the word appears almost magically. With SHARK, you can type between 50 and 80 words per minute, which is almost miraculous. So far, IBM hasn’t yet decided to release this software as a product. But if enough of you download it, which is currently free, and say you want it, IBM could release it as a paying product within a few months.


Here is the introduction of the Mercury News article.


Humans in their long history have invented only two ways for individuals to produce text: handwriting and typing on a keyboard.

Shumin Zhai, an IBM scientist, may have invented another way: SHARK, an abbreviation for ShortHand-Aided Rapid Keyboarding.

SHARK is intended for writing text with a stylus on small touch-sensitive screens, such as those found in cell phones and personal digital assistants. It uses a radically different approach that is easy to learn and fast.

Here is how the system works. Below is a screen capture of a user trying to finishing to type “The quick brown fox jumps over the lazy dog.” On this capture, the user is moving its stylus to create the word “jumps” (Credit: IBM Almaden Research Center).



If you want to see SHARK in action without downloading it, here is a link to a video demo (4 minutes and 24 seconds, 29.7 MB). The above image comes from this video.


Here is how the Mercury News describe the system.


To write a word, you put the stylus on the first letter of the word and then drag the stylus to draw a line through the alphabet cluster, touching every letter in the word. When you lift up the stylus after hitting the last letter, SHARK figures out what word you want and displays it on the screen.

If SHARK makes a mistake, you tap the word and get a list of the most likely alternatives based on the path you traced through the grid.

You can check the system by yourself, and even download a beta version on the IBM SHARK Shorthand web site.


CNET News.com also described the SHARK system last week in “New-age keyboard: Trace, don’t write.”


But for more technical information, here is a link to the recent publications of Shumin Zhai and his colleagues.


In particular, you should read “In Search of Effective Text Input Interfaces for Off the Desktop Computing” (PDF format, 18 pages, 255 KB).


For the moment, the system is only working with a database of English words. If IBM ever needs beta testers for a French version, I’m available. Typing text messages is just a nightmare right now…


Sources: Mike Langberg, San Jose Mercury News, July 15, 2005; and various web sites


Related stories can be found in the following categories.


  • Computers

  • Human Computer Interface

  • IBM

  • Innovation

  • Software

  • Technology

  • Wireless


Chips in Human Brains to Control Prosthesis

You probably remember the story which surfaced in May 2005 about monkeys using robotic arms as their own (check here or there to refresh your memory). Now, according to the ANBA press agency, Miguel Nicolelis, the professor of neurology at Duke University who was behind the experiments with the monkeys, wants to go further. He plans to install chips in humans’ brains in order to control prosthetic arms. Of course, there is still some work to do with animals before this kind of surgery can be practiced on humans. But the first surgery in the world to implant a neuro-prosthesis inside a human being is expected to be performed in a Brazilian hospital by 2008.


Here is the introduction of the ANBA report.


The Syrian-Lebanese Hospital, in the southeastern Brazilian city of São Paulo, is going to perform the first surgery in the world for implantation of robotic arms into a human being, to be moved by brain signals. The agreement for realization of the surgery was singed last month with the Santos Dumont Association for Support to Research. The surgery is scheduled to take place in three years.

According to the hospital’s corporate superintendent, Mauricio Ceschin, the technique consists on implanting a microchip into the human brain to translate the nerve pulses into electric pulses, making it possible for the patient to move robotic prosthetics.

Below is a diagram describing how a patient’s brain can control the prosthetics (Credit: Miguel Nicolelis’s Laboratory at Duke’s Center for Neuroengineering).



Of course, this will take time before this technique can be applied to a human.


According to Ceschin, up to the execution of the first surgery for implantation of robotic arms moved by brain signals, the Education and Teaching Institute of the Syrian-Lebanese hospital will have a laboratory turned to research in neuroscience, where new tests will take place before the first surgery.

The superintendent also stated that a team of hospital neurosurgeons is getting ready to apply the new technique. “It will still take between two and three years for tests to be concluded on animals. The doctors must feel secure,” he said.

For slightly more information, you also can read a former news release from the Syrian-Lebanese Hospital.


As you can guess, there is no scientific paper available on this subject. But if you want to read the latest research paper about this brain-machine interface, at least for monkeys, The Journal of Neuroscience has published “Cortical Ensemble Adaptation to Represent Velocity of an Artificial Actuator Controlled by a Brain-Machine Interface” (May 11, 2005, Vol. 25, Num. 19, Pages 4681-4693). Here is a link to the abstract.


Monkeys can learn to directly control the movements of an artificial actuator by using a brain-machine interface (BMI) driven by the activity of a sample of cortical neurons. Eventually, they can do so without moving their limbs. Neuronal adaptations underlying the transition from control of the limb to control of the actuator are poorly understood. Here, we show that rapid modifications in neuronal representation of velocity of the hand and actuator occur in multiple cortical areas during the operation of a BMI. Initially, monkeys controlled the actuator by moving a hand-held pole.

As the monkeys started using their cortical activity to control the actuator, the activity of individual neurons and neuronal populations became less representative of the animal’s hand movements while representing the movements of the actuator. As a result of this adaptation, the animals could eventually stop moving their hands yet continue to control the actuator. These results show that, during BMI control, cortical ensembles represent behaviorally significant motor parameters, even if these are not associated with movements of the animal’s own limb.

Sources: Marina Sarruf, ANBA (Brazil Arab News Agency), translated by Mark Ament, July 8, 2005; and various web sites


Related stories can be found in the following categories.


  • Biotechnology

  • Chips

  • Human Computer Interface

  • Medicine

  • Robotics


Surf the Web in Your Car — Hands Free

Because she is concerned about the emerging usage of Internet in cars, Dr. Meirav Taieb-Maimon, from Ben-Gurion University of the Negev in Israel, has designed a new search engine that leaves your hands free. In this article, Discovery News writes that the system is using voice-recognition software, a microphone and speakers. The software itself is composed of three elements, two speech recognition components from Microsoft and a custom piece of software called ‘Maestro.’ When a driver says something such as ‘nearest gas station,’ Maestro converts speech to text, builds a search query and sends it to a search engine. It then converts back the results to spoken instructions for the driver. More research needs to be done to know if the system is safe for driving. If it proves to be safe, a ‘Maestro’ might be the Web driver in your next car. Read more…


First, how does the system work?


Let’s say a person wants to find a restaurant in Manhattan that has gotten good reviews. First, she would first dictate her query by saying, “Restaurants New York City.”

Maestro triggers the speech recognition software to convert the speech to text and then delivers it to a so-called “query builder,” which puts the request in language a search engine such as Google can understand.

The query builder returns the query to Maestro, which then delivers it to a search engine. When the results come back, Maestro sends them to the text to speech component for the driver to hear.





The ‘Maestro’ project is the first application developed that can completely search and browse the Internet via voice interface. (Credit: Dani Machlis).

The image above and its accompanying legend come from this article from Allison Kaplan Sommer published by Israel21c.


So when will we have a ‘Maestro’ driving the Web for us?


Taieb-Maimon would like to see more safety tests conducted before such a system finds it’s way into automobiles.

Currently, she is preparing a study that will compare driver distraction while using the Maestro system to how much a driver is preoccupied while not using the voice-activated search function to how much they are distracted while conversing with another passenger.

As Taieb-Maimon said to Israel21c, the use of Internet in cars is “inevitable,” so it’s better to design a safe system for surfing while driving.


“With more and more people now working out of the office and trying to be productive as they travel, these kinds of systems are being developed and used. People want to use their driving time to work,” she told Israel21c.

As even hands-free cell phones are forbidden to use in many areas around the world, I wonder what is the future for a hands-free search engine. Would you use such a system?


Sources: Tracy Staedter, Discovery News, June 30, 2005; Allison Kaplan Sommer, Israel21c, May 29, 2005


Related stories can be found in the following categories.


  • Human Computer Interface

  • Internet

  • Software

  • Transportation

  • Wireless


Ben Franklin’s Ghost Haunts Philadelphia

If you visit the Lights of Liberty Show in Philadelphia, you will not have to pay the $17.76 entrance fee to speak with a virtual Ben Franklin because his ghost is located in the free visitors area. There, you’ll be able to choose from a list of 160 prepared questions or type your own request. And Ben’s image will appear to float in front of you, like a ghost. But don’t worry! In fact, you’ll see a video of Ralph Archibald, an actor who has been portraying Franklin for more than 25 years. And Ben’s ghost will give you the most appropriate of about 800 possible answers from its own database using a technology developed at Carnegie Mellon University and already in use by some medical firms online.


So here are the facts about this exhibit.


An exhibit developed by the Entertainment Technology Center at Carnegie Mellon University (CMU) and now open in Philadelphia at least gives the illusion that the founding father can still keep up his end of a conversation.

Called “Ben Franklin’s Ghost,” it is open across the street from Independence Hall in the visitors center for the Lights of Liberty Show, a sound-and-light walking tour after dark through Independence National Historical Park.





“People who wish to talk with Franklin’s Ghost will find it floating on a large screen above this table, which holds a book containing questions about his life. They can touch the questions that interest them or type in other ones while Franklin answers in real time.” (Credit: CMU Press Release).

How does this work?


Using a Carnegie Mellon-patented technology called Synthetic Interview, visitors can ask questions of Franklin, either by choosing from 160 prepared questions or typing in their own questions based on a list of key words.

Computer software then calls up the most appropriate of about 800 possible answers as performed by actor Ralph Archbald, who has portrayed Franklin in hundreds of appearances in the Philadelphia area over the past 25 years. These digitally recorded images are then displayed using a 150-year-old illusion known as Pepper’s Ghost, which makes Archbald’s image appear to float in front of the visitor like a ghost.

You might think that this technology is only useful to entertain your kids. But you’ll be wrong. This technology, invented and patented by Scott Stevens and Mike Christel is already used online.


For example, MedRespond, a technology company servicing the healthcare and medical communities, already has started to design and develop Synthetic Interviews for online interactive applications.


If you happen to see Ben Franklin’s Ghost, don’t ask him silly questions, such as what will the host city for the Olympic Games in 2012 — tip: the answer is London! Instead, please take some pictures of the ghost in the air and tell me where to find them online. Thanks.


Sources: Various news releases and web sites


Related stories can be found in the following categories.


  • Displays

  • Education

  • Human Computer Interface

  • Software

  • Vision and Visualization Apps


This Robot Understands You in Noisy Environments

The Japanese Humanoid Robotics Project has produced the HRP-2 robot, which is known for dancing and preserving Japanese culture. But now, the HRP-2, which is about 1.6 meter high and weighs about 60 kilograms, can hear humans and understand them with its sophisticated software and hearing equipment. It uses an array of microphones consisting of eight omnidirectional microphones mounted around the robot’s head. Stable speech recognition is obtained by combining information from the microphone array and a camera also mounted on its head, and by isolating and eliminating noises, even from your TV. These hearing capabilities are essential “for helping humans to communicate with robots in real environments by 2025.” Read more…


Before going further, here is how looks the HRP-2, also known as ‘Prométhée’ (Credit: Kawada Industries, Inc.)



Now here are the technical details provided by Japan’s National Institute of Advanced Industrial Science and Technology (AIST) about the microphone array.


The microphone array consists of eight omnidirectional microphones mounted around the robot’s head. The sound source is located on the basis of difference in times for arrival to individual microphones, and at the same time, a camera mounted at the robot’s head detects, tracks and locates a person giving the vocal instruction.

Here is the robot’s head with its array of microphones. The red arrows show the positions of the eight microphones (Credit: AIST).



Stable speech recognition is obtained by combining information derived from the microphone array and the camera and by isolating and eliminating noises. Hardware to eliminate noises in real time has been developed and built into a robot, making it possible for a human operator to give robot vocal instructions, and to control IT appliances through a robot, even in a field where multiple noise sources such as TV exist.

It is expected, therefore, that natural communications may be realized in the living environment between a human operator and a humanoid robot through the auditory function of robot.

Please read the AIST document for more details about the voice interface and its hardware and software components. I just want to emphasize that the goal of this project is to allow natural communications between human beings and humanoid robots through the auditory function of the robots, and even in noisy environments.


Sources: Japan’s National Institute of Advanced Industrial Science and Technology (AIST) news release, June 20, 2005; and various web sites


Related stories can be found in the following categories.


  • Future

  • Human Computer Interface

  • Robotics

  • Software


A ‘Misty’ Screen For Trade Shows

In “Foggy screen points the way,” Nature describes a technology invented by a Finnish company named FogScreen. But don’t let you be fooled by the name, the images are not blurry, even if the screen is made of water. You can even walk through the screen without feeling wet because the company uses ‘dry’ fog made of plain water without any chemicals added. The idea behind the technology is similar to the one used by laser shows for musical events. And the real beauty of this innovation is its ease of use. You just replace your conventional screen by a FogScreen, and you’re all set. But read more…


Here are the opening paragraphs of the article from Nature.


Forget plasma screens, here’s one made out of nothing but water. Inventors have fashioned an interactive computer display from a curtain of fog.

The FogScreen uses ceiling-mounted air jets to create a vertical, turbulence-free slice of air a few centimetres thick, into which a fine mist of water is pumped. An ordinary projector can be used to display images on the resulting wall of fog.

And you can even click on this wall of fog.


When the projector is hooked up to a normal computer, the FogScreen can function much like the large display from a desktop in a lecture theatre. But, with the help of a laser-scanning system, the FogScreen also allows users to click on the watery screen itself.

Poke a finger at the screen, and the laser beams scanning the surface of the fog are interrupted, allowing the system to detect where you have ‘clicked’.

Below is a photograph showing how a FogScreen could be used during a trade show or a cultural event (Credit: FogScreen Inc.)



Here is a link to a larger version of this image (579 KB).


Nature adds that these screens are based on simple technologies.


It looks high-tech, but the FogScreen relies on fairly simple technologies. Ceiling-mounted blowers create vertical sheets of non-turbulent air that flow side-by-side without mixing. High-frequency ultrasound vibrations vaporize water into tiny droplets that are pumped between air flows.

In this page about its technology, FogScreen adds some details — but of course, this is company literature.


The basic components of the screen are a laminar, non-turbulent airflow, and a thin fog screen (or any particles) injected into and inside a laminar flow. Created this way, the fog screen is an internal part of the laminar airflow, and remains thin, crisp, and protected from turbulence.

The fog is made within the device using water and ultrasonic waves. If you hold your hands in the fog flow, the fog feels dry and cool, and your hands do not get wet.

After the screen is formed, images can be projected onto it. The screen can be translucent or fully opaque.

And with two projectors, you can project different images on both sides of the screen.


The technology behind the FogScreen products has received the U.S. patent number 6,819,487 in November 2004 under the name “Method and apparatus for forming a projection screen or a projection volume.”


Finally, in “Click on air!,” innovations report, from Germany, describes what you would experience at a car show if an automotive company used such a display.


Imagine a stand at a motor show featuring a new convertible. There’s a screen ‘hanging in the air’ with everything you expect on your PC desktop. You can click your way through all the new features of the car just by pointing your finger, and when you’re done you can walk through the screen and on to the next stand.

A last note: I’ve never seen these displays in action. So if you read this note and have already walked through a FogScreen, please leave your comments below. Anyway, tt looks like serious fun technology.


Sources: Michael Hopkin, Nature, June 10, 2005; and various websites


Related stories can be found in the following categories.


  • Displays

  • Human Computer Interface

  • Innovation

  • Patents


Play Music By Driving on a Virtual Road

Researchers at the University of Southern California (USC) have designed an interface for non-musicians to play music. This interface, part of the Expression Synthesis Project (ESP), is based on the fact that more people know how to drive a car than an orchestra. In “Baby, you can drive my song,” the researchers explain how they converted real musical scores into digital virtual roads. Then using a steering wheel and foot pedals, you ‘drive’ on this road to interpret the piece of music, becoming a real maestro. Such a system should be demonstrated in a public exhibit by 2008 and become available to everyone in the same time frame. Read more…


Here are some details about the ESP project, devised by a team led by Elaine Chew of the USC Viterbi School of Engineering.


ESP “attempts to provide a driving interface for musical expression,” according to Chew’s published description. “The premise of ESP is that driving serves as an effective metaphor for expressive music performance. Not everyone can play an instrument but almost anyone can drive a car. By using a familiar interface, ESP aims to provide a compelling metaphor for expressive performance so as to make high-level expressive decisions accessible to non-experts.”

Created by Chew, Alexandre R.J. François, a research professor in the Viterbi School, and graduate students Jie Liu and Aaron Yang, ESP starts with a piece of music in the Musical Instrument Digital Interface (MIDI) format, one that has been converted from the printed score.

Below is a diagram showing how the system works, from a real musical score to a virtual digital road, and then from this road to real music played by you (Credit: USC Viterbi School of Engineering).



This image comes from this document about the Expression Synthesis Project(PDF format, 2 pages, 658 KB).


Of course, the difficult part is to convert a real musical score into a digital road.


The group is building tools to automate the process of creating such roads, applying artificial intelligence techniques to the analysis of the score. “Having the road build itself will be the most difficult part,” says François.

The road’s turns suggest to the driver when to slow down and speed up. however, the ultimate decision on what to do at each turn is entirely in the driver’s hands (or foot). The foot pedals control both the tempo and the volume of the music. Additionally, buttons mounted on the wheel act as the equivalent of the pedals on the piano, making the notes either sustain or cut off crisply.

This research work was presented at the 2005 International Conference on New Interfaces for Musical Expression (NIME), held on May 26-28 in Vancouver, Canada.


Here is a link to the paper which was presented at this conference, “ESP: A Driving Interface for Expression Synthesis” (PDF format, 4 pages, 289 KB).


You can also find more information about this project by visiting the Music Computation and Cognition website (but it appears that some links are broken right now) or the USC Integrated Media Systems Center (IMSC).


Finally, on this poster about the project (PDF format, 1 page, 439 KB), you’ll read that the goal is to have an interactive public exhibit in 2008.


Ready to drive an orchestra?


Sources: USC Viterbi School of Engineering news release, May 30, 2005; and various websites


Related stories can be found in the following categories.



  • Engineering

  • Human Computer Interface

  • Innovation

  • Music


A ‘Smart’ Email Software Organizes Your Tasks

You probably receive dozens of emails every day about various aspects of your business or personal life. And because your email program doesn’t understand the relationship between messages, except for the occasional thread, you have to manage your activities by looking through lists of emails. But now, two computer scientists from University College Dublin (UCD) and IBM have developed the Active Email Manager (AEM) and have even filed patents for a ’smart’ email program. Their prototype can make the difference between work-related tasks — and assign them to a workflow — and personal email. This software could be integrated in commercial products from IBM within two years. Read more…


Here are some details about the project.


A University College Dublin (UCD) scientist has filed a patent application for a new technology that he believes can turn email into a much more effective business tool. US-born Dr Nicholas Kushmerick, a senior lecturer in the Department of Computer Science at UCD, has developed the technology over the past year during his part-time position as visiting scientist on IBM’s Centre for Advanced Studies (CAS) initiative.

Kushmerick developed the technology, known as Active Email Manager (AEM), in concert with New York-based IBM researcher Tessa Lau. Together they developed a machine-learning algorithm that automatically keeps track of tasks and associated emails, in order to build up a work flow for each task.

“The vision is that rather than come in and download all your emails, you could just call up your to do list and manage your activities,” Kushmerick explains.

Now, the two researchers have developed a prototype of the software and are busy testing it. And IBM wants to use the technology in some of its future products.


The technology is currently being appraised by two separate research groups within IBM, with the aim of turning into a commercial product. One of these is the Massachusetts-based product development team that develops IBM’s suite of collaboration software, Lotus Workplace. “There are some pretty intensive discussions going on now to see if we can get enough attention and convince them that our idea is feasible and that they would put it into their product pipeline,” says Kushmerick.

The research work has been presented at the 2005 International Conference on Intelligent User Interfaces (IUI 2005) which has been held on January 9-12, 2005, in San Diego, California. You can find the abstract of the paper called “Automated Email Activity Management: An Unsupervised Learning Approach” in the 2005 Conference Program.


Many structured activities are managed by email. For instance, a consumer purchasing an item from an e-commerce vendor may receive a message confirming the order, a warning of a delay, and then a shipment notification. Existing email clients do not understand this structure, forcing users to manage their activities by sifting through lists of messages. As a first step to developing email applications that provide high-level support for structured activities, we consider the problem of automatically learning an activity’s structure. We formalize activities as finite-state automata, where states correspond to the status of the process, and transitions represent messages sent between participants. We propose several unsupervised machine learning algorithms in this context, and evaluate them on a collection of e-commerce email.

Please note that this work received a Honorable Mention for Outstanding Paper Award at IUI 2005.


For more information, here is a link to the full version of this paper (PDF format, 8 pages, 234 KB), available from Kushmerick’s website.


Finally, you might want to read an article from Technology Research News on this subject, “Software organizes email by task.”


Sources: Brian Skelly, Silicon Republic, Ireland, April 6, 2005; and various websites


Related stories can be found in the following categories.



  • Email

  • Human Computer Interface

  • IBM

  • Patents

  • Software


It’s Time for a Conversation with your Computer

It took almost thirty years to get decent speech recognition programs on our computers. But if they’re good enough to translate our words into characters, they can’t engage in a conversation with us (I must say that some humans can’t do either). But according to this article from Technology Research News, things are changing. Computer scientists from Scotland and California have designed a multithreaded system which can anticipate what you’re going to say and are also able to switch context when you jump from a topic to another. This approach, which could be used in a wide range of applications, is welcome. Unfortunately, these researchers have selected the name “Conversational Interface Architecture” for their system, which leads to the worrisome acronym CIA. Anyway, the first commercial applications should be available within two years. Read more…


Here is a general description of this dialogue management system.


Researchers from Edinburgh University in Scotland and Stanford University have built a dialogue management system that promises to improve verbal communication with computers by giving the machine a sense of the type of phrase a person is likely to say next.

The Conversational Interface Architecture goes beyond the slot-filling dialogue systems commonly used for airline ticket booking systems by tracking multiple conversation threads, said Oliver Lemon, a senior research fellow at Edinburgh University. Slot-filling dialogue systems prompt users to provide topic-specific information and listen for keywords that determine the system’s response to the user.

And here ere are some details on how this dialogue management works.


The software follows multithreaded conversations — those that switch back and forth between several topics — without having to be programmed, regulates particular topics, and uses this information to improve speech recognition rates, according to the researchers. It also recognizes corrective fragments — phrases that correct something a user has just said — and it allows users to initiate, extend and correct dialogue threads at any time.

The system accomplishes this by tracking different types of utterances, including yes or no answers; who, what, where answers; and corrections like “I meant the office” and “not the tree.”

[Note: An utterance is a complete unit of talk, bounded by silence.]


I’s interesting to note that, by using this analysis of utterances, the system can work with any speech recognition system.


What could we do with such a software?


The approach could be used in a wide variety of speech recognition systems including telephone-based information systems, interactive entertainment devices, robots, computer interfaces for the visually impaired, in-car dialogue applications, and speech interfaces for personal computers.

Another question remains: when will such systems be available?


The context-sensitive component of the researchers’ system could be applied to practical applications now, said Lemon. Multithreaded dialogue management could be used practically within two years, he said.

This research work has been presented at the ACM Transactions on Computer-Human Interaction (TOCHI) conference last year and published in its September 2004 issue (Volume 11, Issue 3, Pages 241 - 267).


Here is a link to the abstract of this paper named “Multithreaded context for robust conversational interfaces: Context-sensitive speech recognition and interpretation of corrective fragments.” Here is a summary of their results.


In an evaluation of a dialogue system built using this architecture we found that 87.9 percent of recognized utterances were recognized using a context-specific language model, resulting in an 11.5 percent reduction in the overall utterance recognition error rate, and a 13.4 percent reduction in concept error rate. Thus we show that by using context-sensitive recognition based on the predicted type of the user’s next dialogue move, a more flexible dialogue system can also exhibit an improvement in speech recognition performance.

Sources: Eric Smalley, Technology Research News, April 6/13, 2005; and various websites


Related stories can be found in the following categories.



  • Computers

  • Human Computer Interface

  • Innovation

  • Software


Dancing With Data

Some students are luckier than others — or have more fun. For example, this Stanford University report says that some of the students there may have some hard and physical work to do: dancing. But in exchange, they’re working with sensors, cameras and computers to study how a dancer of the Merce Cunningham Dance Company is moving. This must be exhilarating, especially after finding — and confirming — that he acts as a ‘biomechanical rebel.’


Here is the experience of Jonah Bokaer, a dancer from the Merce Cunningham Dance Company, who was enrolled in the program.


The test subject danced wearing only blue shorts and the 50 silver balls the size of marbles that stuck to his skin, mapping out his physique.

“I know what I think my body is doing. But is it really doing that? I don’t really know, but I’d like to,” he said during a break in the afternoon session at the Motion and Gait Analysis Laboratory at Lucile Packard Children’s Hospital.

A member of the Merce Cunningham modern dance company, Jonah Bokaer said he couldn’t wait to see the results — a digital record of his skeleton’s behavior as it undulates, spins and leaps.





Here is a photograph of Jonah Bokaer equipped with reflective markers for the cameras tracking his dance moves (Credit: Amy Ladd, Stanford University).

His moves are monitored by students of the Anatomy of Movement class which is now in its second year.


“We’re looking upside down, inside out, at the human body,” said course director Amy Ladd, MD, professor of orthopedic surgery. “It’s not the way any single discipline would frame the study of movement.”

Ladd added, “Each project reflects an integration of disciplines spanning the humanities and sciences to portray human movement.” The exercise was part of an extensive series of interdisciplinary art projects that were tied to Cunningham’s performances on campus last week.

So what methods are using these students to analyze a dancer’s movements?


Eight cameras in the lab tracked the motion of the silvery balls on their test subjects: Cunningham dancers Frank and Bokaer and course director Ladd, who also happens to be a trained ballet dancer.

“We thought that the study needed a comparison, and analyzing someone in pointe shoes would be a good contrast,” said Ladd, who has studied ballet for years. “So I reluctantly agreed.”

The cameras sent the data to a computer, operated by motion analysis lab’s engineer Erin Butler. The output includes motion capture of dancers as well as quantitative information.

But what do you learn from such interdisciplinary projects?


Projects like this, mixing science with art, are challenging to conceptualize, said Ladd. “We’re looking for projects that merge science and art. No one really knows how to do this well yet. It’s a difficult mix. It calls for a philosophical paradigm shift for people who have been trained to think in one realm or the other.”

Here is a link to the other projects at the Anatomy of Movement.


And as a conclusion, it’s not the first time that Stanford University is mixing several disciplines, such as arts, sports and science. Check for example this article from Technology Research News, “Sensors track martial arts blows.”


Sources: Rosanne Spector, Stanford University Report, March 16, 2005; and various websites


Related stories can be found in the following categories.



  • Arts

  • Computers

  • Education

  • Human Computer Interface

  • Sensors


Virtual Reality and the Art of Medical Interview

Medical students often learn to ask questions such as “Tell me where it hurts” with live actors who are following prepared scripts. But this is expensive and the University of Florida (UF) has developed a new way to teach the subtle art of the patient-doctor interview. This news release, “UF’s Virtual Reality ‘Patient’ Teaches Bedside Manners to Medical Students,” tells us more about DIANA, which stands for “DIgital ANimated Avatar” and is a life-sized image of a young woman. Her image, completed by a simulation of a doctor’s office, is projected in front of a student who can interview her. So far, the method has only been used by two dozens students, but results are promising. Read more…


Let’s start with the introduction of DIANA.


“DIANA,” which stands for DIgital ANimated Avatar, is a life-sized image of a 19-year-old Caucasian female with a passing resemblance to video game hero Lara Croft. Her image, complete with simulated doctor’s office in the background, is projected onto a wall. Through their interviews with her, medical students can practice not only the right questions to ask to come to an accurate diagnosis but also the less straightforward aspects of human interaction such as gestures and eye contact.

“We want to focus on communication,” said Benjamin Lok, an assistant professor in UF’s Computer and Information Science and Engineering (CISE) department and the lead researcher on the project. “Part of (the interview training) is to get the right answer, but part of it is to learn communication skills.”

The images below show how the whole system works.
















On this image, a student is diagnosing DIANA, a ‘virtual’ patient with acute abdominal pain, while the instructor watches. The colored headset is for head tracking (Credit: CISE, University of Florida).
This screenshot shows a close-up of DIANA, the virtual patient, complaining of acute abdominal pain (Credit: CISE, University of Florida). This image, and the other one below, comes from the Virtual Objective Structured Clinical Examination (VOSCE) project webpage
“Head tracking data shows where the medical student is looking during the interview. This student looked mostly at DIANA’s head and thus maintained adequate eye-contact for the scenario.” (Credit: CISE, University of Florida)

Now, here is the current status of this project — and its promising results.


Currently, medical students can practice interviewing skills with “standardized patients,” live actors who are given a script to follow for the interview. However, training the actors can be expensive, and it can be difficult to find sufficiently diverse populations of actors, a factor that can make a subtle difference in the interview process, Lok said. The system, which costs less than $10,000, would help students train for the standardized patient interviews, making those sessions more effective, Lok said.

Seven medical students tested DIANA in August, and another 20 interviewed her in December. After each test, the students rated the realism and usefulness of the interviews on a one-to-10 scale. By December, DIANA’s average rating of 7.2 was nearly identical to the 7.4 average for the live actors.

Of course, this system is not perfect and researchers are working on some of its limitations.


Though those results are promising, DIANA isn’t ready to replace live actors yet, Lok said. She can look up when she is spoken to, look down during pauses, reach out to receive a handshake. But there are many other physical cues in human conversations that can provide information to a doctor and also reassure a patient that the doctor is paying attention, he added.

“There are so many things that you and I do when we talk — I can tell whether your eyes are focusing on me, whether you’re listening, hand gestures, facial gestures, body posture. These are things that the computer can’t do — but we’re working on that,” he said.

For more information about this new computer interface for medical students, you can read this document, “Experiences in Using Immersive Virtual Characters to Educate Medical Communication Skills” (PDF format, 8 pages, 959 KB). The top illustration above comes from this document.


Sources: University of Florida news release, March 12, 2005; and various websites


Related stories can be found in the following categories.



  • Human Computer Interface

  • Medicine

  • Virtual Reality

  • Vision and Visualization Applications


IBM Mouse Helps People with Shaky Hands

A friend of mine who worked for free to help senior citizens to use computers once told me that the biggest hurdle was not technical — people can learn during all their lives — but physical. Many old people have trembling hands which prevent them to use a mouse to point and click on a small icon on a computer screen or a link on a browser page. Now, according to this article from ExtremeTech, IBM has unveiled a mouse adapter which treats these tremors as “noise” by filtering out the unintentional movements of the hand caused by a tremor. This new mouse will also help the ten million people which are affected by this genetic disorder every year, and who aren’t necessarily old. This adapter will be sold for about $100. Read more…


Let’s start with some pictures.











Here is Hugh Pearson of Montrose Secam holding one of these mouse adapters(Credit: IBM Research). Here is a link to a high-quality version of the same image (1,960 x 3,008 pixels, 4.5 MB).
And there is another picture of this adapter sitting next to a computer mouse (Credit: Montrose Secam).

Here is how it works.


The new mouse treats the hand tremors as noise, and uses algorithms based on image-stabilization systems used in digital cameras.

[As you can see on the above pictures,] the mouse includes a physical dial to control the sensitivity of the mouse, as well as how quickly the user needs to double-click. Normally, these functions are handled by software controls — which require a mouse to adjust.

As I wrote above, this inability to precisely use a computer mouse doesn’t affect only the elderly.


Although tremors are usually associated with the elderly, a type of tremor called Essential Tremor is actually a genetic disorder that affects 10 million people per year, according to the International Essential Tremor Foundation (IETF).

This mouse adapter will be distributed by Montrose Secam, a British electronics company. You can buy it now for £67.50, 119.00 euros or $107.00, depending on where you live.


For more information, you can read these two articles from the Mercury News, “Algorithm box smoothes hand tremors on mouse,” and from the San Francisco Chronicle, “Helping hand for those with shaky hands.”


Finally, you might want to read this IBM press release, “Mouse adapter gives computer access to millions of hand tremor sufferers,” which offers additional details and links.


Sources: ExtremeTech Staff, March 14, 2005; Therese Poletti, Mercury News, March 14, 2005; Benjamin Pimentel, San Francisco Chronicle, March 14, 2005; and various websites


Related stories can be found in the following categories.



  • Computers

  • Human Computer Interface

  • Innovation

  • Medicine


Smart Carts Coming Soon to a Retailer Near You?

In this article, eWEEK reports about new smart carts announced by Fujitsu. After entering your shopping list on your Bluetooth-enabled PDA, you’ll go to your supermarket and pick a smart cart which will download your list on the rugged screen (read brat-proof) of its $1,200 unit. The system will lead you around the store, alert you about promotions, show you new recipes and update the shopping list in real time. It also can send messages to the deli or pharmacy sections and tell you when your order is ready. The U-Scan Shopper system allows you to remain anonymous and to receive only regular store promotions. Or you can use a loyalty card, receive targeted ads or recipes based on your shopping history, which will be maintained in the retailer’s databases. The article doesn’t say anything about shoppers who still use paper lists, but I bet these carts are still not smart enough to guess what is written on them. Read more…


Here is the rosy scenario imagined by eWEEK.


With a craving for broiled salmon, Jane quickly sifted through her spice cabinet only to find that her bottle of dill weed was nearly empty.

With a few clicks on her Bluetooth-enabled PDA, she updated her Web shopping list with the dill, and a window opened onscreen suggesting a new salmon recipe. It looked good, so she approved the recipe and her shopping list was instantly updated with all of the necessary items, omitting those that her kitchen already had in stock.

Jane drove the half-mile to her local supermarket where she grabbed a smart cart, scanned her loyalty card and saw her updated shopping list appear in front of her. It had been categorized by aisle, and the cart directed Jane to each item. While she was checking for brown spots on broccoli heads in the produce aisle, her cart signaled the pharmacy to prepare a prescription refill and sent an order for lunch meats to the deli.


Here is an example of what you’ll see on this smart cart’s screen. (Credit: Fujitsu Transaction Solutions Inc.) Here is a link to a larger version which is customizable by retailers to fit their own needs.


Now, let’s look at the Fujitsu’s business model for these smart carts.


Equipping an ordinary shopping cart with Fujitsu’s new U-Scan Shopper unit will cost about $1,200. That price also includes about 60 infrared triggers to be strategically placed along various store shelves to help the cart find its path around the store. Fujitsu is hoping to sell 100 carts at each typical grocery store, according to Vernon Slack, Fujitsu’s director of mobile solutions.

And now, here are more details about the essential part of the system, its screen.


Slack points to more practical advantages of Fujitsu’s design, such as their claim that their cart-handle-mounted unit (it’s literally bolted on) is small enough to allow for a child to sit in the traditional front-compartment, while some rival units are too large.

The “just less than two-pound” unit with the 6.5-inch display is surrounded with a quarter-inch of hardened Mylar plastic making it almost indestructible, even by a curious child, Slack said. The units are also sealed with a polycarbonate cover.

Slack sees the fact that the cart would already have the unit bolted to its bar when the customer arrives as an advantage over smart-cart approaches where the customer has to pick up a pad at customer service or at the entrance and place it in the cart.

The article from eWEEK also looks at the privacy concerns which could be raised by such devices. It also addresses the checkout issues — yes, you’ll still have to pay — and the delivery of personal ads.


Now, I have two questions for you. Would you like to find these smart carts at your local supermarket? And who will profit the most from these systems, the retailers or you?


Sources: Evan Schuman, eWEEK, February 16, 2005; Fujitsu Transaction Solutions Inc.


Related stories can be found in the following categories.



  • Business Intelligence

  • Human Computer Interface

  • Pervasive Computing

  • Wireless


Looking for something?

Use the form below to search the site:

Still not finding what you're looking for? Drop a comment on a post or contact us so we can take care of it!