Andrew D. Wilson is a senior researcher at Microsoft Research who also co-authored another paper that has been read for this class.
Hrvoje Benko is also a researcher at Microsoft Research who focuses on Human-Computer Interaction
This paper was presented at UIST 2010.
Summary
Hypothesis
The hypothesis in this paper is that it's possible to use multiple depth cameras and projectors to interact with an entire physical room space.
Methods
To prove that their hypothesis is feasible, the researcherts implemented various different components using the depth camera and multiple projectors.
The first component they discussed was their simulated interactive surfaces. They wanted any surface (like a table or a wall) to be come interactive by projecting data and objects on that surface. The surface could be interacted with through movements captured by the depth cameras.
The next component was interactions between the users body and the surface. One of these interactions consisted of the user touching one object on one screen then touching a location on the other surface. Completing this action would cause the selected object to be moved across the two locations. Another interaction they added was allowing users to pick up objects from a surface. By picking up an object, a orange "ball" would appear on the users hand and allow them to transfer the object to another surface.
The final component they added was spatial menus. A user can hold his hand over the menu for a few seconds to activate it. The menu is then projected on to his or her hand.
While there have been systems similar to LightSpace, none have combined all the features that the researchers discussed including the usage of multiple depth cameras.
Their full test consisted of allowing users to interact with the LightSpace prototype at a demo event.
Results
They found that there were multiple occassions that LightSpace might fail. One being too many users in the space causing a slowdown or interactions to not be handled well. Also, some interactions were found to fail due to a user accidentally covering their hand or body with their head.
Some users even developed new ways to interact with the LightSpace system.
Discussion
Systems like LightSpace, in my opinion, are part of the future. Being able to interact with objects in a full 3D space is something like what you see in a science fiction movie. However, I feel like there are many advances the researchers could add to make the system even better.
One is a better tracking system. They mentioned in the paper that they left out 3D hand tracking. There are new algorithms being released which allows for easy quick 3D hand tracking. These algorithms could be added in to allow easier and more efficient tracking for even better results.
I think for a system like this to catch on, camera and projection techniques will also have to improve. When a user gets in the way of the camera or projection, the interaction between the user and system is disrupted.
Wednesday, September 28, 2011
Monday, September 26, 2011
Paper Reading #12: Enabling Beyond-Surface Interactions for Interactive Surface with an Invisible Projection
Li-Wei Chan, Hsiang-Tao, Hui-Shan Kao, Ju-Chun Ko, Home-Ru Lin, Mike Y. Chen are graduate students at the National Taiwan University.
Jane Hsu is a computer science professor at the National Taiwan University.
Yi-Ping Hung is also a professor at the National Taiwan University.
This paper was presented at UIST 2010.
Summary
Hypothesis
The researchers set out to prove that interactions can occur beyond the surface of a touch display. By using infrared project, users can use other devices to interact with the scene.
Methods
In order to facilitate the hypothesis, researchers had to create their own custom table design. The table makes use of multiple IR cameras as well as color project layer coupled with a IR projection layer. The color project projects the image onto the color projection which is seen by the human eye. However, the IR projection layer also has data which allows other devices to interact with it.
By providing this IR projection layer, mobile devices can easily determine their orientation to the table.
The researchers also provided three various ways to interact with the table.
The first was the i-m-Lamp. The lamp has a pico projector and an IR camera attached to it. When the lamp is pointed at the screen, the IR camera detects where it is looking and it figures out what to overlay over the color screen.
The i-m-flashlight is very similar to the i-m-lamp but it was used to provide a more dynamic way (as opposed to the more static lamp).
The final interaction method was the i-m-view. The view was a tablet with a camera that could detect what it was looking at and display a 3D projection of a given map.
Results
Through initial results, they found that the users used the lamp like static object (as predicted) and use the flashlight to quickly select object and display relevant information.
The flashlight was a much more dynamic object for the participants.
The view was interesting from the participants perspective but lacked some features when the user would try and look at the 3D buildings a certain way (the view could no longer detect what it was looking at since it couldn't see the table).
Discussion
The system that was created by the researchers could easily find a place in the field of augmented reality in my opinion. In fact, the i-m-view was really an experiment in augmented reality.
I think an interesting path to take this kind of technology would be to use mobile phones to interact with the large table, as well.
I envision a board game (or something similar) where players can share the same view but the mobile device could provide a unique view to the player based on where they're looking on the board.
This kind of technology has many different applications as well. For example, imagine looking at a large phone directory on your table surface. By hovering your phone over the table, it could detect what name you were looking at and present a "Call Number?" message.
Jane Hsu is a computer science professor at the National Taiwan University.
Yi-Ping Hung is also a professor at the National Taiwan University.
This paper was presented at UIST 2010.
Summary
Hypothesis
The researchers set out to prove that interactions can occur beyond the surface of a touch display. By using infrared project, users can use other devices to interact with the scene.
Methods
In order to facilitate the hypothesis, researchers had to create their own custom table design. The table makes use of multiple IR cameras as well as color project layer coupled with a IR projection layer. The color project projects the image onto the color projection which is seen by the human eye. However, the IR projection layer also has data which allows other devices to interact with it.
By providing this IR projection layer, mobile devices can easily determine their orientation to the table.
The researchers also provided three various ways to interact with the table.
The first was the i-m-Lamp. The lamp has a pico projector and an IR camera attached to it. When the lamp is pointed at the screen, the IR camera detects where it is looking and it figures out what to overlay over the color screen.
The i-m-flashlight is very similar to the i-m-lamp but it was used to provide a more dynamic way (as opposed to the more static lamp).
The final interaction method was the i-m-view. The view was a tablet with a camera that could detect what it was looking at and display a 3D projection of a given map.
Results
Through initial results, they found that the users used the lamp like static object (as predicted) and use the flashlight to quickly select object and display relevant information.
The flashlight was a much more dynamic object for the participants.
The view was interesting from the participants perspective but lacked some features when the user would try and look at the 3D buildings a certain way (the view could no longer detect what it was looking at since it couldn't see the table).
Discussion
The system that was created by the researchers could easily find a place in the field of augmented reality in my opinion. In fact, the i-m-view was really an experiment in augmented reality.
I think an interesting path to take this kind of technology would be to use mobile phones to interact with the large table, as well.
I envision a board game (or something similar) where players can share the same view but the mobile device could provide a unique view to the player based on where they're looking on the board.
This kind of technology has many different applications as well. For example, imagine looking at a large phone directory on your table surface. By hovering your phone over the table, it could detect what name you were looking at and present a "Call Number?" message.
Paper Reading #11: Multitoe, High-Precision Interaction with Back-Projected Floors Based on High-Resolution Multi-Touch Input
Thomas Augsten is currently a graduate student working on his masters in IT Systems at the University of Potsdam.
Konstantin Kaefer is currently also a student at the University of Potsdam
René Meusel is a student at the University of Potsdam.
Caroline Fetzer is a Human-Computer interaction student at the University of Potsdam
Dorian Kanitz is a student at the University of Potsdam.
Thomas Stoff is a student at the University of Potsdam.
Torsten Becker is a graduate student at Potsdam and also holds a B.S. in IT Systems engineering.
Christian Holz is a PhD student at Potsdam and has a research interest in Human-Computer Interaction.
Patrick Baudisch is a professor of Computer Science at the University of Potsdam and the head of the Human-Computer Interaction research department.
This paper was presented at UIST 2010
Summary
Hypothesis
The researchers set out to prove that meaningful interaction through floor based input and output can occur.
Methods
The actual floor system that they constructed was called a FTIR (frustrated total internal reflection) device. This was created to allow for images to be projected onto the floor and to be able to handle touch input from a user's shoes.
In order to figure out the most natural ways to design the system, the researchers carried out multiple experiments to see how people would naturally interact with floor user interface objects.
The first was an experiment to see "how not to activate a button". Participants were asked to activate a fake button on the floor and to not activate another button. This test allowed the researchers to see what the most preferred way to interact with floor buttons is.
Another experiment was carried out by having participants step on the FTIR and to choose what buttons on the floor should be highlighted. This allowed the researchers to discover the conceptual model behind a foot press created by the user.
Then to facilitate precise input, participants were asked to choose their preferred "hotspot" which allowed them to select precise objects.
Various other features were also added to the FTIR including being able to identify users based on shoe prints. They also were able to do more complex recognition such as identifying a walking motion.
Results
In many of the tests they performed, participants often had their own way of doing things. In the how not to activate a button test, most preferred to activate the button by tapping and walking over the button to not activate. However, there were many other actions that were used.
In a similar fashion, the hotspot test had many users use unique hotspots. Eventually the designers decided to just allow users to set their own hotspot.
Researchers also successfully implemented the various other previously discussed features such as tapping vs. walking and user identification based on analysis of the sole processing.
Discussion
This paper was organized as more of a documentation of the process to create a system that recognizes foot input. This organization made the paper more interesting and enjoyable in my opinion.
The important distinction that needs to be made with this interesting piece of technology is the fact that foot based touch screens are more of a novel technology that can fit into a proper niche.
It's not meant to be a substitute for a computer or iPhone, for example.
I can easily see this type of technology employed in future homes or offices or even restaurants. The smart home could recognize it's owner by their footprint and their steps and complete tasks based on the users location. For example, when you get home from work in the afternoon, your house knows that you're headed to the kitchen by your walking and turns on the light. Or if you're hosting a party, a foot gesture could bring up a menu which would allow you to put on new music.
Konstantin Kaefer is currently also a student at the University of Potsdam
René Meusel is a student at the University of Potsdam.
Caroline Fetzer is a Human-Computer interaction student at the University of Potsdam
Dorian Kanitz is a student at the University of Potsdam.
Thomas Stoff is a student at the University of Potsdam.
Torsten Becker is a graduate student at Potsdam and also holds a B.S. in IT Systems engineering.
Christian Holz is a PhD student at Potsdam and has a research interest in Human-Computer Interaction.
Patrick Baudisch is a professor of Computer Science at the University of Potsdam and the head of the Human-Computer Interaction research department.
This paper was presented at UIST 2010
Summary
Hypothesis
The researchers set out to prove that meaningful interaction through floor based input and output can occur.
Methods
The actual floor system that they constructed was called a FTIR (frustrated total internal reflection) device. This was created to allow for images to be projected onto the floor and to be able to handle touch input from a user's shoes.
In order to figure out the most natural ways to design the system, the researchers carried out multiple experiments to see how people would naturally interact with floor user interface objects.
The first was an experiment to see "how not to activate a button". Participants were asked to activate a fake button on the floor and to not activate another button. This test allowed the researchers to see what the most preferred way to interact with floor buttons is.
Another experiment was carried out by having participants step on the FTIR and to choose what buttons on the floor should be highlighted. This allowed the researchers to discover the conceptual model behind a foot press created by the user.
Then to facilitate precise input, participants were asked to choose their preferred "hotspot" which allowed them to select precise objects.
Various other features were also added to the FTIR including being able to identify users based on shoe prints. They also were able to do more complex recognition such as identifying a walking motion.
Results
In many of the tests they performed, participants often had their own way of doing things. In the how not to activate a button test, most preferred to activate the button by tapping and walking over the button to not activate. However, there were many other actions that were used.
In a similar fashion, the hotspot test had many users use unique hotspots. Eventually the designers decided to just allow users to set their own hotspot.
Researchers also successfully implemented the various other previously discussed features such as tapping vs. walking and user identification based on analysis of the sole processing.
Discussion
This paper was organized as more of a documentation of the process to create a system that recognizes foot input. This organization made the paper more interesting and enjoyable in my opinion.
The important distinction that needs to be made with this interesting piece of technology is the fact that foot based touch screens are more of a novel technology that can fit into a proper niche.
It's not meant to be a substitute for a computer or iPhone, for example.
I can easily see this type of technology employed in future homes or offices or even restaurants. The smart home could recognize it's owner by their footprint and their steps and complete tasks based on the users location. For example, when you get home from work in the afternoon, your house knows that you're headed to the kitchen by your walking and turns on the light. Or if you're hosting a party, a foot gesture could bring up a menu which would allow you to put on new music.
Wednesday, September 21, 2011
Book Reading: Gang Leader for a Day
I never really ever thought that a sociology book could be interesting. That is until I read Gang Leader for a Day.
One of the interesting things I enjoyed about the book was that it was a real, visceral study of the projects and what goes on in gangs. I feel one of the big mistakes that sociologists make is the sanitation of the data. What I mean by that is that, even if the study is on a group of people, the sociologists just focus on the data without looking at the implications behind the data.
For example, we see studies all the time about the unemployment or homeless rates. But to us, those are just figures. They don't apply to us. We can't imagine them or picture what they actually mean. I think Venkatesh realized that. As opposed to just asking questions and collecting statistics, he builds relationships and explores what it means to live in the projects and to live by being involved with the gangs. If sociologists want to study society, what better way is there than to engulf yourself within a given community?
Towards the end of the book, you find that the projects that all those people rely on are being torn down. I can't help but think that if the president had gone and lived among those people like Sudhir, some other alternative would have been reached. Instead, the president was probably delivered a nice clean report that detailed the small amount of population and the large gang activity.
You can't study society with data alone. To truly study society, one must be engage, interact, and learn about the life of one in the society.
One of the interesting things I enjoyed about the book was that it was a real, visceral study of the projects and what goes on in gangs. I feel one of the big mistakes that sociologists make is the sanitation of the data. What I mean by that is that, even if the study is on a group of people, the sociologists just focus on the data without looking at the implications behind the data.
For example, we see studies all the time about the unemployment or homeless rates. But to us, those are just figures. They don't apply to us. We can't imagine them or picture what they actually mean. I think Venkatesh realized that. As opposed to just asking questions and collecting statistics, he builds relationships and explores what it means to live in the projects and to live by being involved with the gangs. If sociologists want to study society, what better way is there than to engulf yourself within a given community?
Towards the end of the book, you find that the projects that all those people rely on are being torn down. I can't help but think that if the president had gone and lived among those people like Sudhir, some other alternative would have been reached. Instead, the president was probably delivered a nice clean report that detailed the small amount of population and the large gang activity.
You can't study society with data alone. To truly study society, one must be engage, interact, and learn about the life of one in the society.
Paper Reading #10: Sensing Foot Gestures from the Pocket
Sensing Foot Gestures from the Pocket
Jeremy Scott is currently a graduate student at MIT but previously earned his bachelor of science at the University of Toronto.
David Dearman is a PhD student at the University of Toronto and is focusing on HCI
Koji Yatani is also currently a PhD student at the University of Toronto with an interest in HCI.
Khai N. Truong is an associate professor at the University of Toronto involved in HCI.
This paper was presented at UIST 2010.
Summary
Hypothesis
The main hypothesis of this paper was that foot gestures could provide a means of eyes-free input with the gestures being interpreted from a pocket by a cell phone.
Methods
The researchers attempted to prove their hypothesis but completing two studies.
In the first study, they had participants make various foot gesture that were recorded and studied by a complex camera system. The four gestures they tested were the dorsiflexion, plantar flexion, heel rotation, and toe rotation. They had participants try to select a target by making the specified gesture.
In the next study, they used a mobile device and it's accelerometer to detect foot gestures. Each participant wore three mobile devices in the front, the side, and the back. The user would initiate a foot gesture by "double tapping" their foot before using the gesture. The two gestures they focused on was the heel rotation and the plantar rotation.
Results
In the first study, they found that smaller angles were generally more accurate in the target selection. They also found that the participants preferred the heel rotation as it was more comfortable.
In the second study, they found out that the phone in the side pocket, in general, gave much higher success rates. Over all gestures, they found that the side pocket placement of the iPhone gave an accuracy rate of 85.7%.
Based on this high accuracy, they concluded that using foot gestures was indeed a viable input for a eyes-free device. They also began to prepare for future studies using foot gestures.
Discussion
While this is mainly a reiteration of a point they discussed in the paper, having a foot sensor that could control your phone would be very helpful. For example, if you're sitting in class or standing around having a conversation and you have an incoming call, it'd be very helpful to automatically forward the call to voicemail by making a discreet foot gesture.
A furthering of this study might be to remove the phone from the detection system. Perhaps create a small unobtrusive sensor that the user could put in his shoes that could interface with the phone. That way, one could get even higher success rates (the sensor would be right at the foot as opposed on the hip like in the study). Having the phone act as a sensor is very interesting but a simple, wireless detection sensor wouldn't be obtrusive or complicated either.
Tuesday, September 20, 2011
Paper Reading #9: Jogging over a Distance Between Europe and Australia
Jogging over a Distance between Europe and Australia.
Florian Mueller is currently a visiting scholar at Stanford University. However, he's previously worked at the University of Melbourne and Microsoft Research in Asia.
Frank Vetere is a senior lecture specializing in Human-Computer Interaction at the University of Melbourne.
Martin R. Gibbs is a lecturer at the University of Melbourne.
Darren Edge is a Microsoft Research Asia researcher who focuses on Human-Computer interaction.
Stefan Agamonlis is currently an assistant director of the Rebecca D. Considine Research Institute but previously he worked as the chief executive and research director of Distance Lab in Scotland.
Jennifer G. Sheridan is currently the director of User Experience at Big Dog Interactive.
This paper was presented at UIST 2010
Summary
Hypothesis
The main idea behind this paper was that they wanted to see if an activity like "social jogging" would enhance the activities experience and to see if a "technologically augmented social exertion activity" would be possible or worth the design.
Methods
The system that the researchers created was composed of a small headset that participants would wear that was attached to a small cellphone and a heart rate monitor. While a participant would run, they would be able to talk to their running partner over the phone through the mic. However, they also added an additional social experience by the addition of the heart rate monitor. Before racing, participants were asked to set their target heart rate. The closeness of the person's voice to the other person was tied to their heart rate. For example, if partner A was currently at half their target heart rate and partner B was at their target heart rate, partner A would sound like he was behind to partner B and partner B would sound like he was ahead to partner A. This gave the participants a sense of who was "ahead or behind."
There were three different main design elements the researchers attempted to include: Communication Integration (allowing the joggers to talk while running despite location), Virtual mapping (allows the joggers to tell who is in front and who is behind), and Effort Comprehension (due to the heart rate monitor contributing to the virtual mapping, one can tell how they're personally doing).
Results
To test this system, they had 17 participants go on runs that used the Jogging over a Distance system. In the paper they reported on 14 different runs. The participant's locations varied from Australia to Germany. All participants also knew each other before running as well.
Most of the feedback was very positive. Participants enjoyed being able to talk to their partner while jogging despite being far away. They also enjoyed the fact that the virtual spatial mapping was tied to target heart rate. They cited the fact that normally they can't run with one of their partners due to having different target heart rates and thus running at different paces never worked.
Discussion
Exertion "games" like Jogging over a Distance are, in my opinion, extremely interesting. For me personally, jogging can sometimes be a pain. Running on the treadmill can often be boring and running by oneself outside isn't also that fun either. You often times don't go as far as you usually can if you run by yourself. These types of exertion games make it easier to forget about the work associated with running.
I think they did a great job proving that their system worked by letting the participants freely use the system. Through this they got some very constructive feedback and it let the participants use the device how they personally wanted to use it.
Wednesday, September 14, 2011
Paper Reading #8: Gesture Search, A Tool for Fast Mobile Data Access
Gesture Search, A Tool for Fast Mobile Data Access
This paper was presented at UIST 2010.
Yang Li is a Senior Google Researcher with a PhD in Computer Science from the Chinese Academy of Sciences.
Summary
Hypothesis
The main hypothesis discussed in this paper was that this Gesture Search platform would provide an easy way for users to look-up contacts, applications, etc by using simple gestures. The Gesture Search application was built where users could directly draw a gesture to the screen and the gesture engine would attempt to locate relevant items from the user's phone.
Methods
In order to be flexible and efficient, the researcher implemented several helpful modules to aid in searching.
The first was to recognize multiple possible meanings of a gesture rather than selecting just one. For example, if the user writes something that could either be an "H" or an "A" both alternatives are considered.
Second, the system would recognize frequently searched-for terms and would appear higher up in the search results.
They also added a system to figure out whether a gesture was an actual gesture or simple a UI Event system.
In order to test out there complete system, they sent it to various Android users in the office and asked them to use the application. As a user would use the app, the app would log and report information back to a homebase where the usage data would be analyzed. At the end of the test, users were asked to complete a survey on the app usage.
Results
The core group they focused on was the group of people who used the app at least once a month and have used at more than once a week.
They found that users primarily used the app to find contacts in their phone device.
They also found that short gestures indeed helped find an item in their phone faster. In fact, 61% of 5,497 queries found the correct item. Many queries were extremely short even with large datasets.
With the survey, user support was generally solid. Again, most users reported using the app for contact search for the most part. Not many used it for searching and opening applications
Discussion
This kind of gesture system seems like it would be naturally at home on touch screen devices. It's surprising how long a system like this has taken to develop onto modern devices.
Palm Pilot's from the late 90s and early 2000s had gesture search abilities (through their "graffiti" system).
Perhaps the reason why this kind of gesture search has become so strange is our continued love affair with the QWERTY keyboard. We've begun to almost completely associate input on technology devices with a QWERTY keyboards. I believe that's why gesture search is sometimes hard to pick up and not so commonly used.
I feel as if the researchers did a good job getting their point across, though. It seemed that when users got used to using this application, it let them quickly find what they were looking for.
A possible future development for this kind of application would be to include it natively in the OS as opposed to just creating an application that must be used.
This paper was presented at UIST 2010.
Yang Li is a Senior Google Researcher with a PhD in Computer Science from the Chinese Academy of Sciences.
Summary
Hypothesis
The main hypothesis discussed in this paper was that this Gesture Search platform would provide an easy way for users to look-up contacts, applications, etc by using simple gestures. The Gesture Search application was built where users could directly draw a gesture to the screen and the gesture engine would attempt to locate relevant items from the user's phone.
Methods
In order to be flexible and efficient, the researcher implemented several helpful modules to aid in searching.
The first was to recognize multiple possible meanings of a gesture rather than selecting just one. For example, if the user writes something that could either be an "H" or an "A" both alternatives are considered.
Second, the system would recognize frequently searched-for terms and would appear higher up in the search results.
They also added a system to figure out whether a gesture was an actual gesture or simple a UI Event system.
In order to test out there complete system, they sent it to various Android users in the office and asked them to use the application. As a user would use the app, the app would log and report information back to a homebase where the usage data would be analyzed. At the end of the test, users were asked to complete a survey on the app usage.
Results
The core group they focused on was the group of people who used the app at least once a month and have used at more than once a week.
They found that users primarily used the app to find contacts in their phone device.
They also found that short gestures indeed helped find an item in their phone faster. In fact, 61% of 5,497 queries found the correct item. Many queries were extremely short even with large datasets.
With the survey, user support was generally solid. Again, most users reported using the app for contact search for the most part. Not many used it for searching and opening applications
Discussion
This kind of gesture system seems like it would be naturally at home on touch screen devices. It's surprising how long a system like this has taken to develop onto modern devices.
Palm Pilot's from the late 90s and early 2000s had gesture search abilities (through their "graffiti" system).
Perhaps the reason why this kind of gesture search has become so strange is our continued love affair with the QWERTY keyboard. We've begun to almost completely associate input on technology devices with a QWERTY keyboards. I believe that's why gesture search is sometimes hard to pick up and not so commonly used.
I feel as if the researchers did a good job getting their point across, though. It seemed that when users got used to using this application, it let them quickly find what they were looking for.
A possible future development for this kind of application would be to include it natively in the OS as opposed to just creating an application that must be used.
Tuesday, September 13, 2011
Paper Reading #7: Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop
Performance Optimizations of Virtual Keyboards for Stroke-Based Text Entry on a Touch-Based Tabletop
Jochen Rick is an assistant professor at Saarland University in the department of Education Technology.
This paper was presented at UIST 2010.
Summary
In this paper, the writer discussed the need for an alternate input in touch-based devices. Currently, touch based keyboards are widely used in many applications. The writer presents a comparison of the various stroke-based input methods. He presents many different stroke-based keyboard layouts that have been used over time.
Past researchers have used many different models in an attempt to allow users to type with maximum accuracy and efficiency.
The main purpose of this paper is to attempt to discover the most efficient stroke-based keyboards and to present various ways to measure the layouts.
In the study, they had eight adults connect several different nodes together through a stroke on the touch device. Through this study, they were able to ascertain what strokes were easier to make and used most often. Through these kinds of studies, more efficient keyboard models can be made.
Discussion
While this paper determined many different great ways to figure out the efficiency of the stroke-based input, I feel like many other touch screen inputs are missing from consideration. While at the moment, none come to mind, I feel like there are many other ways to do touch based input on a touch screen besides tap-based and stroke-based.
And also, like Dvorak, it must be considered whether these complex text input methods are worth learning how to use. If it's a small gain (of around 12% efficiency) it may not be used learning and adopting.
That being said, I felt like the study of optimal text input was very interesting and I'd personally enjoy attempting to learn such a system.
Monday, September 12, 2011
Paper Reading #6: TurKit, human computation algorithms on mechanical turk
TurKit, human computation algorithms on mechanical turk
Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller
Greg Little is a student and researcher at the CSAIL lab at MIT.
Lydia B. Chilton is a graduate student at the University of Washington who also interned at Microsoft Research.
Max Goldman is a graduate student and researcher at the CSAIL lab at MIT.
Robert C. Miller is an associate professor in the Computer Science department at MIT.
Summary
In this paper, the researchers presented a new interface called "TurKit" that interacted with an existing tool known as MTurk. These tools allow for the consideration of human computation in systems. Human computation is a great way to allow computations that a computer might now be able to do. For example it might be difficult for a programmed system to recognize what's in a photo. However, using TurKit, you can design systems that allow the humans to essentially fill in the blanks.
The TurKit researchers specifically added a new API extension to MTurk, the idea of "crash-and-rerun" programming and an online interface for TurKIt.
TurKit allows for instructions computed in a human computation to be saved for later. This also allows for synchronization across various human computation inputs. In the API extension they added the ability for various to be used across human computation tasks (also know as human intelligence tasks or HITs). The extension also adds some "fake" multithreading abilities.
They researchers seemed to have multiple problems they hoped to address with TurKit. First, they wanted to give developers a better interface to use with MTurk. Second, by cutting down on costly operations and storing data, it's easier to save data for later or to save on computation costs.
A few ways they used to test TurKit were through iterative writing and blurry text recognition. Several outside groups also tested the use of TurKit in various ways such as psychophysics anaylsis.
Discussion
The usage of TurKit could be extremely beneficial in very many areas. For example, data labeling is a huge problem in many different environments. For example, searching for images can be difficult be the user might know what's in the picture but they can't search by tags normally. Adding in a labeling system would be beneficial to organization without a doubt.
Also mentioned in the article is the benefits of iterative writing. This is a system that would make Wikipedia much more accessible. Currently, editing Wikipedia is an extremely daunting task but if TurKit was used, articles could write on articles that need more information or fill in needed information for another article.
I think the researchers definitely fixed the problems they discussed and they presented a viable product that definitely has potential for many future products.
Greg Little, Lydia B. Chilton, Max Goldman, Robert C. Miller
Greg Little is a student and researcher at the CSAIL lab at MIT.
Lydia B. Chilton is a graduate student at the University of Washington who also interned at Microsoft Research.
Max Goldman is a graduate student and researcher at the CSAIL lab at MIT.
Robert C. Miller is an associate professor in the Computer Science department at MIT.
Summary
In this paper, the researchers presented a new interface called "TurKit" that interacted with an existing tool known as MTurk. These tools allow for the consideration of human computation in systems. Human computation is a great way to allow computations that a computer might now be able to do. For example it might be difficult for a programmed system to recognize what's in a photo. However, using TurKit, you can design systems that allow the humans to essentially fill in the blanks.
The TurKit researchers specifically added a new API extension to MTurk, the idea of "crash-and-rerun" programming and an online interface for TurKIt.
TurKit allows for instructions computed in a human computation to be saved for later. This also allows for synchronization across various human computation inputs. In the API extension they added the ability for various to be used across human computation tasks (also know as human intelligence tasks or HITs). The extension also adds some "fake" multithreading abilities.
They researchers seemed to have multiple problems they hoped to address with TurKit. First, they wanted to give developers a better interface to use with MTurk. Second, by cutting down on costly operations and storing data, it's easier to save data for later or to save on computation costs.
A few ways they used to test TurKit were through iterative writing and blurry text recognition. Several outside groups also tested the use of TurKit in various ways such as psychophysics anaylsis.
Discussion
The usage of TurKit could be extremely beneficial in very many areas. For example, data labeling is a huge problem in many different environments. For example, searching for images can be difficult be the user might know what's in the picture but they can't search by tags normally. Adding in a labeling system would be beneficial to organization without a doubt.
Also mentioned in the article is the benefits of iterative writing. This is a system that would make Wikipedia much more accessible. Currently, editing Wikipedia is an extremely daunting task but if TurKit was used, articles could write on articles that need more information or fill in needed information for another article.
I think the researchers definitely fixed the problems they discussed and they presented a viable product that definitely has potential for many future products.
Thursday, September 8, 2011
Paper Reading #5: A Framework for Robust and Flexible Handling of Inputs with Uncertainty
A Framework for Robust and Flexible Handling of Inputs with Uncertainty
Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, Andrew D. Wilson
Julia Schwarz is a computer science PhD student at Carnegie Mellon University focusing on Human-Computer Interaction.
Scott E. Hudson is a Human-Computer Interaction professor at Carnegie Mellon University.
Jennifer Mankoff is a Human-Computer Interaction professor at Carnegie Mellon University.
Andrew D. Wilson is a researcher at Microsoft who also researched the topic of "Pen + Touch = New Tools."
Summary
In this paper, researchers detailed a new system that would allow for a new kind of input that had some sort of uncertainty factor. As touch screens, gestures, and other "uncertain" inputs develop, there is a need for some kind of system that accurately predicts what the user means to do. In this system, a dispatcher will sent an event notification to the interactors that have a high selection probability. Based on the selection score, interactors will complete their given action. Sometimes multiple interactors will work at once. When that happens, the possible actions are all sent to the mediator who decides either to run both the actions or to choose one of them to run.
An example they used was three tiny buttons and a user's touch input. The user's touch was mostly over two buttons but one of the buttons was disabled therefore it had a selection score of 0 (the interactor wouldn't fire) and the middle button was correctly activated. They had many other examples that were similar.
One way they tested their system was a case study involving users whose motor skills were impaired. In this test, they had users use a system with normal click events and found that users missed their targets approximately 14% of the time. However, using the probabilistic system, researchers found that the users missed their targets only two times combined.
Discussion
The creation of a system that uses probabilistic input is without a doubt needed and in some cases being used already in consumer products. For example, the Apple iOS on-screen keyboard uses a probabilistic model of what key the user is hit based on touch location but also the number of words that have that letter in it (letters that are used more frequently might have a higher selection score based on previously typed letters).
I definitely believe the researchers proved the hypothesis but not only providing convincing examples but also by using an interesting case study. The wrong input was received only twice with the motor impaired participants. Not only does this have some great ramifications for those who aren't motor impaired, but it also has the potential to bring a better, more reliable system to those who have some motor impairment.
Julia Schwarz, Scott E. Hudson, Jennifer Mankoff, Andrew D. Wilson
Julia Schwarz is a computer science PhD student at Carnegie Mellon University focusing on Human-Computer Interaction.
Scott E. Hudson is a Human-Computer Interaction professor at Carnegie Mellon University.
Jennifer Mankoff is a Human-Computer Interaction professor at Carnegie Mellon University.
Andrew D. Wilson is a researcher at Microsoft who also researched the topic of "Pen + Touch = New Tools."
Summary
In this paper, researchers detailed a new system that would allow for a new kind of input that had some sort of uncertainty factor. As touch screens, gestures, and other "uncertain" inputs develop, there is a need for some kind of system that accurately predicts what the user means to do. In this system, a dispatcher will sent an event notification to the interactors that have a high selection probability. Based on the selection score, interactors will complete their given action. Sometimes multiple interactors will work at once. When that happens, the possible actions are all sent to the mediator who decides either to run both the actions or to choose one of them to run.
An example they used was three tiny buttons and a user's touch input. The user's touch was mostly over two buttons but one of the buttons was disabled therefore it had a selection score of 0 (the interactor wouldn't fire) and the middle button was correctly activated. They had many other examples that were similar.
One way they tested their system was a case study involving users whose motor skills were impaired. In this test, they had users use a system with normal click events and found that users missed their targets approximately 14% of the time. However, using the probabilistic system, researchers found that the users missed their targets only two times combined.
Discussion
The creation of a system that uses probabilistic input is without a doubt needed and in some cases being used already in consumer products. For example, the Apple iOS on-screen keyboard uses a probabilistic model of what key the user is hit based on touch location but also the number of words that have that letter in it (letters that are used more frequently might have a higher selection score based on previously typed letters).
I definitely believe the researchers proved the hypothesis but not only providing convincing examples but also by using an interesting case study. The wrong input was received only twice with the motor impaired participants. Not only does this have some great ramifications for those who aren't motor impaired, but it also has the potential to bring a better, more reliable system to those who have some motor impairment.
Paper Reading #4: Gestalt, Integrated Support for Implementation and Analysis in Machine Learning
Gestalt: Integrated Support for Implementation and Analysis in Machine Learning
Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, James A. Landay.
Kayur Patel is currently a computer science PhD student at the University of Washington.
Naomi Bancroft is currently a computer science undergraduate student at the University of Washington.
Steven M. Drucker is a researcher at Microsoft who focuses in Human Computer Interaction.
James Fogarty is currently an assistant professor of computer science and engineering at the University of Washington.
Andrew J. Ko is an assistant professor of the Information School at the University of Washington.
James A. Landay is also a professor of computer science and engineering at the University of Washington.
This paper was presented at UIST 2010.
Summary
This paper details the system known as Gestalt. Gestalt is a novel and unique system to allow developers to incorporate and work with machine learning development. Normally, developers stray from machine learning (although it can be extremely helpful) due to the fact that testing and creation can be difficult.
The researchers claimed that Gestalt "allows developers to implement a classification pipeline, analyze data as it moves through that pipeline, and easily transition between implementation and analysis." Gestalt is meant to be used as tool to aid developers in the creation of applications that use machine learning.
One of the main points the developers discussed was Gestalt's ability to work with many different types of data sets. They used two different examples throughout the paper: the analyzation of a movie review and the deciphering of a gesture mark. Through a set of simple API's, a developer could quickly and easily plug in their specific code to handle the specific data set of the problem that needs to be handled.
Gestalt will even allow developers to quickly analyze input data. This can be used, for example, to determine why a certain gesture is failing to be recognized.
The researchers believed that, through Gestalt, developers would be better able to find bugs and fix them quickly. To test Gestalt, they had users attempt to locate bugs in the machine learning pipeline. They had two different systems: a baseline system (simple system that executed scripts) and the Gestalt system.
The study found that users found and fixed errors in the machine learning pipeline much easier and more efficiently in the Gestalt system than the baseline.
Discussion
While this was certainly one of the more complicated papers so far, Gestalt was definitely an interesting system.
The idea of machine learning is something very interesting and very useful for many different types of systems. With a system like Gestalt, it would allow developers to become far more creative with some of their software implementations (implementing machine learning as opposed to some rigid set of error checking code, for example).
While testing a software development platform is going to be difficult, I wish they had included more information on or included a study about actually developing a full complete system on Gestalt (as opposed to tests limited to bug hunting).
Kayur Patel, Naomi Bancroft, Steven M. Drucker, James Fogarty, Andrew J. Ko, James A. Landay.
Kayur Patel is currently a computer science PhD student at the University of Washington.
Naomi Bancroft is currently a computer science undergraduate student at the University of Washington.
Steven M. Drucker is a researcher at Microsoft who focuses in Human Computer Interaction.
James Fogarty is currently an assistant professor of computer science and engineering at the University of Washington.
Andrew J. Ko is an assistant professor of the Information School at the University of Washington.
James A. Landay is also a professor of computer science and engineering at the University of Washington.
This paper was presented at UIST 2010.
Summary
This paper details the system known as Gestalt. Gestalt is a novel and unique system to allow developers to incorporate and work with machine learning development. Normally, developers stray from machine learning (although it can be extremely helpful) due to the fact that testing and creation can be difficult.
The researchers claimed that Gestalt "allows developers to implement a classification pipeline, analyze data as it moves through that pipeline, and easily transition between implementation and analysis." Gestalt is meant to be used as tool to aid developers in the creation of applications that use machine learning.
One of the main points the developers discussed was Gestalt's ability to work with many different types of data sets. They used two different examples throughout the paper: the analyzation of a movie review and the deciphering of a gesture mark. Through a set of simple API's, a developer could quickly and easily plug in their specific code to handle the specific data set of the problem that needs to be handled.
Gestalt will even allow developers to quickly analyze input data. This can be used, for example, to determine why a certain gesture is failing to be recognized.
The researchers believed that, through Gestalt, developers would be better able to find bugs and fix them quickly. To test Gestalt, they had users attempt to locate bugs in the machine learning pipeline. They had two different systems: a baseline system (simple system that executed scripts) and the Gestalt system.
The study found that users found and fixed errors in the machine learning pipeline much easier and more efficiently in the Gestalt system than the baseline.
Discussion
While this was certainly one of the more complicated papers so far, Gestalt was definitely an interesting system.
The idea of machine learning is something very interesting and very useful for many different types of systems. With a system like Gestalt, it would allow developers to become far more creative with some of their software implementations (implementing machine learning as opposed to some rigid set of error checking code, for example).
While testing a software development platform is going to be difficult, I wish they had included more information on or included a study about actually developing a full complete system on Gestalt (as opposed to tests limited to bug hunting).
Monday, September 5, 2011
Paper Reading #3: Pen + Touch = New Tools
Pen + Touch = New Tools
Ken Hinckley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, Bill Buxton.
Ken Hinckley is a researcher currently employed at Microsoft.
Koji Yatani is a graduate student working toward his PhD at the University of Toronto.
Michel Pahud is currently a senior researcher employed at Microsoft.
Nicole Coddington is a senior interaction designer at HTC who was previously employed at Microsoft.
Jenny Rodenhouse is currently employed at Microsoft as an Experience Design for Xbox system but previously worked as an Experience Design for the mobile division.
Andy Wilson is a senior researcher at Microsoft who focuses on Human Computer Interaction.
Hrvoje Benko works at Microsoft as a researcher and focusing on Adaptive Systems and Interaction.
Bill Buxton is a Principal Researcher at Microsoft in Toronto.
This paper was presented at the UIST 2010.
Summary
In this paper, Microsoft researchers investigated the using of pen and paper and how it could possibly apply to a digital device with the addition of touch. In the study, they observed people interacting with physical pages of paper and notebooks in various ways and developed an interface to allow for similar interaction. Their main idea with this interface was to allow the user to write with the pen but still interact by touch. The defining characteristic, however, is that using pen and touch together allows for new tools or features.
The first part of the paper discussed their design study with a physical paper notebook. They had people write in a notebook, cut clippings, paste the clippings and complete other various tasks. They noticed several common trends with this physical interaction including specific roles people assigned to the pen and objects, the fact that people would hold clippings temporarily, holding pages while flipping, and several others. With this in mind, they attempted to design their interface to allow for users to interact in a similar manner.
They used a Microsoft Surface device for the physical interface. Their idea with the interface was that "the pen writes and touch manipulates." Touching on the interface could move objects around or zoom in on them. Using pen as well as touch allowed them to add new tools. For example, the stapler allowed items to be grouped into stacks by selecting items then touching them all. They added others such as the X-action knife, Carbon Copy and amore.
Participants overall enjoyed the experience of the touch plus gesture. Many of the users found some of the interactions as natural and would use it without even thinking. The problem area came from special "designed" gestures where users wouldn't be able to intuitively figure it out. The researchers believed that this interface that used pen combined with touch was a success.
Discussion
The concept of using pen and touch to create a system that mimics natural physical interaction with paper is very interesting. One of the main barriers to using a tablet or laptop device to take notes is the difficult level of translating thoughts or drawings to either font on the computer or scribbles on a tablet. A scrapbook or notebook program like what they created would be very helpful and would probably do well in the market. Functional note taking applications are already extremely popular. Adding new natural gestures would increase popularity even more.
The main fault of this paper would probably be that the details of the case study weren't very evident. It would have been helpful to understand in greater detail how users interacted with the device. They researchers claimed their device fulfilled its goal but they never discussed in detail the responses to the device.
Ken Hinckley, Koji Yatani, Michel Pahud, Nicole Coddington, Jenny Rodenhouse, Andy Wilson, Hrvoje Benko, Bill Buxton.
Ken Hinckley is a researcher currently employed at Microsoft.
Koji Yatani is a graduate student working toward his PhD at the University of Toronto.
Michel Pahud is currently a senior researcher employed at Microsoft.
Nicole Coddington is a senior interaction designer at HTC who was previously employed at Microsoft.
Jenny Rodenhouse is currently employed at Microsoft as an Experience Design for Xbox system but previously worked as an Experience Design for the mobile division.
Andy Wilson is a senior researcher at Microsoft who focuses on Human Computer Interaction.
Hrvoje Benko works at Microsoft as a researcher and focusing on Adaptive Systems and Interaction.
Bill Buxton is a Principal Researcher at Microsoft in Toronto.
This paper was presented at the UIST 2010.
Summary
In this paper, Microsoft researchers investigated the using of pen and paper and how it could possibly apply to a digital device with the addition of touch. In the study, they observed people interacting with physical pages of paper and notebooks in various ways and developed an interface to allow for similar interaction. Their main idea with this interface was to allow the user to write with the pen but still interact by touch. The defining characteristic, however, is that using pen and touch together allows for new tools or features.
The first part of the paper discussed their design study with a physical paper notebook. They had people write in a notebook, cut clippings, paste the clippings and complete other various tasks. They noticed several common trends with this physical interaction including specific roles people assigned to the pen and objects, the fact that people would hold clippings temporarily, holding pages while flipping, and several others. With this in mind, they attempted to design their interface to allow for users to interact in a similar manner.
They used a Microsoft Surface device for the physical interface. Their idea with the interface was that "the pen writes and touch manipulates." Touching on the interface could move objects around or zoom in on them. Using pen as well as touch allowed them to add new tools. For example, the stapler allowed items to be grouped into stacks by selecting items then touching them all. They added others such as the X-action knife, Carbon Copy and amore.
Participants overall enjoyed the experience of the touch plus gesture. Many of the users found some of the interactions as natural and would use it without even thinking. The problem area came from special "designed" gestures where users wouldn't be able to intuitively figure it out. The researchers believed that this interface that used pen combined with touch was a success.
Discussion
The concept of using pen and touch to create a system that mimics natural physical interaction with paper is very interesting. One of the main barriers to using a tablet or laptop device to take notes is the difficult level of translating thoughts or drawings to either font on the computer or scribbles on a tablet. A scrapbook or notebook program like what they created would be very helpful and would probably do well in the market. Functional note taking applications are already extremely popular. Adding new natural gestures would increase popularity even more.
The main fault of this paper would probably be that the details of the case study weren't very evident. It would have been helpful to understand in greater detail how users interacted with the device. They researchers claimed their device fulfilled its goal but they never discussed in detail the responses to the device.
Thursday, September 1, 2011
Paper Reading #2: Hands-On Math
Hands-On Math: A page-based multi-touch and pen desktop for technical work and problem solving.
Robert Zeleznik, Andrew Bragdon, Ferdi Adeputra, Hsu-Sheng Ko
Robert Zeleznik is currently the director of research in the Computer Graphics Group at Brown University.
Andrew Bragdon is a PhD student studying Computer Science at Brown University.
Ferdi Adeputra is a student at Brown University studying Computer Science.
Hsu-Sheng Ko is a student at Brown University studying Computer Science.
This paper was presented at the UIST 2010.
Summary
In this paper, researchers attempted to create a new way to allow users to interact with math problems without using paper, a white board, or some kind of unintuitive computer program. Their hypothesis was that it is possible to create a device that combines the intuitive and free-flow usage of paper with the computer-aided assistance of a CAS (Computer Algebra System). In order to great an interface like this, the researchers implemented several different components.
The first they discussed was the Page Management. It allows users to quickly delete and create new pages to draw on intuitively. The next was the Panning Bar. Through a gesture, the Panning Bar allowed users to quickly sift through created pages. Folding was another useful feature. It allowed users to quickly hide or re-open large blocks of text and/or mathematical equations.
Gestures also played a big part in the interface. Under-the-rock menus were implemented where they weren't initially visible till a user wanted to perform an operation on a figure. Touch-Activated Pen Gestures (or TAP gestures) allowed the user to use the light pen in combination with their hand to signal more complex, yet intuitive, gestures. PalmPrints was created so that a user's idle hand could control various operations with the tap of a finger. The FingerPose component allowed the system to differentiate between the tip of a finger or the pad of a finger.
The Math components were also very interesting. Users can select individual numbers or even full terms. Then, using gestures, they could perform complex mathematical operations quickly like factoring by pulling numbers apart.
They tested their hypothesis and the viability of their system by allowing students from the university come in and try the system. They had them perform various tasks from creating pages to graphing an equation.
They found that most people were able to pick up the system very quickly. With some of the gestures, the participants had to be prompted to either complete an action a different way or perform the gesture differently. Once they got the hang of it, though, the interface seemed to be easy to use. The researchers confirmed their hypothesis and concluded that a more robust version of Hands-On Math would be a useful tool.
Discussion
In my opinion, the researchers were really on to something with their idea of creating a math system that combines the benefits of paper and computer assistance. They certainly had some very creative and intuitive ideas that they attempted to implement.
My main concern is the fact that they seemed to be missing some aspects that might have been helpful in the demo. First, it seemed like there were several glitches that they had to resolve during the tests. Second, perhaps a small on-screen tutorial or demo would have been helpful. It seemed like many people were trying a gesture or movement in an awkward manner. This could've been fixed by allowing them to watch some sort of demo video before they started.
I could definitely see a piece of technology like this become popular in the future. I'd love to be a user of such a system.
The first they discussed was the Page Management. It allows users to quickly delete and create new pages to draw on intuitively. The next was the Panning Bar. Through a gesture, the Panning Bar allowed users to quickly sift through created pages. Folding was another useful feature. It allowed users to quickly hide or re-open large blocks of text and/or mathematical equations.
Gestures also played a big part in the interface. Under-the-rock menus were implemented where they weren't initially visible till a user wanted to perform an operation on a figure. Touch-Activated Pen Gestures (or TAP gestures) allowed the user to use the light pen in combination with their hand to signal more complex, yet intuitive, gestures. PalmPrints was created so that a user's idle hand could control various operations with the tap of a finger. The FingerPose component allowed the system to differentiate between the tip of a finger or the pad of a finger.
The Math components were also very interesting. Users can select individual numbers or even full terms. Then, using gestures, they could perform complex mathematical operations quickly like factoring by pulling numbers apart.
They tested their hypothesis and the viability of their system by allowing students from the university come in and try the system. They had them perform various tasks from creating pages to graphing an equation.
They found that most people were able to pick up the system very quickly. With some of the gestures, the participants had to be prompted to either complete an action a different way or perform the gesture differently. Once they got the hang of it, though, the interface seemed to be easy to use. The researchers confirmed their hypothesis and concluded that a more robust version of Hands-On Math would be a useful tool.
Discussion
In my opinion, the researchers were really on to something with their idea of creating a math system that combines the benefits of paper and computer assistance. They certainly had some very creative and intuitive ideas that they attempted to implement.
My main concern is the fact that they seemed to be missing some aspects that might have been helpful in the demo. First, it seemed like there were several glitches that they had to resolve during the tests. Second, perhaps a small on-screen tutorial or demo would have been helpful. It seemed like many people were trying a gesture or movement in an awkward manner. This could've been fixed by allowing them to watch some sort of demo video before they started.
I could definitely see a piece of technology like this become popular in the future. I'd love to be a user of such a system.
Subscribe to:
Posts (Atom)