Category: Understanding Gestures

Since I joined the INIT lab, I have been working on preparing a study related to the Understanding Gestures project. The goal of the project is to examine the relationship between previous findings about children’s touch and gesture interactions and their cognitive development. Our lab’s previous work has shown that children’s gestures are not recognized as accurately as adults’ gestures and that there are significant differences in articulation features related to gesture production time and geometry. We have received inquiries from readers of our prior publications regarding the cognitive development of the children we collected data from, which led us to pursue this project on understanding how children’s cognitive development is related to the way they interact with touchscreen devices. We believe having this new information will help us gain a more comprehensive understanding of children’s touchscreen interactions.

Cognitive development is a field of study in neuroscience and psychology focusing on a children’s development in terms of information processing, problem solving, and decision making [1]. In our Understanding Gestures project, we are mainly concerned with children’s fine motor skills and children’s executive function, both of which exhibit variance across early ages of childhood and between genders. Fine motor skill measures the coordination of small muscles such as those in the finger and hand [2]. Executive function measures the ability to focus attention and execute tasks [3]. We plan on measuring these two aspects using NIH Toolbox®, a “comprehensive set of neuro-behavioral measurements that quickly assesses cognitive, emotional, sensory, and motor functions” [4]. The creators of the app, the National Institutes of Health, maintain a representative database for comparing children’s performance on the tasks based on their demographic information (e.g., age, gender, etc.). We are excited to be collaborating on this project with Dr. Pavlo Antonenko from the College of Education. We are looking forward to drawing connections between children’s touchscreen interactions and their cognitive development from this study.

I am a third-year undergraduate student majoring in Computer Science, and this is my first full semester in the INIT Lab. The process of preparing a study has been challenging but very interesting. I have always wanted to learn how to run a study and been curious about the work that goes into a research paper. As we prepare for the study, I have performed in-depth independent research on potential topics of exploration regarding children’s cognitive development. I have gained a great sense of accomplishment by playing a role in building the study from scratch, and I am looking forward to continuing my work on the study.

 

REFERENCES

1. Ali, Ajmol & Pigou,Schacter, Daniel L (2009). PSYCHOLOGY. Catherine Woods. p. 429. ISBN 978-1-4292-3719-2.

2. Deborah & Clarke, Linda & Mclachlan, Claire. (2017). Review on Motor Skill and Physical Activity in Preschool Children in New Zealand. Advances in Physical Education. 7. 10-26. 10.4236/ape.2017.71002.

3. Team, Understood. “Understanding Executive Functioning Issues.” Understood.org, www.understood.org/en/learning-attention-issues/child-learning-disabilities/executive-functioning- issues/understanding-executive-functioning-issues.

4. Weintraub, Sandra et al. “Cognition assessment using the NIH Toolbox.” Neurology vol. 80,11 Suppl 3 (2013): S54-64. doi:10.1212/WNL.0b013e3182872ded

Read More

Over the past months, I have continued my work on the understanding gestures project by working on developing a set of new articulation features based on how children make touchscreen gestures. Our prior work has shown that children’s gestures are not recognized as well as adults’ gestures, which led us to perform further investigation on how children’s gestures differ from those of adults. In one of our studies, we computed the values of 22 existing articulation features to improve our understanding. An articulation feature is a quantitative measure of some aspect of the way the user creates the gesture. These features are generally either geometric (such as the total amount of ink used or the or the area of the bounding box surrounding the gesture) or temporal (such as the total time taken to produce the gesture or the average speed). In that paper, we showed there was a significant effect of age on the values of many of the features, illustrating differences between children’s and adults’ gestures.

Though we found many differences between children and adults’ gestures, I noticed there were several behaviors that were often present in children’s gestures which were not captured by the features we had used. For example, children’s gestures often do not connect the endpoints of their strokes as well as adults do, as shown in the following “Q” gesture produced by a 5-year-old in one of our studies:

I developed a list of several behaviors like this one that I wanted to capture as new articulation features. For this blog post, I’ll focus on the feature measuring the distance between endpoints of strokes that should be connected, which I’ll call joining error. Using the “D” gesture as an example, the value we would want to compute is the total distance indicated by the orange line below:


To compute this feature, my first idea was to develop an algorithm to detect which ends of strokes should be joined, then measure the distance between them. However, even though we know what the gestures should look like, creating an algorithm to measure this feature would be a difficult computer vision problem. I thought I could look at the distance between points and then assume that if the distance between them was less than a threshold, they should be joined. However, this doesn’t work in all cases. What if some endpoints are less than the threshold, but not supposed to be joined? Many of the features we wanted to compute required similar challenges making them different to design an algorithm for.

Despite this difficulty, I realized that I could just easily look at any gesture, see the joining errors, and mark the distance I wanted to measure. Therefore, I decided to manually annotate all the gestures to calculate the new features. Because there were more than 20,000 gestures, I needed to develop a tool to help me complete the annotations in a timely manner.

I created a tool that plots out all the gestures in a given set and allows me to click to mark the features I’m interested in. The program detects the size of the screen and determines how many gestures to put on the screen. Then I can click each pair of endpoints that should be joined for measuring joining error, and my software automatically logs the distance between the points as well as information about the gesture. The following shows a mockup of the program I developed:


I was able to annotate five different features of over 20,000 gestures using my tool in a few weeks, whereas if I had manually examined each gesture individually, it would have probably taken several months. Furthermore, since I was visually inspecting each gesture, I had confidence that I was measuring exactly the quantity I wanted. Working on this project has helped me learn how important it can be to create tools for streamlining work requiring repeated manual intervention.

Read More

This summer, I have been working on a project related to the $-family of gesture recognizers. The $-family is a series of simple, fast, and accurate gesture recognizers designed to be accessible to novice programmers. $1 [1] was created by Wobbrock and colleagues, and INIT Lab director Lisa Anthony contributed to later algorithms, including $N [2] and $P [3]. My goal this summer was to implement my own versions of the $-family algorithms, and then to try them out on a new dataset that was collected from adults and children in a different context than previous datasets collected by the INIT lab.

The first step of my work on this project was to understand how the different algorithms of the $-family work. I examined the advantages and limitations of each recognizer in the $-family by reading the related research papers and playing around with existing implementations of the recognizers. After studying the recognizers, I created my own implementations of $1 and $P in Javascript by making a web application. I faced several challenges when implementing these algorithms. My first challenge was to decide in what form the gestures to be recognized should be taken as input (predefined point arrays or through a canvas where user-defined gestures can be given as input). Using this $1 implementation as a reference, I normalized each of the gestures, then computed the distance between the gestures and performed user-defined gesture recognition through a canvas. While implementing the algorithms, I followed a step by step approach so that I could evaluate whether each function was working before moving forward with recognition. In the process, I learned the importance of debugging the program to help pinpoint errors in my code more efficiently than trying to find the problems manually.

After completing the web applications, my next task was to recognize gestures from a dataset with XML files as input. I created another implementation of the $1 recognizer in Python to learn and explore another programming language. I was initially unsure how to read in the gesture data from XML files so I had to learn how to parse them. I used the pseudo code presented in the original $1 paper [1] as a guide to implement the algorithm. Resampling the points of the gesture before recognition was challenging. Every gesture needed to have the same number of resampled points for recognition. To solve the issues I encountered while preprocessing the gestures, I plotted the gestures using the matplotlib library from Python. Not only did visualising gestures help in that context, it also helped me to understand why some gestures were wrongly recognized, since they looked more like the other gestures than what they actually were. Solving these errors and getting a correct implementation gave me a great sense of achievement. After implementing the recognition algorithms, I learned how to run user-independent recognition experiments where I systematically varied the number of participants included in the training set. Then I ran those experiments to find out the accuracy of the algorithms that I implemented. Now, I am working on analyzing articulation features [4] [5] of a new set of gestures to help quantitatively investigate the difference between adult’s and children’s gestures in a new context.

I am a final year undergraduate computer science student from MIT, Pune, India working with the INIT lab as an REU student this summer, as part of the UF CISE IMHCI REU program. I have greatly enjoyed my time working in the INIT lab. One thing I have really enjoyed while I’ve been here is related to another project I worked on: interacting with an ocean temperature application on the PufferSphere, which is a large interactive spherical display. Through my experience in the INIT lab, I have been able to closely follow the different stages of the research process. I’ve added to my technical knowledge through improved understanding of gesture recognizers, and I’ve also learned the importance of being clear and concise in scientific writing. I am looking forward to continuing my work on this project and understanding new ways to improve children’s gesture interaction experiences.

References:

[1] Wobbrock, Jacob O., Andrew D. Wilson, and Yang Li. “Gestures without libraries, toolkits or training: a $1 recognizer for user interface prototypes.” Proceedings of the 20th annual ACM symposium on User interface software and technology. ACM, 2007.

[2] Anthony, Lisa, and Jacob O. Wobbrock. “A lightweight multistroke recognizer for user interface prototypes.” Proceedings of Graphics Interface 2010. Canadian Information Processing Society, 2010.

[3] Vatavu, Radu-Daniel, Lisa Anthony, and Jacob O. Wobbrock. “Gestures as point clouds: a $ P recognizer for user interface prototypes.” Proceedings of the 14th ACM international conference on Multimodal interaction. ACM, 2012.

[4] Anthony, Lisa, Radu-Daniel Vatavu, and Jacob O. Wobbrock. “Understanding the consistency of users’ pen and finger stroke gesture articulation.” Proceedings of Graphics Interface 2013. Canadian Information Processing Society, 2013.

[5] Shaw, Alex, and Lisa Anthony. “Analyzing the articulation features of children’s touchscreen gestures.” Proceedings of the 18th ACM International Conference on Multimodal Interaction. ACM, 2016.

Read More

In our last post, we shared that we had a paper accepted to the ACM International Conference on Multimodal Interaction (ICMI) 2017, to be held in Glasgow, Scotland, UK. The paper was titled “Comparing Human and Machine Recognition of Children’s Touchscreen Gestures.” We just came back from the conference and are proud to announce that Alex Shaw, the INIT Lab PhD student who first-authored the paper, won Best Student Paper at the conference! Alex is co-advised by Dr. Lisa Anthony from the INIT lab and UF CISE professor Dr. Jaime Ruiz.

Congratulations, Alex!

Read More

In a previous post, we discussed our ongoing work on studying children’s gestures. To get a better idea of the target accuracy for continuing work in gesture recognition, we ran a study comparing human ability to recognize children’s gestures to machine recognition. Our paper, “Comparing Human and Machine Recognition of Children’s Touchscreen Gestures”, quantifies how well children’s gestures were recognized by human viewers and by an automated recognition algorithm. This paper includes our project team: me (Alex Shaw), Dr. Jaime Ruiz, and Dr. Lisa Anthony. The abstract of the paper is as follows:

Children’s touchscreen stroke gestures are poorly recognized by existing recognition algorithms, especially compared to adults’ gestures. It seems clear that improved recognition is necessary, but how much is realistic? Human recognition rates may be a good starting point, but no prior work exists establishing an empirical threshold for a target accuracy in recognizing children’s gestures based on human recognition. To this end, we present a crowdsourcing study in which naïve adult viewers recruited via Amazon Mechanical Turk were asked to classify gestures produced by 5- to 10-year-old children. We found a significant difference between human (90.60%) and machine (84.14%) recognition accuracy, over all ages. We also found significant differences between human and machine recognition of gestures of different types: humans perform much better than machines do on letters and numbers versus symbols and shapes. We provide an empirical measure of the accuracy that future machine recognition should aim for, as well as a guide for which categories of gestures have the most room for improvement in automated recognition. Our findings will inform future work on recognition of children’s gestures and improving applications for children.

The camera-ready version of the paper is available here. We will present the paper at the upcoming ACM International Conference on Multimodal Interaction in Glasgow, Scotland. We will post our presentation slides after the conference. See more information at our project website.

Read More

In previous posts, we have discussed our ongoing work on improving recognition of children’s touchscreen gestures. My paper, “Human-Centered Recognition of Children’s Touchscreen Gestures”, was accepted to ICMI 2017’s Doctoral Consortium! The paper focused on my future research plans as I continue to work on my doctorate. Here is the abstract:
Touchscreen gestures are an important method of interaction for both children and adults. Automated recognition algorithms are able to recognize adults’ gestures quite well, but recognition rates for children are much lower. My PhD thesis focuses on analyzing children’s touchscreen gestures, and using the information gained to develop new, child-centered recognition approaches that can recognize children’s gestures with higher accuracy than existing algorithms. This paper describes past and ongoing work toward this end and outlines the next steps in my PhD work.

We will post the camera-ready version of the paper soon. This year’s ICMI conference will be held in Glasgow, Scotland, in November.

I am beginning my 4th year as a PhD student at UF. I think that participating in the doctoral consortium at ICMI will be extremely helpful in continuing to develop a plan for my dissertation. I look forward to attending the conference.

Read More

We are currently continuing our work in gesture recognition by studying how well humans can recognize children’s gestures. We will compare human recognition rates to the rates of the automated recognition algorithms we used in our previous work. This will help us get an idea of how well humans are able to recognize children’s gestures. That way, we will have a good target accuracy for our future work on improving automated recognition of children’s gestures. Our future work will focus on improving the accuracy of recognizers for children’s gestures using the human recognition rate as the goal.

I am a rising 4th year Ph.D. student. Working on this project has helped me to better understand the ways that humans perceive gestures, which has led to some interesting discoveries on what kinds of gestures humans confuse. I look forward to applying this information to automated algorithms to improve recognition. Working on this project, I also learned how to use tools like Qualtrics and Amazon Mechanical Turk.

Read More

In a previous post, we discussed our ongoing work on studying children’s gestures. We studied a corpus of children’s and adults’ gestures and analyzed 22 different articulation features, which we are pleased to announce has been accepted for publication at the 2016 ACM International Conference on Multimodal Interaction (ICMI). Our paper, “Analyzing the Articulation Features of Children’s Touchscreen Gestures”, describes how children’s gestures differ from ages 5 to 10, and compares them to the features of adults’ gestures. This paper includes our project team: me (Alex Shaw) and Dr. Lisa Anthony. The abstract of the paper is as follows:

Children’s touchscreen interaction patterns are generally quite different from those of adults. In particular, it is known that children’s gestures are recognized by existing algorithms with much lower accuracy than those of adults. Previous work has qualitatively and quantitatively analyzed adults’ gestures to promote improved recognition, but this has not been done for children’s gestures in the same systematic manner. We present an analysis of gestures elicited from 24 children (age 5 to 10 years old) and 27 adults in which we calculate geometric, kinematic, and relative articulation features of the gestures. We examine the effect of user age on 22 different gesture metrics to better understand how children’s gesturing abilities and behaviors differ between various age groups. We discuss the implications of our findings and how they will contribute to creating new gesture recognition algorithms tailored specifically for children.

The camera-ready version of the paper is available here. We will present the paper at the upcoming ACM International Conference on Multimodal Interaction in Tokyo, Japan. We will post our presentation slides after the conference. See more information at our project website.

Read More

One of the projects our lab has been working on has been a qualitative analysis of the children’s gesture data from our MTAGIC project. We submitted an extended abstract about this work to CHI and it was accepted! In the extended abstract, we detail the tools we have applied thus far to study the children’s gestures, and summarize our findings. We also present an outline of our plans for continued work in this area. We will present our work as a poster at the conference. You can view the extended abstract that will be published here.

Read More

In my last post I discussed some work on the $-family of recognizers. Since then, I’ve been working on designing a standalone application for running gesture recognition experiments using the $-family. The application will also allow the user to design custom recognizers. I’ll also be using the $-family to run recognition experiments on the data we have collected in our MTAGIC project. By analyzing the results, we hope to be able to design better recognition algorithms for kids.

Working with the $-family has given me a great introduction to the field of gesture recognition. I have studied these algorithms as well as other work in the field, which will be important knowledge as I continue forward and begin writing my own gesture recognizers.

Read More