Author: Alex Shaw
Over the past months, I have continued my work on the understanding gestures project by working on developing a set of new articulation features based on how children make touchscreen gestures. Our prior work has shown that children’s gestures are not recognized as well as adults’ gestures, which led us to perform further investigation on how children’s gestures differ from those of adults. In one of our studies, we computed the values of 22 existing articulation features to improve our understanding. An articulation feature is a quantitative measure of some aspect of the way the user creates the gesture. These features are generally either geometric (such as the total amount of ink used or the or the area of the bounding box surrounding the gesture) or temporal (such as the total time taken to produce the gesture or the average speed). In that paper, we showed there was a significant effect of age on the values of many of the features, illustrating differences between children’s and adults’ gestures.
Though we found many differences between children and adults’ gestures, I noticed there were several behaviors that were often present in children’s gestures which were not captured by the features we had used. For example, children’s gestures often do not connect the endpoints of their strokes as well as adults do, as shown in the following “Q” gesture produced by a 5-year-old in one of our studies:
I developed a list of several behaviors like this one that I wanted to capture as new articulation features. For this blog post, I’ll focus on the feature measuring the distance between endpoints of strokes that should be connected, which I’ll call joining error. Using the “D” gesture as an example, the value we would want to compute is the total distance indicated by the orange line below:
To compute this feature, my first idea was to develop an algorithm to detect which ends of strokes should be joined, then measure the distance between them. However, even though we know what the gestures should look like, creating an algorithm to measure this feature would be a difficult computer vision problem. I thought I could look at the distance between points and then assume that if the distance between them was less than a threshold, they should be joined. However, this doesn’t work in all cases. What if some endpoints are less than the threshold, but not supposed to be joined? Many of the features we wanted to compute required similar challenges making them different to design an algorithm for.
Despite this difficulty, I realized that I could just easily look at any gesture, see the joining errors, and mark the distance I wanted to measure. Therefore, I decided to manually annotate all the gestures to calculate the new features. Because there were more than 20,000 gestures, I needed to develop a tool to help me complete the annotations in a timely manner.
I created a tool that plots out all the gestures in a given set and allows me to click to mark the features I’m interested in. The program detects the size of the screen and determines how many gestures to put on the screen. Then I can click each pair of endpoints that should be joined for measuring joining error, and my software automatically logs the distance between the points as well as information about the gesture. The following shows a mockup of the program I developed:
I was able to annotate five different features of over 20,000 gestures using my tool in a few weeks, whereas if I had manually examined each gesture individually, it would have probably taken several months. Furthermore, since I was visually inspecting each gesture, I had confidence that I was measuring exactly the quantity I wanted. Working on this project has helped me learn how important it can be to create tools for streamlining work requiring repeated manual intervention.
On May 29, I completed and passed my PhD dissertation proposal defense. The proposal defense process can vary widely among institutions and even among departments in the same institution, so in this post I outline the process I followed in the CISE department at UF.
The first step I followed was to create a document outlining my proposed work to help my committee understand my plans. There was no prescribed length or format for the document, but mine was around 60 pages. The document contained information about all the work I’ve done up to this point as a PhD student, as well as an outline of all the work I plan to do before graduating. Preparing the document requires a significant amount of work, so I would recommend planning on spending several months working on it before submitting. The document is a crucial part of the proposal process since your committee will use it as a guide to understand the details of your work that you don’t have time to cover in your presentation.
After completing the document, I sent it to my committee. The committee then had several weeks to review the document while I prepared for the next step, which was to give a 45 minute-long presentation about my prior work and my plans for my dissertation work, with 15 additional minutes for public questions.
The proposal defense itself was divided in to four stages. In the first phase, I gave my 45-minute presentation to my committee as well as members of the public who were interested in attending. In the second stage, which lasted around 15 minutes, both the public audience and my committee members asked questions. In the third stage, the public audience was asked to leave and my committee asked questions in private. This phase lasted around 30 minutes. I found that the questions my committee asked in this phase were more difficult and thorough since my committee wanted to be sure they understood my proposed work. For example, my committee asked not only what I planned to do but how I planned to implement specific parts of my dissertation work. In the final stage of the proposal defense, I was asked to leave the room while the committee deliberated on whether I had passed the proposal. The time taken by the committee to deliberate can vary, but for me it was in the range of 20 to 30 minutes. After my committee finished their discussion, I was brought back into the room and was very excited (and relieved) to learn that I had passed my proposal! My committee offered suggestions and feedback on ways to improve my proposed work. For example, some of my committee members suggested specific algorithms that I had not considered that may be useful for my work.
I am entering my fifth year in the PhD program at UF. Now that I’ve defended my proposal, my next major milestone will be my final dissertation defense, which I plan to complete in December 2019. The proposal process was long and difficult, but it provided me a valuable opportunity to crystallize my plans for my dissertation work. Preparing for my proposal forced me to take a more active role in generating ideas for future directions of my research, and now that I’ve passed my proposal I am expected to take more ownership of my work with less involvement from my advisor.
Based on my experience, here are some tips for preparing for your proposal:
* Read proposal documents from students who have already passed their proposal in your department and/or to help get an idea of the scope and formatting to use. I used previous students’ proposals working in a similar area to mine as a model for my document.
* Give as many practice talks as you can with different people. Consider getting people outside of your own lab to make sure it is understandable to a more general audience. Even your committee will have diverse backgrounds and may not be familiar with some concepts related to your research. Practices also are a great time to get a feel for the types of questions you’re likely to get. When I prepared for my presentation, I gave practice talks to friends in other engineering departments to help evaluate how well I was able to explain my work.
* Prepare backup slides to help you answer questions you think you are likely to get.
* Ask your friends and labmates to attend your talk. It helps to see familiar faces and to know you have a lot of support while you’re giving your presentation.
* Bring food and/or coffee for the audience, especially your committee.
* Try not to get too stressed out during your presentation. Ultimately, everyone wants to see you succeed.
If you’re about to propose your dissertation, good luck!Read More
In a previous post, we discussed our ongoing work on studying children’s gestures. To get a better idea of the target accuracy for continuing work in gesture recognition, we ran a study comparing human ability to recognize children’s gestures to machine recognition. Our paper, “Comparing Human and Machine Recognition of Children’s Touchscreen Gestures”, quantifies how well children’s gestures were recognized by human viewers and by an automated recognition algorithm. This paper includes our project team: me (Alex Shaw), Dr. Jaime Ruiz, and Dr. Lisa Anthony. The abstract of the paper is as follows:
Children’s touchscreen stroke gestures are poorly recognized by existing recognition algorithms, especially compared to adults’ gestures. It seems clear that improved recognition is necessary, but how much is realistic? Human recognition rates may be a good starting point, but no prior work exists establishing an empirical threshold for a target accuracy in recognizing children’s gestures based on human recognition. To this end, we present a crowdsourcing study in which naïve adult viewers recruited via Amazon Mechanical Turk were asked to classify gestures produced by 5- to 10-year-old children. We found a significant difference between human (90.60%) and machine (84.14%) recognition accuracy, over all ages. We also found significant differences between human and machine recognition of gestures of different types: humans perform much better than machines do on letters and numbers versus symbols and shapes. We provide an empirical measure of the accuracy that future machine recognition should aim for, as well as a guide for which categories of gestures have the most room for improvement in automated recognition. Our findings will inform future work on recognition of children’s gestures and improving applications for children.
The camera-ready version of the paper is available here. We will present the paper at the upcoming ACM International Conference on Multimodal Interaction in Glasgow, Scotland. We will post our presentation slides after the conference. See more information at our project website.Read More
In previous posts, we have discussed our ongoing work on improving recognition of children’s touchscreen gestures. My paper, “Human-Centered Recognition of Children’s Touchscreen Gestures”, was accepted to ICMI 2017’s Doctoral Consortium! The paper focused on my future research plans as I continue to work on my doctorate. Here is the abstract:
Touchscreen gestures are an important method of interaction for both children and adults. Automated recognition algorithms are able to recognize adults’ gestures quite well, but recognition rates for children are much lower. My PhD thesis focuses on analyzing children’s touchscreen gestures, and using the information gained to develop new, child-centered recognition approaches that can recognize children’s gestures with higher accuracy than existing algorithms. This paper describes past and ongoing work toward this end and outlines the next steps in my PhD work.
We will post the camera-ready version of the paper soon. This year’s ICMI conference will be held in Glasgow, Scotland, in November.
I am beginning my 4th year as a PhD student at UF. I think that participating in the doctoral consortium at ICMI will be extremely helpful in continuing to develop a plan for my dissertation. I look forward to attending the conference.Read More
We are currently continuing our work in gesture recognition by studying how well humans can recognize children’s gestures. We will compare human recognition rates to the rates of the automated recognition algorithms we used in our previous work. This will help us get an idea of how well humans are able to recognize children’s gestures. That way, we will have a good target accuracy for our future work on improving automated recognition of children’s gestures. Our future work will focus on improving the accuracy of recognizers for children’s gestures using the human recognition rate as the goal.
I am a rising 4th year Ph.D. student. Working on this project has helped me to better understand the ways that humans perceive gestures, which has led to some interesting discoveries on what kinds of gestures humans confuse. I look forward to applying this information to automated algorithms to improve recognition. Working on this project, I also learned how to use tools like Qualtrics and Amazon Mechanical Turk.Read More
In a previous post, we discussed our ongoing work on studying children’s gestures. We studied a corpus of children’s and adults’ gestures and analyzed 22 different articulation features, which we are pleased to announce has been accepted for publication at the 2016 ACM International Conference on Multimodal Interaction (ICMI). Our paper, “Analyzing the Articulation Features of Children’s Touchscreen Gestures”, describes how children’s gestures differ from ages 5 to 10, and compares them to the features of adults’ gestures. This paper includes our project team: me (Alex Shaw) and Dr. Lisa Anthony. The abstract of the paper is as follows:
Children’s touchscreen interaction patterns are generally quite different from those of adults. In particular, it is known that children’s gestures are recognized by existing algorithms with much lower accuracy than those of adults. Previous work has qualitatively and quantitatively analyzed adults’ gestures to promote improved recognition, but this has not been done for children’s gestures in the same systematic manner. We present an analysis of gestures elicited from 24 children (age 5 to 10 years old) and 27 adults in which we calculate geometric, kinematic, and relative articulation features of the gestures. We examine the effect of user age on 22 different gesture metrics to better understand how children’s gesturing abilities and behaviors differ between various age groups. We discuss the implications of our findings and how they will contribute to creating new gesture recognition algorithms tailored specifically for children.
The camera-ready version of the paper is available here. We will present the paper at the upcoming ACM International Conference on Multimodal Interaction in Tokyo, Japan. We will post our presentation slides after the conference. See more information at our project website.Read More
One of the projects our lab has been working on has been a qualitative analysis of the children’s gesture data from our MTAGIC project. We submitted an extended abstract about this work to CHI and it was accepted! In the extended abstract, we detail the tools we have applied thus far to study the children’s gestures, and summarize our findings. We also present an outline of our plans for continued work in this area. We will present our work as a poster at the conference. You can view the extended abstract that will be published here.Read More
In our last post we discussed how we were working on replicating analyses from previous studies. We have completed these analyses and written and submitted a paper on our findings. When our paper is accepted, we will post the abstract and announce our findings! Since then, we’ve begun exploring the data in greater detail by looking at how gesture samples differ among different age groups and how other factors, such as handedness, affect gesture articulation. We will also examine the common mistakes recognizers made on kids’ gestures from our study so we can later improve on them.
Working on this study has given me a good understanding of how gesture recognition algorithms work and where their weaknesses lie. I’ve also learned a lot about the kinds of experiments used to test recognizers. I’m excited to be a part of the project and I look forward to diving into the data in more detail.Read More
In my last post I discussed some work on the $-family of recognizers. Since then, I’ve been working on designing a standalone application for running gesture recognition experiments using the $-family. The application will also allow the user to design custom recognizers. I’ll also be using the $-family to run recognition experiments on the data we have collected in our MTAGIC project. By analyzing the results, we hope to be able to design better recognition algorithms for kids.
Working with the $-family has given me a great introduction to the field of gesture recognition. I have studied these algorithms as well as other work in the field, which will be important knowledge as I continue forward and begin writing my own gesture recognizers.Read More
The $-family of recognizers are lightweight, easy to implement gesture recognizers that allow for quick development of 2-D gesture-based interfaces. These algorithms are short (less than 100 lines of code each) allowing for easy incorporation by developers into new projects. These algorithms currently achieve 98-99% accuracy for recognizing gestures made by adults, but only about 84% accuracy for gestures from kids. Thus, we are working on extending these algorithms so that they can achieve better recognition for children’s gestures. We are currently working on a study to gather a set of gesture data from kids as part of our MTAGIC project. After we collect this data, we will study it and attempt to find ways to improve the algorithms based on our findings. For more information on previous work on the $-family of recognizers, see the below links:
I am a first year Ph.D. student at the University of Florida studying Computer Engineering. I recently received my B. S. in Computer Science from Auburn University. Working with the $-family of recognizers has given me an excellent introduction to the field of gesture recognition. I have also been able to study the experiments used to verify these gesture recognition algorithms, which has helped me learn about research methods in Human-Centered Computing.Read More