How to Qualitatively Code

As a member of the INIT lab, I’m writing today’s blog post about the qualitative coding of research data. Depending on the type of data, research data can be analyzed either quantitatively or qualitatively. Researchers use quantitative approaches (e.g., means, standard deviations, statistical tests) to quantify certain patterns in numeric data. On the other hand, researchers apply qualitative approaches when they want to extract emerging themes that can help inform users’ motivations and mental models from text-based data (e.g., open-ended survey questions or interview responses). A common approach for analysis of qualitative data is called “coding”. In this blog post, I will discuss the steps for applying qualitative coding to open-ended survey questions.

Qualitative coding is used to determine the categories, relationships, and assumptions that inform the respondents’ view of the world in general, and of the topic in particular [1]. A code is what a researcher uses to code qualitatively; it is a word or phrase that describes a participant’s response. The ultimate aim for qualitative coding is to identify themes and patterns that emerge from gathered data that can then be used to enable further interpretations [2]. One of the most important aspects of using this method is developing and designing a codebook. A good codebook includes all the themes the researchers are qualitatively coding, their definitions, and an example sentence from the data that can be characterized using this theme. For example, a code/theme can be a simple word or phrase such as “ease of use” with a definition “response expresses the idea that the task is easy to do,” and an example that this code would apply to would be: “This method was really easy for me to use; it took me no effort.” Designing a codebook involves several steps. The first step is to identify the dimensions, which are the encompassing themes that capture the ideas expressed in the data. For example, each open-ended question can be used to define a codebook dimension. The next step is to organize the responses pertaining to each dimension. Organizing these responses is an important step in qualitative coding because it makes one’s data set easier to access and understand.

Once the data has been organized, the next step is to develop the codes for each dimension. As I mentioned above, a code is a word or short phrase that captures the main concept or theme of a certain data point. There are many different procedures one can take in the code creation process. For instance, in our lab, we often have each researcher (coder) independently look through 10% of the data and then come together to agree on the final set of codes. Our codebooks always include corresponding definitions and examples to help us understand when the code should be applied. After the codes for all dimensions have been added to the codebook, our lab usually then has each coder independently code a different 10% of participants’ responses across all dimensions and the inter-rater reliability (IRR) will be calculated for this subset of the data. This rate is calculated to see how closely connected the researchers coded the same material. Once an acceptable IRR is achieved, the last step of the coding process is for each coder to independently code a subset of the data.

Once the codebook is refined and clearly understood by the coders, they should be able to apply the codes to the data easily, even if the responses are more complex. For example, imagine we have a codebook with two codes “ease of use” and “uncomfortable”, in which ease of use is defined as “the task is easy to do” and uncomfortable is defined as “the task makes the participant feel uncomfortable to perform.” Therefore, if a participant’s response was “This task took no effort, but it made me feel awkward and uncomfortable,” the first half would be categorized as “ease of use,” and the latter half would be coded “uncomfortable.” Therefore, applying codes to data can be quick and easy if the coders’ codebook is comprehensive and clear. After the coding process, the researchers will discuss common or frequent themes in the data, and try to understand the big picture of what the qualitative analysis is saying about the original research questions.

Participating in a recent lab research project has exposed me to the field of human-computer interaction. This new field has become a topic of interest for me because I find the research intriguing. Moreover, being a third-year Neuroscience, I have found many relationships between cognitive abilities and individuals’ responses. For example, in my Cognitive Psychology class, Professor Brian Cahill made it clear in his lecture that according to cognitive psychologists, many individuals do not fully understand why they make decisions; they usually base their decisions oms on instinctual tendencies or past experiences. In our study, we are asking participants to explain the reasoning behind some of their interaction choices. I think this facet of cognitive science could be why many of the participants in the study responded with statements like: “It felt natural”. This phenomenon has contributed greatly to my interest in the field. This research lab has allowed me to learn new skills such as qualitative coding, advanced statistics, and computer programs such as Excel. Overall, this experience so far has been rewarding and exciting!



[1] G. McCraken, The Long Interview. Newbury Park, California, 1988. [Online] Available:,+G.+(1988).+The+Long+Interview+(Sage+University+Paper+Series+on+Qualitative+Research+Methods,+No.+13).+Newbury+Park,+Calif.:+Sage.&ots=RBzNdslYZu&sig=-QIil8Az3wKfMJ5c-oRqZNckWGQ#v=onepage&q&f=false

[2] T. N. Basit, “Manual or Electronic? The Role of Coding in Qualitative Data Analysis,” Routledge Taylor & Francis Group, vol. 45, no. 2, pp. 143-154, Aug. 2003, doi: 10.1080/0013188032000133548