Creating a Code Book to Analyze Qualitative Research Findings

Step 2 - How to Create a Code Book for Qualitative Data

Creating a Code Book Takes Concentration and Creativity. A Good Hat Can Be of Help. Getty Images | Andrew Rich | Vetta Collection

The market research budget of a small business owner or a home-based business generally does not have room for spending large sums on software to analyze the qualitative data collected for business development. This series of articles provides step-by-step information on how to use an ordinary word processing application to conduct text analysis for qualitative market research.  The processes described can be applied to the analysis of quantitative data collected from surveys research, focus group sessions, and in-depth interviews.


Links to all of the articles in the series are provided below.

A Beginner's Guide to Do-It_Yourself Qualitative Data Analysis

Step 1. Set Up Table and Column Headings

Step 2. How to Prepare a Code Book for Qualitative Data

Step 3. How to Prepare the Table for Data Triangulation

Step 4. Assign the Codes to Prepare for Data Analysis

Step 5. Perform Sorting with Combined and Isolated Codes

Step 6. How to Perform Code Validation and Merging of Data Tables

Step 7. Advanced Considerations in Qualitative Data Analysis

Code Book Basics

T he first step in qualitative data analysis is coding. A code is a label to tag a concept or a value found in a narrative or text.

Code books include definitions of themes and sub-themes that are used as references for the coding of narrative text.  The themes can be those actually expressed by the respondents (these are called in vivo codes) or those that are constructed or inferred by the researcher.


Each theme and sub-theme are assigned a specific number that can be used for sorting the text data and for relocating the places in the narrative text for deeper analysis. Coding improves reliability as it creates a structure and agreement about important definitions, constructs, and themes

Reliability in Coding

Determining which code should be assigned to particular text is not always obvious.

  A common reliability problem is that coders or raters do not always code similar passages of text exactly the same.  Reliability can be improved by making sure to use clear categories for coding. 

Reliability across two or three coder can be calculated.  This inter-rater or inter-coding reliability index will show whether the researcher needs to revise the coding scheme. The formula for calculating inter-rater reliability index is shown below:

  • Reliability = # of agreements / # total codes (# code agreements + # code disagreements)
  • Where:
  1. Code Agreements = the same codes were chosen by two or more coders
  2. Code Disagreements = different codes were chosen by two or more coders

Approaches to Coding

Three very different coding strategies exist:

  1. Code-book creation according to theory
  2. Coding by induction (according to “grounded theory”)
  3. Coding by ontological categories

Code-book Creation According to Theory

When taking a theory-based approach to the creation of a code book, the market researcher creates a list of concepts based on those found in the research questions or the hypothesis.  Using analytical frameworks and analysis grids, the researcher works through the narrative and codes the text according to theoretical reasoning.

To learn more about coding by induction (according to grounded theory), or coding by ontological categories, visit this article.

The Theme Code Book - A Reference Table

As shown in the table below, decimal numeric codes are used to identify themes.  Identifying the themes and codes in this manner enables an easy sorting process during the data analysis.  The code book table is separate from the data recording table that is shown in Step 1.  The code book table serves as a reference, but it is not an active table.

In addition to the definitions of the themes and sub-themes, the code book may contain criteria used for inclusion or exclusion of text instances in the thematic categories.  Careful, logical design of the code book and the indexing structure promotes ease in coding and ultimately in the reporting of findings.

  A recommended format for a code book is conceptually logical and follows a conventional sequential structure of an outline.

Example - Step. 2.

Excerpt of 3-Level Code Book Table
4.0000   Music in the Barrio (Venezuela)
 4.05  Nucleo - music classes M-F afternoon and Sat. mornings
 4.10  Each one teach one - basic agreement
  4.105 As music students learn, they coach younger students
 4.15  Ensemble music strengthens sense of community
 4.20  Classical music paves way for social change / social justice
  4.205 Self-identify of young musicians forever altered
  4.215 Orchestra elevates social standing and fosters inclusion


The next step in the preparation for data analysis is:

Step 3 - How to Prepare Data for Analysis and Triangulation.


Dey, L. (1993). Qualitative data analysis: A user friendly guide for social scientists.  London: Routledge and Kegan Paul.

LaPelle, N. R. (2004).  Simplifying qualitative data analysis using general purpose software tools.  Boston, MA: University of Massachusetts Medical School.

MacQueen, K. E., McLellan, K., Kay, K., and Milstein, B. (1998).  Code book development for team-based qualitative analysis.  CAM Journal, 10, 31-36.

Miles, M. B. and Hubermanm, A. M. (1994). Qualitative data analysis: An expanded sourcebook (2nd ed.) Thousand Oaks, CA: Sage Publications.