Qualitative data analysis with Excel and NVDA

Áine Kelly-Costello

Notes

This written text aligns roughly with a presentation I gave to the 2020 NVDA Conference on 5 Dec. The transcript is designed to flow and is not verbatim, as there are a lot more demonstrations of the commands in the recordings using the example files linked, where you can also hear exactly how NVDA responds. This text also includes a couple additional references thanks to audience input. I can also update it if anything is unclear so if you have questions, you’re welcome to contact me.

Resources

Here is a Dropbox folder where you can find audio of this presentation recorded in sections, along with the example text used at various stages of completion. The relevant files are also linked throughout the presentation.

Intro

Hi everyone, I’m Áine, a blind master’s student.

Today I’d like to explain one way of doing qualitative data analysis using Excel and NVDA, though most of this process isn’t screen reader-specific.

I wanted to make this presentation because for my thesis, I needed to thematically code data for transcripts from about ten hours’ worth of interviews, and I was looking around for screen reader-friendly qualitative data analysis software that would work with NVDA and I wasn’t getting anywhere. So this presentation will hopefully be useful to you if you have some text-based data where you want to break it down to a more micro level so you can then label, or code, portions of it in certain ways. We’ll talk about preparing your data for coding, sorting and filtering your coded data and ways to keep track of your progress.

Please note that I’m not using any visual aspects here and that some of the methods described below will not necessarily be optimal if you are more reliant on reading data visually or if you need to show your data to someone who is, though I will mention a couple options for sorting in the latter case which were suggested by audience members.

Finally, I want to make it very clear that I’m certainly no expert in qualitative data analysis methods more broadly, so I’m just speaking on what I’ve learned from my own experience on making the analysis process accessible for me during my thesis.

Cleaning and segmenting your data

Recording

The type of data I’m imagining you have involves text of some kind, so that could be interview transcripts, the answers to open-ended survey questions, text of media articles, etc. Today, I’ll use one short interview transcript as an example. Lainey Feingold interviews blind researcher Joshua Miele for through the Disability Visibility Project about his work in the accessible media space and how he got into that. Here’s that example interview online and here’s a Word file with the raw transcript copy and pasted.

The first thing you want to do is get your data nice and clean. So this means removing excess data, retaining any context needed for each data point to stand alone, and segmenting your data so that it’s ready to be coded.

You might be starting with an Excel sheet already, for instance a spreadsheet of survey responses. There are of course data cleaning methods you can use in excel as well, but today we’ll focus on working in a Word file as I think that’s sometimes faster when it comes to tidying up transcripts or similarly formatted data. In Word, a hard return with the enter key is the equivalent of a new row in excel, and the tab character triggers a new column. To enter these characters in the find and replace box, for example if you would like to use it to insert a new line/new row after every instance of a full stop, you would use ^p for new line. ^t is used for tab.

As for how much you divide your data up by, you can code at any level, from paragraph right down to word, but keep in mind that the unit that a screen-reader will read back at one go for you is a single cell, so if using this method, dividing by word might make seeing the bigger picture difficult. Today we will code by paragraph.

I recommend starting by creating duplicate files of your data. For good measure, distinguish the original file from the copy you’ll be editing by renaming the files clearly so there’s no risk of losing your original data. So let’s have a look at this interview transcript now.

First, let’s see what kind of info we might want to remove. Apart from the question responses which we want to analyse, we have the interview questions, some music queues, and the names of the speakers. All of that needs to come out. But it can also be useful to leave in any context that will help individual data points stand alone, for instance if a question is simply answered with “yes” or “no”, it can help to note down the question within the cell so that the answer is in context. When I add info in this way I do it in [square brackets].

This interview transcript, lucky for us, is already divided into paragraphs, and we don’t need to do any more segmenting. But if you were going to code by sentence for instance, you would need to add a hard return after each sentence. You can use the find and replace dialogue for this, accessed with ctrl+h. Replace “. “ with “. ^p”, but beware you might get false positives like dr. or e.g., and that some sentences will end in an exclamation mark or question mark.

Finally, before transferring data to Excel, it wouldn’t hurt to double check that it doesn’t contain any unwanted blank lines. That’s because they will cause you to have a blank row in your data, and generally speaking, you don’t want any of those. We can also do that using the find and replace dialogue accessed with ctrl+h. We’ll find ^p^p and replace it with ^p, and press alt+a to replace all. Press alt+a again if any replacements are made until you hear that 0 replacements were made.

Setting up your Excel sheet

Recording, start at 7:50

Clean interview transcript

Once the data cleaning and division is done, make a new blank Workbook (or sheet) in Excel. Now you need to pick your column headers. For today, I’m going to pick ID, Name, Notes and Data, and I’ll come back to these.

Now is a good time to put the cursor in cell a1 and press NVDA+shift+c so that NVDA will read your column headers for you. If it doesn’t work, you’ll need to open the document formatting settings, ctrl+NVDA+d, and check the boxes for both tables and row and column headers. The text you’ll want read out on each row is not your ID, in column A, but rather your data. Today that’s going to land in column d. So we’ll put the cursor in d1 and press NVDA+shift+r there.

Next, let’s put the cursor in cell d2, under the data heading, return to our file, select all, copy, and paste our data into excel.

Now we have that pasted data ready for us in the d column, one cell per paragraph.

If you don’t already have a unique identifier for each data point, we’ll go through making a numerical ID column, in this case assigning a number, starting with 1, to each row of data. I’d recommend this as it makes sorting data back into its original order easy later, and it also means each of your possibly long datapoints has a short unique identifier.

So, we’ve just pasted our data into column d. We’ll press ctrl+down, which takes you to the bottom-most data-point before a blank row, in our case it’ll take us to the end of our data. And we’ll note down what row the data went down to now. That’s row 14. You might recall we labelled column A as ID. We’ll move back over to the ID column now with ctrl+home to land in cell a1.

There are many ways to create this numerical sequence. We’ll do it as follows. Type 1 in a2, press enter, type 2 in a3, press enter. Press f5 which is the “go to” dialogue. Now type a2:a14 to select vertically from a2 down to a14, and press enter. The command you want now is fill series. One way to find it is to press alt q for the command search box, start typing “fill”, then down arrow to ‘fill series of pattern”, right arrow to expand, find “series” in this list, press enter, tab through the options to the okay button and press it.

We labelled column B as name. today, this is redundant, as we’ve only got one interviewee, but if you have ten or a hundred and they are not identified by the ID, this could help.

So, let’s say we want to write JM for Josh Miele all the way down the column, or otherwise fill in any repetitive info. In b2, we’ll type JM, and press enter. Press f5 for the “go to” dialogue and type b2:b14 to select the b column down to where our data stops. Press enter and then ctrl+d to fill down, which means JM, as we wrote in b2 is now copies down the column. This works for formulas too, by the way.

Earlier, we labelled column C “Notes”, for a convenient place to jot down any of those as we code our data and now we’re ready to code.

Coding the data

Excel sheet set up for coding

Recording

I’ve already read this transcript a few times, so I’m going to pick out a bunch of themes we can code it by (this isn’tmethodical, just first impressions). WE’ll type these themes starting in column E, so pressing tab after each to move into the next column. They are: Access to information, Information deficit, Autonomy, Perceptions, Receiving support, Expectations, Stigma, Fitting in, Advocacy, Identity, Community and Pride.

In reality, you might not know which themes you want to code for at first, and if you think of more as you go along, you can either make new columns on the fly or make a note beside the point in question, particularly if you found one that’s a bit of an outlier.

Before we begin, I’m going to turn the reading of cell coordinates off, I personally find they are superfluous here. That option is under document formatting settings, press ctrl+NVDA+d to bring them up.

Now, in order to code a theme as being present in our data, I recommend using a key in easy reach of your left hand, as assuming you have the use of both hands, that leaves your right hand free for arrows. I personally default to using x. So we’ll find column d with our data and arrow down to D2 to read it. Then we’ll arrow right till you find the theme or themes to code it for, and put an x in the appropriate cell.

In coding the data in d2 (see spreadsheet, I noticed another useful theme autonomy. So I’ll add that on to the end of the themes in the first row.

Of course we can either choose to assign one, or multiple, themes to each piece of data. You might also decide to code for primary and secondary themes, and you could do that, by, for instance, marking primary themes with x, and secondary ones with z. By the way, if you do this, it’s easiest if your primary theme indicator is earlier in the alphabet than your secondary themes one, as it makes one less step when sorting later. Remember you can use the notes column to jot down anything that comes to mind, and here I’ll also give the example of using it to mark which cells I have yet to code. I’ll do that by writing “to code” in the top-most uncoded row, selecting the rest of the column down to the bottom row of data and pressing ctrl+d for fill down.

Now we have some beautifully coded data (well, partially coded in our example). Now is a good time to make another spreadsheet where you copy your raw data, in case you accidentally mess something up in here. To do that, press alt+shift+f1 for a new sheet. I don’t know if there’s a faster way, but to rename that sheet to something meaningful, press alt q, start typing “rename”, arrow down to format, right to expand it, and six presses of up arrow should get you to the rename option, where you press enter and type a new name. The commands to switch sheets are ctrl+pageup and ctrl+pagedown. You might want to give your original data sheet a name too, while you’re at it.

Sorting, grouping and reordering your data

Recording

The spreadsheet so far

In real life, what you do next will depend on your specific needs. You might want to read your data according to one of the themes you’ve just coded for. Perhaps you want to sort by primary and secondary themes, or sort by all the text which is relevant to theme A AND theme B.

All of these options involve the “Sort’ dialogue. As an example, we’ll sort by perceptions and then by stigma.

We’ll start by selecting the whole sheet with ctrl+a. Then press alt+q for the command search and type “sort”, down arrow to find the option that says “sort: find values quickly by sorting your data” and press enter. Something odd happens with the focus in this dialogue. You start out on the okay button, and in order to find the “sort by” selection list, you need to press shift+tab once, and then tab, and now instead of landing back on the okay button, you should be on “sort by”. Now down arrow to select the theme to sort by from the list of columns, we’ll select “Perceptions”.

If you just wanted to sort by one theme, tab forward till you find the okay button. Note that the default sort is alphabetical, so if you for instance marked primary themes with x and secondary ones with z, it’ll sort primary before secondary by default, where blank/unmarked cells will come last.

We’re now also going to sort by a second theme, “stigma”. We’ll press alt+a to add another sorting level, press tab, and then select “Stigma” from the list. Then tab forward to the okay button.

We won’t do this now, but if you would like to select all the data cells corresponding to both perception and stigma, for instance to copy them somewhere else, you need to figure out where the theme overlap stops. To find that point, go to the column of the second theme you sorted by, “stigma” in our case. Press ctrl+down and you should be on the last cell coded as “Stigma”. Take a note of its row number (if you turned off cell coordinate reporting before, you’ll want to turn it back on now). Then select the data in any of the usual ways.

If you have used a numerical ID as we have today and at any point you want to return your data to its original unsorted order, all you have to do is select all, press alt+q for command search, type sort, arrow down to “sort ascending” and press enter.

An audience member brought up a useful suggestion which is that you can also group your sorted data, and this is a visually useful way of delineating it. The main thing to know is that to group a particular range, you select all the rows in that range, press alt+shift+right and select the “rows” radio button. From here it’s also possible to hide (or collapse your grouped data). Read more about grouping here.

If at some point you realise your columns are in an inconvenient order, for instance if you’d like to select your data with its ID but not the interviewee name column which is currently in between them, reordering columns is straight forward. You just need to select the entire column, press ctrl+x for cut. Don’t be alarmed by the fact that the cells are still showing. Now move the cursor to the column header which is directly to the right of where you want your column to land. Here press ctrl+shift+= (which is ctrl++), and if options come up asking if you want to move cells right press enter. Otherwise your column should have inserted itself at the desired position by default. Here’s the spreadsheet with “autonomy” column reordered.

Keyword Search and filtering by keyword

Recording, start at 11:04

Another thing you might want to do is show the data that corresponds to a particular keyword that some cells contain. For instance you might have used a particular keyword in a few notes cells, such as “To code” for cells you wanted to come back to. There are two ways to do that. One is to select your data column, press ctrl+f for find, type your keyword, press alt+I which is the find all command and arrow down the list (which unfortunately will probably read you some extraneous info about the file and sheet as well).

The other method is filtering. It’s not the most intuitive but here are the steps. Select your data column as usual, now press ctrl+shift+l, followed by alt+down. Now press f twice. Then type the word (or character) you want to filter by and press enter.

Cross referencing and tracking progress

Recording, start at 18:00

When I was writing my thesis, the next step after initially coding data was making a new file with bullet points that summarised what I found. Now that I was in a new document, it was very useful to have the ID to refer to each data cell by. To jump to a certain ID when the data is in its original order, you can simply use the “go to” dialogue and type the cell coordinate, which will be the column letter followed by the row number, which is the number ID+1 (because of the header rows).

Another thing you might need to do is to keep a note of which data points you’ve already vetted for this second stage of analysis, especially as you’ve probably been doing a bunch of sorting and moving them around. I used two different ways for doing this. One is creating another column called “sorted”, and, just as with the themes, marking an x in the appropriate cell once I’d taken care of analysing that data point. This method is useful as it gives you a way to sort by which points you have yet to analyse, but it is also not readily obvious which datapoints you’ve sorted, without constantly arrowing over to this column. To overcome this, you might like to mark all of your sorted rows as bold or another font attribute that you can ask NVDA to read. The report font attributes setting is in the document formatting settings and you could also use the input gestures dialogue to set up a keyboard toggle for it. One thing you need to know about how this setting works is that the thing that is reported is when the font attribute changes. So let’s say you have three bolded cells in a row and the fourth is not bold. NVDA will announce “bold” on the first one, and then if you arrow through them, it won’t mention bold again till you reach “no bold” on the fourth cell. Finally, as mentioned by an audience member, there are other ways you could demarcate your sorted data by too which are more user-friendly to sighted people. Grouping as discussed above is one such, and another one is colour-coding. I won’t go into that here but if you know what colour scheme you are after, selecting your cells, pressing alt+q, typing “colour” and arrowing down will bring up the fill colour palette for you to pick from.

Conclusion

I hope this info was useful. My final observation would be that with a combination of persistent googling and trial and error, there’s quite a lot you can accessibly do to manipulate and analyse even large amounts of qualitative data in Excel. Remember to make friends with f5, which is the “go to” dialogue, and also alt+q which is where you search for commands so you don’t have to remember their baffling multi-lettered key combinations, though of course you can google and memorise any of those that you use regularly, if you prefer. I can’t over-emphasise how much you can learn about Exccel through googling, even if it sometimes takes a bit of persistence to find tutorials that don’t hide useful information inside images!

If you have further questions, you can contact me here or on Twitter @ainekc95.

Report abuse