The Joys of Data Collection
I have spent a significant portion of the past week and a half (probably about 25 hours) trying to put together data for a paper that I’m working on with a friend. It’s an interesting project—it basically tries to measure the effects of negative political shocks in the Sino-Japanese relationship on the economic fortunes of specific types of firms. We started working on this about a year ago, but it was really difficult to get our hands on the really fine-grained company-level data that we needed. We managed to piece together something workable from a number of different databases and wrote up an early version of our findings, but it’s been a continual goal of mine to find better sources. And now that I’m in Tokyo, lo and behold, I’ve stumbled upon some really promising stuff!
Of course, that means that I now have to get these great new sources into a form that suits the question we’re trying to answer. All of the data is in Japanese, so I get to deal with this part (and my friend gets to deal with working on the statistical analysis later). And since we need data on every firm listed on the Tokyo Stock Exchange and the stuff I’ve found is only available in hard copy, that means I get to spend a lot of time at my computer painstakingly entering numbers and doing somewhat laborious calculations of employee ratios and currency conversions while staring at pages that look like this:
This might seem like a total drag, but it’s actually kind of fun. I don’t often do large-N quantitative work (translation: work that uses numbers to analyze a large number of units such as firms or individuals), but I always feel a certain sense of satisfaction about the … concreteness of it all. So often in the social sciences, we are mired in a web of complex explanations and narratives. And sometimes it’s nice just to be able to deal with numbers and assets, to put them all into a huge spreadsheet and feel like you’ve accomplished something tangible. There’s also sort of a feeling of anticipation, since you do all of this entering and coding, but you don’t know what the answer will be in the end. I’ve been lucky enough to have research assistants to help me with data entry in the past, but in this case, I’m really glad that I’m working with the data myself. You get a much better sense of what you’re working with and as you start to become familiar with the data, you come up with ideas about new ways that you might be able to measure things. Plus, this is a fairly complicated coding scheme, so I’m not sure that I’d trust anyone else to do it.
Of course, after 25 hours, your (my) enthusiasm for this does start to dull a bit. But it usually picks back up after a good night’s sleep.
Anyway, this entry is mostly intended for people who wonder what on earth I do. :) I do things like this sometimes. Not often. But sometimes. My dissertation research is mostly qualitative, which means that I spend a lot of time reading through accounts of events and talking to academics and people in government. So, the two projects are a nice contrast to one another.
In any case, I probably have about 25 more hours of work to do on this before the library book is due back next Wednesday, so I’d better get myself some rest. Things are so busy!