Transcription approaches to multilingual discourse analysis|
by Tim Greer Hokkaido University
This paper is based on the author's ongoing investigation into codeswitching and social identity among multi-ethnic Japanese teenagers. It outlines some of the challenges in transcribing bilingual conversational data through a hybrid ethnographic/CA approach, including the nature and positioning of translations within the text, the use of romanized or kana script for rendering Japanese speech into text and the adoption of other transcript conventions. In addition, the paper will also introduce readers to the transcription software Transana.
IntroductionTranscribing spoken discourse is more than just writing down what is being said. The way the researcher chooses to render the spoken data into text can influence not only their interpretation by readers, but also their subsequent analysis. When the data are based on dual-medium interaction, such as those in the author's on-going investigation into codeswitching among multi-ethnic Japanese teenagers, informed transcription decisions are all the more vital.
The aim of this paper then is to conduct a comparison of approaches to transcription across various discourse analytic traditions in order to document the rationale behind the conventions adopted by the author in his ethnographic/Applied CA study of codeswitching in an international high school. After briefly describing the context of the wider study, it will then outline some of the challenges in transcribing bilingual conversational data through a hybrid Conversational Analytic (CA) approach.
From Data Collection to Data TranscriptionAs mentioned above, the author's wider study is concerned with codeswitching and multi-ethnic identity in an international school. Codeswitching, for the purposes of this paper, can be operationalized as language alternation, or the switching from one language to another either within a conversation or within a single utterance. Following the ethno-methodological tradition of adopting the participants' understandings, a definition of codeswitching according to those who use it daily is reproduced below in Transcript 1. The transcript comes from a conversation in which a group of multi-ethnic teenagers from the international school negotiate the meaning of the word "champon", a Japanese slang term that can refer to mixed bilingual speech. Through the sequential co-construction of meaning by four speakers, Mia rather creatively arrives at the definition "It's like ko iu koto".
[ p. 46 ]
While the study of codeswitching and its implications for social identity are the main themes of my wider study, here the purpose of this transcription is to exemplify some of the challenges faced by the analyst in rendering bilingual speech into script. Many of these may become apparent just from the brief taste given in the first transcript.
The word champon has appeared on a discussion focus sheet the participants are reading before the session begins. Joe is unfamiliar with the word and appeals to the others for clarification.1. Joe: champon tte nan dakke 'What's champon mean again?' 2. Eli: mazzeru 'Mixed' 3. Mia: arui wa, (.) um it's like (.) 'or else, it's like' 4. Erika: English to nihongo o mazeru 'Mixing English and Japanese' 5. Karen: gucha gucha ni suru 'Mashing it all together' 6. Joe: ah hai hai hai hai 'oh, right right right right' 7. Mia: It's like koh iu koto 'It's like this sort of thing.' 8. Karen: h ha
The data I have collected for my qualitative investigation are based on a series of focus group sessions and extended classroom and out-of-class observation. After an initial week of observations in which I familiarized myself with the various groups in the school, I decided to focus on the upper school, from 10th to 12th grade, because it was determined that the students at this level had a more balanced understanding of both languages and their codeswitching would be less likely to occur due to L1 transfer. During a subsequent two week block of ethnographic observations, I videoed the participants' interaction with their teachers in class and amongst their peer group during lunch and before and after school. As anticipated, incidents of codeswitching increased outside of the classroom, but some in-class switches were also observed. The intensive block was followed by additional weekly observations and a series of focus groups in which I took up issues of language and identity in a form of group discussion.
With over fifty hours of video footage, I have necessarily been selective even before transcribing the data, by collecting field notes while recording the conversations and simultaneously annotating them according to my research focus. My initial review of the data has followed the CA convention of "unmotivated looking" (ten Have, 1999). I paid particular attention to sequences in which the participants were heard to be somehow indexing their ethnic identities, and further looked at the role of codeswitching in this process. Initially I created a broad index to the tapes, listing the key episodes using coded titles that would be recognizable to me later and noting those that I felt were most noteworthy with a simple system of stars and colours. Later I returned to these key-episodes to transcribe them via more detailed CA conventions to document the locally emergent meaning of various codeswtiches in specific instances of talk-in-interaction. For the focus group sessions, which were data-rich but ultimately researcher-initiated group conversations, I employed a general transcription approach which I coded in vivo for content using QSR's N5 software. At this level, I was concerned primarily with content over form so the transcription here followed generally conventional orthographic practices. In other words, I transcribed the participants' talk as if they had written what was said.
The transcription software I have been using in my study is Transana. This tool, available for free at www.transana.org, facilitates the transcription and analysis of audio and video data. On a single screen users can view footage in both video and soundwave forms, transcribe the data, and link specific points in the transcript to individual frames in the video. The greatest advantage of having data accessible in one place in this way is that audio and video controls are accessed via the keyboard. Instead of having to continually remove your hands from the keyboard to pause and rewind the video, these actions are available on the computer panel, expediting the transcription process considerably. Moreover, the soundwave feature allows precise alignment of utterances and facilitates the accurate measurement of such interactional elements as pauses and overlaps. By placing time codes within the transcript, the software will automatically highlight relevant portions of the transcript as the video plays.
[ p. 47 ]
Figure 1. A screenshot of the transcription software Transana.
In addition, Transana contains analytic tools for organizing video data via the creation of a hierarchy of user-designated keywords. The researcher can select analytically interesting portions of the video, and organize them into meaningful groups in a form of qualitative coding process. Transana can then search for instances of keywords and access the video clips to which they have been applied. I have found Transana to be an extremely useful method for transcribing my data and anticipate that its use will extend further into the presentation of those data, by the inclusion of the transcripts in electronic form along with the final dissertation.
Transcription Considerations from a Conversation Analytical PerspectiveAt the most detailed level of my research I have chosen to transcribe and analyze my data according the conversational analytic tradition, a qualitative yet empirical discipline that aims to document participants' social actions through a detailed examination of their turn-by-turn organizational and sequential interaction. As such CA places a heavy emphasis on transcription, not only as a means of orthographically encoding the spoken word, but also as an integral process in its analysis. CA adopts the turn as the basic unit of analysis, and most research in the field is based on the simplest systematic model (Sacks, Schegloff, and Jefferson, 1974). As such, elements of naturalistic talk such as turn overlaps, gaps and pauses, breathiness and laughter are all taken into account in a CA transcript. For example rising intonation is represented with an arrow (>) and decreased volume is brackets by degree marks(°). See the Appendix for a complete list of transcription conventions employed.
An example of how one of my CA-based transcripts appears in practice is given below in Transcript 2. As is evident, the attention to detail and completeness in an attempt to document all aspects of the interaction means that the transcript can become fairly difficult to interpret for those from other disciplines. However, this is a necessary evil in the attempt to obtain a complete, consistent and accurate representation of the spoken word (Wood and Kroger, 2000).
Even issues that are not readily apparent had to be taken into consideration in the transcription process. Initially I had to decide whether to provide pseudonyms for the participants or just call them A or B, as in Transcript 3 below. "Pure" CA aficionados would hold that issues of identity such as name or gender are only relevant in the analysis to the extent that the members orient to them specifically in the localized context of the talk. I take the position that the names are known to the participants and they facilitate an understanding for the reader that is already held by the members. Likewise I had to decide whether to number according to the turn or the line. Both are acceptable conventions in discourse analysis, and serve as a convenient analyst's tool rather than a part of the talk-in-interaction. I have decided to number turns rather than lines, in part because I have provided the translations mid-script so there is a potential for confusion about what is being said and what is being translated.
TRANSCRIPT 2: (Greer, 2003)
The participants have been discussing racial epithets that have been ascribed to them as multi-ethnic Japanese people.40 Peter: ore zasshu to iwaretchatta 'I was called mongrel' 41 Others: ((loud laughter)) ooh ha ha HUH ((jocular hand clap)) that's (a good one) 42 Eli: (laughing) Zasshuk That's like you're a dog or something 'Mongrel?' 43 Peter: mm((casts gaze down at desk)) 44 Eli: ((Sees it is perhaps not a laughing matter and changes her tone of voice)) Zasshuk (.6) ° Zasshuk ° 'Mongrel?' ° 'Mongrel?' ° 45 Erika: Nanim Kodomo ni& 'What? By some kid?' 46 Peter: Un 'Yeah' 47 Eli: How could someone say that& 48 Peter: ((smiling again)) hee ha hidokunai& 'Don't you think that's terrible?' 49 Eli: ((somewhat subdued laughter)) ° ha ha ha m °
[ p. 48 ]While CA-based transcripts have been criticized for being too detailed, proponents maintain that it is through such a fine-grained attention to detail that elements of the interactional work become apparent, and such "transcripts play a key role in the claim of CA to be a rigorous empirical discipline" (Hutchby and Woofit, 1998:92).
CA researchers have overwhelmingly favored Jeffersonian transcription (see Atkinson and Heritage, 1984), but this approach is by no means limited to just CA and the methods are being increasingly used by linguists in other discourse analytic traditions. Many of the conventions I have used are based on Jefferson's transcripts, although I have chosen to adapt some of them in order to facilitate a hybrid methodological approach.
Challenges Inherent in Transcribing Multilingual Data
". . . the potential for the transcription tool to become a politicized tool of linguistic representation is ever-present"
Multi-lingual data necessitates translation at some level, if only to accommodate a wider readership. This section will discuss some of the challenges faced in rendering Japanese talk into English within the bounds of a transcription.
Bucholtz (1999) notes that, in making interpretive and representational decisions about what and how talk is being described, the transcriber's own beliefs and expectations about the interaction and the speakers inevitably enter the final transcription. Transcribers must decide whether to render the speech to text by conforming to written conventions or to retain links to the original oral discourse, such as accent or dialectic idiosyncrasies. But whether naturalized or denaturalized transcription approaches are adopted, the potential for the transcription tool to become a politicized tool of linguistic representation is ever-present. When the data include multiple languages and subsequently require translation, the danger is even further apparent.
In my transcriptions, I have carried out most of the translation myself, possessing the appropriate competence in Japanese to do so, but I have made extensive checks along the way with native-speakers and the participants themselves in order to confirm my interpretations of their intentions. I have largely rendered them into a naturalized translation, in the belief that a detailed literal and syntactic record, such as that adopted by Nishimura (1997), is not warranted for the present analysis. I feel justified in translating Japanese speech to English, my first language, but would hesitate at attempting the reverse process, except at a very basic level. Likewise, I feel that native-speakers of Japanese can be a useful resource in understanding difficult sections of the data, but ultimately it should be a native speaker of English who provides the final translation into English.
I decided to use romaji instead of kana for Japanese data in order to make them accessible to a wider audience. Most Japanese writing is, of course, not usually written in the 26 letters of the roman alphabet, but as the Japanese orthography is not generally well-understood by many potential readers, it was decided to adopt its most readable form. In addition, even within Japanese transcripts that adhere to kana scripts there are still questions about whether to write elements of talk in katakana, hiragana or kanji, and these can all marginally affect the way the reader interprets the transcripts.
Finally, an important consideration is the positioning of translations in the transcript. In Transcript 3, Gafaranga and Torras (2002), provide their translations after the segment has ended, perhaps in an effort to keep their transcripts faithful to the original talk. This is another acceptable transcription approach for multilingual data, but would no doubt work best for relatively short sequences.
Gafaranga and Torras differentiate between the two languages in this sequence by use of italics and bold fonts (English and Castilian) and render inaudible talk with x's. This solution seems to work well for the purposes of their analysis, which perhaps relies less on a detailed consideration of overlap. The simplified version of CA transcription allows novice readers access to their main point, and readers who understand both of the languages in the transcript can follow the conversation in much the same way the participants performed it.
TRANSCRIPT 3: (Gafaranga & Torras, 2002)
Talk takes place in an Irish pub in Barcelona between two customers of Anglo-Saxon orinin (A,B) on one hand, and a Spanish bar attendant (C) on the other.1. C: una pinta de Scrumpy 2. B: y un gin and tonic 3. C: mil ciento cincuenta 4. B: OK xxx (.) (to CU1) oh let me get let me xxx the change 5. A: OK 6. B: xxx (.) eight twenty-four 7. A: xxx 8. B: yeah but I wanna xxx as much as I can (.) xxx (.) look at that (.) I don't need any more 9. A: OK 10. C: ya está 11. B: sí sí ============== 1. C: one pint of Scrumpy 2. B: and one gin and tonic 3. C: one thousand one hundred and fifty 10. C: is that it 11. B: yes yes
However, as not every turn requires translating, these footnotes can become disjointed and confusing. Cashman (2000, 2001, 2002) and Bailey (2000a; 2000b) instead place their translations mid-transcript, which allows the reader to follow the conversation closer to its source and quickly determine the nature and extent of mid-turn switches.
One final consideration noticeable in comparing Transcripts 2 and 3 is the font that has been used in each. Most CA researchers write and display their transcripts using some version of the Courier script because as each letter is the same width, it is easier to align overlapping utterances. Courier also facilitates single word translations mid-transcript in multi-lingual discourse analysis. Gafaranga and Torras, on the other hand, have used Times New Roman for their transcript which again has the benefit of assisting readability for researchers from outside the field. They also may have adopted this approach because their data did not include many overlaps and the translations were provided post-transcript. Either method has its pros and cons, and transcribers must consider carefully their options carefully even in this seemingly minor regard.
[ p. 49 ]
Transcription is a complicated and time consuming process. When using a conversational analytic approach with multi-lingual discourse data the researcher's decisions concerning the transcript conventions can affect the outcome of the analysis. Matters of naturalized and denaturalized orthography, literal or natural translations, position and form of the text and the extent to which non-linguistic features of the interaction are transcribed must all be taken in to account and applied consistently to the data.
ReferencesAtkinson, J. M., & Heritage, J. (Eds). (1984). Structures of social action: studies in conversation analysis. New York: Cambridge University Press.
Bailey, B. (2000a). Language and negotiation of ethnic/racial identity among Dominican Americans. Language in Society, 29(4), 555-582.
Bailey, B. (2000b). Social/interactional functions of code switching among Dominican Americans. Pragmatics, 10 (2), 165-193.
Bucholtz, M. (1999). The politics of transcription. Journal of Pragmatics, 32, 1439-1465.
Cashman, H. (2000, April 7-9). Constructing a bilingual identity: Conversation analysis of Spanish/English language use in a television Interview. Paper presented at the Texas Linguistic Forum: Eight Annual Symposium about Language and Society, Austin.
Cashman, H. (2001). Doing being bilingual: Language maintenance, language shift and conversational codeswitching in southwest Detroit. Unpublished PhD, University of Michigan, Ann Arbor.
Cashman, H. (2002, April). Social context and bilingual conversation: Towards criteria for determining interactional relevance. Paper presented at the 14th Sociolinguistics Symposium, Ghent, Belgium.
Fassnacht, C. & Wood, D. K. (1995-2003). Transana (Version 1.21) [Computer Software]. Madison: The Board of Regents of the University of Wisconsin System.
Gafaranga, J., & Torras, M. C. (2002). Interactional otherness: Towards a redefinition of code-switching. International Journal of Bilingualism, 6 (1), 1-22.
Greer, T., (2003, May) Co-constructing identity: The use of 'haafu' by a group of bilingual multi-ethnic Japanese teenagers. in Proceedings of the 4th International Symposium on Bilingualism Arizona State University.
Hutchby, I., & Woofit, R. (1998). Conversation analysis. Cambridge: Polity Press.
Nishimura, M. (1997). Japanese/English code-switching: Syntax and pragmatics. New York: Peter Lang Publishing.
N5 [Computer Software]. (1996-2003). Melbourne, Australia: QSR International Pty Ltd.
Sacks, H., Schegloff, E., & Jefferson, G. (1974). A simplest systematics for the organization of turn-taking in conversation. Language in Society, 50 (4), 696-735.
ten Have, P. (1999). Doing conversation analysis: A practical guide. London: Sage.
Wood, L., & Kroger, R. (2000). Doing discourse analysis: Methods for studying action in talk and text. Thousand Oaks: Sage.