To develop efficient general purpose second language curricula, designers need to make empirically justified decisions with respect to a host of issues. One vexing issue concerns the selection and incorporation of vocabulary items from the innumerable lexicon of the language in question. The questions to be answered here are ‘how many’ and ‘which’ lexical items should be covered in the curriculum. A related second issue, particularly brought to the fore by recent research on language processing and acquisition (e.g., Ellis, 2012; Jiang & Nekrsova, 2007), is the inclusion of formulaic language in the curriculum. According to one estimate, for instance, 58.6% and 52.3% of spoken and written language respectively are formulaic and therefore highly predictable (Erman & Warren, 2000). Due to their predictability and, in effect, ‘lexicalized’ status, the curricular incorporation of formulaic chunks and patterns has been shown to enhance learners’ language acquisition and boost their (native-like) fluency (e.g., Myles, Mitchell & Hooper, 1999). A third issue derives from the current debate over input authenticity in second language education that revolves around the curricular reliance on language input originally intended for first language (L1) users. In view of these ongoing issues and the indispensable contribution of corpora in second language education (e.g., Römer, 2011), in this presentation I will first describe the process of constructing a multi-million-word Persian language corpus at the University of Maryland to help address these curricular concerns. I will next demonstrate the affordances provided by this corpus and discuss how corpus-based findings can furnish Persian language curriculum designers with sound empirical justifications in tackling the three issues referenced above.
Ellis, N. C. (2012). Formulaic language and second language acquisition: Zipf and the phrasal teddy bear. Annual Review of Applied Linguistics, 32(1), 17–44.
Erman, B. and Warren, B. (2000).The idiom principle and the open choice principle. Text, 20, 29–62.
Jiang, N., & Nekrsova, T. M. (2007).The processing of formulaic sequences by second language speakers. The Modern Language Journal, 91(3), 433–445.
Myles, F., Mitchell, R., & Hooper, J. (1999). Interrogative chunks in French L2. A basis for creative construction? Studies in Second Language Acquisition, 21, 49–80.
Römer, U. (2011). Corpus research applications in second language teaching. Annual Review of Applied Linguistics, 31, 205–225.