Developing a Corpus of Entrepreneurship Emails (COREnE) for Business Courses in Malaysian University Using Integrated Moves Approach


  • Baiti Shauki Universiti Utara Malaysia
  • Manvender Kaur Sarjit Singh



Corpus analysis, CEFR, Corpus of Entrepreneurship Emails, English for Specific Purposes, Integrated Moves Approach (IMA), Sales emails


This paper describes the processes that went into the design and development of the Learner Entrepreneurship Corpus (LEC), a specialised corpus that contains pedagogical sales e-mails written by undergraduates at a Malaysian public university. It describes in detail the processes involved in data collection and selection, as well as raw data processing, which includes data coding and tagging. The tagging and coding utilised online softwares; CLAWS Tagger and AntConc. The study adopted Integrated Moves Approach (IMA) which combines both structural and syntactical analysis. This paper also discusses the evolution of corpora in Malaysia and their relevance to the field of English for Specific Purposes (ESP). LEC was created with the goal of providing corpus linguistic researchers with a place to explore the field, as well as ESP practitioners in Malaysia with resources for developing local-based ESP curriculum and teaching and learning materials. LEC has approximately 18 thousand words from 89 electronic documents retrieved from eCampus website. At present, users can make queries to access LEC from the researcher for free. 


Abdul Haseeb, Md. A. S. (2018, January 10th). Higher education in the era of IR 4.0.New Straits Times.

Almutairi, N. D. (2016). The Effectiveness of Corpus- Based Approach to Language Description in Creating Corpus-Based Exercises to Teach Writing Personal Statements. English Language Teaching, 9(7), 103-111.

Anthony, L. (2016). Introducing corpora and corpus tools into the technical writing classroom through Data-Driven Learning (DDL). In J. Flowerdew and T. Costley (Eds.), Discipline Specific Writing(pp. 162-180). Abingdon, UK: Routledge.

Bhatia, V. K. (2004). Interdiscursivity in critical genre analysis. 391-400.

Bhatia, V. K. (1993). Analysing genre: Language use in professional settings. Harlow, UK: Pearson Education Limited.

Biber, D., & Reppen, R. (2015). Introduction: In D. Biber & R. Reppen (Eds.), The Cambridge Handbook of Corpus Linguistics.(pp. 1-8). Cambridge: Cambridge University Press. doi: 10.1017/CBO9781139764377.001

Bowker, L., & Pearson, J. (2002). Working with Specialized Specialized Language: A Practical Guide to Using Corpora.London/New York: Routledge. ISBN: 0–415–23698–3 (hb); 0–415–23699–1 (pb)

Calude, A, & Pagel, M. (2014). Frequency of Use and Basic Vocabulary. In Multilingual Cognition and Language Use: Processing and Typological Perspectives. M. Putz and L. Filipovic (Eds.) Publisher: John Benjamins. doi:10.1075/hcp.44.02cal.

Chandler, D. (1997). An introduction to genre theory. The Media and Communications Studies Site, 1-15.

Cirillo, L. (2020). Learner Corpora and Embedded Assessment of Undergraduate EFL Writing: The Case of Metadiscourse Markers. In Approaches to English for Specific and Academic Purposes: Perspectives on Teaching and Assessing in Tertiary and Adult Educationby M.J. Ennis & J. Prior (Eds.) (pp.147-172). Publisher:

Creswell, J., & Clark, V. P. (2011). Designing and conducting mixed methods research.(2ndEd.). Thousand Oaks, CA: Sage.

Davies, A. (2007). An Introduction to Applied Linguistics: From Practice to Theory.Edinburgh: Edinburgh University Press. Retrieved July 14th, 2020, from

Evans, S. (2012). Designing email tasks for the business English classroom: Implications from a study of Hong Kong‟s key industries. English for Specific Purposes, 31(3), 202-212.doi:

Gablasova,D.,Brezina,V.,&McEnery,T.(2017).Exploring Learner Language Through

Corpora: Comparing and interpreting corpus frequency information. Language Learning, 67(S1), 130-154.doi:

Granger S. (2008). Learner corpora. In A. Ludeling & Kyto, M. (Eds.), Corpus Linguistics: An International Handbook.Berlin & New York: Walter de Gruyter, 259-275.

Granger, S. (1998). Learner English on Computer. London, New York: Addison Wesley Longman.

Gries, S. T. (2009). What is Corpus Linguistics?. Language and Linguistics Compass, 3(5), 1225-1241. doi:

Gries, S. T. (2015). Statistics for learner corpus research. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge handbook of learner corpus research(pp. 160–181). Cambridge, UK: Cambridge University Press.

Grundmann, R. (2021). Using large text news archives for the analysis of climate change discourse: some methodological observations. Journal of Risk Research.1-13. 10.1080/13669877.2021.1894471.

Hunston, S. (2006). Corpus Linguistics. In Encyclopedia of Language & Linguistics by K. Brown (Ed.), (pp. 234-248). Elsevier Publications. doi:

Hyland, K. (2004). Graduates' gratitude: The generic structure of dissertation acknowledgements. English for Specific Purposes, 23(3), 303-324. doi:

Joharry, S. A., & Rahim, H. A. (2014). Corpus research in Malaysia: A bibliographic analysis. Kajian Malaysia, 32(1), 17-43.

Kamariah, Y., & Su‟ad, A. (2011). Collocational Competence among Malaysian Undergraduate Law Students. Malaysian Journal of ELT Research, vol. 7(1), p. 151 – 202.

Kaur, N. (2013). A case study of tertiary learners‟ capability In lexical knowledge. GEMA Online Journal of Language Studies, 13(1), 113-126.

Kankaanranta, A. (2006). “Hej Seppo, Could You pls Comment on This!” - Internal email communication in lingua franca English in a multinational company. Business and Professional Communication Quarterly, 69(2), 216-225. doi:

Kotani, K, & Yoshimi, T. (2015). Design of a Learner Corpus for Listening and Speaking Performance. 9th Pacific Asia Conference on Language, Information and Computation pages 351 – 358.

Leech, G. (1998). Preface. In Granger, S. (Ed.), Learner English on Computer, pp. xiv-xx. London and New York: Longman.

Lozano, C., & Mendikoetxea, A. (2013). Learner corpora and second language acquisition: The design and collection of CEDEL2. In A. Díaz-Negrillo, N. Ballier, & P. Thompson (Eds.),Automatic Treatment and Analysis of Learner Corpus Data(pp. 65-100). doi:

Martin, J. R. (1997). Analysing genre: functional parameters. In J. R Martin, F. Christie & J. Martin (Eds.), Genre and Institutions: Social Processes in the Workplace and School(pp. 3-39). London: Cassell: Continuum.

McEnery, T., Xiao, R., & Tono. Y. (2006).Corpus-Based Language Studies. An Advanced Resource Book(Routledge Applied Linguistics). London, New York: Routledge. xx, 386 S.

Meurers, D. (2015). Learner corpora and natural language processing. In S. Granger, G. Gilquin, & F. Meunier (Eds.), The Cambridge Handbook of Learner Corpus Research (pp. 537-566). Cambridge, UK: Cambridge University Press. doi:

Munby, J. (1978). Communicative Syllabus Design. Cambridge: Cambridge University Press.

Myles, F. (2005). Interlanguage corpora and second language acquisition research. Second Language Research, 21(4), 373-391. Retrieved October 30, 2017, from https://hal.archives-

Myskow, G., & Gordon, K. (2010). A focus on purpose: Using a genre approach in an EFL writing class. ELT Journal, 64(3), 283-292. doi:

Nesselhauf, N. (2005). Corpus linguistics: A practical introduction. Retrieved from -%20Linguistics%20Practical%20 Introduction.pdf

Osman, H. (2013). Pedagogic genre analysis: A module for copy-writing. English for Specific Purposes World, 13(37).

Pendar, N., & Chapelle, C. (2008). Investigating the promise of learner corpora: Methodological issues. CALICO Journal, 25(2), 189-206.

Romer, U., & Wulff, S. (2010). Applying corpus methods to written academic texts: Explorations of MICUSP. Journal of Writing Research, 2(2), 99-127. Retrieved from s/54d8d2a70cf24647581c47a7.pdf

Rice, R. E., & Bair, J. H. (1984). New Organizational Media and Productivity. In R. E. Rice (Ed.), The new media: Communication, research and technology (pp. 185- 215). California, USA: Sage Publications. Retrieved from

Sadjirin, R., Aziz, R, A., Nordin, M, N., Ismail, R, M., & Baharum, M, D. (2018). The Development of Malaysian Corpus of Financial English (MaCFE). GEMA Online Journal of Language Studies, 18(3), pp. 73-100,

Safayenj, F., Lee, E., & Macgregor, J. (1988). An empirical investigation of two electronic mail systems. Behaviour & Information Technology, 7(4), 361-372. doi:

Schmitt, N. (2000). Vocabulary in Ianguage teaching. Cambridge: Cambridge University Press.

Singh, M. K. S. (2014). A corpus-based genre analysis of quality, health, safety and environment work procedures in Malaysian petroleum industry.Unpublished PhD Thesis, Universiti Teknologi Malaysia, Faculty of Education. Retrieved from

Singh, M. K. S., Shamsudin, S., & Zaid, Y. H. (2012a). Revisiting genre analysis: Applying Vijay Bhatia's approach. Procedia - Social and Behavioral, 66, 370-379.doi:

Singh,M.K. S., Zaid, H. Y., & Shamsudin, S. (2012b). A Computer-Assisted Corpus Analysis (CACA) of professional discourse. Sino-US English Teaching, 9(6), 1236-1245.

Subtirelu, N. C., & Baker, P. (2017). Corpus-based approaches. In J. Flowerdew, & J. E. Richardson (Eds.), The Routledge Handbook of Critical Discourse Studies (pp. 107-120). London, UK: Routledge.

Swales, J. M. (1990). The concept of discourse community. In J. M. Swales, C. A. Chapelle, & S. Hunston (Eds.), Genre analysis: English in academic and research settings (pp. 21-32). New York, USA: Cambridge University Press.

Tono, Y. (2003). Learner corpora : Design, development and applications. Proceedings of the Corpus Linguistics 2003 Conference (pp. 800-809). Lancaster, UK: University Centre for Computer Corpus Research on Language.

Tostlebe, E. (2011). Teaching writing in the textbook series profiles: A genre-based approach. [Unpublished Thesis], University of Tampere, School of Language, Translation and LiteraryStudies.

Upton, T. A., & Cohen, M. A. (2009). An approach to corpus-based discourse analysis: The move analysis as example. Discourse Studies, 11(5), 585-605. doi:

Vicic, S. G. (2008). Potentials and challenges of ESP learner corpora: The case of modal auxiliaries in Slovene ESP learners' written interlanguage. InBrkan et. al. (Eds), Inter alia 1 (pp. 15-30). Ljubljana: Slovenian Association of LSP Teachers.




How to Cite

Shauki, B., & Sarjit Singh, M. K. (2022). Developing a Corpus of Entrepreneurship Emails (COREnE) for Business Courses in Malaysian University Using Integrated Moves Approach. Sains Humanika, 14(1), 1–9.