To Determine The Effect Of Raters On Science Process Skills Performance Assessment Among Primary School Students

Authors

  • Gopal Krishnan Govindasamy School of Education Studies, University Science Malaysia, 11800 USM ,Pulau Pinang, Malaysia
  • Mohd Ali Samsudin School of Education Studies, University Science Malaysia, 11800 USM ,Pulau Pinang, Malaysia
  • Rozaini Abu Bakar School of Education Studies, University Science Malaysia, 11800 USM ,Pulau Pinang, Malaysia

DOI:

https://doi.org/10.11113/sh.v7n1.597

Keywords:

Many-facet Rasch measurement, rater effect, science process skill.

Abstract

This study is done to know whether the judges’ ratings results in the same decisions for candidates of the same ability and to show the relative severity of the different judges. The Many-facets Rasch Measuring Model (MFRM) is used in this study to determine the validity and reliability of raters’ severity and leniency. The research involves five raters and fifty examinees from standard five students of a primary school in Butterworth. There are five science process skills evaluated that is the observation, classifying, measuring, relationship between space and time and experimenting (identifying variables, make hypothesis, presenting the report). The results of this study suggest that raters of science process skills performance such as those in the PEKA can be trained to rate appropriately and consistently, and that under a system of double marking, assigning different raters to different test takers does not pose a threat to the validity of scores, and that tests are valid, reliable, and fair in that regard.

References

Bond, T. G. & Fox, C. M, (2007). Applying The Rasch Model: Fundamental Measurement In The Human Sciences. Second Edition. Ney Jersey: Lawrence Erlbaum Associates Publishers.

Cambridge, MA: MIT Press

Connor-Linton, J. (1999). Competing communicative styles and crosstalk: A multi-feature analysis. Language in Society, 28(01), 25-56.

Dahncke, R. Duit, W. Gräber, M. Komorek & A. Kross, Eds, Research in Science Education – Past, Present And Future, 49-60. Dordrecht: The Netherlands: Kluwer Academic Publishers.

Engelhard, G. (1994). Examining rater errors in the assessment if written composition with a many-faceted Rasch Model. Journal of Educational Measurement, 31(2), 93-112.

Engelhard, Jr&Myford, M.C (2003).Monitoring Faculty Consultant Performance In The Advanced Placement English Literature And Composition Program With A Many-Faceted Rasch Model. (College Board Research Rep. No. 2003-1). New York.

Englehard, G. (1992). The measurement of writing ability with a many-faceted Rasch model. Applied Measurement in Education, 5(3), 171-191.

Englehard, G. (1994). Examining rater errors in the assessment of written compositions with a many-faceted Rasch model. Journal of Educational Measurement, 31(2), 93-112.

Ericsson, K. A., & Simon, H. A. (1993). Protocol analysis: Verbal reports as data. 2nd edition.

Grobman, Laurie. 2007. Affirming The Independent Researcher Model:Undergraduate Research in the Humanities. CUR Quarterly, 28(1):23-28

Haley, S.M, McHorney, C.A, &Ware,J.E., Jr (1994), Evaluation of the MOS SF-36 physical functioning scale (PF-10): I. Unidimensionality and reproducibility of the Rasch item scale. Journal of Clinical Epidemiology, 47, 671-684.

Hambleton, R. K., Swaminathan, H., & Rogers, H. J. (1991). Fundamentals Of Item Response Theory. Newbury Park, CA: Sage.

Harlen, W. (1999). The assessment of scientific literacy in the OECD/PISA project. In H. Behrendt, H.

Kane, M. T., Crooks, T., & Cohen, A. (1999). Validating measures of performance. Educational Measurement: Issues and Practice, 18(2), 5-17.

Lee King Siong,HazitaAzman& Koo Yew Lie. (2010). Investigating the undergraduate experience of assessment in higher education. GEMA Online Journal of Language Studies, 10(1), 17-33.

Linacre, J. M. (1989). Many-facet Rasch measurement. Chicago: MESA Press.

Linacre, J. M. (1997). Guidelines for rating scales. MESA Research Note #2. Retrieved June 24, 2009, from http://www.rasch.org/rn2.htm

Linacre, J. M. (2002). What do infit, outfit, mean-square and standardized mean? Rasch Measurement Transactions, 16, 878.

Linacre, J. M. (2006). Facets Rasch measurement computer program. Chicago: Winsteps.com.

Linacre, J. M. (2011). Winsteps Rasch Measurement Version 3.71 [Software].

Linacre, J.M. (2003) Winsteps Version 3.4 Wright & Master (1982). Rating Scale Analysis. Chicago: MESA

Linacre, J.M. (2003). Winsteps Version 3.48 [Computer Software and manual]. Chicago.

Lumley T (2005). Assessing Second Language Writing: The Rater’s Perspective. Frankfurt.

McNamara, T. F. (1996). Measuring Second Language Performance. London: Longman.

McNamara, T.F. (1996). Measuring Second Language Performance, New York; Longman.

Merbitz, C., Morris, J., & Grip, J. (1989). Ordinal Scales and the foundations of misinference. Archieves of Physical Medicine and Rehabilitation, 70, 308-332.

Messick, S. (1996). Validity and washback in language testing. Language Testing, 13(3).

Moon, T., & Calahan, C. (2001). Classroom performance assesment: What should it look like in a standards-based clasroom? NASP Buletin, 85(62), 48-58.

Noor Lide Abu Kassim(2007). Using The Rasch Measurement Model For Standard Setting Of The English Language Placement Test At The IIUM.

Pulakos, E.D. (1986). The Development Of Training Programs To Increase Accuracy On Different Rating Forms. Organizational Behavior And Human Decision Processes, 38, 76-91.

Rasch, G. (1980). Probabilistic Models For Some Intelligence And Attainment Tests. Chicago: University of Chicago Press.

Rezaei, A. R., & Lovorn, M. (2010). Reliability and validity of rubrics for assessment through writing. Assessing Writing, 15(1), 18-39.

Skamp, K.(1998). Primary Science & Technology: How Confident Are Teachers? Research in Science Education, 21, 290-299.

Test Method And Learner Discourse. Language Testing, 16(1), 82-111.

Upshur, J.A & Turner CE (1999). Systematic Effects In The Rating Of Second Language Speaking Ability:

Wiggins, G (1993). Educative Assessment. Designing Assessment to inform and improve student Performance, San Francisco, California.

Wilson, M., & Sloane, K. (2000). From principles to practice: An embedded assessment system. Applied Measurement in Education, 13, 181-208

Wright, B. & Stone, M. (1979). Best Test Design. Chicago: MESA

Wright, B. D., & Masters, G. N. (1982). The measurement of knowledge and attitude. Research Memorandum No. 30. Chicago: University of Chicago, MESA Psychometric Laboratory.

Yang Ling Li (2004), Psychometric Properties of the Volitional Questionnaire, University of Illinois, Chicago.

Downloads

Published

2015-10-11

How to Cite

Govindasamy, G. K., Samsudin, M. A., & Abu Bakar, R. (2015). To Determine The Effect Of Raters On Science Process Skills Performance Assessment Among Primary School Students. Sains Humanika, 7(1). https://doi.org/10.11113/sh.v7n1.597

Issue

Section

Articles