探索网阅环境下的英语写作评分员培训An Investigation of Rater Training in On-line ESL Writing Assessment
陆远
摘要(Abstract):
写作评分员差异构成了对写作评估信度和效度的直接威胁。提高评分信度的策略之一便是对评分员的培训。网上阅卷更是给评分员培训提供了详尽和及时的数据信息。对英语专业四级作文写作评分员网上培训的实证研究发现,培训在一定程度上有助于减少评分员严厉度的总体差异,有利于增强部分评分员的自身一致性,以及能够缩小评分员在总体上对评分项目的偏颇。为使培训更有效,培训应将先期培训和持续培训并重。
关键词(KeyWords): 网阅;英语写作;差异;评分员培训
基金项目(Foundation):
作者(Author): 陆远
参考文献(References):
- [1]Alderson,J.C.,Clapham,C.and Wall D.Language test construction and evaluation[M].Cambridge/北京:Cambridge University Press/北京外语教育与研究出版社,1995/2000.
- [2]Charney,D.The validity of using holistic scoring to evaluate writing: a critical overview[J].Research in the Teaching of English,1984, (18):65-81.
- [3]Constable,E.and Andrich,A.Inter-judge reliability:is complete agreement among judges the ideal[P].Paper presented at the annual meeting of the National Council of Measurement in Education,New Orleans,LA.,1984.
- [4]Elder,C.,Knoch,U.,Barkhuizen,G.,& Randow,J.v.Individual feedback to enhance rater training:Does it work[J].Language Assessment Quarterly,2005,2(3):175-196.
- [5]Elder,C.,Barkhuizen,G.,Knoch,U.,& Randow,J.v.Evaluating rater responses to an online training program for L2 writing assessment [J].Language Testing,2007,24(1):37-64.
- [6]Huot,R.Reliability,validity,and holistic scoring:what we know and what we need to know[J].College Composition and Communication, 1990,(41):201-13.
- [7]Lumley,T.,& McNamara,T.F.Rater characteristics and rater bias: Implications for training[J].Language Testing,1995,12(1):54 -71.
- [8]Lumley T.Assessing Second Language Writing:The Rater's Perspective [M].Frankfurt am Main:Peter Lang,2005.
- [9]Lunz,M.E.,Wright,B.D.,& Linacre,J.M.Measuring the impact of judge severity on examination scores[J].Applied Measurement in Education,1990(3):331-345.
- [10]McNamara,T.Language Testing[M].Oxford/上海:Oxford University Press/上海外语教育出版社,2000/2003.
- [11]Schaefer,E.Rater bias patterns in an EFL writing assessment[J]. Language Testing,2008,25(4):465-493.
- [12]Weigle,S.C.Effects of training on raters of ESL compositions[J]. Language Testing,1994(11):197-223.
- [13]Weigle,S.C.Using FACETS to model rater training effects[J]. Language Testing,1998,15(2):263-387.
- [14]Weir,C.J.Language Testing and Validation:An Evidence-Based Approach[M].Basingstoke,Hampshire:Palgrave Macmillan,2005.
- [15]陈茂建.浅析网上阅卷[J].福建教育学院学报,2002(7):17- 19.
- [16]高丙成,秦旭芳.成人高考网上阅卷的评分员差异研究[J].乌鲁木齐职业大学学报,2007,(4):96-99.
- [17]马世晔.网上阅卷的回顾与思考[J].中国考试,2004,(7):24- 26.
- [18]王跃武,朱正才,杨惠中.作文网上评分信度的多面Rasch测量分析[J].外语界,2006,(1):69-76.
- [19]王跃武.大学英语四、六级考试作文网上联机阅卷评分信度研究[D].上海交通大学博士学位论文,2005.
- [20]张洁,何莲珍.语言运用测试中的分数差异研究——基于多层面Rasch模型的方法[J].中国英语教学,2008,31(4):40-49.
- [21]邹申,杨任明.他们如何使用写作评分标准[J].国外外语教学,2002,(3):1-6.