国外作文自动评分系统评述及启示A Critical Review and Implications of Some Automated Essay Scoring Systems
梁茂成,文秋芳
摘要(Abstract):
本文依据语言测试领域的作文评分要素,对国外具有代表性的三种作文自动评分系统进行评介和比较,指出这些评分系统在训练及作文的人工评分方法和机器评分效度等方面存在的问题,并分析这些作文自动评分系统为我国自主开发作文自动评分系统所提供的借鉴作用。
关键词(KeyWords): 作文自动评分;模型;评分要素;信度;效度
基金项目(Foundation): 教育部人文社科项目(编号06JA740007);; 中国外语教育研究中心重大研究项目的资助
作者(Author): 梁茂成,文秋芳
参考文献(References):
- [1]Attali,Y.and Burstein,J.Automated essay scoring with E-rater V.2.0[A].Paper presented at the Conference of theInternational Association for Educational Assessment(IAEA),Philadelphia,June 13-18,2004.
- [2]Bachman,L.F.Fundamental considerations in languagetesting[M].Oxford and New York:Oxford UniversityPress,1990.
- [3]Blok,H.,and de Glopper,K.1992.Large scale writingassessment[A].In L.Verhoeven and J.H.A.L.De Jong(eds.).The construct of language proficiency[C].Amster-dam/Philadelphia:John Benjamins,1992:101-111.
- [4]Burstein,J.C.,Kukich,K.,Wolff,S.,Lu,C.,Chodor-ow,M.,Braden-Harder,L.&Harris,M.D.Automatedscoring using a hybrid feature identification technique[A].In The Proceedings of the annual meeting of the Associationof Computation[C],1998a.
- [5]Burstein,J.C.,Kukich,K.,Wolff,S.E.,Lu,C.,&Chodorow,M.Enriching automated scoring using discoursemarking[A].Paper presented at the Workshop on DiscourseRelations and Discourse Marking at the annual meeting of theAssociation,1998b.
- [6]Burstein,J.,Kukich,K.,Braden-Harder,L.,Chodorow,M.,Hua,S.&Kaplan,B.Computer analysis of essay con-tent for automatic score prediction:A prototype automatedscoring system for GMATanalytical writing assessment[R].(Research Report RR-98-15).Princeton,NJ:EducationalTesting Service,1998c.
- [7]Burstein,J.C.,&Marcu,D.,Andreyev,S,&Chodorow,M.Towards automatic classification of discourse elements inessays[A].In Proceedings of the 39th annual meeting of theAssociation for Computational Linguistics[C],France,2001:90-92.
- [8]Chung,G.,&O Neil,H.Jr.Methodological approaches toonline scoring of essays[R](Report No.CSE-TR-461).Los Angeles,CA:University of California,Los Angeles,Center for the Study of Evaluation,1997.
- [9]Cohen,Y.,Ben-Simon,A.&Hovav,M.The effect ofspecific language features on the complexity of systems forautomated essay scoring[C].Paper presented at the IAEA29th Annual Conference.Manchester,UK,2003.
- [10]Daigon,A.Computer grading of English composition[J].English Journal 55.1,1966:46-52.
- [11]Deerwester,S.,Dumais,S.T.,Furnas,G.W.,Lan-dauer,T.K.,&Harshman,R.Indexing by Latent Se-mantic Analysis[J].Journal of the American Society forInformation Science,41,391-407.1990.
- [12]Dumais,S.,Furnas,G.,Landauer,T.Deerwester,S.&Harshman,R.Using Latent Semantic Analysis to ImproveAccess to Textual Information[J].Machine Studies,1982,17,87-107.
- [13]Foltz,P.W.,Kintsch,W.&Landauer,T.K.Themeasurement of textual coherence with Latent Semantic A-nalysis[J].Discourse Processes.1998,25,285-308.
- [14]Foltz,P.W.,Laham,D.,&Landauer,T.K.The Intel-ligent Essay Assessor:Applications to Educational Technol-ogy[J].Interactive Multimedia Electronic Journal of Com-puter-Enhanced Learning,1999,1(2).
- [15]Kaplan,R.M.,Wolff,S.E.,Burstein J.,Lu C.,Rock,D.A.,&Kaplan,B.A.Scoring essays automati-cally using surface features[R].(GRE Board Report No.94-21P).Princeton,NJ:Educational Testing Service,1998.
- [16]Kukich,K.2000.Beyond automated essay scoring[A].In Hearst,K.(eds.),The Debate on Automated EssayScoring.IEEE Intelligent Systems[C],September/Octo-ber,2000.
- [17]Landauer,T.K,Foltz,P.W.&Laham,D.An introduc-tion to Latent Semantic Analysis[J].Discourse Processes,1998,25,2&3,259-284.
- [18]Landauer,T.K.,Laham,D.,Rehder,B.M.E.Schreiner.How well can passage meaning be derived with-out using word order?Acomparison of latent semantic anal-ysis and humans[A].In Shafto,M.G.&Langley,P.(Eds),Proceedings of the 18th international ACM SIGIRconference on research and development in information re-trieval[C].1997.
- [19]Landauer,T.&Dumais,S.A solution to Plato s problem:The Latent Semantic Analysis theory of the acquisition,in-duction,and representation of knowledge[J].Psychologi-cal Review,1997,104.211-140.
- [20]Landauer,T.K.,Laham,D.and Foltz,P.W.2000.The Intelligent Essay Assessor[A].In Hearst,K.(eds.),The Debate on Automated Essay Scoring.IEEEIntelligent Systems[C],September/October,2000.
- [21]Landauer,T.K.,Laham,D.&Foltz,P.W.Automatedscoring and annotation of essays with the Intelligent EssayAssessor[A].In Shermis,M.D.&Burstein,J.(eds.).Automated Essay Scoring:A Cross-Disciplinary Perspective[C].Lawrence Erlbaum Associates,Mahwah,NJ.,2003:87-112.
- [22]McNamara,T.Measuring Second Language Performance[M].Addison Wesley Longman Limited:New York,1996.
- [23]Page,E.B.Grading essays by computer:Progress report[A].In Educational Testing Service(Ed.),Proceedingsof the Invitational Conference on Testing Problems[C].New York City:Princeton,NJ:Educational Testing Serv-ice,1966:87-10.
- [24]Page,E.B.The Use of the Computer in Analyzing StudentEssays[J].Int l Rev.Education,Vol.14,1968:210-225.
- [25]Page,E.B.New computer grading of student prose,usingmodern concepts and software[J].Journal of ExperimentalEducation,1994,62(2):127-142.
- [26]Page,E.&Peterson,N.S.The Computer Moves into Es-say Scoring:Updating the Ancient Text[J].Phi DeltaKappan March,1995:561-565.
- [27]Powers,D.E.,Burstein,J.C.,Chodorow,M.,Fowles,M.E.,&Kukich K.Comparing the validity ofautomated and human essay scoring[R](GRE Board Re-search Report 98-08aR).Princeton,NJ:EducationalTesting Service,2000.
- [28]Purves,A.C.In search of an internationally valid schemefor scoring compositions[J].College Composition andCommunication.1985,35,426-438.
- [29]Shermis,M.,Mzumara,H.R.,Olson,J.and Harring-ton,S.On-line Grading of Student Essays:PEG goes onthe World Wide Web[J].Assessment&Evaluation inHigher Education,2001,26(3):.
- [30]Stemler,S.E.A comparison of consensus,consistency,and measurement approaches to estimating interrater relia-bility[J].Practical Assessment,Research&Evaluation,2004,9(4).
- [31]Valenti,S.,Neri,F.and Cucchiarelli,A.An overviewofcurrent research on automated essay grading[J].Journal ofInformation Technology Education.Volume 2,2003.
- [32]Weigle,S.C.Assessing writing[M].Cambridge Universi-ty Press:Cambridge,2002.
- [33]桂诗春.潜伏语义分析的理论及其应用[J].现代外语,2003,(1).
- [34]梁茂成.中国学生英语作文自动评分模型的构建[D].南京大学博士学位论文,2005.