A question worth answering?

Two important points need signposting here.

First, learning is not revealed at the point of observation. This knowledge is well-established and frequently commented upon. Learning and performance are dissociable. What is more, we know that learning can occur in the absence of any performance gains and, conversely, substantial changes in performance too often fail to translate to relatively permanent learning.

Second, the irony, that an ‘experience of education’ fails to protect us from such misinformed folly. It may even perpetuate it in a style alikened to a ‘curse of knowledge.’ A misconception belonging to experienced educators that their expertise offers insight. The practice of lesson observations have a lot to answer for.

I value Dylan’s Wiliam’s sometimes provocative contribution and admire the construction of his targeted criticism and the wriggle room he allows himself, here the use of “fairly clearly,” keeps the crowd-sourced opinionated at arms defence. Other times I have seen him use ‘capabilities’ over ‘abilities.’

The point to contest, maybe, is the level of clarity. It is, perhaps, not clear as he reports.

Moreover, educators tend to rate teachers teaching more-able groups more favourable or recognise teachers who teach as they teach, more favourably.

Several recent studies have pointed to the problems with the application of observation instruments in the context of teacher evaluation, in particular significant correlations between teachers’ observation scores and the characteristics of classes they teach. Dylan Wiliams often refers to the data collected by the Measures of Effective Teaching (MET) project.

Mihaly & McCaffrey (2014) reported negative correlations between teachers’ observation scores and grade level and Lazarev and Newman (2013), showed that relationships between observation and value-added scores vary by grade and subject.

Whitehurst, Chingos, and Lindquist (2014) report a positive association between the teacher’s average observation score and the class-average pretest score, while Chaplin, Gill, Thompkins, and Miller (2014) report negative correlations between the score and class shares of minority and free lunch-eligible students.

While the nature of these relationships remains complex and possibly unclear, these results can be interpreted as suggesting that teachers may benefit unfairly from being assigned a more able group of pupils.

References

Chaplin D., Gill B., Thompkins A., & Miller H. (2014). Professional Practice, Student Surveys, and ValueAdded: Multiple Measures of Teacher Effectiveness in the Pittsburgh Public Schools. Mathematica Policy Research report.

Lazarev, V., & Newman, D. (2014). Can multifactor models of teaching improve teacher effectiveness measures? Paper presented at the Annual Meeting of AEFP, San Antonio, TX, March 2014.

Mihaly, K., & McCaffrey, D. (2014). “Grade-Level Variation in Observational Measures of Teacher Effectiveness” In: Kane, T., Kerr, K., & Pianta R., eds. Designing Teacher Evaluation Systems: New Guidance from the Measures of Effective Teaching Project. New York: John Wiley & Sons.

Whitehurst, G., Chingos, M., & Lindquist, K. (2014). Evaluating Teachers with Classroom Observations: Lessons Learned in Four Districts. Brown Center on Education Policy at the Brookings Institution