Internet Learning Volume 3, Number 1, Spring 2014 | Page 32

Continuous Improvement of the QM Rubric and Review Processes Differences of Review Success by Course Discipline Analysis of courses reviewed from 2011 through July 2013 revealed that business courses tended to have the best outcomes. Business courses were most likely to meet standards in the initial review, followed by education courses. Business courses also had the highest total scores. Courses in the remaining disciplines did not significantly differ from one another. Relationship between Faculty Developer/Instructor of Reviewed Course and Familiarity with the QM Rubric In the analyses of the 2011–2013 course reviews, courses submitted by individuals familiar with QM had higher initial scores than courses submitted by individuals who were not familiar with QM (Mann– Whitney U (N = 1,488) = 43,537, p < .001). However, there were not total point differences after amendment (Mann–Whitney U (N = 1,488) = 61,900, p = .108). (The amendment phase includes interaction with the peer review team.) The familiarity of faculty developers and instructors with the QM Rubric was examined in relation to the outcome of the initial course review and the amended course review (when needed). In the analysis of the 2011–2013 Rubric, the majority (93.3%) of individuals who submitted courses for review were familiar with the Rubric. Only 98 out of 1,492 (6.6%) of individuals stated that they were not familiar with the Rubric. Proportion of Rater Agreement by Specific Standards Measures of reliability are often given when discussing scores such as those assigned using the QM Rubric. The term “reliability” refers to consistency of results. Inter-rater reliability is a measure of the relationship between scores assigned by different individuals (Hogan, 2007). In its strictest sense, however, inter-rater reliability works under the assumption that reviewers are randomly selected and interchangeable (see Suen, Logan, Neisworth, & Bagnato, 1995). This assumption is not met in the QM’s process in which reviewers may be selected on the basis of their previous experiences or areas of expertise. The measurement of interest concerning the QM Rubric is the proportion of reviews in which all three raters assigned the same rating to a specific standard (i.e., all three reviewers assessed a standard as met or not yet met). This is different from inter-rater reliability in that it is not an attempt at describing unsystematic variance (see Hallgren, 2012; Liao, Hunt, & Chen, 2010); its purpose is to provide an easily interpretable statistic that will allow for the comparison of specific standards for practical purposes. Thus, in the discussion of consistency of results of QM’s reviews, the term proportion of rater agreement is used as it explicitly describes the analyses performed as opposed to inter-rater reliability, which it technically is not. One of the primary purposes of analyzing proportion of rater agreement is to identify specific standards that may require attention to keep the Rubric reflective of the research and fields of practice while being workable for a team of inter-institutional, inter-disciplinarian academic peers. A specific standard for which reviewers 31