The quality assurance system utilized by Northwest seeks to ensure accountability, support innovation and improvement, and foster professional collaboration. By collecting data for the DESE Annual Performance Report (APR) and Title II reports, the quality assurance system ensures accountability for state and federal compliance requirements. In addition, Northwest has gathered data to ensure innovation and improvement strategies are evaluated, such as surveys collecting data about our recent program redesign, and regarding research purposes as well. Finally, collaborative discussions and decision making through data occurs both through regular Quality Assurance Team (QAT) meetings and during Professional Education Unit retreats.
Hallmarks of data quality collected by an effective assurance system include data: validity, reliability, fairness, and trustworthiness. Examples of how these attributes are demonstrated in our data collection will be listed in this appendix. The exemplars listed will model our ability and goals in measuring these attributes.
Reliability is a key psychometric in regards to data collection. Reliability of an assessment means that, if an attribute is measured multiple times by multiple assessors, the end scores should be similar. If so, this assessment has a high degree of reliability. Methods to improve reliability include the use of multiple scorers and providing scorers with training on the use of assessments. Other methods include the use of rubrics, and ensuring that scorers know the content they are scoring. Examples of these methods of reliability improvement include how all key assessments noted in Table 3 utilize rubrics, and that all faculty scoring these assessment have strong content knowledge. While we can assume the key assessments noted in Table 3 have a large degree of reliability, one commonly used assessment across all Northwest programs has had its reliability quantifiably measured. This would be the Missouri Educator Evaluation System (MEES).
The Missouri Educator Evaluation System is an observation form utilized during student teaching. Since Fall 2018 it has been utilized as a teacher certification requirement. This form was developed to assess teacher candidates on all nine Missouri teacher standards. It is used formatively and summatively by each candidate’s cooperating teacher and university supervisor. The scores of the summative assessments by both cooperating teacher and university supervisor are combined as one final score. This combined score is compared to a state benchmark to determine if a candidate has passed the MEES as a certification requirement.
While the MEES currently consists of nine items, each scored on a 0-4 scale, before Fall 2018 the form existed on a scale of 0-3 and measured a total of 16 quality indicators, under the nine Missouri teacher standards. While used for teacher certification, this data was also included in assessing Missouri preparation programs as part of the Annual Performance Report (APR). So, the reliability of scores between cooperating teacher and university supervisor was vital to accurate data for candidates and programs.
To ensure that the MEES is scored reliably, analyses are conducted on an ongoing basis. For instance, in the 2017-2018 academic year, 158 traditional program completers were assessed using the MEES summative, all by both a cooperating teacher and university supervisor. A reliability analysis was conducted by calculating the Pearson correlation coefficient between the sum of MEES points awarded by the cooperating teacher and the sum of MEES points awarded by the university supervisor for the same teacher candidates. Using this analysis, there was a statistically significant correlation (p < .001), with a correlation coefficient of r = .378. This represents a moderately strong correlation between the scores given by each assessor for their candidates.
While the MEES instrument has been designed to safe guard reliability between scorers, another important tool used to secure reliability between scorers is the use of scorer training. During August 2018 and 2019, a training titled “Triad Training” was provided by the Northwest Field Experience office. This training included university supervisors, cooperating teachers, teacher candidates, and content supervisors. All attendees were provided training by the Field Experience Director, the Assessment Director, and the Northwest RPDC staff. These trainings covered the MEES form and best practices on how it should be scored.
All of these methods and ensuing analysis ensures the reliability of this measure.
Validity and reliability are typically spoken of inseparably. However, just because an assessment is reliable does not mean it is also valid. Validity, or ensuring that an assessment measures what it is meant to, is a separate construct that must be assessed as well.
Validity has a variety of types that could be measured. Some of these are easier to assess than others. For instance, we can assume that all key assessments for all programs outlined in table 3 have a validity known as “face validity.” This would indicate that a content expert could look at an assessment and determine, without a quantitative analysis of any kind, that the assessment should measure what it is designed to measure. Since these key assessments were developed by content expert faculty, we can assume that they have face validity. However, deeper quantitative analysis can be done to ensure that assessments have more quantifiable and verifiable aspects of validity.
An example of this type of validity would be content validity. This type of validity can be measured in a few steps. First, a list of potential items to be assessed should be suggested by a group of content experts. Then, another group of experts scores each of these items based on how essential they are to measuring what the assessment should be measuring. From these scores, a measure titled Content Validity Ratio (CVR) is calculated for each item.
The MEES itself was scored for content validity at the state level. This was done by a state panel of content experts and lead to improvements for the Fall 2019 semester. An outline of this process can be found here.
While the MEES was being utilized as an assessment for student teaching, it was not set to be a certification requirement until the Fall of 2018. While the assessment had been used previously, this change of use initiated a deeper discussion of how to ensure validity of scoring. Of most concern was how to validly score the teacher standards of 7, 8 and 9, while using this observation form. These standards measured candidate ability to use K-12 student assessment data, collaborate professionally and engage in professional development. These were difficult to assess with a classroom-based observational assessment like the MEES. So, Northwest endeavored to develop MEES artifacts that candidates could complete during student teaching and that cooperating teachers and university supervisors could use to assess their candidates for items focused on Standards 7, 8 and 9.
The development and validity testing of these artifacts went through three main groups of stakeholders:
The QAT met beginning in the Spring 2018 semester with a focus on developing MEES artifacts to demonstrate candidate ability on Missouri Teacher Standards 7, 8 and 9 during student teaching. After a semester of meetings, the QAT settled on:
These artifacts were used as part of a pilot in Fall 2018 and Spring 2019. During this academic year, a small group of candidates completed these and university supervisors scored them on rubrics separate from the MEES summative. Candidates and supervisors were surveyed and asked to give feedback on how effective these artifacts were in demonstrating proficiency on the Missouri Teacher Standards 7-9.
While this provided useful input, a more structured content validity analysis was conducted separately in the Spring of 2019. During this semester a PEU retreat took place with a special focus on analyzing the MEES artifacts. Faculty from across the PEU gathered and scored all items on the MEES artifacts according to how essential they were to assessing a candidate’s ability to demonstrate their skill in Standards 7-9. From the 20 respondents, it was determined that a content validity ratio (CVR) less than .50 would be too low for an item to remain in the assessment. Using these methods, the following items were identified as requiring amendment or deletion:
These results from Spring 2019 were then revisited at a Triad Training in August 2019. At this training, university supervisors were able to give their feedback on potential changes that could be made to the MEES artifacts. At this meeting, the following changes were made:
The instructions for 8.2 may seem nearly identical. However, the newer Professional Development Log no longer contains a question related to whether each experience was positive, negative or neutral.
One other main update did come in the Standard 9 artifact, as the Parent Interaction Log was changed to a Working Relationship Log. As opposed to only asking for information about interactions with student parents, the Working Relationship Log requires the following:
Candidates must engage in at least one interaction with each of the following: Student; Family; Colleague; and Community.
This ensures a well-rounded experience regarding professional interactions.
This analysis lead to the current versions of the MEES artifacts used in the Fall 2019 semester.
At the end of the Fall 2019 semester, candidates and scorers will be surveyed again regarding these instruments, and a CVR analysis will be conducted again to ensure on-going validity of these assessments used for both candidate certification and program approval. MEES artifacts will continue to be an exemplar of how Northwest will analyze our key assessments moving forward.
In the spring of 2017, Northwest attempted to align University goals, academic goals and student expectations and to provide evidence of student learning. The key objective was the development of student learning outcomes. The university determined seven main Institutional Level Outcomes (ILOs) to assess for all students:
Through a rolling five year program review process, by which 20% of all programs would be reviewed annually, each program would be evaluated in part by student performance in these ILOs. The ILOs were set by the university during the Spring 2017 term. These learning outcomes are what each student that attends Northwest is assessed on at some point during their coursework, regardless of their major.
After development of ILOs, each program was required to develop Program Learning Outcomes (PLOs) that aligned with the ILOs. The program level learning outcomes were sent to the interim Provost, by the directors and chairs for the departments and schools, in March 2017. These learning outcomes are specific to the vision/direction of what students will have been assessed upon during the completion of that program.
So, in the Spring of 2017, the Dean of the School of Education and other faculty and staff gathered to develop PLOs for programs in the PEU. The results of that development can be found here. These outcomes were attached to identified coursework rubrics sent to candidates via the campus LMS, Northwest Online. Candidate work was submitted and scored by faculty, including PLOs. Results by program were then calculated and displayed as part of each program’s Program Review Dashboard.
While this process has been developed with a great deal of time and effort, it is currently being heavily revised. After an environmental scan completed by the Associate Director of Assessment and Accreditation, several shortcomings were identified. These included that the PLOs were only assessed using a rubric with two performance levels (Met/Not Met). Also, faculty felt that these PLOs were not as strongly reflective of student performance and success as they could be. So, a subcommittee of faculty senate was developed to revamp this process. The hope is that once this process increase face validity among faculty, that the outcomes will be used more broadly and effectively in program improvement.
Each Northwest academic program goes through a thorough program review at least once every five years. This process, driven by faculty for the goal of gathering data from a variety of sources (including stakeholder perceptions of the curriculum, assessments, and market demand and a review of similar program offerings from competitor universities) to make decisions on whether to keep, refine, or delete academic programs. The process involves program leaders, faculty, and Associate Provost and Graduate Dean. The cycle time is that each academic program must be evaluated through Program review once per five years. Suggested changes are brought to the Provost and University-Wide Educational Leadership Team.
This process helped the Northwest School of Education make improvements based on data. One change emanating from program review was that faculty recommended ceasing to offer the Master’s degree in Elementary Education, which had lagging enrollment and curriculum that was not current. Based on the program review, the Dean and faculty supported bringing together a group to revise the curriculum. This became an entirely new Master’s Degree in Curriculum and Instruction, offered in 7-week online courses, which now enrolls more than 250 students per year. Program review helped the School of Education to identify weak spots, strengths, and areas of opportunity. The process is collaborative, and the fundamental questions driving this improvement, “Is this the program we should offer? How do we know? What should we stop doing so that we can make the change?”
Program Review also supported the decision to stop offering the Teacher Leader Program in 2017, components of which we integrated into the brand-new Master’s in Curriculum and Instruction in 2018. Program Review in 2018-2019 included programs in Elementary Education (undergraduate/initial certification), Middle School (undergraduate/initial), and Special-Education Cross Categorical (undergraduate/initial).
Our regional accreditor, Higher Learning Commission, found value in the program review process used by Northwest. Since it is a relatively new institutional norm for Northwest, HLC suggested that we need more data about how the process drives improvement and change across academic programs. In the 2018 HLC review of the Northwest program review process, site visitors considered program review a university-wide strength. For the School of Education and the Professional Education Unit, the program review process has been valuable. It supports insights and continuous improvement.
All Northwest preparation programs are assessed heavily, most often with the assessments included as part of the Annual Performance Report (APR). These include the Missouri Educator Evaluation System (MEES), Missouri Content Assessment (MoCA) and the First Year Teacher survey (by completers and their principals). Information about these is included in Table 1. In addition, programs are assessed using program-specific key assessments. These are course-based assessments developed and scored by faculty. They can be found in Table 3 for each program. These key assessments are aligned to the MEES, the School of Education Program Level Outcomes (PLOs), Northwest Institutional Learning Outcomes (ILOs), Missouri Teaching Standards, AAQEP Standards, and program specific standards. In addition, these key assessments are aligned by where standards are introduced, reinforced and applied.
These key assessments have face validity as they are developed and utilized by program faculty. The Quality Assurance Team (QAT) will analyze them for content validity based on the methods utilized for the MEES Artifacts. One key note is that the applied level key assessments are typically items from the MEES summative. These items have already been analyzed at the state level for content validity and included in many data analyses, due to their inclusion in the APR, as indicated in Table 1.
The concept of fairness in assessment is related to whether measures work well for all candidates. For instance, if a measure is developed to determine which candidates are accepted into a teacher preparation program, are candidates from all backgrounds equally likely to pass or fail the measure? The key to fairness is to ensure that bias based on candidate background, race, ethnicity, SES and other factors is minimized. While Northwest endeavors to ensure that all assessments are utilized in a fair manner with minimized bias, one of the clearest examples of fairness is the process Northwest undertook regarding the Missouri General Education Assessment (MoGEA).
The Missouri General Education Assessment (MoGEA), developed by Pearson, is a multiple choice, high stakes assessment used since 2013 as an admissions requirement for any teacher preparation program in Missouri. Originally composed of five subtests, candidates were required to exceed a cut score on all five in order to pass this assessment. However, research conducted by Edmonds (2014), indicated that the MoGEA was biased against candidates who identified as African American or Hispanic. In addition, female candidates were less likely to score as well as male candidates. Edmonds stated the bias was so great that he recommended the MoGEA not be used as an admission requirement for teacher preparation programs. Soon after, the MoGEA was heavily revised and reduced to four subtests.
Another note regarding the MoGEA was that, while passing all subtests was required for admission to teacher education, preparation programs were given the options of what passing scores to use. Pearson suggested a cut score of 220 on a scale of 0-300 for each subtest on the MoGEA. Also, Pearson did analyses on all subtests and calculated Standard Errors of Measure (SEM) above and below the recommended cut score. Similar to standard deviations, these SEMs gave more flexibility to any potential performance. Preparation programs were given the option of using either the score of 220 for a passing score, or one or two SEMs below or above the recommended cut score. Different preparation programs chose different cut scores.
To set a cut score, Northwest analyzed data based on race and ethnicity. Researchers identified the highest cut scores possible on each subtest that were as likely to be attained by candidates of any race or ethnicity. By doing this, Northwest attempted to reduce bias. The following cut scores were set as of Fall 2015:
MoGEA Subtests |
NW Cut Score |
Reading |
202 |
Writing |
193 |
Mathematics |
220 |
Science and Social Studies |
204 |
As seen above, the only subtest that utilized the recommended cut score was Mathematics. Reading, Writing and Science and Social Studies utilize the cut score of one SEM below the recommended cut score. Then, in April, 2019, DESE provided a memo that would again change the use of the MoGEA.
According to this memo, Missouri educator preparation programs were now allowed to choose any assessment to measure general education knowledge prior to acceptance into a teacher preparation program. This included the MoGEA, the ACT, or other assessments. Similar to the opportunity mentioned above regarding cut scores and candidate race and ethnicity, Northwest endeavored to analyze previous assessment data and identify an assessment strategy that reduced bias.
Northwest has been an active institutional member of MACTE, the Missouri affiliate of AACTE, since 2000. The Executive Board of Missouri Association for Colleges of Teacher Education (MACTE) suggested that if programs decided to use the ACT as their general education assessment, that a cut score of 20 be used. Northwest analyzed previous data from education candidates since the Fall 2015 semester and our research revealed a disparity in achievement between diverse and non-diverse candidates:
NW Teacher Ed Candidates: ACT score by Race/Ethnicity, 2015-2018 (n=1603) |
ACT of 20 or Higher |
White (n=1499) |
80% |
Non-White (n=104) |
61% |
With the results above, Northwest determined that the ACT could not, on its own, be used as a fair and unbiased measure for entry into teacher education. Research was conducted to see if the MoGEA, with its modified cut scores, still had minimal bias.
MoGEA Passage Rates (2015-2018) |
% First Attempt |
% Best Attempt |
All Candidates (n=811) |
85% |
95% |
Non-White (n=33) |
88% |
94% |
From this, we identified that any difference in MoGEA passage between candidates based on race and ethnicity were minimal. However, this was still an opportunity to rewrite policy to not only continue to ensure bias based on race and ethnicity was minimized, but also to allow more candidates into the program to address the current teacher shortage in Missouri. So, faculty and staff met about the possibilities for policy change. This included the Dean of the School of Education, the Assistant Director of Teacher Education, the Associate Director of Assessment and Accreditation, Jill Baker (the initial advisor for all elementary education candidates), and Dr. Everett Singleton, who has a scholarly background in the impacts of poverty and socioeconomics on education, as well as educational access.
The group discussed a concept the Associate Director was considering, that of using multiple tiers of multiple measures as the general education assessment requirement. The Associate Director believed that using multiple measures and an “or” structure instead of an “and” structure would open up enrollment, ensure equity, and maintain academic rigor. The measures considered were the following:
An analysis was conducted to determine the impact of this policy change if it had been used previously, from 2015-2018. The results were the following:
Cumulative % of Candidates Admitted at Each Proposed Tier, 15-18 |
% White Education Candidates (n=517) |
% Non-White Candidates (n=29) |
Tier 1: MoGEA, pass first attempt |
85% |
88% |
Tier 2: ACT of 20 or higher |
92% |
93% |
Tier 3: GPA of 3.0 or higher |
98% |
100% |
Total Admitted |
98% |
100% |
As shown above, by using progressive tiers, 88% of non-white candidates would have passed the MoGEA on their first attempt. Then, if we use the ACT with a score of 20 or higher, this rate rises to 93%. If we include a possible cumulative GPA of 3.0 or higher, all non-White students who applied for teacher education from 2015-2018 would have passed this admission requirements.
But, the follow up question would have been whether these candidates could have passed the Missouri Content Assessment (MoCA) in their content areas. It would not be helpful or ethical to admit a greater percentage of candidates to teacher education if they would complete four years of education and then fail their final certification requirement. So, an analysis of 2015-2018 candidate data was conducted regarding this as well.
Fall 2015-2018 |
Admitted, Original System (n=568) |
Admitted under 3-Tiered System (n=652) |
Original = No,3-Tier = Yes(n=62) |
MoCA Passage Rate |
95.50% |
95.46% |
95.20% |
The above table indicates that 95% of candidates admitted under the original, MoGEA-based system passed their MoCA. Also, those admitted under the new 3-Tiered general education requirement system would pass the MoCA. Finally, those that would be admitted under the new system, but would not have been admitted under the old system, 74% of them still took a MoCA. That means, that 62 of the 84 students who were not admitted into Northwest teacher education between 2015 and 2018 still completed a preparation program somewhere and then took a MoCA. Of these candidates, 95% of them passed their MoCA. This would indicate that if a candidate was admitted under the new, 3-Tiered system, and had the grit to complete it, they would have the same opportunity to pass their MoCA as a candidate admitted under the old system.
Based on these analyses, the following policy was set forth and approved by COTE:
To be admitted as a teacher candidate, Northwest education students should:
This policy went into effect beginning with the Fall 2019 semester. The hope is that this policy will ensure minimized bias across all teacher preparation programs at Northwest and ensure academic rigor. Again, while not all assessments and their uses have been scrutinized this deeply to ensure lack of bias, Northwest holds this analysis as an exemplar and will continue to analyze assessments in this manner, starting broadly with assessments all candidates encounter.
While reliability, validity and fairness are concepts typically used to analyze assessments with quantitative data, trustworthiness is used to analyze data collected qualitatively. The goal of trustworthiness is to addresses the credibility, transferability, dependability, and verification of qualitative data. One of the best uses of this concept is with student teaching qualitative survey data.
In spring 2019, the Assistant Director of Teacher Education led a discussion among candidates who were completing student teaching. The Assistant Director gathered these candidates during a seminar session and described his plan. The Assistant Director of Teacher Education would provide them an opportunity to complete a brief, qualitative survey based on their experiences student teaching and how well Northwest prepared them for this experience. Then, candidates would have the opportunity to share with the larger group.
At the time, the candidates submitted their results as part of an online survey through Survey Monkey. They were asked a broad question to begin with, on how well prepared they felt during student teaching, and then asked to indicate how they would summarize their Northwest experience in one word. Finally, candidates were asked to provide their program name. The written responses were mostly positive, but the sharing turned negative quickly. It seemed that candidates with negative experiences were more willing to share and seemed to dominate the conversations.
The qualitative survey results were then downloaded from Survey Monkey. The Associate Director of Assessment and Accreditation then reviewed the results and split them according to whether the candidate was from an elementary or K-12, Secondary, or Middle School program. Finally, the Associate Director coded these results into emegent qualitative themes. The results were shared with faculty and School of Education administration during the fall 2019 Professional Education Unit Retreat.
While this data opened up a variety of new discussions and opportunities for input from candidates, there was certainly room to increase trustworthiness. So, in the Fall 2019 semester, this was attempted again. This time, candidates returned from student teaching and provided qualitative feedback in survey form again. But, instead of sharing as a bigger group, candidates shared in small groups at round tables. Faculty were there to lead these separate discussions. Faculty took notes and shared results with the Assistant Director of Teacher Education. These allowed for even deeper dives into how these candidates felt about their preparation programs and minimized any over-whelming impact louder candidates might have had.
Also, in regards to analyzing these qualitative survey results, they will not be reviewed by one individual first. Instead of the Associate Director of Assessment and Accreditation reviewing results and organizing them into themes, these results will be part of a larger discussion with the Quality Assurance Team (QAT). The QAT, a group of faculty and staff focused on continuous improvement through data usage, will meet and break these results into themes with input from a more diverse group and then combine results. The end results will include input from a variety of viewpoints and programs and therefore lead to more trustworthy results.
The Northwest Quality Assurance System assures accountability, supports innovation and improvement, and fosters professional collaboration. Examples of these were outlined above. One of the final goals of the Northwest Quality Assurance System is using collected data to guide improvement. An example of this was the APR Feedback Request form utilized in the summer of 2019. Data collected from the Fall 2018 APR and projected data for the Fall 2019 APR was analyzed and shared at the program level. Twenty-three programs had their data given back to them in graphical form. On these Feedback Request Forms, each program shared its gaps. Also, the following questions were asked:
The Northwest educator preparation programs use multiple measures of valid and reliable qualitative and quantitative information related to knowledge, skills, professional dispositions, and teaching effectiveness. These criteria are assessed at several key program points: 1) entrance to Northwest, 2) admission to the professional education program, 3) admission to culminating student teaching, at 4) at graduation/program completion, and 5) into the professional setting as a teacher, counselor, or administrator.
To assess program quality and candidate impact, we regularly collect and analyze data from trustworthy information from diverse stakeholders and multiple perspectives. These include our faculty, the candidate/completer, the cooperating teacher and/or university supervisor for clinical experience, and school administrators. Excellent data quality allows the Northwest quality assurance system to be nationally-recognized by the American Association of Colleges for Teacher Education. Data quality and systematic use of stakeholder feedback was instrumental in preparing the evidence to make the case that our innovative program redesign was worthy enough to be the sole university to receive the 2018 AASCU Christa McAuliffe Award for outstanding quality and innovation in teacher education.
Improvements are expected and progress is monitored systematically- reviewed by individuals, committees, and administration. Ongoing research of the programs and their effectiveness is regularly conducted in several ways. Faculty conduct, present, and publish research; faculty, staff, and administration work together to form the Quality Assurance Team and attend Unit Retreats. Policy makers and administration populate the Council of Teacher Education, which uses data to inform ethical and equitable policy, analyze curriculum, and monitor candidate and program progress, and stakeholders share their voices, and analyze data as part of the Professional Advisory Board.