Future of Student Assessment

Welcome to the Future of Student Assessment

Passing informed judgment on the curriculum, the learning and teaching process, and programs is a process of making sense of where schools, teachers, and students stand. Over the past decade, assessment has become a central feature of the educational environment. In 2001, the Society for the Advancement of Excellence in Education published the groundbreaking study, Student Assessment in Canada: Improving the Learning Environment through Effective Evaluation by Alan R. Taylor and Teresita-Salve Tubianosa of Raven Research Associates. Their report examined many important facets of student evaluation. It analyzed and compared provincial and national assessment systems through interviews and surveys with education officials across Canada. The research also reviewed the international context for testing, the role of testing and measurement in the teaching and learning environment, and emerging assessment technologies. A major finding from the study identified a new direction for assessment in the future. Computer-based technology will be the vehicle that delivers advantages in both the assessment OF learning and the assessment FOR learning.  Since 2001, there has been an explosion of interest in technology as a tool for better assessment. In the United States, major test companies and research centers are investigating computer-administered assessments for general or specialized student populations, computer-adaptive methodologies which can adjust to varying levels of competency, and computer scoring of essays and other open-response tasks. Several states plan to deliver all statewide assessments on-line by 2010 and several Canadian provinces are on a similar timetable. Such ambitious goals require significant R & D support and rigorous evaluation.

Context and Pretext of the Institute

Rapidly emerging technology promises a revolutionary shift in the future delivery and management of assessment. (See The Potential of Technology Assisted Student Assessment) As a result of our initial research and rapidly unfolding developments subsequent to the publication of Student Assessment in Canada, the Society for the Advancement of Excellence in Education has determined to focus on the application of technology to student assessment. To expand significantly upon the research of Drs. Taylor and Tubianosa in this area we are forming a new initiative: the Technology Assisted Student Assessment Institute (TASA Institute, for short). The purposes of the Institute are:

1. To document trends, leading-edge prototypes, evidence regarding their effectiveness, best practice, and implications for policy in the field of technology-delivered student assessment.

2. To develop a next-generation assessment toolset and process, leveraging the considerable strengths of computer and online technologies.

3. To collaborate with Ministries of Education, school districts, testing agencies and international researchers in the piloting and evaluation of computer assisted assessment models.

4. To serve as a clearinghouse for research and provide a source of expertise to schools, districts, and ministries/departments of education on the design, implementation, and use of computer based assessment.

The Role of Assessment

The Role of Assessment and Evaluation in the Teaching-Learning Environment

Evaluation is a process, not a single event. As such it involves several steps leading to important decisions impacting on students and classroom processes. A model showing this relationship is shown below. As it shows, evaluation is a three-step process. It involves the systematic collection of relevant data, the professional interpretation of what that information means and the act of informed decision-making on the basis of that interpretation. A discussion of each of these stages follows.

1. COLLECTION OF INFORMATION

The systematic collection of relevant data is an important first step in the process of student evaluation. It requires the appropriate planning and design of procedures and instruments for collection of information on student outcomes, either at the formative or summative stage of the process. A discussion of implications for data collection at different levels of the system is next.

» At the Classroom Level
In the case of the classroom teacher, this stage of the process utilizes a variety of measures such as observational techniques, projects, presentations, quizzes and teacher-made tests. Each of these should correspond to what is being taught in the classroom, be “weighted” in some way to reflect importance, and measure the goals of the intended curriculum as prescribed at each respective provincial level in Canada.

» Program Evaluations
For program level evaluation, this stage usually involves the development and administration of a variety of instruments such as achievement tests, surveys of student attitudes and perceptions, performance assessment procedures, and surveys or observations of classroom practices. Each of these instruments is linked to the prescribed curricula of respective jurisdictions and provides information on outcomes and practices relevant to achievement and instruction. Often achievement measures focus on student outcomes at the end of a range of grade levels in each subject, rather than on those limited to a single course such as Grade 6 mathematics.

» Standardized Tests
Information collected on standardized tests reflects the knowledge and skills considered important from across a number of jurisdictions. These often relate to important life skills including applications of consumer needs, knowledge considered important for an educated citizen, and problem solving skills. This information doesnêt cover the entire breadth of a curriculum but focuses on areas considered important, consisting of subsets of common skills and areas of knowledge.

» Credentialing Examinations
Provincial credentialing tests such as subject-specific Grade 11 or 12 examinations collect information corresponding to important curriculum outcomes prescribed at the course level. They are designed to reflect outcomes from the curriculum considered important to examine and are usually articulated to schools through a detailed description of the content and cognitive levels to be examined.

2. INTERPRETATION OF RESULTS

The interpretation of results is a key component in the process of evaluation. It is essential, however, that it take place within an appropriate context. To better understand what results tell us and what they do not, it is important to keep in mind the nature of the audience examined, the design and purpose of the test, and the limitations of the data. A discussion of interpretation at different levels of the system follows.

» Teacher Measures
A major advantage that teachers have in interpreting results from their own measurements is first-hand knowledge of students in their classrooms. Given this contextual understanding they are in a good position to assess how well, for example, Billy or Daljit understands the concepts being examined. However, there are some limitations in using only their own measures for this purpose. For example, one teacherês perception of “excellence” may be quite different than anotherês. This limitation is best addressed when results from teacher-made tests are combined from time to time with external measures linked to standards and norms.

» Program Assessments
The interpretation of results from program level assessments usually occurs at each of the following reporting levels: school, district, and province. Information provided for this purpose often includes norms and standards associated with each level. The process usually involves a subjective procedure where a committee or panel reaches consensus on how well the jurisdiction under review has met expected and desired levels of achievement, while taking into account the context for learning. A school in a high socioeconomic area, for example, would likely expect higher scores than one in a disadvantaged area. Provinces usually provide schools and districts with procedures and guidelines to help in undertaking this process.

» Standardized Tests
The interpretation of results from standardized tests should include references to both normative comparisons and expectations relative to the nature of the population under review. Through a combined analysis of these, a better understanding of results can be gained. At each level of reporting (the student, classroom, school and district), profiles of strengths and weaknesses can be developed, which can lead to a plan for improvement.

» Credentialing Examinations
Data from credentialing examinations focus at the student level and are then aggregated to school, district and provincial levels. In addition to student scores, summaries usually include results by topic and cognitive level. When results are interpreted at this level it is common, not only to base them on expectations relative to the nature of the student population but also relative to how other schools and districts performed. It is important to utilize both references in the process. For example, if only expectations for the population under review were used, then it would be possible to lose sight of objective comparisons. On the other hand, if only normative comparisons were used, important factors impacting on performance, such as socio-economic status, would not be taken into account.

3. EFFECTIVE DECISION-MAKING

Once results have been interpreted within an appropriate context, areas of strength and weakness should be identified and a growth plan developed. It is incumbent on those making the interpretation to develop a plan for the maintenance of strengths and the improvement of areas of weakness based on findings.

At the student level, decisions may involve a plan for remediation at the formative stage of evaluation or the reporting of a standing at the summative stage. Identification of strengths and weaknesses based on results aggregated to the classroom, school, district or provincial level areas can provide direction for a number of follow-up activities. Among these may be a change in methodology, the development of resource materials, the allocation of resources, and revision of curriculum content and design.

4. SUMMARIZING THE COMPONENTS OF A BALANCED MODEL

The balanced model proposed includes a variety of measures intended for a variety of purposes. Its components include teacher-made measures, program assessments, credentialing examinations, and standardized tests. A summary of these components is shown in Table 1-1 with a list of the advantages and disadvantages of each.

The above presented the concept of evaluation as a process involving the systematic collection of information, the appropriate interpretation of results, and the act of making informed decisions on the basis of findings. It is contended that there is a role for the components of a balanced evaluation model to play in each step of th

Study of Student Assessment

Introduction to the Study of Student Assessment

Student Assessment in CanadaStudent evaluation is an integral component of the teaching and learning process. It involves the collection and interpretation of information in a systematic fashion, crucial to effective decision-making and essential for a successful learning environment. The process of evaluation is a complex one and requires the collection of information for a variety of purposes ranging from student level, for diagnosis and reporting, to classroom and school levels intended for instructional and program planning. Given such a wide range of purposes, there is an implicit need for the corresponding selection of instruments and designs. For rather than a single event held at only one level of the system, student and program evaluation should be designed for various levels of the system and administered within a context leading either to confirmation of current practice or to direction for change.

In planning to meet the purposes just discussed, a variety of arguments relative to curriculum quality and student learning patterns can be insightful in looking at the relationship between assessment and the teachinglearning process. It is contended that there is no question that testing, assessment, and evaluation programs contribute to individual growth and to the maximization of individual potential. Increasing awareness of the significance of assessments places evaluation within the core of the educational process and as Forsyth, Jolliffe, and Stevens (1995) argue, evaluation should “never be an afterthought” (p.9). Passing informed judgment on the curriculum, the learning and teaching process, and programs is a process of making sense of where schools, teachers, and students stand. Thus, assessment becomes a central feature of the teaching and learning environment.

Critics of practice in this area sometimes focus only at one end of a broad spectrum. At one extreme some contend that teachers cannot be trusted and therefore all examinations intended for grading purposes should be externally developed and administered. At the other extreme, others condemn the use of standardized tests and program wide assessments. An example of the latter position is documented in Standardized Testing – Undermining Equity in Education (Froese, 1999). Froese claims that,

Although standardized tests may be useful for sorting and ranking students, they are inadequate in assessing student learning and development (p. 5).
In contrast to this position, Shanker (1996) contends there is a critical need for national standards to ensure high levels of achievement and to determine how successful educational reforms have been. In an address to the Second Education Summit in Palisades, New York he stated the following,
High academic standards are what parents, teachers and the public want; support for this is overwhelming. Standards also happen to work. Without them, not much else is going to make a  difference in student achievement.
National standards represent a real opportunity for public schools to turn themselves around and win back the confidence of the people we serve. If we can agree on what we want students to learn, we can focus our energies, ideas and resources on helping them achieve. Without standards, we have no clear focus and no way to determine which reform ideas and programs really work.

Source: Al Shanker, President, American Federation of Teachers (1996), Second Education Summit, Palisades, NY
Shanker’s call for standards implies a need for some objective way to determine whether or not they are being met. Standardized tests are the most effective measures to ensure that it is done in an accurate and unbiased way.

Reasons to Assess and Evaluate

It may be helpful to review a number of reasons why we should assess and evaluate prior to establishing a case for multiple measures to use for a variety of purposes. Among these are the following: to inform decision-making, to motivate change, to describe the effectiveness of educational programs, for accountability, and for purposes of certification and promotion. A brief discussion of these reasons follows:

» Testing informs decision-making
Educational tests have the ability to detect reading difficulties, to assess levels of achievement and to identify problem-solving abilities. Teachers use test results to group students or identify areas that need reinforcement or remediation. For example, diagnostic tests often provide a basis for assigning students to various programs, whether it is for students with disabilities, English-language learners, or underachieving students (Heubert & Hauser, 1999). In such cases, testing is used to provide educational opportunities through programs that will benefit the students. Alongside the objectivity of testing is the ability of the tests to discriminate among learners and to provide direction for designing programs to match a learnerês strengths and weaknesses.

» Testing motivates change
Assessment results may be viewed as a way to influence individuals involved in the development of learning events and those supporting them to take action. Students who know they are to be tested will do more studying and learn more than would otherwise be the case. As well, teachers and even parents get involved in preparing tshirt printing students for the test. The promise of reward or the threat of sanction will ensure change. Tests results inform practice and in this way become a strategy for instructional improvement. As a consequence, setting standards that define what teachers should teach, what students should learn, and holding  educators and teachers responsible for meeting these standards is important. The role played by assessment in educational change should be explored not only in terms of its results and products but also by looking at assessment as a process. As a result, answers to the following questions are legitimate expectations. In what ways might the results be used in improving the school system? How can assessment information direct instructional management?

» Assessment describes the effectiveness of educational programs
Tests conducted at the start of the implementation of a particular educational program and after an intervention was made will provide information indicative of the effectiveness of that program. In turn, this set of information might contribute to making adjustments to the program, thus making the program more functional. National assessments (e.g., SAIP in Canada, NAEP in the United States, and APU in wedding photographer Leeds England and Wales) and international assessments (e.g., SIMS, SISS and TIMSS) serve to inform the public about how schools and students are performing over time and how they compare with schools in other jurisdictions.

» Testing acts as a mechanism for accountability
Assessment results can carry consequences such as accountability, fund ing, technical assistance, or loss of acceleration. One aspect of accountability is providing information to the public. The public is expected to act on the information derived from the tests and to provide interventions that will improve educational quality and student achievement. Conducting national assessments serves to ensure that society is investing in those who have the potential of contributing to the society. Despite debates over standardized tests (see Phelps, 1999), they are rendered significant for their reliability and validity in measuring studentsê achievement.

» Testing is used for purposes of certification and promotion
Historically, examinations have been effective in terms of their objectivity as a selection mechanism and in ensuring fairness to individuals. Tracking, promotion, and graduation are èhigh stakesê decisions involved in large-scale achievement tests (Heubert & Hauser, 1999). These tests place a student in schools, programs, and classes based on their achievement levels. As well, they indicate whether a button badges student will be promoted to the next grade or whether a student will receive a high school diploma. Heubert and Hauser propose that the “value of tests should also be weighed against the use of other information in making high stakes decisions about students”.

» Looking beyond the testing process
The task of coming to a decision about the progress of a learner can be difficult and problematic for people making judgments. While tests may be used to diagnose learning problems, it is possible that remedial programs proposed may result from misinterpretation. Unless certain conditions are met, it is possible that assumptions about learners may be unfair and make no sense at all. Therefore it is essential that appropriate information is collected from a variety of sources and that results are interpreted within an appropriate context.

Even if we can learn something valid about what is happening, assessment information alone does not tell us how or why it is happening and what to do about it. Broadfoot (1996), Cambourne and Turbill (1994), and Woodward (1994) recommend the need to include qualitative data in the evaluation process. If one purpose of giving tests is to benefit the individual learner, considerable attention is paid not only to the elements of testing: averages, norms, groups, and standard deviations, but also to the recipients of whatever decisions will be made as an aftermath of the testing process.

A responsibility exists to regard the learner as more than a score and to propose programs that will accommodate a learner’s potential based on the interpretation of results from a variety of different measures. Learners and learning will benefit most when information is sought and questions are asked.