Adaptive Behavior Assessment System ® (ABAS-3) - Third Edition - Sample Reports. A range of sample reports for ABAS-3: 'Anna' Interpretive Report. Intervention Planning Report 1. Intervention Planning Report 2. Progress Report. Score Report Parent. Score Report Teacher 'Jamie' Interpretive Report 1. Interpretive Report 2. The Adaptive Behavior Assessment System®-Second Edition (ABAS®-Second Edition) test and Windows software were published in Fall 2003. ABAS-II is primarily a downward extension Birth to 5.6. The product now ranges from 0-89. There are additional forms and the new AAMR composites are included. ABAS II and ABAS II Scoring Assistant are new.
GLUTTING'S GUIDE FOR NORM-REFERENCEDTEST SCORE INTERPRETATION, USING A SAMPLE PSYCHOLOGICAL REPORT
November 5, 2002
Importance of Norm-Referenced Test Scores
The purpose of this handout is to provide instruction on theinterpretation of results from norm-referenced tests. When peoplethink of a teacher's job, they seldom think of it requiring theinterpretation of results from standardized tests. However, interpretingsuch results is actually a very important part of a teacher’syearly (versus daily) activities.
I know this be true for tworeasons. First, after working in the public schools for six years as aschool psychologist, I saw how teachers reacted with puzzlement,confusion, and wonder when I presented results from norm-referencedpsychological evaluations. Second, I have been teaching long enough at theUniversity of Delaware to have had undergraduates return and takegraduate-level measurement classes with me. After a few years of workingin the public schools, these teachers see the impact norm-referenced testshave on children - - and, they emphasize that someone should havetaught them more about the norm-referenced test-score interpretations whenthey were undergraduates!
There is yet another way to demonstrate the importance ofnorm-referenced test interpretation to classroom teachers. Approximately15% of all children in the public school receive special education. To beeligible for special education, federal law (i.e., the Individuals withDisabilities Education Act [IDEA]) specifies that children must receivecomprehensive, norm-referenced assessments from Multi-Disciplinary Teams(MDTs). Furthermore, another 5% of the children in the public schools areevaluated by MDTs, but do not qualify for special education. Therefore,around 20% of all children in the public schools are evaluated, at onetime or another, by MDTs.
Given the large number of children evaluated by MDTs, the odds areapproximately 1 in 5 (i.e., 20%) each year that you, a'regular'education teacher, will refer a child for evaluation. Once you refer achild, you will receive one or more reports about him or her (e.g., apsychologist’s report, an educational diagnostician’s report, etc.).Almost all of the scores in these reports are norm-referenced, and it isthe results from these tests that determine whether children: (1) areeligible for special education and (2) are diagnosed as having a handicapping condition suchas mental retardation (MR), a learning disability (LD),attention-deficit/hyperactivity syndrome (ADHD), conduct disorder (CD),etc. Therefore, as you can see, the norm-referenced assessments conductedby MDTs are 'high stakes' and have a significant impact on thelives of children and the regular-education teachers who instructthem.
Perhaps the best way to learn about norm-referenced testinterpretation is to begin with a psychological evaluation. Youwill see one such psychological report just below. Thereport is fictitious. The child, the names of his parents, teacher,school, etc. are made up. Otherwise, the report is exactly what you wouldreceive as a classroom teacher.
Read the report carefully. There are four major areas covered in a psychological report: IQ-test results (see the 'WISC-III' section), adaptive-behavior inventory results (see the 'ABAS' section),achievement-test results (see the 'WIAT' section), andsocial-emotional adjustment results (see the 'ASCA' section).
Try to determine whether the child is performing above average,average, or below average in each of the four areas. You probably will beable to make the determination based on what the psychologist says in thereport(i.e., the report’s text presentation). However, look at the section ofthe report titled 'Synopsis of Formal Test Scores'. It isthis section of the report that provides the actual, norm-referencedscores obtained by the child. Look at the test scores themselves and seeif you can determine whether the child is performing above average,average, or below average based on the scores alone. You probably willnot be able to make the determination without learning more aboutnorm-referenced tests.
Also, as surprising as it may sound to you, the actual test scoresand what is said about the test scores in a report (i.e., thereport’s text presentation) sometimes do not agree with one another! Forthis reason, as a classroom teacher, you need to know something aboutnorm-referenced test scores. Otherwise, you will be unable to determinewhether the test results accurately portray how a child in your classroomis performing academically.
Once you finish reading the psychological report, the other sections of this document will teach you how to interpret norm-referenced test scores. At times, the document will refer back to the child (Billy) discussed in the psychological report and his test scores.
NOTE
PSYCHOLOGICAL EVALUATION
NAME:William (Billy) SmithPARENTS:William and Susan Smith
GENDER:MaleADDRESS:411Hanson Driver
DATE OF BIRTH:12/12/95
CHRONOLOGICALAGE:6-11TELEPHONE:807-555-1212
RACE:Anglo
EVALUATIONDATES: 11/10/02, 11/12/02,SCHOOL:Happy Valley Elementary
11/13/02GRADE:1
WechslerIntelligence Scale for Children-Third Edition (WISC-III), Wechsler IndividualAchievement Test- Second Edition (WIAT-II), Adaptive Behavior Assessment System(ABAS), Adjustment Scales for Children and Adolescents (ASCA), StructuredDevelopmental History Interview with Parent, Structured Teacher Interview,Review of School Records, Structured (Time Sampling) Classroom Observation,Unstructured Clinical Interview with Student
WISC-III IQs and Subtest StandardScores
FullScale IQ:65Verbal Scale IQ:67Performance Scale IQ:68
Information5
Similarities3
Arithmetic5
Vocabulary5
Comprehension3
DigitSpan6Symbol Search
STANDARD
INDEX
VerbalComprehension68
PerceptualOrganization67
Freedomfrom Distractibility75
ProcessingSpeed80
WIAT-II Composites and SubtestStandard Scores
STANDARD
COMPOSITESSCORE
Reading
Abas Scoring Guide
Mathematics
Written Language
Oral Language
**Not calculated prior to age8
STANDARD
SUBTESTS
Word
Pseudoword Decoding60
Numerical Operations
Math Reasoning
Spelling
Written Expression
Listening Comprehension69
Listening Comprehension67
Oral Expression
**Not calculated prior to age8
ABAS Composite and Subtest StandardScores
STANDARD
SCORE
Composite
STANDARD
SUBTESTS
Communication
Community Use70
Functional Academics
Home Living
Health and Safety70
Leisure
Self-Care
Self-Direction
Social60
Work**
**Not Administered
STANDARD
SCORE
Reason for Referral:
William(Billy) was referred by his classroom teacher, Mrs. Hopkins. Billy tries hardin school, but he is struggling in all academic areas.
History:
Billy is approaching hisseventh birthday (age = 6 years, 11 months). He lives with his both of hisbiological parents, William (age 35) and Susan (age 33) Smith. William is anaccountant and Susan works as a purchasing agent. Both Mr. and Mrs. Smith arecollege graduates. Neither report that they experienced learning difficultiesin school. Mr. and Mrs. Smith have lived in the same community (Omaha)throughout their lives.
A developmental history wasconducted with Mrs. Smith on 11/10/02. Two children besides Billy live in thehome: Mary, age 10 and Ann, age 8. Mary and Ann are Billy’s biologicalsiblings. Parent information and a review of school records reveal that bothMary and Ann are doing well in school.
Billy speaks only English, whichhe has been exposed to since birth and has been speaking since he first begantalking. Mrs. Smith’s pregnancy with Billy, and her delivery, wereunremarkable. Billy was born through a Cesarean section, as were his twosiblings. However, Billy weighed less than 5 1/2 pounds at birth. Hisone-minute Apgar score was moderately depressed (score = 7), but thefive-minute Apgar was in the healthy range (score = 8). Billy has never beenhospitalized, and with the exception of measles, he experienced no childhoodillnesses.He currently is taking noprescription medications.
A visual screening wasconducted by the school nurse on 10-10-02. Results revealed Billy has normalvisual acuity. Also, a hearing test was conducted in school by the speechtherapist on 10-20-02 and showed normal auditory acuity.
According to his mother,Billy reached his motor milestones (sitting alone, crawling, standing alone,and walking) within the expected age ranges. Mrs. Smith is concerned because hereached his language milestones later than expected (speaking first words andspeaking in short sentences). Mrs. Smith describes Billy as a happy,cooperative child who gets along with his two, older sisters. There are manychildren in Billy’s neighborhood. Mrs. Smith also indicated that Billy prefersplaying with children younger than himself than with either his sisters orchildren his own age. Billy’s favorite activity is playing with trucks. Hisfavorite food is ice cream.
Billy attended a preschoolprogram at age 4 and a half-day kindergarten program last year. In addition,Billy’s first grade is in the same school (Happy Valley Elementary) as hiskindergarten class. A review of school records shows he is maintaining goodattendance this year and he had an excellent attendance record in kindergarten.A teacher interview was completed with Mrs. Hopkins, Billy’s current teacher.Mrs. Hopkins reports that Billy is very well-behaved. Likewise, Billy has anexemplary conduct record. Regarding academic performance, Mrs. Hopkins indicatesthat Billy tries very hard in class. At the same time, he is struggling andexperiencing many academic difficulties. He is having problems withintroductory reading and math skills, and in both areas, he is in Mrs. Hopkinslowest teaching groups. School records show the same pattern of academicperformance was present in kindergarten. In October, standardizedgroup-achievement tests were administered to all first graders at Happy ValleyElementary School. Results disclose Billy scored far below average in Reading,Math, and Language.
Several pre-referralinterventions were attempted with Billy. For instance, Mrs. Hopkins providesone-to-one instruction whenever possible. Billy receives one-to-one tutoringfrom a community volunteer twice a week for one-half hour. Likewise, the schoolhas a peer tutoring program. Once a week, Billy works with a fourth-gradestudent who helps him with sight-word identification.
Current Observations:
Billy was evaluated on two occasions in his school.Physically, he presented as appropriate in height and weight for his age.Billy’s dress was clean, and on each occasion, he was well groomed. It isobvious that Billy is well cared for at home. His articulation was clear, andhis vision, hearing and gross-motor coordination appeared appropriate. He wassomewhat nervous about leaving the classroom to work with the examiner.Nevertheless, Billy grew increasingly relaxed as the first test sessionprogressed; he was cooperative; he regularly helped the examiner put away testmaterials; and he listened attentively to most test directions and questions.Similarly, Billy was equally relaxed and cooperative during the second testsession.
Onetest administered to Billy was the WISC-III. This instrument evaluates avariety of abilities associated with school success and it is considered to beone of the best predictors of future achievement. The WISC-III does not assessall abilities such as some specific mechanical aptitudes that may be importantto certain occupations and trades. Likewise, it does not measure creativity orhow well children get along with others.
TheWISC-III provides a progression of scores that can be thought of as forming atriangle. At the top is the Full Scale IQ (FSIQ). This is the best singlepredictor of school achievement on the WISC-III. Underlying the FSIQ are twoscores that permit further distinctions. The first is the Verbal Scale IQ(VIQ). It assesses the ability to think in words and apply language skills andverbal information to solve problems. The second is the Performance IQ (PIQ)which requires fewer verbal skills. It evaluates the ability to think in termsof visual images and manipulate them fluently with relative speed. Another wayto think of the PIQ is that it evaluates the ability to organizevisually-presented material against a time limit. When there is a differencebetween the VIQ and PIQ, the VIQ is usually the better predictor of schoolachievement.
Resultsfrom the WISC-III indicate Billy may have difficulty keeping up with peers onmost tasks requiring age-appropriate thinking and reasoning. His generalcognitive ability is within the lower extreme range of intellectual functioning(WISC-III FSIQ = 65).
Billy'sability to think with words is comparable to his ability to reason without theuse of words (VIQ = 67, PIQ = 68). Both Billy’s verbal and nonverbal reasoningabilities are in the lower extreme range and align with his overall abilitylevel.
Apersonal strength for Billy is his ability to process simple informationquickly and efficiently (Processing Speed Index [PSI] = 80). Billy’s PSI washis highest result on the WISC-III. The PSI converts to performance at theninth percentile. In other words, Billy is able to process simple information morequickly than 9 out of every 100 children his age.
Social and Emotional Functioning
Summary:
OVERVIEW OF NORM-REFERENCED TEST SCORE INTERPRETATION
The direct numerical report of a child’s test performance isthe child's raw score (e.g., the number of right-wronganswers). Most often,we cannot interpret raw test scores as we do physical measures such asheight because raw scores in a psychological report have no truemeaning. Likewise, raw scores are NOT measured in equalunits along a line. Therefore, the way one canmeaningfullytalk about test scores is to bring in a referent. There aretwo major referents for tests:norm-referencing andcriterion-referencing. We already discussed both types of referentsearlier in the course.Now,as a result of the psychological report for Kelly, we will pay particularattention to instruments that facilitate norm-referenced comparisons.
NORM VS. CRITERION-REFERENCED MEASUREMENT
The basic difference between norm- andcriterion-referencedtests is their interpretation; that is; how wederive the meaning from a score. Norm-referencedtests are constructedto provide information about the relative status ofchildren. Thus,they facilitate comparisons between a child's score to the scoredistribution (i.e., mean and standard deviationof somenorm group. As a result, the meaningfulness of these scoresdependson:
(1)the extent to which the test user (e.g., psychologist, teacher,parents) is interested in comparing a child to the mean andstandard deviation of a norm group.
(2)the adequacy of the norm group.
ADEQUACY OF THE NORM GROUP IN NORM-REFERENCED TESTING
Before we learn how to interpret Billy’s test scores, we need to learn why the norm group in norm-referenced test interpretations is so important.
The American Psychological Association (APA), the AmericanEducational Research Association (AERA), and the National Council forMeasurement in Education (NCME) (1985) clearly state that it is the testpublisher's responsibility to develop suitable norms for thegroups on whom the test is to be used. There are fourmajor types of norms. Which of the four types of norms are used by apsychologist (or a school district when conducting group testing) canhave a radical impact on the interpretation of a child’s test results.
1.National norms. This is the most common normapplied to test scores. Therefore, it is the most important test norm.These norms are almost always reported separately by the differentage or grade levels. Most group instrumentsreportingnational norms employ reasonably satisfactory norm groups. On the otherhand, most individually-administered, clinical instruments used bypsychologists, educational diagnosticians, etc., have inadequate nationalnorms. Many have samples that are too small in size, are conductedon regional samples not representative of the country, areinsufficiently stratified by age, and disproportionately represented byAnglos and middle-class children.
The WISC-III, WIAT, ABAS, and ASCA instruments used to evaluate Billy all have very good national norms. The diagnostic decisions made my MDTs should ALWAYS be based on tests that use national norms, and NOT on any of the other norms discussed below.
2.State (also called Regional) Norms. Here, the referent changes from children across the United States to those within a particular state. State norms are confusing. State norms cansometimes be helpful, however.For instance, if we wanted to compare a child's achievement level to the achievement level of other children within the state of Delaware, we would use a state norm.
Generally, state norms impose problems for interpretation. Let’s talk about an instance where state norms would not be appropriate.In the psychological report for Billy, his overall IQ on the WISC-III was 67. The overall IQ on the WISC-III is referred to as the Full Scale IQ (FSIQ). Billy’s FSIQ of 67 was determined by comparing his performance to other children across the nation (i.e., national norms were used). We would not want to compare him to just children in the state of Delaware (i.e., we would not want to use state norms) because children in Delaware could have higher, or lower, IQs than children in other states. In other words, when we think about children’s intelligence levels, achievement levels, etc., we typically think about how they compare to other children across the nation - - and not how they compare to children just within one state. If MDTs made decisions on the basis of state norms, a child could be identified as mentally retarded based on their 'Delaware' IQ, move across state lines and maybe not be retarded in another state. Consequently, as noted above, national norms are to be preferred in the norm-referenced, diagnostic assessments completed by MDTs.
3.Special-Group Norms. For SOMEdecision-making purposes, special-group norms make sense. For example,when hiring an engineer from a homogenous pool of applicants whoare all engineers applying for a job, a better decision can be made onnorms based on a pool of engineers alone because we get to see howeach applicant compares to the 'typical' engineer. Norms based on thegeneral population would probably fail to make the fine-grain distinctionsamong the engineering applicants that are necessary to make the hiringdecision because engineers are brighter and more educated than the averageperson.
You may not know this, but the SAT uses special-group norms. Ask yourself: 'Who takes the SAT'? It is only those people who are pretty successful in high school. This is a 'special' group because you actually have to be pretty smart to graduate high school- - and, only those people who do well in high schoolconsider going to college. It is only this latter group of people who take the SAT - - and norms for the SAT are based on thisgroup. In other words, the norm group for SAT is a 'special' norm group because it represents the top one-half of all students in the United States. Consequently, if you did not score very highly on the SAT, it is not an embarrassment. The reason is because you scored below average in comparison to a special norm group that was above average to begin with!
Another way of saying all of the above is that you could take the SAT (which has above-average, special group norms) and score below average. You could then take the adult form of the WISC-III (i.e., the WAIS-III) and still score above average on the WAIS-III because the WAIS-III has national norms!
On the other hand, special group norms are inappropriate forthe educational or diagnostic decisionsmade by MDTs. For example, youprobably would agree that it would be incorrect to compare Billy’sadaptive-behavior results on the ABAS by using norms only on children withretardation. The reason is because if Billy were to score in the averagerange for this special group, he would still share more in common withretarded children than with 'regular' education students simplebecause he is 'average' only in comparison to children who areretarded.
4.Local Norms. Many educators prefer someintradistrict norm where they can compare children to one anotherwithin their school district. These norms are referred to as'local' norms. The idea behind local norms is that test userscan compare specific children to the average in that particular locale.While the use of local norms has some intuitive appeal, the procedure canbe misleading when the local test mean deviates sharply fromthe test's national mean. For example, if the performance in a specificschool district is below the national mean, the relative performance ofchildren will be inflated by using local norms.
STANDARD SCORES
As we already know from our earlier lesson on statistics,the basic standard score is the z-score. We also know that once we obtaina z-score, it is a simple process to convert a z-score to a t-score, IQscore, and such.
The basic standard score, the z-score, is defined as follows:
Z = (X - M)/SD
where:
X = a child's raw-score on the test,M = the raw-score mean for a particular norm group,
SD = the raw-score standard deviation for a particular norm group.
The mean for a full set of z-scores is set at zero andthestandard deviation is set at 1.0. Stated simply, z-scores areraw-scores expressed in standard deviation units from the mean.Further, we know that a major advantage of standard scores is that theyare measured in equal units.
Problem 1
Before we go on to t-scores and other types of standard scores, let’s try a couple of problems where we convert raw scores on a test into z-scores. Assume that a test has a raw-score mean of 62 and a standard deviation of 9. If a child obtains a raw score of 71 on the test, what would her z-score be? Calculate this problem yourself.
Problem 2
Let’s return to the test used in problem 1 just above. The test hasa raw-score mean of 62 and a standard deviation of 9. A second child takesthe test and gets a raw-score of 53. What is this child’s z-score? Calculate this problem yourself.
I am going to give you a lot of help with problem 2 just above. The correct answer is z = -1.0. The answer shows that z-scores below the mean have negative values. In order to get enough precision when using z-scores, we must use at least one decimal place. This makes z-scores such as -1.0 awkward. Another drawback is that approximately half of all z-scores are negative.
Let’s consider again the case of Billy. He obtained a WISC-III FSIQ of67. This number may not mean much to you yet, but it is a pretty low IQ.It is possible to convert his IQ to a z-score. When you do so, a WISC-IIIFSIQ of 67 converts to a z-score of -2.20. How would you like to tellBilly’s parents that his IQ was negative! I know I wouldn’t - - and its for this reason that tests use metrics.
Another way of saying all of the above is that we can avoid negative scores and decimals by simply using a standard score with a meansufficiently greater than 0 to avoid minus score values, and a standard deviation sufficiently greater than 1 to make decimals unnecessary.
We already learned the general formula for convertingz-scores to other standard scores.
Desired Metric unit = z (SD) + M
For example, Wechsler's intelligence tests (WISC-III, WAIS-III)use this form:
IQ = z (15) + 100
The SAT and GRE use this form:
SAT = z (100) + 500
Many behavior rating scales use t-scores. You can convertz-scores to this form as follows:
t-score = z (10) + 50
Problem 1
Johnny obtains a z-score of -2.0. What numerical value would his score be if we converted it to Wechsler IQ units, SAT units, and T-score units?
IQ = -2.0 (15) + 100
IQ = -30 + 100
IQ = 70
SAT = -2.0 (100) + 500
SAT = -200 + 500
SAT = 300
T score = -2.0 (10) + 50
T = -20 + 50
T = 30
As can be seen from this, IQs, SATs, and T-scores have all theproperties of z-scores without the awkwardness resulting from negative scores and decimal points.
Problem 3
Mary obtains an IQ of 115 on the WAIS-III, and her SAT score is650. On which test did she do better?
To find this out, we need to convert both scores to a common unit,the z-score. All we have to do is use the formula for a z-score.
To find the z-score, given a WAIS-III IQ of 115, we:
Z = (115 - 100)/15
Z = 15/15
Z = 1.0
To find the z-score, given a SAT score of 650, we:
Z = (650 - 500)/100
Z = 150/100
Z = 1.5
So, then, on which test did Mary do better?
We already discussed other common, standard-score metrics during ourlesson on statistics. However, they are so important that I will presentthem again:
__________________________________________________________________
T score (e.g., the ASCA, CDCL, and BRP-2)
- M = 50
- SD = 10
Wechsler IQ units (e.g., the WISC-III, WIAT,WRAT-III, KM-R, andABAS)
- M = 100
- SD = 15
Wechsler subtest units (e.g., the Information, Similarities,and other subscales of the WISC-III)
- X = 10
- SD = 3
Stanford Binet IQ units
- X = 100
- SD = 16
GRE
Abas Scoring Manual
- X = 500
- SD = 100
SAT
- X = (approximately) 430
- SD = 100
NCEs
- X = 50
- SD = 21.06
Stanines
- X = 5
- SD = 2 (actually 1.96)
__________________________________________________________________
RELATIVE-STATUS SCORES
We need to discuss some other types of derived scores(i.e., converted from raw scores). There are several types of derived scores that give a child’s relative status. Like standard scores, these other relative-status scores arederived from raw scores.However, these other relative-status scores are not standard scores.
Remember, standard scores present everything in equal units. This means we can add, subtract, multiply, and divide standard scores. We cannot add, subtract, multiply, and divide the other types of relative-status scores.
Besides standard scores, three other types of relative-status scores are commonly used by MDTs: (a) percentiles, (b) grade equivalents, and (c) age equivalents. We will now discuss each type of relative-status score.
Percentiles
Percentiles. A percentile is the point in a scoredistribution BELOW which a certain percentage ofthepeople fall. Thus, if a person obtains a percentile score of 50, it meansthat 50 percent of the population falls below this person.Likewise, if a person gets a percentile score of 75, it means that 75percent of the population falls below this person.
Percentiles are not standard scores. The reason isbecause percentiles are expressed in ordinal units (ranks). Allthat the term 'ordinal' means is that the distance between units(i.e., percentile numbers) is not equal. In other words, the distancebetween the 49th and 50th percentiles is much smaller than the distancebetween the 1st and 2nd percentiles. The reason is because the 49th and50th percentiles are near the middle of the bell-shaped curve and the 1stand 2nd percentiles are at one 'tail' of the bell-shaped curve.As strange as it may seem (and I will show you this in class), thedistance between the 1st and 3rd percentiles is exactly the samedistance as that between the 16th and 50th percentiles!
Although widely used, percentiles suffer from two seriouslimitations. One limitation is that the size ofpercentile units is not constant in terms ofstandard-score units. We just covered this limitation above, but I willrepeat it again to be thorough. For example, if the distribution of testscores is a normal, bell-shaped curve, the distance between the 90th and99th percentiles is much greater than the distance between the 50th and59th percentiles. One standard-score unit change near the mean of a testmay alter a percentile score by many units while a single standard-scoreunit change at the tail of the distribution may not change the percentilescore at all!
A second limitation of percentiles is that gains and losses cannot be compared meaningfully because percentiles are not measured in equal units. Thus, because the units are not equal, you cannot add, subtract, multiple, or divide percentiles.
Percentile scores can be very deceiving!!! Let’s consider the psychological report for a second student, Kelly. Her standard score in mathematics on the WIAT was 86. This score converts to a percentile score of 17. A standard score of 86 is in the Average range of achievement. However, most teachers would say that a child whose mathematics score is at the 17th percentile is having big trouble academically. This simply is not the case! Yes, like her classroom teacher, the psychologist would prefer to see Kelly have a much higher achievement level. However, a score at the 17th percentile is not all that low. Psychologists know this fact. It is not until you are about the 5th percentile, or lower, that the score suggests a need for special education. Because percentiles are misinterpreted so often, I tell graduate students that, in general, it is best not to present them in their psychological reports. (Note: the psychological report did present percentiles for the case of Billy because I wanted to show you the problems they can pose.)
Standard scores are clearly better to interpret than percentiles.Furthermore, once you know a child’s standard score on a test it isrelatively easy to translate the standard score to a percentile. You canuse standard score-to-percentile conversion tables to do this withouthaving to make any calculations of the sort described earlier. In class, Iwill show you how to use such tables. Here are four:
Age- and Grade-Equivalents
Like percentiles, age- and grade-equivalents are two other typesof derived scores. However, percentiles, age-equivalents, andgrade-equivalents are not standard scores.
We already know that percentile scores can cause problems forinterpretation. The truth of the matter is that age- and grade-equivalentsare far worse to interpret than percentiles!
Age equivalents are intended toconvey the meaning oftest performance in terms of the typical child at a given age. Likewise, grade equivalents attempt to provide information in termsofthe typical child at a given grade level.
Grade equivalents are the most common method forreporting results onstandardized achievement tests prior to high school (Echternach, 1977). Although grade equivalents are very popular, they also are very problematic. Approximately 20 years ago, the APA, AERA, and NCME proposed that they be banned. Unfortunately, this never occurred.
Age- and grade-equivalents are essentially the same thing,except that age-equivalents compare children to other children whoare at the same age level, whereas grade-equivalents compare children toothers at their grade level. Therefore, because grade-equivalents are morepopular than age-equivalents, the rest of the document will discuss gradeequivalents.
Grade-equivalents can be explained best by an example. If a studentobtains a raw score on a test that is equal to the median score forall the beginning sixth-graders (September testing) in the norm group,then that student is given a grade-equivalent of 6.0. A student whoobtains a score equal to the median score of all beginning fifth-gradersis given a grade equivalent of 5.0. If a student should score betweenthese two points, 'interpolation' would be used to determine thegrade equivalent. Because most schools run for 10 months, successivemonths are expressed as decimals. Thus, 5.1 would refer to the averageperformance of fifth graders in October, 5.2 in November, and so on to 5.9in June.
Limitations of Grade Equivalents
Grade-equivalents have a great deal of intuitive appeal becauseparents, teachers, as well as many psychologists, think the numbersactually mean something. However, this is not the case. By way of example,most parents, teachers, (and many psychologists) would assume that a fifthgrade child who obtain a grade equivalent of 3.2 knows the same amount ofreading as a third grade child who obtains a grade equivalent of 3.2. Thissimply is not true! The fifth grade child actually knows more reading!Thus, this short example shows some of the problems associated with gradeequivalents.
If you do not believe what I just said about grade equivalents (oreven if you do), it would be worthwhile to read the question and answersection of 'HillsHandy Hints' devoted to grade equivalents.
We are now going to discuss the limitations of grade equivalents,but the problems cited for grade equivalents also apply to ageequivalents.
Grade equivalents suffer from at least 7 major limitations. I will now present each of these limitations
1.Grade equivalents for low-scores in the lowgrades andhigh-scores in the high grades are impossible to establish becausethey generally are extrapolated from existing observations. Thisprocedure, at best, represents little more than an educated guess.
2.Even in grades where norms exist, it is appropriate for 50%of the children in a classroom to score below their grade level.That is, grade equivalents give us little information about the percentilestanding of a person within a class. For example, during a Septembertesting, it is normal for 50% of the children toobtain grade equivalents below their grade placement.This is especially true in the upper grades.
3.Related to problem number 2, grade equivalents tend toexaggerate the significance of small differences, and in this way,tend to encourage the improper use of test scores. Because of the largewithin-grade variability it is possible, for example, for a child whois only moderately below the median for his grade to appear on a gradeequivalent as much as a year or two below expectancies. (This phenomena ismost likely to occur in the upper grades - - say in the sixth gradeand above.) A comparison of the child's grade equivalent with hispercentile rank will make this fact clear. The problem is most evident,say, when a 6th grader obtains a grade equivalent of grade 1.3. The1.3 grade equivalent does not mean that the child is functioning on thesame level as a child in the third month of first grade. The older childmost probably knows more.
4.Grade equivalents are not comparable across subject matter. A6th-grade student, for example, because of the differences in gradeequivalents for various subject matter, can have a grade equivalent of6.6 in reading and 6.2 in mathematics and yet have a higher standardscore (or percentile score) in mathematics! In other words, gradeequivalents are an artifact of the particular way the subject-matter areain question is measured on the test AND the way thesubject-matter is introduced in the curriculum of a particularschool district.
5.Grade equivalents assume that growth across years is uniform. The assumption of uniform growth across years is untenable. Developmental psychologists teach us that rate of growth is greater for younger children and that it diminishes as children advance in age. Grade equivalents, however, act as though 1 month of growth in the first grade is the same as 1 month of growth in the 10th grade.
6.Grade equivalents are based on 9 or 10-month school-year metrics.This means grade equivalents assume that either no growth takesplace during the summer or that growth during summer is equal to onemonth of growth during the school year. There is certainly reason todoubt that these assumptions are true.
7.Finally, grade equivalents have no interpretive value beyond theeighth or ninth grade. They are appropriate only for those subjectsthat are common to a particular grade level.
Grade equivalents remain popular in spite of their inadequacies. Educators are under the impression that such scores are easily andcorrectly interpreted - an unfortunate assumption. At a minimum, it isappropriate to suggest that grade equivalents never be used alonewithout some other type of score, such as standard scores orpercentile ranks - - and, it may not be too dogmatic to suggest that westop using these scores altogether.
You probably noticed that the psychological report for Billypresented neither age- or grade-equivalents. The reason, quite simply, isbecause their metrics are so bad.
Abas 3 Scaled Score Chart
Showing: 1-17 of 17
|