As an SLP who routinely conducts speech and language assessments in several settings (e.g., school and private practice), I understand the utility of and the need for standardized speech, language, and literacy tests. However, as an SLP who works with children with dramatically varying degree of cognition, abilities, and skill-sets, I also highly value supplementing these standardized tests with functional and dynamic assessments, interactions, and observations.
Since a significant value is placed on standardized testing by both schools and insurance companies for the purposes of service provision and reimbursement, I wanted to summarize in today’s post the findings of recent articles on this topic. Since my primary interest lies in assessing and treating school-age children, for the purposes of today’s post all of the reviewed articles came directly from the Language Speech and Hearing Services in Schools (LSHSS) journal.
We’ve all been there. We’ve all had situations in which students scored on the low end of normal, or had a few subtest scores in the below average range, which equaled an average total score. We’ve all poured over eligibility requirements trying to figure out whether the student should receive therapy services given the stringent standardized testing criteria in some states/districts.
Of course, as it turns out, the answer is never simple. In 2006, Spaulding, Plante & Farinella set out to examine the assumption: “that children with language impairment will receive low scores on standardized tests, and therefore [those] low scores will accurately identify these children” (61). So they analyzed the data from 43 commercially available child language tests to identify whether evidence exists to support their use in identifying language impairment in children.
Turns out it did not! Turns out due to the variation in psychometric properties of various tests (see article for specific details), many children with language impairment are overlooked by standardized tests by receiving scores within the average range or not receiving low enough scores to qualify for services. Thus, “the clinical consequence is that a child who truly has a language impairment has a roughly equal chance of being correctly or incorrectly identified, depending on the test that he or she is given.” Furthermore, “even if a child is diagnosed accurately as language impaired at one point in time, future diagnoses may lead to the false perception that the child has recovered, depending on the test(s) that he or she has been given (69).”
Consequently, they created a decision tree (see below) with recommendations for clinicians using standardized testing. They recommend using alternate sources of data (sensitivity and specificity rates) to support accurate identification (available for a small subset of select tests).

The idea behind it is: “if sensitivity and specificity data are strong, and these data were derived from subjects who are comparable to the child tested, then the clinician can be relatively confident in relying on the test score data to aid his or her diagnostic decision. However, if the data are weak, then more caution is warranted and other sources of information on the child’s status might have primacy in making a diagnosis (70).”
Fast forward 6 years, and a number of newly revised tests later, in 2012, Spaulding and colleagues set out to “identify various U.S. state education departments’ criteria for determining the severity of language impairment in children, with particular focus on the use of norm-referenced tests” as well as to “determine if norm-referenced tests of child language were developed for the purpose of identifying the severity of children’s language impairment” (176).
They obtained published procedures for severity determinations from available U.S. state education departments, which specified the use of norm-referenced tests, and reviewed the manuals for 45 norm-referenced tests of child language to determine if each test was designed to identify the degree of a child’s language impairment.
What they found out was “the degree of use and cutoff-point criteria for severity determination varied across states. No cutoff-point criteria aligned with the severity cutoff points described within the test manuals. Furthermore, tests that included severity information lacked empirical data on how the severity categories were derived (176).”
Thus they urged SLPs to exercise caution in determining the severity of children’s language impairment via norm-referenced test performance “given the inconsistency in guidelines and lack of empirical data within test manuals to support this use (176)”.
Following the publication of this article, Ireland, Hall-Mills & Millikin issued a response to the Spaulding and colleagues article. They pointed out that the “severity of language impairment is only one piece of information considered by a team for the determination of eligibility for special education and related services”. They noted that they left out a host of federal and state guideline requirements and “did not provide an analysis of the regulations governing special education evaluation and criteria for determining eligibility (320).” They pointed out that “IDEA prohibits the use of ‘any single measure or assessment as the sole criterion’ for determination of disability and requires that IEP teams ‘draw upon information from a variety of sources.”
They listed a variety of examples from several different state departments of education (FL, NC, VA, etc.), which mandate the use of functional assessments, dynamic assessments criterion-referenced assessments, etc. for their determination of language therapy eligibility.
But are the SLPs from across the country appropriately using the federal and state guidelines in order to determine eligibility? While one should certainly hope so, it does not always seem to be the case. To illustrate, in 2012, Betz & colleagues asked 364 SLPs to complete a survey “regarding how frequently they used specific standardized tests when diagnosing suspected specific language impairment (SLI) (133).”
Their purpose was to determine “whether the quality of standardized tests, as measured by the test’s psychometric properties, is related to how frequently the tests are used in clinical practice” (133).
What they found out was that the most frequently used tests were the comprehensive assessments including the Clinical Evaluation of Language Fundamentals and the Preschool Language Scale as well as one word vocabulary tests such as the Peabody Picture Vocabulary Test. Furthermore, the date of publication seemed to be the only factor which affected the frequency of test selection.
They also found out that frequently SLPs did not follow up the comprehensive standardized testing with domain specific assessments (critical thinking, social communication, etc.) but instead used the vocabulary testing as a second measure. They were understandably puzzled by that finding. “The emphasis placed on vocabulary measures is intriguing because although vocabulary is often a weakness in children with SLI (e.g., Stothard et al., 1998), the research to date does not show vocabulary to be more impaired than other language domains in children with SLI (140).“
According to the authors, “perhaps the most discouraging finding of this study was the lack of a correlation between frequency of test use and test accuracy, measured both in terms of sensitivity/specificity and mean difference scores (141).”
If since the time (2012) SLPs have not significantly change their practices, the above is certainly disheartening, as it implies that rather than being true diagnosticians, SLPs are using whatever is at hand that has been purchased by their department to indiscriminately assess students with suspected speech language disorders. If that is truly the case, it certainly places into question the Ireland, Hall-Mills & Millikin’s response to Spaulding and colleagues. In other words, though SLPs are aware that they need to comply with state and federal regulations when it comes to unbiased and targeted assessments of children with suspected language disorders, they may not actually be using appropriate standardized testing much less supplementary informal assessments (e.g., dynamic, narrative, language sampling) in order to administer well-rounded assessments.
So where do we go from here? Well, it’s quite simple really! We already know what the problem is. Based on the above articles we know that:
- Standardized tests possess significant limitations
- They are not used with optimal effectiveness by many SLPs
- They may not be frequently supplemented by relevant and targeted informal assessment measures in order to improve the accuracy of disorder determination and subsequent therapy eligibility
Now that we have identified a problem, we need to develop and consistently implement effective practices to ameliorate it. These include researching psychometric properties of tests to review sample size, sensitivity and specificity, etc, use domain specific assessments to supplement administration of comprehensive testing, as well as supplement standardized testing with a plethora of functional assessments.
SLPs can review testing manuals and consult with colleagues when they feel that the standardized testing is underidentifying students with language impairments (e.g., HERE and HERE). They can utilize referral checklists (e.g., HERE) in order to pinpoint the students’ most significant difficulties. Finally, they can develop and consistently implement informal assessment practices (e.g., HERE and HERE) during testing in order to gain a better grasp on their students’ TRUE linguistic functioning.
Stay tuned for the second portion of this post entitled: “What Research Shows About the Functional Relevance of Standardized Speech Tests?” to find out the best practices in the assessment of speech sound disorders in children.
References:
- Spaulding, Plante & Farinella (2006) Eligibility Criteria for Language Impairment: Is the Low End of Normal Always Appropriate?
- Spaulding, Szulga, & Figueria (2012) Using Norm-Referenced Tests to Determine Severity of Language Impairment in Children: Disconnect Between U.S. Policy Makers and Test Developers
- Ireland, Hall-Mills & Millikin (2012) Appropriate Implementation of Severity Ratings, Regulations, and State Guidance: A Response to “Using Norm-Referenced Tests to Determine Severity of Language Impairment in Children: Disconnect Between U.S. Policy Makers and Test Developers” by Spaulding, Szulga, & Figueria (2012)
- Betz et al. (2013) Factors Influencing the Selection of Standardized Tests for the Diagnosis of Specific Language Impairment
Last week I wrote a blog post entitled: “
Punctuation brings written words to life. As we have seen from countless of grammar memes, an error in punctuation results in conveying a completely different meaning.
This explicit instruction of punctuation terminology does significantly improve my students understanding of sentence formation. Even my students with mild to moderate intellectual disabilities significantly benefit from understanding how to use periods, commas and question marks in sentences.
This in turns becomes a critical thinking and an executive functions activity. Students need sift through quite a bit of information to find a website which provides the clearest answers regarding the usage of specific punctuation marks. Here, it’s important for students to locate kid friendly websites which will provide them with simple but accurate descriptions of punctuation marks usage. One example of such website is
In my therapy sessions I spend a significant amount of time improving literacy skills (reading, spelling, and writing) of language impaired students. In my work with these students I emphasize goals with a focus on phonics, phonological awareness, encoding (spelling) etc. However, what I have frequently observed in my sessions are significant gaps in the students’ foundational knowledge pertaining to the basics of sound production and letter recognition. Basic examples of these foundational deficiencies involve students not being able to fluently name the letters of the alphabet, understand the difference between vowels and consonants, or fluently engage in sound/letter correspondence tasks (e.g., name a letter and then quickly and accurately identify which sound it makes). Consequently, a significant portion of my sessions involves explicit instruction of the above concepts.
So I do! Furthermore, I can tell you that explicit instruction of metalinguistic vocabulary does significantly improve my students understanding of the tasks involved in obtaining literacy competence. Even my students with mild to moderate intellectual disabilities significantly benefit from understanding the meanings of: letters, words, sentences, etc.


Here is the problem though: I only see the above follow-up steps in a small percentage of cases. In the vast majority of cases in which score discrepancies occur, I see the examiners ignoring the weaknesses without follow up. This of course results in the child not qualifying for services.
So the next time you see a pattern of strengths and weaknesses and testing, even if it amounts to a total average score, I urge you to dig deeper. I urge you to investigate why this pattern is displayed in the first place. The same goes for you – parents! If you are looking at average total scores but seeing unexplained weaknesses in select testing areas, start asking questions! Ask the professional to explain why those deficits are occuring and tell them to dig deeper if you are not satisfied with what you are hearing. All students deserve access to FAPE (Free and Appropriate Public Education). This includes access to appropriate therapies, they may need in order to optimally function in the classroom.
“Well, the school did their evaluations and he doesn’t qualify for services” tells me a parent of a 3.5 year old, newly admitted private practice client. “I just don’t get it” she says bemusedly, “It is so obvious to anyone who spends even 10 minutes with him that his language is nowhere near other kids his age!” “How can this happen?” she asks frustratedly?


As per NJAC 6A:14-2.5
These delays can be receptive (listening) or expressive (speaking) and need not be based on a total test score but rather on all testing findings with a minimum of at least two assessments being performed. A determination of adverse impact in academic and non-academic areas (e.g., social functioning) needs to take place in order for special education and related services be provided. Additionally, a delay in articulation can serve as a basis for consideration of eligibility as well.
General Language:
Pragmatics/Social Communication
Finally, by showing children simple 
Today I’d like to officially introduce a new
Today I’m excited to introduce a new product: “
Typically when asked that question I always tend to recommend that a trained SLP performs a series of tests aimed to determine whether the student presents with reading and writing deficits.
Last year an esteemed colleague, Dr. Roseberry-McKibbin posed this question in our 


