Topic: "Establishing Validity and Reliability for Locally Developed Instruments"

Topic started by: John Sutton on 5/21/14

We recently presented the webinar Establishing Validity and Reliability for Locally Developed Instruments and would like to continue the conversation.

If you missed the webinar, a recording of it along with the presentation slides will be available here:

Post webinar discussion continued

posted by: Sara Silver on 5/22/2014 9:00 am

Dave, Xin, and John from TEAMS /RMC gave useful ideas on how to establish reliability and validity for locally developed MSP tools. But there are some contextual issues to be considered:
1. MSP local evaluators have little advance time to go through the exercise of establishing reliability and validity prior to actual data collection (e.g. field test, run exploratory factor analysis).They are rarely brought into the planning stages and must often have a tool ready to go at the time of the first PD delivery.
2. Tools to measure teacher content knowledge (CK) are often the first that need to be produced. And here, a pre/post admin design makes reliability and validity correlations moot because if teachers have gained CK, one would NOT want to see strong pre/post correlations. In our experience, local grantees have the greatest difficulty designing CK tools when PD is so specialized to the grant that no standard tools exist in the field. It is incumbent on PD providers to take a larger role here since they determine the content of PD.
3. Finally, local evaluators typically have modest budgets to perform a great many tasks and in reality, establishing reliability and validity with fidelity is costly.

We welcome TEAMS assistance with these very real "on the ground" factors.

Sara's post

posted by: Leaf Schumann on 5/23/2014 9:43 am

Sara's points are well taken and each of the three present real issues that must be addressed to the degree possible (or the degree impossible?).

A thought on 2, however: to the degree that a PD participant holds their position (against other participants) between pre and post, might that not offer an opportunity to say something about reliability?

I agree that the specialized content of these projects means standard tools are generally lacking. This is one of the most nettlesome issues in my view.l

Reliability and Validity on a Shoestring.

posted by: Dave Weaver on 5/23/2014 5:52 pm

What I proposed during the webinar is something I have been doing myself for a number of years across multiple projects. I certainly agree that in most cases it is not practical to try to do it all during single project because of a lack of time and money. It took me sevaral years across several projects to accumulate a collection of reliable measures for a number of constructs that I can easily reuse in surveys for other projects. Each project contributes to my library of constructs for which I have reliable measures.The content validity comes from choosing the constructs relevant and valid for the project. However; every project has constructs for which I have no measures for it. In that case I am faced with creating the items from scratch and doing the reliability testing using the first wave of data collected. I know that others are doing the same thing. If we could share our reliability and validity information at the construct level, we would all be able to save lots of time and be using instruments that are more reliable. Picking the constructs that valid for your project supports content validity for the instrument.

Regarding the comment about measures of teacher content knowledge: Keep in mind that every method for determining reliability and validity is not appropriate for every situation. The nature of the instrument and the project implementation determines which method would work best. For tests, I generally use the alternate form method; but I will need to defer to Dr. Wang who is much more versed on this matter.

CK tests--alternate forms

posted by: Sara Silver on 5/27/2014 9:36 am

Dave, I'm curious as to how you administer alternate forms of a CK test to yield a measure of reliability. Can Xin weigh in on this discussion too?

CK reliability

posted by: Xin Wang on 5/27/2014 11:09 am

I agreed with Dave that it is hard for evaluators to validate the measures for one single project, that's why it makes sense for evaluators/researchers to share the the validity and reliability evidences for the measures they used and help each other to gather a collection of reliable and valid instruments. As for the reliability measures for teacher CK, if you plan to conduct the reliability testing using data collected from one test administration, I would go for alternate form or split-half tests. If the sample size is pretty large, you may want to divide your sample into two/multiple groups and administer different forms to the groups. Another option is to do internal-consistency test (split-half test or Cronbach's alpha) which doesn't rely on data from repeated administration.

CK reliability

posted by: Sara Silver on 5/28/2014 9:26 am

Thanks, Xin. Split half or Cronbach's alpha might be the way to go for projects that require customized CK tests given in a pre/post fashion. And also where time and budget are at a premium.