If dif is found for many items on the test, the final test scores do not represent the same. Since the effect of missing data on differential item functioning dif assessment has been invest. The conquest software provided the analysis model to understand the performance differences between groups i. Analyze dif with specialized software like dfit or parscale. I would thank the authors of these programs for allowing free access to these packages. A set of functions to perform differential item and item functioning analyses is implemented in the dfit package.
A new method for estimating differential item functioning dif for. Classical item analyzer computes test scale reliability analyses. University of wisconsin, laboratory of experimental design. Software for analyzing differential item functioning. Improving the assessment of differential item functioning.
Starting from a framework for classifying dif detection methods and from a comparative overview of the most traditional methods, an r. An item displays dif when test takers possessing the same amount of an ability or trait, but belonging to different subgroups, do not share the same likelihood of correctly answering the item. Search funded research grants and contracts details. The fairness of an item depends directly on the purpose for which a test is being used. Statistical software for differential item functioning analysis. A computer program for detecting uniform and nonuniform differential item functioning with the mantelhaenszel procedure. Batch files can also be generated for handling multiple calibrations in a cue. A comparative study of the bias correction methods for. Windows software that generates irt parameters and item responses. Analysis of differential item functioning in the depression. Jun 22, 2012 the jmetrik software includes psychometric analyses such as ctt, irt, differential item functioning dif, and confirmatory factor analysis cfa. For example, a science item that is differentially difficult for women may be judged to be fair in a test designed for certification of science teachers because the item measures a topic that every entrylevel science teacher should know.
Assessing differential item functioning among multiple groups. Differential item functioning analysis with ordinal logistic. Performances based on ability estimation of the methods of. Intuitive and analytical responses, agreedisagree answers, response refusals, socially desirable responding, differential item functioning, and choices among multiple options are considered.
Differential item functioning dif has been increasingly applied in fairness studies in psychometric circles. Software erm faculty members have made the following software packages available free for download. We analyzed 95 cognitive reading items, administered to students in 29 european countries. Apr 12, 20 differential item functioning dif is when a test item favors or hinders a characteristic exhibited by group members of a testtaking population. Irt differential item functioning tool assess computerized. In brief, differential item functioning dif occurs when groups such as defined by gender, ethnicity, age, or education have different probabilities of endorsing a given item on a multiitem scale after controlling for overall scale scores. A program to generate item response vectors unpublished manuscript. As a result, the differential item functioning analysis system difas was developed to provide a costeffective and easytouse program for conducting many of the common nonparametric dif detection procedures, as well as several new dif detection procedures that are not available in other statistical packages. An overview of differential item functioning in multistage computer. With multiple file readin option in wingen, a user can have multiple groups of examinees and multiple sets of itemstests. Improving the assessment of differential item functioning in. The student edition runs all example command files. Measuring differential item and test functioning across.
The program developed for dif analysis in cat was called computer adaptive testsimultaneous item bias catsib roussos 1996 that. The purpose of the proposed research is to create multilevel differential item functioning dif methods and software to increase the accuracy of the detection of dif. A general framework and an r package for the detection of. Average item scores for subgroups having the same overall score on the test are compared to determine whether the item is measuring in essentially the. Naep analysis and scaling differential item functioning. Eric ej973374 comparison of three software programs for. Differential item functioning software free downloads.
Windows software that generates irt parameters and. With the rising concerns over the fairness of language tests, differential item functioning dif has been increasingly applied in bias analysis. Differential item functioning dif is an important issue of interest in psychometrics and educational measurement. We present an ordinal logistic regression model for identi. Programs for differential item functioning linkdif and ezdif as described in applied psychological measurement factor analysis.
By design, largescale educational testing programs often have a large proportion of missing data. Detecting differential item functioning using wald and likelihood. Differential item functioning between ethnic groups in the epidemiological assessment of depression. Paper 29002015 multiple ways to detect differential item. Teresi, 1, 2 katja ocepekwelikson, 2 marjorie kleinman, 1 joseph p. Graphing tool is a simple spreadsheet to visualize differential item functioning with item. In this study, the performance of the regular maximum likelihood ml estimation is compared with two bias. Detection of dif is one step in the process of gathering score validity evidence. Pdf comparison of three software programs for evaluating. Measurement invariance and differential item functioning. Some of these procedures, such as the mantelhaenszel chi. They are helpful to those of us who wish to investigate. However, for rare events data, the maximum likelihood estimation method may be biased and the asymptotic distributions may not be reliable. Differential item functioning dif is a statistical characteristic of an item that shows the extent to which the item might be measuring different abilities for members of separate subgroups.
Assessing differential item functioning in performance assessment. Theres a body of research on what is called differential item functioning in. The logistic regression lr model for assessing differential item functioning dif is highly dependent on the asymptotic sampling distributions. Current methods include classical item analysis, differential item functioning dif analysis, item response theory, irt equating, and nonparametric item response theory. A thesis submitted to the graduate school of natural and applied sciences of middle east technical university. Differential item functioning dif is the preferred psychometric term for what is otherwise known as item bias. Differential item functioning wikimili, the free encyclopedia. Because of insufficient numbers of students for other demographic characteristics, this was the only comparison made. Differential item functioning analysis with ordinal logistic regression techniques difdetect and difwithpar paul k. The item analysis includes proportion, point biserial, and biserial statistics for all response options. Current methods include classical item analysis, differential item functioning dif analysis, confirmatory factor analysis, item response theory, irt equating, and nonparametric item response theory. Assessment developers design and construct questionnaires or tests including sets of items that measure, for example, cognition, personality traits, or political views. Erm software school of education uncg soe unc greensboro.
It includes functions to use the monte carlo item parameter replication ipr approach for obtaining the associated statistical significance tests cutoff points. Dec 05, 2015 dif differential item functioning in larger testing programs, it is possible to look at how, within a given overall ability level, members of different groups e. The analysis of differential item functioning dif examines whether item responses differ according to characteristics such as language and ethnicity, when people with matching ability levels respond differently to the items. The differential item functioning analysis software penfield, 2005 and the easydif software gonz alez et al. A windowsbased item response theory data generator with an equating and differential item functioning simulation guide. This article provides an applied example using sibtest statistical software to detect dif in u. Research open access detecting differential item functioning. X fits an item response model when x are item scores e. Free differential item functioning to download at shareware. Dif analyses are statistical procedures used to determine to what extent the content of an item affects the item endorsement of subgroups of testtakers. Pdf an introduction to differential item functioning. Differential item functioning dif refers to group differences in performance on a test item that cannot be explained by group differences in the construct targeted.
Performance differences at the measure level are described here as differential item functioning dif. In this article, i show how item response models can be used to capture multiple response processes in psychological applications. Recommendations for conducting differential item functioning. Differential item functioning shareware, freeware, demos.
A variety of statistical procedures have been developed to assess dif in tests of dichotomous hills, 1989. Since the effect of missing data on differential item. Several methods have been proposed in recent decades for identifying items that function differently between two or more groups of examinees. Modeling multiple response processes in judgment and choice. Using difas penfield, 2005, differential item functioning dif analysis was performed comparing males with females using data from sets 1 and 2, which were administered to all examinees. A new method for estimating differential item functioning dif for multiple groups and polytomous items. All of these analyses are useful in evaluating the psychometric quality of an assessment. This simulation study examines item level differential item functioning dif in the context of international largescale assessment ilsa using a generalized logistic regression approach. Differential item functioning dif has been widely used in healthcare, business management, and educational measurement. The purpose of the present analysis is to use differential item functioning dif to identify differences in the performance of native and immigrant students in pisa 2009 that can be directly related to their responses to particular items.
If you have any comments or questions about any software on this page, contact the author of that specific package. Avoiding bad discrimination in licensing and certification. The analysis of differential item functioning dif examines whether item responses. Comparison of three software programs for evaluating dif by means. Below are common statistical programs capable of performing the procedures discussed herein. The aim of the study is to examine differential item functioning dif. Relatively fewer studies examined an item level approach to measurement equivalence, particularly in settings where a large number of groups is included. The difnlr package uses nonlinear regression to estimate dif. Differential item functioning columbia university mailman. Wingen provides a dialog input to introduce differential item functioning dif or item parameter drift in the simulated data. Gibbons, phd, lance jolley, ms, and gerald van belle, phd introduction. Judicious application of this methodology by the researchers, however, requires an. Detecting differential item functioning using generalized.
This analysis can be performed by calculating various statistics, one of the most important being the mantelhaenszel, which can be carried out with software programs. Package difr may, 2020 type package title collection of methods to detect dichotomous differential item functioning dif version 5. Software for analyzing differential item functioning using the. A computer program for detecting uniform and nonuniform differential. Flexible application to many types of selectedresponse items. University of massachusetts, center for educational assessment. Analysis of differential item functioning in the depression item bank from the patient reported outcome measurement information system promis.