Differential item functioning (DIF) analysis is a key component in the evaluation of the fairness and validity of educational tests. Although it is often assumed that refinement of the matching criterion always provides more accurate DIF results, the actual situation proves to be more complex. To explore the effectiveness of refinement, we conducted a simulation study consisting of 40 conditions that varied in terms of amount and pattern of DIF, sample sizes, and ability distributions. We found that the effectiveness of refinement was heavily dependent on whether DIF was balanced (with positive DIF values compensating for negative DIF values) or unbalanced (all in one direction). In balanced conditions, the unrefined method generally produced better results, whereas, in unbalanced conditions, the opposite was true. In the absence of information about the pattern of DIF, it is probably best to choose the refined method because it is only slightly disadvantageous in balanced conditions, whereas the unrefined method can be substantially disadvantageous in certain unbalanced conditions.