A sequence of simulations was carried out to aid in the diagnosis and interpretation of equating differences found between random and matched (nonrandom) samples for four commonly used equating procedures: Tucker linear observed-score equating; Levine equally reliable linear observed-score equating; Equipercentile curvilinear observed-score equating; and IRT curvilinear true-score equating. The results support the prediction based on theoretical grounds that observed-score equating methods are more affected by sample variation than are true-score equating methods. These results further suggest that matching equating samples on the basis of fallible measures of ability may not be advisable for any conventional equating method except the Tucker method. In addition, the results support a particular hypothesis about IRT equating, suggesting that the use of matched samples cannot be recommended for this equating method either. (72pp.)