The equal-weights e-rater score showed the same high reliability but significantly lower correlation with essay length. It is also aligned with the 3-factor hierarchical (word use, grammar, and discourse) structure that was discovered in the factor analysis. Both e-rater scores also successfully replicate human score differences between countries and prompts.