The DMS has shown consistently acceptable reliability, whether using the original 15-item version or the 14-item version with the question about possible future anabolic steroid use having been removed. Among male respondents, the DMS has had alpha reliability estimates ranging from .85 to .91 in both published reports and in reports presented at conferences, but not yet published. When female respondents have completed the DMS, reliability estimates have been above .80. However, coefficient alpha is not the only indicator of acceptable reliability. Corrected item-total correlations of DMS items have ranged from .37 to .65. These are well within the range recommended by Nunnally & Bernstein (1994). Finally, Cafri and Thompson (2004) reported high, 7-10 day test-retest correlations in a sample of men: .93 for the entire scale, .84 for the muscularity attitudes, and .96 for the muscularity behaviors.
There are several types of scale validity to be considered when evaluating a measurement tool. These include construct validity, concurrent validity, convergent validity, and discriminant validity. Each will be addressed here.
Construct validity can be determined in several ways, including analyses of a scale's factor structure and potential contamination from social desirability biases. With regard to the DMS factor structure, research conducted by McCreary et al. (2004) has shown that, for males, the DMS has two lower-order factors: muscularity-related attitudes and muscle-enhancing behaviors. Those two lower-order factors also load onto a single, higher-order Drive for Muscularity factor for men. For women, the two subscales do not emerge from factor analyses. Thus, for men, researchers can compute separate attitude and behavioral subscale scores and an overall DMS score. But for women, only the overall DMS score can be computed. In the only study to explore the association between socially desirable response biases and the DMS, Duggan and McCreary (2004) asked a self-selected sample of heterosexual and homosexual men to completed Paulus' Balanced Inventory of Desirable Responding, in addition to the DMS. There were no significant correlations between two measures for either group of men.
Concurrent validity assesses the extent to which DMS scores differ between groups that they theoretically should be able to distinguish between (i.e., by using a known-groups procedure). Gender would be one known-groups comparison with which to test the concurrent validity of the DMS. Gender differences have been found for both the overall DMS scale score, as well as for many of the individual DMS items. Men scored higher than the women when the differences were significant. A second known-groups comparison is between those who weight train and those who do not. While McCreary and Sasse (2000) showed a positive correlation of .24 between DMS scores and the number of times each week the respondents typically engaged in weight training activities, other researchers (e.g., Rutsztein, 2004) have observed that men and women who weight train (either regularly or intermittently) tend to score significantly higher on the overall DMS than a group who do not weight train. Finally, a comparson could be made between weight trainers who abuse anabolic-androgenic steroids (AAS) and those who do not. Choi, Pitts, and Grixti (2005) showed that AAS users scored significantly higher on the DMS than the non-AAS-using group.
Convergent validity examines the degree to which the DMS is associated with constructs with which it theoretically should be associated. Research by Cafri and Thompson showed that ratings on the DMS were uncorrelated with ratings on a muscular-based figure silhouette scale. Baxter and von Ranson, however, found that scores on the DMS were positively correlated with scores on a modified version of the Swansea Muscularity Attitudes Questionnaire (i.e., modified to suitable for use by both men and women). The DMS also should be negatively associated with self-esteem. This has been found in samples of men (e.g., Duggan & McCreary, 2004; McCreary & Sasse, 2000; Jacobs et al., 2004), but not in women. The DMS also should be correlated with various dimensions of personality. Holden et al. (2002) showed that DMS was positively correlated with appearance orientation in a sample of college males. This study was replicated and extended Davis et al. (2005), who showed that the scores on the DMS were positively associated with neuroticism, self-oriented perfectionism, appearance orientation, and fitness orientation. Finally, scores on the DMS should be correlated with measures of masculine-typed gender role socialization. This has been demonstrated in two studies (Mahalik et al., 2003; McCreary et al., 2005).
Discriminant validity explores the degree to which the DMS is uncorrelated with measures with which it should not, theoretically, be correlated. Because DM is not considered to be the opposite of the Drive for Thinness, DMS scores should not be negatively correlated with scores on measures such as the EAT or EDI. However, because muscle is situated underneath body fat, people who want to show off their muscularity also will need to have a low percentage of body fat. This means that DMS scores should be correlated to a small extent with those from the EAT and EDI. To date, several studies have shown the correlation between these two types of measure are approximately .30 to .40 (r2 = 9% to 16%) (e.g., McCreary & Sasse, 2000).