Note: This very brief analysis is something fun that I mainly wrote to organize some thoughts and explore shower thoughts about something like waist circumference and marijuana use. This is not to be taken as conclusive science. See limitations at the end.
Similar to a previous post in revisiting projects to draft for journal submission, the NHANES data from the Centers for Disease Control and Prevention is a treasure trove of health data from social factors to biomarkers. It's an amazing study in that the CDC actually physically examines respondents in a mobile examination unit. Also amazing is the CDC maintains a running probability sample so that researchers can create population estimates. Very cool.
This exploratory analysis was inspired by a previous study where scholars found a negative correlation between BMI and marijuana use, which was somewhat counterintuitive to me. I often imagine people who smoke a lot of marijuana getting the munchies and eating Doritos Tacos Locos six times a day. However, perception is not always reality. That's why data exists!
NHANES has several modules. For our purposes, we will assemble Demographics, Mental Health Screening, Medical Conditions, Drug Use, Alcohol Use, and Sexual Behavior. I am going to conduct analysis of BMI at a later date.
Each module is available in two-year sets. Since we are looking at 2009-2012, the data must be merged and appended with special attention to changes in variables and their coding. As mentioned, NHANES uses a complex probability sample and survey design, so the data are weighted accordingly.
A drawback here, however, is that these types of surveys typically do not have a large enough cell size of individuals identifying as a sexual minority. NHANES has a set of questions asking about sexual orientation. However, we are more interested in sexual behavior rather than identity. Thus, we use questions from the Sexual Behavior Module to identify men who sleep with men (MSM) and females who sleep with females (FSF).
We wind up with a sample of 5,048 individuals who took part in both the mental health screening and completed the sexual behavior module.
The data are examined for normality and missing data. We utilize a selector variable for those who were screened for depression at a CDC mobile examination unit (MEC) with weighting specific to that module.
To start the exploratory analysis, we look at depression. NHANES use the PHQ-9 depression questionnaire developed from DSM-IV criteria. It measures depressive symptoms having occurred in the past two weeks prior to the exam. It is constructed from nine variables in an ordered scale of 1 to 5 (no depression to severe depression).
We can then examine variation of depressive symptoms among our sample using a two-tailed t-test.
We already have an interesting result. Our dummy variables use straight men as a base group. Across the spectrum of depression, MSM are significantly different than the comparison group.
To further explore, we fit a logistic regression with the response variable of clinical depression.
As we can see, and what our initial t-test didn't show us, is that MSM, FSF, and straight women, when factoring in education, income, age, self-reported days feeling anxious, and marijuana use are all much more likely than heterosexual men to be diagnosed with depressive symptoms. Of interest is the interaction between FSF and marijuana use where this was not present for other groups who use marijuana.
We can visualize probability of depression over level of education.
We can visualize in greater detail by sexual behavior.
We can then do the same for household income level.
Interestingly, self-reported days feeling anxious is a stronger predictor of depression among MSM than other groups.
Since NHANES has a lot of data on substance use, I'm interested in who is more likely to smoke marijuana, how frequently, and maybe see what kinds of contexts might predict who smokes weed. We now fit a logistic regression with response variable of current marijuana user.
It's interesting that higher education levels are a strong predictor of marijuana use. FSF, MSM, and straight women are also more likely to smoke marijuana than straight men with self-reported days felt anxious a predictor across the board, with one exception.
This is somewhat surprising to me. This interaction indicates MSMs who report more days feeling anxious are less likely to smoke marijuana while the opposite holds true for FSF, along with straight men and women. If we visualize this broken down by level of education, we get more interesting results.
We can then run an ordinal logistic regression (or softmax) to examine frequency of marijuana use broken down to every day, weekly, a few times a month, or once a month.
We can tell by the slopes for weekly and a few times a month that there isn't a lot of variation between MSMs and the rest of the sample. However, it's interesting to examine the difference between those who smoke weed every day versus those who smoke it weekly. I do not have an explanation, but I hypothesize that it might have something to do with tolerance to THC.
This exploratory analysis is interesting, but it is with its limitations. One is the low cell count for sexual minorities which makes the complex weighting scheme less robust in providing population estimates. Further, a large swath of respondents refused to answer sexual behavior questions.
Further analysis will likely have to look at groups in the sample with larger cell counts and not focus primarily on sexual minorities. Scholars and researchers are beginning to include more sexual minorities in their studies, but it will be several years before we have enough data to begin to better understand how sexual behavior might play into proclivity to use marijuana or other drugs and how that might effect mental health, anxiety, and depression.