Thanks again for all the comments. I have snatched a handful of time so will have a go at continuing the discussion... These are my current thoughts and reactions that I'm sharing with you. I would be most grateful for further feedback and especially any references/texts to back up your ideas so I can do some further reading to assist my understanding of these critical issues. In response to Ben Anderson (3/9): You mention the 3 issues: (1) how representative is my sample? (2) what analysis can I do with my sample (3) can I say anything about the larger population from my analyses As to the representativeness, it's not possible to know how representative my sample is of the target group: who are Australians who have recently used 'party drugs' (ecstasy, methamphetamine, etc) AND who use online forums / internet message boards. This subgroup of the wider population who have recently used party drugs is likely to differ on key characteristics, and I can discuss those, but I can't know the biases of this sample in comparison to the whole target group because they have never been systematically studied. As for the analysis, my concern is that - although many papers are published in this field where analyses of association (eg. regression) or difference (eg. ANOVA) are conducted on samples like mine (not randomly selected), psych and stats textbooks stress the critical assumption of having a sample frame if one wants to apply the logic of probability. As I understand it, a simple test like a t-test is asking - e.g., is the difference between scores for (say) males and females large enough that I can reject the null hypothesis that males and females in the wider population do not differ on this specific score, with 95% (or 99%) certainty (assuming the sample is randomly selected from the wider population). If I am conducting an exploratory study where little is known about the wider population and I can't randomly sample from it, what use is this t-test? Surely I should just give the two different scores for males and females. As an exploratory study, this information is still useful, as it provides something to start with for further research. A reading I found incredibly useful was: Berk, R. A., & Freedman, D. (2003). Statistical assumptions as empirical commitments. In T. G. Blomberg & S. Cohen (Eds.), Law, punishment, and social control: Essays in honor of Sheldon Messinger (2nd ed., pp. 235-254). New York: Aldine. (which is available through Google Books to read) One relevant quote is: "the moment that conventional statistical inferences are made from convenience samples, substantive assumptions are made about how the social world operates. Conventional statistical inferences (e.g., formulas for the standard error of a mean, t-tests) depend on the assumption of random sampling. This is not a matter of debate or opinion; it is a matter of mathematical necessity. When applied to convenience samples, the random sampling assumption is not a mere technicality or a minor revision on the periphery; the assumption becomes an integral part of the theory" ... rest of article well worth the read! in response to Ben Anderson (7/9): Re confidence intervals. I've definitely read many times cautions about applying confidence intervals to purposive/non-random samples. These margins of error are based on specific statistical assumptions that don't appear to hold true for non-systematic samples. So, yes, I agree we need to move away from just looking at the p value to reporting confidence intervals and effect sizes - but reporting these for convenience samples does not appear to be sound. An article I read that seeks to "compare the characteristics of a self-selected, convenience sample of men who have sex with men (MSM) recruited through the internet with MSM drawn from a national probability survey in Great Britain" provides an example of this thinking. They calculate confidence intervals for the probability sample but do not present CIs for the convenience sample, stating "CIs for the internet percentages were narrow, and are not presented here because they add little to the interpretation of the data from this non-probability sample." I intend to do a similar analysis to compare my sample with a probability sample of Australian 'party drug' users. Evans, A. R., Wiggins, R. D., Mercer, C. H., Bolding, G. J., & Elford, J. (2007). Men who have sex with men in great britain: Comparison of a self-selected internet sample with a national probability sample. Sexually Transmitted Infections, 83(3), 200-205. In response to jeremy hunsinger: "Then you have to say... do i want to make inferences about my sample as population, my sample as representative of a larger population, or my sample as representative of the world. each of those three questions will take you toward slightly different answers." This is helpful as a way of thinking about the issue for me. I have no desire to make inferences about the world or the larger population from my sample, which was always about recruiting a specific sub-population of drug users that had not been studied before. "However, if your sample was large enough. you could treat it as a population and subsample it to make inferences amongst its differences." The sample is N=837 so it is "large enough" but I'm still stumped about what meaning significance tests would have in this case. Rest assured, I have conducted many statistical tests on the data to explore it, and most of them are 'significant' associations or differences, but I'm stuck on how to interpret them. ...continued part 2 (email was too long!)