false
Catalog
Science of Neurosurgical Practice
Practical Statistics: Choosing the Right Test
Practical Statistics: Choosing the Right Test
Back to course
[Please upgrade your browser to play this video content]
Video Transcription
We're going to move on a little bit beyond the two-by-two table, but not too much. And at the end of this sort of short talk, we'll come back to the two-by-two table. But we're going to do a little bit more of what people think of as hardcore statistics. So at this point in the day, you're probably thinking, well, we've heard about two-by-two tables, and we've talked about relative risk, and we've talked about odds ratios, and they haven't talked about the t-test. Where's the t-test? Because people think about statistics, and they think about the t-test. So we're going to talk about the t-test and some other stuff now. And I'm going to do it, again, in the context of a clinical question. Not that I'm really particularly interested in dementia, although as I get older, I become more interested. But memantine is a drug that's been of interest in neuro-oncology as well. So that was my motivation for this. So here's my PICO format question, and this is going to be the theme during the course of our talk about higher-level statistics. So for patients with advanced Alzheimer's disease, does memantine versus placebo versus no memantine improve cognitive function or delay time to cognitive decline? So that's my question. And I did a literature search, interested in this drug, and came up with a bunch of articles, but in particular, this one from a reasonably reputable journal. And it turns out that this is a randomized controlled trial with concealed allocation and masked outcome assessment and all of the features that you'd attach to a well-designed and well-run clinical trial. And so it seemed promising that we were going to get some interesting information from this. And so after going through the sort of critical appraisal, I went looking for the two-by-two table, or the equivalent, you know, the data, and this is what I found. So I mean, what is it? They're doing this thing called the CBIC Plus, which initially I thought was a Honda model, and then they've got these two curves. And it looks like the drug is doing better than placebo, but I don't know. What are all these numbers, and where's the two-by-two table? So this is how we're going to get into these statistics. And this is all a question of variables, okay? And we've already sort of broached this topic of variables, variables meaning quantities that can take on more than one value, they can be variable. And in evidence-based practice, we're interested in two types of variables. We're interested in the independent variable, which has different names depending on what type of study that you're interested in. So it can be the intervention in a therapeutic trial, or a predictor if you're looking at a risk study, or a risk factor. Occasionally, you hear the title explanatory variable, or it can be the result of a diagnostic test if you're doing a diagnostic study. So the independent variable. And then the outcome variable, or the dependent variable, okay, that also can assume different values, and so it's a variable. And all of biostatistics beyond the two-by-two table depend just on identifying the type of variable. That's all there is to it. So we're going to spend a little time doing that. And then all of this kind of stuff that seems like statisticians using technical language to befuddle you just goes away. The fog just disappears. So all this stuff about Wilcoxon tests, and Mann-Whitney tests, and this and that, all will evaporate, okay? So here are the two variables in this study that I was interested in. So tell me something about these variables, okay? I've given you sort of which one is the dependent and which one is the independent. Let me first tell you what I want to know. So there are three key questions that we're going to go over over the course of the next maybe 20 minutes. The first one is, what type of variables are we using, okay? What's the type of variable? Once we've identified that, we're going to try and describe the variables in terms of quantitatively, and there are two components to that. There's the central tendency and the dispersion, okay? So we're going to identify the independent and dependent variables, and we're going to decide what type of variables they are. Then we're going to describe them mathematically, and then we're going to ask, and this is where all those terms come in, we're going to ask, what's the way that you assign a measure of association between the two variables, and how do you describe the precision, okay? So we're going to ask those three two-part questions, and at the end of that, all of statistics beyond the two-by-two table will be really clear, okay? So we're going to start with naming the variables, okay, and they're really in a sense two types of variables. There are these things that people call interval or continuous variables, and these are things like temperature or blood pressure or heart rate or time. They can assume any value within an interval, and they have some very neat characteristics that we'll look at, but you sort of get the idea of an interval variable, height, another one, blood pressure, and then categorical or discrete variables, and these can only assume discrete. There's a discrete set of values that they can assume, okay, discrete variables, and they can either be ordinal, where there are multiple ordered sets. So the MRC scale for muscle grading is the one that I think we're most familiar with. So you can only have six assigned categories or values. And there's an order to them, okay, they're ordinal, or nominal, where they're non-ordered categories. One that people talk about a lot is race, okay, and we'll use that as an example. And a special subset, the subset that actually is really the most clinically useful often and that EBM just loves is these dichotomous or binary kind of nominal data, so there's yes, no, alive, dead, improved, didn't improve, okay, so that's it. That's the range of variable types, okay, and we'll go over them in a little more detail. So here are the characteristics of an interval set of data, okay, and it's important because it sort of will dictate to you how you describe the data mathematically and then how you operate on it statistically, so for example, blood pressure. So it makes sense to have any value in that range, right, you can have a blood pressure of 103 or 177, so it's interval in that sense, but equally important, the distance between, say, 75 and 50 is exactly the same as the distance between 125 and 100. You can count on that, right, 200 to 150, same as going from 100 to 50, right. So those are the two key characteristics of interval data, and you don't see that in ordinal data, okay, so here you have an order, so one is weaker than three, and three is always weaker than five, you can't mix up the numbers, but you don't know what really the distance between the numbers means. In fact, if you were going, if you wanted to know, in this case, you could maybe make an assessment if you were sort of testing muscle strength with a dynamometer, something that has interval characteristics, you could actually measure resistance or pressure, you would find that you lose 70 to 90% of your strength just going from five to four, okay, so the intervals are not equal, and you can't tell what the distance between the intervals are, there's no obvious way of doing that, okay, but you know just by definition that the distance between one and two is not the same as the distance between three and four, so that's an interval, an ordinal set of data. And then nominal, and the easiest one of these, if we go back to our original example of the Bell's palsy, is good, bad, abnormal, normal, improved, didn't improve, live, died, okay, so they're the three sets, the three types of variables, so now tell me what kind of comparisons you can do for the different ones, so for nominal, ordinal, and interval, which of those can you say same or different, can you say same or different for nominal data, sure, how about for ordinal, for interval, same or different, can you make that determination, yeah, so you can do that for any of these, right, that's the very basic, how about greater than or less than for nominal, not really, how about ordinal, you can, and for interval, do you see a pattern here, which is conveying the most information, which type of data, yeah, so without even thinking, what's going to happen when we say magnitude of difference, can you do that for nominal, and ordinal, but you can't for interval, okay, so these are the kind of comparisons that you can do for the different types of data, so let's go back to our original, this kind of theme that we're going to use for this article, what type of data is the intervention, the memantine, yeah, so that's going to be drug, placebo, yes, no, and how about the outcome, you might have to guess, because like me, you may not be familiar with the CBIC, yeah, it sort of looks like it's going to be, so nominal data for the drug, and I would say almost all of the intervention trials of clinical interest are going to have an independent variable that's nominal, that's almost always the case, so we can pretty much assume that, so here's the CBIC plus scale, right, and this is basically the clinician interviews the patient, and then the caregiver has some input and you come up with this seven point scale that looks sort of a little bit like the MRC scale, right, so what kind of scale would you call this, yeah, that's what I would say too, so we've got a nominal to ordinal comparison here, does that make sense, good, so that's really it, we've gone through independent and dependent variable, and what type of, what are the choices you can have for variables, so now we're going to talk about how to describe, we're going to quantitate the variables, and I promise, I know this is really dry, I can't think of a way to make it as exciting as diagnostic tests, but at the end it will be worth it, so we're going to think about how to describe the distribution of the values of these variables, okay, and there are two components to that, the central tendency and the dispersion, and then we'll end up with the statistics, so for nominal data, for example, so as you think about the race distribution in a local high school, okay, and I've sort of arbitrarily chosen this order, okay, but you could just as easily shuffle them around because there's no sense that one is higher or lower or better or worse than the other, right, that's a property of nominal data, okay, and so there's no fixed order, and I'm just going to put some numbers in, okay, so now you tell me, how would you describe for nominal data the central tendency and the dispersion, what would you guess? So these are names that you've learned in third grade, and since nominal data is a little bit of an oddball, the central tendency is one of the three that you've probably never spoken since third grade, how do you describe the central tendency in nominal data? You've got to really dig back, do you want me to tell you or do you want to, what are the possibilities? I'm sorry? Yeah, yeah, so that's how you describe it, the central tendency and the dispersion, and the central tendency is the mode for nominal data, and the dispersion, there are a couple of ways of doing this, but what do people typically do? In a sense, exactly, the proportions, the ratios, so the percentage of people that have the most common outcome, and you can percentage them all, okay, so the mode, the most frequently observed category, and the proportion, the percentage, just as you said, okay, so that's how you describe central tendency and dispersion. How about ordinal, this is our example where there's a fixed order, right, but the distance between is really unknowable, it makes no real sense, okay, this is actually closer to the truth, so I'm going to, let's say for this example we'll stick with the MRC scale and you're interested in intensive care polyneuropathy and you've decided as your outcome measure that you're going to measure knee extension, okay, so you're interested in quadriceps strength, and you grade a handful of patients on the MRC scale and here's your data, so this looks like the picture that we had before. Now you want to again describe the central tendency and the dispersion, how are you going to do that? You've only got two choices now for central tendency, what's the right one? Yeah, yeah, exactly, this is the median, you can't use the mean as we'll see because you don't really know what the median, the mean is an average, right, and you can't average because you really have no idea where the numbers go on the scale, but you can pick out the median, right, that number which has 50% of the events below and 50% above, and then the dispersion, how do we characterize the dispersion of the data around the median? Also a couple of acceptable ways. You see this all the time when you look at articles I'm sure, I'm sorry. So the standard deviation, yeah, exactly, that's what most people would do, right, you do 25% above and 25% below, so the 25th to 75th percentiles, exactly, so median and this interquartile distance. Some people do median and range, which would be fine, and we saw a talk earlier where Dr. Barker was talking about median and 95% intervals, all of those are acceptable and give you a sense of the dispersion of the data, okay, this is for ordinal data, good, so ordinal data. How about interval? Let's say, now we're back to the Bell's palsy, I'm sorry, but that's what we're doing, and now instead of our outcome measure being improvement or not improvement or better or worse or good or bad or whatever, now let's say you're interested in the time to improvement, and you're going to follow these patients for 10 years, and you're going to see how long it takes for them to improve maximally, okay, so now the scale is days or weeks, and that's a variable that has interval qualities, right, because two weeks is greater than one week, the distance between two and three weeks, the same as the distance between five and six weeks, so this is an interval set of data, and now what, now how are we, you've already mentioned, how are we going to describe the central tendency for interval data, yes, sir, and how about the dispersion, I'm sorry, I'm not hearing so well, standard deviation, right, exactly, mean and standard deviation, and you remember the standard deviation is the difference of each value from the mean, the sum of those divided by the number of data points, and then take the square root of that, you could also, if you wanted, calculate this thing called the standard error, which is the standard deviation divided by the square root of the number of data points, it gives you a sense of the dispersion of the means, the same way that the standard deviation gives you a sense of the dispersion of the data, so either of those are acceptable, people will use either of them, okay, so just to summarize, for nominal data, mode and proportions, ordinal, we do median and percentiles, and interval, mean and standard deviation, is that okay, I'm thinking if I talk fast, you won't be bored as long, so that's why I'm going this quickly, here's a neat thing, you can convert one of these into another, but only in one direction, what drives that direction, so you can convert interval data into ordinal data, or ordinal into nominal, why can't you go the other way, why can't you convert nominal data into ordinal or interval, why is this a one-way street, so this is kind of an important concept, we're going to ask both questions, we're going to ask why would you want to do it in the first place, and why can't you do it the other way, so interval data contains more information, right, it contains the information that you know the exact distance between two points, and you don't have that in any, so you can't go backwards, you can't add information that you don't have, and just as a preview, this is why, this is the problem when, for example, do you guys get the, I get this like, I think everyone else in the department gets it like every six months, Dr. Harbaugh sends me my Press Ganey reports weekly, so, but you see how they, if you've seen those for yourself or the department, they give you means and standard deviations or percentages, what kind of data is, I don't know if you've seen these, it's, you get rated poor, fair, good, very good, excellent, something like that. What kind of data is that? Yeah, that's ordinal data. And they're telling you means and standard. So they're adding information that they don't have. And we'll talk about why you get into trouble with that. But just the short answer is that a very small difference that they report out, a very small difference in your mean, for example, they could report out as a 20% difference in your rating. And that's completely not right. Here's how you would, and you tell me why. Now I'm going down the scale, like we said. We're going to convert what we said was what was ordinal data, MRC scales. And I'm going to convert this into what type of data? Nominal. Why would I do that? Why would I lose information? Here's how I've decided to do it. I've decided that I'm going to make the cut at antigravity strength. And if you have antigravity strength or greater in your quadriceps, that's going to be one group. And if you don't, it's another. Why would you ever want to lose information like that? To make it easier to interpret. Yeah, the ease of interpretation. So now if you have, and oftentimes this is more clinically relevant. Antigravity strength may mean the difference between the ability to walk and not walk. And so although you lose information always when you do that, you often gain in clinical relevance or the ability to interpret data. So that's how you do it. And there's a good reason. And you can percentage that out just like we did when we were doing the Bell's palsy example. And then you can manipulate this data the same way that you do with that data. So anyway, the ease of interpretation is the reason that you might want to go from interval to nominal or ordinal to nominal. And that's done really frequently. You lose information always when you summarize data that way. But it's often easier to interpret. So what did these guys do in their study? What do you think? So they have two variables. Tell me first about which variable is that, the drug or placebo, which is that dependent or independent? That's our independent variable. And what type of data is it? You said nominal. So are they expressing the nominal data in the way that we suggested they should? They're not actually telling us the mode, but they're giving us the percentages. And you can see what the mode is just by looking at the data. That's sort of what we suggested. How about the CBIC plus score? What type of data, again, is that dependent variable? So it's like the MRC scale. And how do you want to convey ordinal data, the central tendency and the dispersion? Yeah, and they haven't done that, right? They've conveyed this as a mean and a standard error, standard deviation. Why would someone do that? Presumably, they understand what they're doing. I mean, you can always say they didn't know. But why would they do that? Why would someone do that? Well, I don't know. In any event, it's a red flag, because that's not how you're supposed to convey that data. And perhaps the means were really close to each other, and they really wanted to show a difference. And so they said, ah, the medians are different, so I'm going to convey it. And that happens all the time, right? Medians are frequently very close together and means further. So if you're trying to show a difference, so a red flag. In a big-time journal, right? So this is something that editors don't catch, ought to catch, you shouldn't rely on, just because it's published in an all-star journal. So good. We've talked about the types of variables. We've talked about how to convey central tendency and dispersion. And now we're going to actually talk about the statistics. And there are two components to that. There's the measure of association and this measure of precision. So let's look at both of those. So we'll look at a bunch of comparisons. Here's the one that we started the day with, nominal by nominal. So steroids or no steroids, good outcome or poor outcome. And we're going to fill in the box. And we've talked about a bunch of ways to express association. I'm going to use the risk difference, just because we used the risk ratio the last time. But what other ways can you express association with nominal data and the two by two table? We talked about three, but you can give me a whole bunch. Absolute, so that's the risk difference, good. You've calculated all of these earlier today. Odds ratio, number needed to treat. And how would you convey the precision of your data? You can do a 95% confidence interval. Any ratio, any ratio you can calculate a confidence interval on. So that's the story. To move on to the little bit more complicated sets of data now, I'm going to get out of the two by two table a little. But not really. So this is still a two by two table. We still have nominal data and nominal data. But now I put little triangles in the boxes. And the reason I've done that is because now I'm going to interpret the triangles as bar graphs. And this is still the same thing. We have the good outcomes in the, I don't know what color that is, blue or gray or whatever, and the bad outcomes in red. So now how are we going to talk about the measure of association for this type of data? We've just said that. What are we looking at? I'm taking away the triangle so it's clearer. How are we going to express the measure of association? We just did this. No tricks here, I promise. I always do a warning if there's a trick question. So risk difference, relative risk, odds ratio. We said we'd do the risk difference, so I'm going to stick with that. I haven't done anything new. And now we're going to talk about the precision. We discussed this. Frequently, people will use a p-value for the test of statistical significance. But that really tells you just what? What does a p-value tell you? Yeah, so that only tells you whether the result that you see is related to chance. It doesn't really tell you anything about precision, OK, the range of possible values. For that, you need the confidence interval. And you can calculate that around a 95% confidence interval around the risk difference. Dr. Smith's tool will do that for you. You probably saw. So we're going to stick with this bar graph sort of thing. This bar graph sort of depiction. And now we're going to describe some different variable relationships. Now we're going to do interval data with nominal. This is what a lot of data in the journals are. You see this all the time. Two interventions and a time to something, for example. So we've done this before. We've shown the triangles. And all I'm going to do now for this Bell's palsy data, this is time to recovery of Bell's palsy. I'm going to flip the graph a little bit. That's all I'm doing. So still the same. No, nothing up my sleeve. And now I've made this into interval data. And what type? Nominal data. Still sort of related to the two by two table. But now instead of nominal data on the y-axis, we have interval data. So I haven't done anything unfair yet, right? So how are we going to describe the measure of association for this data? How are we going to look at the difference between central tendency? What are we going to do? How are we going to display that difference? Yeah, yeah, we're going to look at the difference between the means. OK, does that make sense? We've already talked about standard deviation and mean. So I've just put those up. And now I'm going to take the triangles away because they're getting cumbersome and they get really cumbersome if you have 400 patients. So now I've just got the plots. And we can look at the difference in the means. It's the measure of association. And how do you? So now we've got means. How do you calculate the precision or the statistical significance when you've got means? And you just did this in the hands-on exercise for one of the data sets. What statistical tool technique did you use to calculate that? When you were doing the D-dimer stuff, you had to get means and standard deviations. So you do a t-test if you have two categories of nominal data. And what happens if you have more than two? The analysis of variance, the sort of t-test that you do when there are multiple categories. And then the measure of statistical precision, that'll get you a p-value, is the difference in the means with the 95% confidence interval. Is that fair? We're almost done. We've only got one more thing to do. So now I've got an ordinal scale. We're back to the made-up data. I don't know. Maybe you have a treatment for ICU polyneuropathy. Or you're using one of the ordinal scales for facial recovery for Bell's palsy. But now we have nominal by ordinal. And how are we going to convey the difference here, the difference between the two tests? What did we say that we looked at? How do we convey again? What did we say we used to convey the central tendency? Yeah. So median and interquartile range. Do you know what these funny looking plots are called? Yeah, these are box and whisker plots. This is the 25-75 range. And that's the median. And that's the full range, excluding outliers, which become really easy to see, because they stand out like that. So I'm going to take away the triangles again. And then as we said, now we can do the difference in the medians. And the test that you used to convey statistical significance. So medians, I guess strictly speaking, and I can see Dr. Barker sort of on the edge of his chair, but I guess strictly speaking, nominal data is non-parametric too. But we usually talk about non-parametric data when we talk about medians. And what's the test that we use to get a p-value, to convey statistical significance with ordinal data? Well, chi-square we've used now. That answer is taken. We use that for nominal data, chi-square, or when they're small numbers, Fisher's exact test. So this is the one with all the weird names. All right, this is the Wilcoxon or Wilcoxon. They've got a bunch of Mann-Whitney, that depending on how many people you want to attribute it to, they've got a bunch of names. And you can, if you want, calculate a difference in the medians. It's incredibly annoying to do, so it's rarely done. But it's possible. So that's it. There are other combinations that you can imagine, right, ordinal by interval, interval by interval. These are uncommon. And again, in clinically relevant medicine, this is not common. Usually, the independent variable is nominal. OK, and usually, it's dichotomous. But you can do these other things. And we will, actually, at the end of the lecture. You know, on the last day, just as a preview. So we'll be talking about time to event and regression. So we will get to that, but not right now. So just to summarize, right, so nominal, nominal. We can do a risk difference as the measure of association in a chi-square or a Fisher exact test if the numbers are small. For nominal to ordinal, we do a rank or a median difference. And the Wilcoxon or the sort of ANOVA equivalent, this Kruskal-Wallis test. And then we can do difference in the means if we have interval data. And depending on how many nominal categories there are, we'll do a t-test if there are two, or an analysis of variance if there's more than two. Is that fair? So now we're almost at the finish line. We're going to go back to this study that I started with that sort of stimulated the question in the first place. And the data that they express. And again, so where is the two by two table and how do we interpret this? So here's what they said, OK? So we said this is a nominal to ordinal comparison. And here's the data that they give us. What do you think about that? So we've seen this before. And they provide us with a p-value. Is that helpful? Interpret this for me in light now of whether you would use, whether you would tell someone to use memantine in the setting of moderate to severe Alzheimer's disease with a p-value of 0.03. And at least a difference in the means of what? What's the difference in the means here? It's my test to see if you're awake, because I know you can calculate it. Good. Interpret this for me. What are we missing that we'd like to see? We'd like to see some measure of precision, a confidence interval. And we'd also like to respond to Dr. Barker's question about, is that clinically significant? It's clearly statistically significant. So there's your difference in the means. And as you'd expect, it's not going to cross 0, because it's a statistically significant difference. Is 0.3 units on that scale, that seven point scale? Hard to tell whether that's. And you can't really tell. People are saying no, and it's my inclination as well. But since you don't know the difference between the different intervals, hard to tell. You can't tell. Is a Karnofsky score that goes from 40 to 50 clinically important? Not really, but 60 to 70? That means the difference between independence and not. So that's a characteristic of these ordinal data sets that you can't tell. So anyway, we've gotten all of this stuff done. And we've gotten this information from the study. What are you going to do now? You're stuck. They're done. This is all that the article tells you. And it leaves you not very satisfied and without much direction. What do you suggest? So what I'm going to suggest, and once again, really for medicine people, this is the top tier journal. This is number one in the world. So they mess up frequently. They've left us inadequately informed. And I think when you're inadequately informed by the data, even though they seem to have done it correctly, this business about the means and standard error is not right, but they at least use the right statistical test. But they've left us confused. We don't know how to interpret this data. I would say go back to the two by two table. Look at their data and reinterpret it in light of something you know. Ask all of these questions for something that we can sort of explain. We said we don't know if a 0.3 unit difference is very significant in terms of the CBIC whatever, even though we have the range and the standard. So look to their secondary outcomes. And it turns out that they've done this second thing, this severe battery impairment, which also is one of these kind of ordinal scales. But they've divided it now. As we said you could do, they've changed ordinal data into what? Nominal data. Better or not worse. Not worse or worse. So now we've got something that we can do a two by two table with. So now we're on to something that might be clinically relevant. Now we have data that falls into nominal categories. So now we've got a nominal by nominal study. OK. And you can, the data is in the article. And now we can interpret it in a way that we can make sense of. So what do you think now? Interpret that for me. Assume that I've done the math right. Because I don't want to find out if I haven't. It's so hard to change these slides. What do you think about that number needed to treat and give it to me in words? Treat five Alzheimer's patients with memantine to prevent one from getting worse. But that number may be as low as one divided by. So it might be as high as 10 needed to treat or might be as low as two. Yeah, I didn't give you the confidence interval for the number needed to treat. But you see how Nick knew what to do there. So the risk difference, the inverse of the risk difference will give you the range of the confidence intervals. So good. You have to give five people this drug to prevent one deterioration over the duration of the study. I think it was four months or something. So that's something that you can interpret. That's a clinically relevant outcome measure. And you can think about it. So I would say to summarize, if you can't interpret the data, then change it into a two by two table, which you can almost always interpret. That's it. If you can't, if you can avoid getting beyond the two by two table, always go for that. It's rigorous. It's appropriate. It's not puny or inappropriate. And it can give you some clinically relevant information.
Video Summary
In this video, the speaker discusses various statistical concepts and their application in clinical trials. They start by talking about different types of variables, such as nominal, ordinal, and interval, and how to describe their central tendency and dispersion. The speaker then explains how to analyze and interpret different combinations of variables, including nominal-nominal, nominal-ordinal, and interval-nominal. They emphasize the importance of using appropriate measures of association and precision, such as risk difference, odds ratio, median difference, and confidence intervals. <br /><br />The speaker uses an example of a study on memantine in Alzheimer's disease to demonstrate the application of these statistical concepts. They highlight the need for proper interpretation and presentation of data to ensure clarity and meaningful conclusions. They also suggest that if the data is difficult to interpret, it can be converted into a two-by-two table for easier analysis. The video concludes by encouraging viewers to rely on the two-by-two table and clinically relevant outcomes when evaluating study results.
Asset Subtitle
Presented by Michael J. Glantz, MD
Keywords
statistical concepts
clinical trials
variables
central tendency
dispersion
measures of association
precision
×
Please select your language
1
English