Skip to content

Logistic regression (Episode 9)

Logistic regression is a beautiful tool for modeling a binary dependent variable, although many more
complex extensions exist. In the show, we will speak about the generalized linear model family, logit and
probit functions, interpretations, and practicalities.

Logistic regression is a beautiful tool for modeling a binary dependent variable, although many more
complex extensions exist. In the show, we will speak about the generalized linear model family, logit and
probit functions, interpretations, and practicalities.
Resources:
● McCullagh, Peter, and John A. Nelder. Generalized linear models. Routledge, 1983.
● Faraway, Julian J. Extending the linear model with R: generalized linear, mixed effects and
nonparametric regression models. Chapman and Hall/CRC, 2016. [https://julianfaraway.github.io/faraway/ELM/]

Transcript

Logistic Regression

[00:00:00] Alexander: Now we are diving much more into linear regression part three.

Welcome to a new episode again with Paolo and myself, Alexander. How you doing Paolo today?

[00:00:24] Paolo: I’m very good. Alexander. Very excited to be here at start, new episode.

[00:00:30] Alexander: Yeah, we have talked a lot about regression and all kind of different things in the past, and of course, when we think about regression, we first think about linear regression and you have continuous outcomes, and that’s great, but unfortunately not everything in the world is continuous or maybe happily. And one of the very common things is that you have these binary endpoints like response, no response. Buy, didn’t buy, clicked on the link, didn’t click on the link. All these kind of different things. And so if you wanna understand what impacts these binary and points, you can’t use the usual linear models. But there is a nice trick you can do. Paolo what’s a trick?

[00:01:25] Paolo: Yes. No, I, would build a logistic progression model. Maybe instead of model the linearity with the identity function because maybe you just use your why as it is. Maybe there is a trick to constraint your why, which is binary zero one in a minus infinite plus infinite space. Like for the standard continuous case, and it’s basically done by the logistic function. So we just transform our zero ones into continuous variable via the logistic function. So we have our logics.

[00:02:20] Alexander: Yep. And the trick is there is that, for the original linear regression, you basically you can also think of it from the average perspective. Yeah. So if you think about, you wanna understand what is the, if you have, it’s this linear regression model for each individual observation. So y equals alpha plus better x. Yeah, you can also sum it up. Yeah. And look into what is the average of wide. Yeah. And then you still have a linear model. And here it’s a wor works the same way. Yeah. Here the Y is not just, continuous, but it’s zero or one. And if you sum up over the zeros and ones, what you get is the proportion. Or the number of responders. Yeah, the number of the ones that you have or the number of the yeses or the number of whatsoever. Yeah. And so you look into see this average. Yeah, and that is here. Then if you have, everything is zero one, it’s something between 0 0 1 and it is the. Expected value of this.

Yeah. So you model again, the expected value. Yeah. And the usual way to estimate that is why as the why as average. And so now you need to get that into some things that is from minus infinity to plus infinity. So what this logit does, it takes this. Let’s say it’s a probability, let’s call it P, for having response of one P divided by one minus P, and then you take the lock of offset ratio and that is this kind of nice. S-shaped curve. Yes. That goes from minus infinity to plus infinity.

[00:04:29] Paolo: Yeah. Which sometimes is called in a different way. For example if you follow the machine learning community, for example this is mainly called sigmoid function.

[00:04:42] Alexander: Ah, okay.

[00:04:42] Paolo: But it’s basically the same. Just talking about below.

[00:04:46] Alexander: Yeah. There’s also some kind of other ways to look into it. Yeah. So here we have this lock of this ratio, but you can also use the normal probability function as a trans version of that. And then you get the profit models. Yeah. Or you can just use the natural like logo, red milk of that. Or other areas.

And so with that, you get all kind of different functions where you can then model your probability. And on the right side of the model is again, a alpha plus better x. Yeah, same as with the original linear model. So it’s a very nice and easy extension.

[00:05:31] Paolo: Yeah. The product model is really interesting because it’s so to say threshold model. You are authorizing that you have a latent variable, underline your probability. And then when this underlying variable is. Above certain point, then you, your probability for one becomes, Higher in terms of the general concept of the thing. And then it’s quite nice to see that you start from logistic aggression. You have now the logic link or private link, but then once you realize this mechanism, you can apply to many other situations. Then you have the generalized linear model. It’s a way to generalize what you have with the linear model in general, our linear regression.

[00:06:30] Alexander: And some really nice thing about it is with the log model, the interpretation of the coefficient. Yeah. So if you do a little bit of calculus and you solve this equation for beta, You will actually see that Peter is see odds ratio for an increase of one in X. Yeah. And if your covariate is just binary. Yeah so you have basically two groups, then you directly see that. That relates very similarly to the. Usual square test that you have in a two by two table. Yeah. You look into the art ratio again, and so that’s very easy. So you can just say, okay, if you have a beater of one yeah. Then you know that is the art ratio.

[00:07:34] Paolo: Yeah, one, it’s not easy for everyone. For example,

[00:07:39] Alexander: sorry, I was wrong. It’s a lock ratio. It’s a lock ratio, yeah. Yeah, it’s a lock ratio.

[00:07:44] Paolo: So you have the odd ratio if you have the beta on the exponential. So you have e to the power of beta is to the power of beta. And then this is the art ratio, which is not super easy. For me in the sense that for some people you need to sit down, think a bit about it, but in the end it’s really powerful. You have you need, you just need to expon it. You are coefficient and you get the bots ratios if you have especially if you have a binary or covas with multiple categories. That’s really nice to have this direct interpretation. Of course, some people struggle with it and I came across to one funny paper, so explaining that maybe for psychologist or not a customer to this kind of models, maybe it’s better to have a linear regression under Some conditions because the interpretation is much more easy and under some Cisco stances it, it’s also correct, but in general, the logistic provision model is really a powerful.

[00:08:59] Alexander: Yeah, so there’s a couple of really nice things about it. The first is that it doesn’t matter whether you, how you define your outcome. Yeah. So if you have outcome, let’s say blue or green, yeah. It doesn’t matter whether you call blue one. Or green one. Yeah. N c has along zero. Yeah, because then just see instead of better, you have minus better. Yeah. That’s, very easily translates also. That’s, for example, not the case. In many other areas where if you’re looking for example, in the relative risks, yeah, these actually do change. Yeah. Yeah. So if you have, let’s say in one hand you have a 5% chance, and in the other you have a, let’s say, 6% percent chance.

Yeah. Since the relative risk here would be 6%, over 5%, so about 1.2, where if you look at the other round, you have 95 over 94%, and that’s pretty close to one. Yeah. So the interpretation there depends very much on what you call a. All the, for example, a success or response or whatever. And that’s important to have in mind with the, with this, the, what racial approach.

[00:10:33] Paolo: It’s more easy to interpret. And also it’s a really powerful tool. And some, and we should have in mind that it’s still a regression because many papers from the learning communities that Lena regression is for regression and logistic regression is for classification. This is not completely true.

So we are still on the regression framework, but we are just using the link function, to have everything in a convenient way.

[00:11:04] Alexander: Yeah. And now you can, do prediction and with that classification in the same way. Actually as with, you know what we talked about in the linear regulat regression space? Yeah.

[00:11:16] Paolo: Yeah.

[00:11:16] Alexander: So if you now have prove X Yeah. You can just impute it into your function and then you get a get outta probability. Yeah. And then you can, based on this probabilities, you can then predict. What is the, what will be the outcome? Yeah. And if that probability, and the nice thing is you can even have some kind of cost function on it. Yeah. So let’s say it doesn’t matter whether you know you make a false positive or false negative thing. Yeah. Then you can just say everything that is above oh 0.5. 50%, we say that is one, like likely a one. Everything below is likely a zero. But if you wanna be more cautious Yeah. And say when we predict one, we really need to be more sure.

You can just say let’s only for those where we, the probability is more than 80%. We call predict these other ones. For all the others we predict it’s a zero. Yeah. Then we minimize, the probabilities that we actually predict. A one when actually it’s not a one for the cost that we more often say it’s a zero when infected. It’s more one. Yeah.

[00:12:50] Paolo: Yeah. It always depend on the cost of the false positives or false negatives. Yeah. Yeah. The cost of missing cases or maybe having too much false predicted cases in your prediction exercise.

[00:13:07] Alexander: Yeah. So maybe for example, for all of those that have a one, maybe you actually follow up on them. Yeah, and you’ve maybe these other people that have clicked on an email and for all those that have a one, you want to set up a phone call with them. Yeah since the cost is the investment of the phone call. Yeah. And you probably wanna make sure that you really invest in these that will then, actually, Do something. Yeah. Buy the product or whatsoever. And so yeah, that could be an area where you maybe want to have a smaller false negative prediction. So Yeah. But that always depends on the business or Yeah. They are environmental setup. Yeah. So hopefully you have seen now this logistic progression. Works very similar to the linear regression. Yeah. And so you can do about, the same things in terms of predictive events and all kind of fun stuff that you could do with linear regression. You can do in the same way with this logistic regression. So feature selection and whatever. Sure. Okay. Any final things to add Paolo?

[00:14:41] Paolo: I don’t think so. I think that the basics are covered. As always we will include some Hardcode and Python code to play with it and get the basics. So have fun you’ll find in the show notes.

[00:14:55] Alexander: Thanks so much.