Loading [MathJax]/jax/output/CommonHTML/jax.js

Regression modeling with a two-level categorical variable

Suppose that Z is a two-level categorical variable such that Z = A or B.

Define

X={1,if Z=A0,otherwise

 

Then we can use the following regression model, Y=β0+β1X+ϵ

  • β0=μB(called the base line)
  • β1=μAμB
  • Consequently, β0+β1=μA

Since E(Y)=β0+β1X,

if Z = A, X = 1, E(Y)=β0+β1=μA

if Z = B, X = 0, E(Y)=β0=μB

 

 

Suppose that Z is a three-level categorical variable such that Z = A, B or C. 

Define

X1={1,if Z=A0,otherwise

 

X2={1,if Z=B0,otherwise

 

Then we can use the following regression model, y=β0+β1X1+β2X2+ϵ

  • β0=μC (called the base line)
  • β1=μAμC
  • β2=μBμC

 

Since E(Y)=β0+β1X1+β2X2,

if Z = A, (1, 0), E(Y)=β0+β1=μA

if Z = B, (0, 1), E(Y)=β0+β2=μB

if Z = C, (0, 0), E(Y)=β0=μC

 

 

Two categorical variables

Consider two categorical variables: One at 3 levels (F1,F2,F3) and the other at 2 levels (B1,B2).

Then, the model can be written as Y=β0+β1X1+β2X2+β3X3+ϵ,

where

X1=1ifF2X1=0,ifnot

X2=1ifF3X2=0,ifnot

X3=1ifB2X3=0,ifnot

 

Note that F1 and B1 : base levels

  • β0=μ11 (mean of combination of base levels)
  • β1=μ2jμ1j for any level Bj (j = 1, 2)
  • β2=μ3jμ1j for any level Bj (j = 1, 2)
  • β3=μi2μi1 for any level Fi (i = 1, 2, 3)

 

Interaction model with two categorical variables 

Consider an extended model as follows:

Y=β0+β1X1+β2X2+β3X3+β4X4+β5X5+ϵ,

where 

X1=1ifF2X1=0,ifnot

X2=1ifF3X2=0,ifnot

X3=1ifB2X3=0,ifnot

X4=X1X3,andX5=X2X3

 

Note that F1 and B1 : Base levels.

  • β0=μ11 (mean of combination of base levels)
  • β1=μ21μ11 for any level B1 only
  • β2=μ31μ11 for any level B1 only
  • β3=μ12μ11 for any level F1 only 
  • β4=(μ22μ12)(μ21μ11)
  • β5=(μ32μ12)(μ31μ11)

 

Since F2, B1, μ21=β0+β1 then we can write β1=μ21μ11.

 

 

Example(Two categorical variables with interaction)

 

이걸 보고 우리가 질문할 수 있는 것은 다음과 같습니다. 

  1. interaction이 유의한가요?
    • H0:β4=β5=0 vs. H1:β40orβ50
    • SAS에서 추가적인 옵션이 test를 걸어줘서 확인을 해도 되나, T-test에서 유추가 가능합니다. 
  2. interaction이 없는 모델과 비교할 때는 R2a을 비교합니다. 

결론: 범주형에 대한 회귀분석을 진행할 때도 interaction을 고려해볼 수 있다는 것입니다. 

 

 

+ Recent posts