1. Description

Course STAT467 Multivariate Analysis

Semester Fall 2017

Content Multivariate Normal Distribution and Its Properties part:1*

Notebook Language R

*Notes are mostly from Applied Multivariate Statistical Analysis (Johnson et al.) and Applied Multivariate Statistical Analysis (Härdle et al.)

2. Probability Density Function of p-dimentional Normal Distribution

$$ X \sim N_p(\mu,\Sigma) $$$$f(x;\mu,\Sigma) = \frac{1}{2\pi^{(p/2)} |\Sigma|^{1/2}}e^{{-0.5(x-\mu)'\Sigma^{-1}(x-\mu)}}$$

3. Probability Density Function of 2-dimentional (bivariate) Normal Distribution

$$ \Sigma = \left[ {\begin{array}{cc} \sigma_{11} & \sigma_{12} \\ \sigma_{21} & \sigma_{22} \\ \end{array} } \right] \tag{1} $$$$ \Sigma^{-1} = \frac{1}{\sigma_{22}\sigma_{11} - \sigma_{21} \sigma_{12} } \left[ {\begin{array}{cc} \sigma_{22} & -\sigma_{12} \\ -\sigma_{21} & \sigma_{11} \\ \end{array} } \right] \tag{2} $$

we know;

$$ \sigma_{12} = \sigma_{21} = \rho_{12}\sqrt{\sigma_{22}\sigma_{11}} \tag{3} $$

so;

$$ \Sigma^{-1} = \frac{1}{(1-\rho_{12})^2 \sigma_{11} \sigma_{22} } \left[ {\begin{array}{cc} \sigma_{22} & -\rho_{12}\sqrt{\sigma_{22}\sigma_{11}} \\ -\rho_{12}\sqrt{\sigma_{22}\sigma_{11}} & \sigma_{11} \\ \end{array} } \right] \tag{4} $$

Rewrite the formula like:

$$ (x-\mu)'\Sigma^{-1}(x-\mu) = \frac{1}{(1-\rho_{12})^2 \sigma_{11} \sigma_{22} } \left[ {\begin{array}{cc} x_1-\mu_1 & x_2-\mu_2 \\ \end{array} } \right] \left[ {\begin{array}{cc} \sigma_{22} & -\rho_{12}\sqrt{\sigma_{22}\sigma_{11}} \\ -\rho_{12}\sqrt{\sigma_{22}\sigma_{11}} & \sigma_{11} \\ \end{array} } \right] \left[ {\begin{array}{c} x_1-\mu_1 \\ x_2-\mu_2 \\ \end{array} } \right] $$$$ (x-\mu)'\Sigma^{-1}(x-\mu) = \frac{(x_1-\mu_1)^2 \sigma_{22} + (x_2-\mu_2)^2 \sigma_{11} -2 \rho_{12} \sqrt{\sigma_{11}\sigma_{22}}(x_1-\mu_1)(x_2-\mu_2)}{(1-\rho_{12})^2 \sigma_{11} \sigma_{22}}$$
$$ h(x;\mu,\Sigma)=(x-\mu)'\Sigma^{-1}(x-\mu) = \biggl( \frac{x_1-\mu_1}{ \sigma_{1} } \biggr)^2 + \biggl( \frac{x_2-\mu_2}{ \sigma_{2} } \biggr)^2 - 2\rho_{12} \biggl( \frac{x_1-\mu_1}{ \sigma_{1} } \biggr)\biggl( \frac{x_2-\mu_2}{ \sigma_{2} } \biggr)$$

Hence;

$$f(x;\mu,\Sigma) = \frac{1}{2\pi \sqrt{(1-\rho_{12})^2 \sigma_{11} \sigma_{22}}}e^{-0.5h(x;\mu,\Sigma)}$$

4. R Function for p-dimentional Normal Distribution

$$f(x;\mu,\Sigma) = \frac{1}{2\pi^{(p/2)} |\Sigma|^{1/2}}e^{{-0.5(x-\mu)'\Sigma^{-1}(x-\mu)}}$$

4.1 Writing the density function

Let's partition above pdf into two pieces;

$$ h(x;\mu,\Sigma)=(x-\mu)'\Sigma^{-1}(x-\mu) $$$$ g(\Sigma) = 2\pi^{(p/2)} |\Sigma|^{1/2} $$

Hence;

$$ f(x;\mu,\Sigma) = \frac{1}{g(\Sigma)}e^{-0.5h(x,\mu)} $$
In [1]:
MultiNorm = function(x, M, S){
    
    p = length(x) #number of observations
    x = matrix(x, p, 1) #random variable vector
    M = matrix(M, p, 1) #mean vector
    h = t(x-M)%*%solve(S)%*%(x-M) #matrix multiplication that we defined earlier
    g =  ((2*pi)^(p/2)) * sqrt(det(S)) #normalizing constant
    
    f = (1/g)*exp(-0.5*h) #density result
    
    return(f)
}

4.2 Data Generation

Lets generate some M and S to test our function. This is NOT MVN distribution generator. I will just generate parameters of MVN distribution. (This section is more like a bonus section, if you like, you can continue with section 5)

In [2]:
p = 4
M = runif(p) #p means from 0-1 interval
V = runif(p, 1, 4) #p variances from 1-4 interval

I will not generate their Variance-Covariance matrix first. Instead, i will generate their correlation matrix and then find its Variance-Covariance matrix by using Variance and Correlation Matrix. In this way, i will have more control over linear dependency of variables.

Beta

To generate its correlation matrix, i will use Beta distribution. Because it's interval is between 0 and 1. In addition, its shape quite flexible w.r.t. its parameters. If you want strong/weak correlation, you can generate them from a beta where a>b/b>a. Let's clearify it with an example;

In [3]:
set.seed(108)

print("from Beta(5, 1.5):")
print(rbeta(5, shape1 = 5,shape2 =1.5)) #right skewed (we could generate strong correlation)

print("from(Beta(1, 5):")
print(rbeta(5, shape1 = 1,shape2 =5)) #left skewed (we could generate weak correlation)
[1] "from Beta(5, 1.5):"
[1] 0.7916875 0.8389239 0.7875928 0.7617174 0.7688059
[1] "from(Beta(1, 5):"
[1] 0.02286697 0.07138112 0.10971079 0.04275005 0.15488653
In [4]:
set.seed(108)

a = 1.5; b = 8 #we want a rather weaker correlation

P = diag(0.5, p, p) #diagonal of Correlation matrix = 1, because Cor(x, x) = 1

for(i in 1:(p-1)) for(j in (i+1):p) P[i, j] = rbeta(1, a, b) # I obtained an upper triangular matrix

P = P + t(P) # i added P with transpose of it. Because of symetric property of Correlation matrix

print(round(P, 2))
     [,1] [,2] [,3] [,4]
[1,] 1.00 0.14 0.11 0.14
[2,] 0.14 1.00 0.16 0.16
[3,] 0.11 0.16 1.00 0.27
[4,] 0.14 0.16 0.27 1.00
In [5]:
S = diag(V^(1/2)) %*% P %*% diag(V^(1/2))
print(round(S, 2))
     [,1] [,2] [,3] [,4]
[1,] 3.71 0.47 0.30 0.29
[2,] 0.47 2.93 0.41 0.28
[3,] 0.30 0.41 2.13 0.41
[4,] 0.29 0.28 0.41 1.06

Let's put it all together and create a parameter creation function for a Multivariate Normal distirbution.
p = (integer, scaler) number of variables
M_interval = (integer, size 2 vector) interval of mean
V_interval = (integer, size 2 vector) interval of variance
correlation = (charachter, scaler) Correlation magnitude of variables. It could be one of them: none, weak, modarate, strong

In [6]:
MVN_prm_genarator = function(p, M_interval, V_interval, correlation = "weak", myseed = 1){
    
    set.seed(myseed)
    
    M = runif(p, M_interval[1], M_interval[2]) 
    V = runif(p, V_interval[1], V_interval[2]) 
    
    if(correlation == "none"){
        S = diag(V) #if no correlation Var-Cov matrix is exactly equal to diagonal variance matrix
    }else{
        if(correlation == "weak"){
            a = 1.5; b = 8
        }else if(correlation == "moderate"){
            a = 2; b = 2
        }else{
            a = 5; b = 1
        }
            
        P = diag(0.5, p, p) #diagonal of Correlation matrix = 1, because Cor(x, x) = 1 (we will add it to 1)

        for(i in 1:(p-1)) for(j in (i+1):p) P[i, j] = rbeta(1, a, b) # I obtained an upper triangular matrix

        P = P + t(P) # i added P with transpose of it. Because of symetric property of Correlation matrix
            
        S = diag(V^(1/2)) %*% P %*% diag(V^(1/2))
    }
    
    return(list(M = M, S = S))
}

Hooray, we created our own parameter generation function for MVN distribution. With this function we could easily generate MVN's that have differenent characteristics.

In [7]:
MVN_prm_genarator(p = 10, M_interval = c(1, 4), V_interval = c(1, 10), correlation = "modarate")
$M
  1. 1.7965259894263
  2. 2.11637169891037
  3. 2.71856009005569
  4. 3.72462336998433
  5. 1.60504579311237
  6. 3.69516905490309
  7. 3.83402580581605
  8. 2.98239337746054
  9. 2.88734213169664
  10. 1.18535881140269
$S
2.8537712.6807164.0903412.3032014.6183183.2549363.8279455.1093693.3767414.438318
2.6807162.5890114.1322173.2184553.8475303.5098313.6056404.0346882.3617784.128618
4.0903414.1322177.1832065.5522076.1073815.6658316.0098466.0758304.0924656.213174
2.3032013.2184555.5522074.4569335.7635813.9216144.4350666.2505654.1050673.697340
4.6183183.8475306.1073815.7635817.9285734.0322036.9192668.3951884.7769357.607822
3.2549363.5098315.6658313.9216144.0322035.4792935.7828526.6103354.8418386.587352
3.8279453.6056406.0098464.4350666.9192665.7828527.4585676.3240122.6713617.154333
5.1093694.0346886.0758306.2505658.3951886.6103356.3240129.9271555.3073068.895329
3.3767412.3617784.0924654.1050674.7769354.8418382.6713615.3073064.4203174.901810
4.4383184.1286186.2131743.6973407.6078226.5873527.1543338.8953294.9018107.997007
In [8]:
MVN_prm_genarator(p = 3, M_interval = c(-2, 2), V_interval = c(1, 6), correlation = "none")
$M
  1. -0.9379653474316
  2. -0.511504401452839
  3. 0.291413453407586
$S
5.5410390.00000 0.000000
0.0000002.00841 0.000000
0.0000000.00000 5.491948

5. Plot p-dimentional Normal Distribution

In [9]:
library(plot3D)
library(plotly)
In [10]:
plot_mvn = function(MS, x1_interval, x2_interval){

    x1 = seq(x1_interval[1], x1_interval[2], length = 100)
    x2 = seq(x2_interval[1], x2_interval[2], length = 100)
    Mu = mesh(x1, x2)

    x = Mu$x
    y = Mu$y

    z = matrix(0, 100, 100)
    for(i in 1:100){
      for(j in 1:100){
        z[i, j] = MultiNorm(x = matrix(c(x[i, j], y[i, j]), 2, 1), M = MS$M, S = MS$S) 
      }
    }

    df = list(x1 = x1, x2 = x2, z=z)

    plot_ly(x = df$x1, y = df$x2, z = df$z) %>% add_surface()
}

We are going to compare correlation effect into plot, with 4 different bivariate normal distributions all distirbuted with same mean and variances. But their correlations differ.

In [11]:
none = MVN_prm_genarator(p = 2, M_interval = c(0, 0), V_interval = c(10, 10), correlation = "none")
weak = MVN_prm_genarator(p = 2, M_interval = c(0, 0), V_interval = c(10, 10), correlation = "weak")
moderate = MVN_prm_genarator(p = 2, M_interval = c(0, 0), V_interval = c(10, 10), correlation = "moderate")
strong = MVN_prm_genarator(p = 2, M_interval = c(0, 0), V_interval = c(10, 10), correlation = "strong")

Let's print their Variance-Covariance matrices:

In [12]:
print("No covariance:")
print(none$S)

print("Weak covariance:")
print(weak$S)

print("Moderate covariance:")
print(moderate$S)

print("Strong covariance:")
print(strong$S)
[1] "No covariance:"
     [,1] [,2]
[1,]   10    0
[2,]    0   10
[1] "Weak covariance:"
           [,1]       [,2]
[1,] 10.0000000  0.8273305
[2,]  0.8273305 10.0000000
[1] "Moderate covariance:"
          [,1]      [,2]
[1,] 10.000000  3.275025
[2,]  3.275025 10.000000
[1] "Strong covariance:"
          [,1]      [,2]
[1,] 10.000000  6.438026
[2,]  6.438026 10.000000
In [13]:
x1_interval = c(-12, 12)
x2_interval = c(-12, 12)
In [14]:
print("No covariance:")
plot_mvn(none, x1_interval, x2_interval)
[1] "No covariance:"

In [15]:
print("Weak covariance:")
plot_mvn(weak, x1_interval, x2_interval)
[1] "Weak covariance:"

In [16]:
print("Moderate covariance:")
plot_mvn(moderate, x1_interval, x2_interval)
[1] "Moderate covariance:"

In [17]:
print("Strong covariance:")
plot_mvn(strong, x1_interval, x2_interval)
[1] "Strong covariance:"

When two variables have equal mean and variances. The shape seems like a cone. But when covariance increase, shape of x-y plane became more elliptic. The reason behind this quite obvious. When X1 and X2 are positively correlated. They accumulate around a line that has positive slope in x1-x2 space. And negative slope line when correlation is negative.