Stealing pages from the server...

Principal Component Analysis Derivation


Principal Component Analysis (PCA) is an important technique to understand in the fields of statistics and data science. It is a process of computing the principal components and utilising then to perform a change of basis on the data. For the purpose of visualisation, it is very hard to visulaise and understand the data in high dimensions, this is where PCA comes to the rescue.

Introduction

Principal Component Analysis or (PCA) is a widely used technique for dimensionality reduction of the large data set. It needs the knowledge of some linear algebra, such as vector projection, eigenvalues and eigenvectors, Lagrange multipliers, derivatives of a matrix, and covariance matrix.

Derivation

Let’s go ahead and get into it. Let’s say we have N different vectors to with dimension of D. Our goal of course in PCA is dimensionality reduction. So we want to map the space which has dimensionality D onto a space which has dimensionality M, where M has to be less than D by all means. That’s the point of dimensionality reduction.

It looks like a scary and daunting task, let’s take a whole step back and look at a simpler picture. Our objective is that we want to maximise the variance of the projections onto some dimensional space. In other words, we have this D dimensional space that contains all this information and all this data, we want to reduce its dimensionality, but we want to do it in a clever way, we want to do it onto a space so that we preserve as much of the information (original variation) while reducing the dimensions.

Projection

The projection of onto a potential vector can be written as

where u is a unit vector, so its length is 1, and we can get

Finally, we know that our mean of projections among all the data is , since the mean is a linear operation, it behaves in the exact same way.

Variance

Going back to our objective, our goal is to maximise the variance of the projected data. By the definition of the variance,

Next,

And we expand it,

Next,

where is the closed form of the covariance matrix, so we then left

Lagrange Multipler

In this part, we want to maximise the variance of the projections which as we found is subject to the constraint , and u means a unit vector. Using the power of Lagrange multipler, we have a new objective function, which looks like .

We are just going to take the derivative of this line with respect to vector , so we get

We get

This means that for u, whatever direction we choose to project on is going to have to be an eigenvector of the covariance matrix S, because this is exactly the definition of an eigenvector. But there’s lots of eigenvectors and eigenvalues, what eigenvector and what eigenvalue should we use?

Eigenvectors and Eigenvalues

To figure this out, we know that

If we want the maximum value of , then we should select the dominant eigenvalue for the variance of the projected data.

To be more general, if we want to project the data onto more than just one dimension, we have to figure out what is the second biggest eigenvalue, and we use the second eigenvector corresponding to the second biggest eigenvalue, etc. You just go down in line for whatever many different components you want to end up in.

Conclusion

In this article, we understand the moving parts behind Principal Component Analysis (PCA), I believe this will give you some insight into what’s actually happening. I hope you are able to follow this article, stay tuned! Bye!


Author: Yang Wang
Reprint policy: All articles in this blog are used except for special statements CC BY 4.0 reprint polocy. If reproduced, please indicate source Yang Wang !
 Previous
Estimate π on π Day Estimate π on π Day
Happy π Day 2021! In this article I'll estimate the digits of π with random numbers and the probability of two integers being co-prime. What is the probability of two random integers being coprime? Euclidean Algorithm can be used to estimate π!
2021-03-14
Next 
P-Value Easy Explanation P-Value Easy Explanation
In Data Science interviews, one of the frequently asked questions is 'What is P-Value?'. It's hard to grasp the concept behind p-value. To understand p-value, you need to understand some background and context behind it.
2021-03-12
  TOC