I pulled the above image from the Wikipedia article on Principal Component Analysis, which I linked you to above. Here's a good example from Wikipedia for two dimensional data: It's also good to note that each principal component is orthogonal to each other. Each principal component after that gives you variability of a decreasing nature. the largest eigenvector and associated largest eigenvalue) gives you the direction of the maximum variability in your data. Concretely, the first principal component (i.e. It's a well known fact that the eigenvectors of the covariance matrix are equal to the principal components. You can then pluck out the first k largest vectors and values via: k = 800 It's imperative that you choose 'descend' as the flag so that the largest eigenvalue and associated eigenvector appear first, just like we talked about before. This is the ordering we need to rearrange the columns of the eigenvector matrix A. I use the second output of sort because it tells you the position of where each value in the unsorted result would appear in the sorted result. However, the eigenvalues are in a diagonal matrix, so we extract out the diagonals with the diag command, sort them and figure out their ordering, then rearrange A to respect this ordering. Specifically, the i th column / eigenvector of A corresponds to the i th eigenvalue seen in D. Take note that each column of the eigenvector matrix A represents one eigenvector. Next, we find the eigenvalues of your covariance matrix and the associated eigenvectors. The first line of code finds the covariance matrix of B, even though you said it's already stored in sigma, but let's make this reproducible. This means that if we had a component whose eigenvalue was, say -10000, this is a very good indication that this component has some significant meaning to your data, and if we sorted purely on the numbers themselves, this gets placed near the lower ranks. It's a good thing to note that you do the sorting on the absolute value of the eigenvalues because scaled eigenvalues are also eigenvalues themselves. In MATLAB, doing what we described above would look something like this: sigma = cov(B) I should also note that using eigs will not guarantee sorted order, so you will have to explicitly sort these too when it comes down to it. If you wish to select out the largest k eigenvalues and associated eigenvectors given the output of eig (800 in your example), you'll need to sort the eigenvalues in descending order, then rearrange the columns of the eigenvector matrix produced from eig then select out the first k values. MATLAB generates the eigenvalues and the corresponding ordering of the eigenvectors in such a way where they are unsorted. Now let's answer your question one at a time. I'll throw in another link as well that talks about the theory behind why the Singular Value Decomposition is used in Principal Component Analysis: See this informative post on Cross Validated as to why this is preferred: Concretely, the columns of the V matrix give you the eigenvectors of the covariance matrix, or the principal components, and the associated eigenvalues are the square root of the singular values produced in the diagonals of the matrix S. The most canonical way to do this now is via Singular Value Decomposition. It has been known that doing it this way is not recommended due to numerical instability with computing the eigenvalues and eigenvectors for large matrices. The mechanics behind this would be to compute the covariance matrix of your data and find the eigenvalues and eigenvectors of the computed result. Now, back to your problem, what you are describing is ultimately Principal Component Analysis. ![]() You simply supply the covariance matrix of your data to eigs and it returns the k largest eigenvalues and eigenvectors for you. ![]() This may save computational overhead where you don't have to compute all of the eigenvalues and associated eigenvectors of your matrix as you only want a subset. This not only computes the eigenvalues and eigenvectors for you, but it will compute the k largest eigenvalues with their associated eigenvectors for you. What I would recommend to you in the future is to use the eigs function. ![]() I'm assuming you determined the eigenvectors from the eig function.
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |