Notes on EOF Analysis


I have recently been doing some basic Empirical Orthogonal Function (EOF) analysis of some oceanographic data and have found the literature to be rather confusing.  Here I have collected a few notes on the subject, matlab code and useful references.  The discussion is very basic and is not designed to be an in-depth discussion of doing EOF analysis.  If you have any corrections to these notes, my contact information is here.


Terminology:

First of all, there is absolutely no consistent terminology for EOF analysis.  There are several competing and at times contradictory terminologies for EOF analysis.  This can make it exceedingly difficult to understand the literature.  I will use "empirical orthogonal functions" or EOFs to refer to the "spatial" patterns that are the result of doing an EOF analysis and "expansion coefficients" or ECs to refer to the "temporal" patterns.  In the literature, you will find:
EOFs = principal component loading patterns or, at times, just principal components

ECs = EOF time series, expansion coefficient time series, principal component time series, principal component scores, principal component amplitudes or, at times, just principal components
There is also talk of covariance matrices and communalities.  I will explain what these are later.

Doing EOF analysis in 5 minutes or less:

This is the quickstart to doing EOF analysis.
  1. Put your data into a matrix so that the rows indicate temporal development and the columns are variables or spatial data points.  The temporal relationship between rows is unimportant (ie. doesnt have to be uniform).  Same for the spatial relationship between columns.
  2. Detrend the columns of the resulting matrix.  Some EOF routines do this for you, but I prefer to do it separately.
  3. Use singular value decomposition (svd) to break up your data into 3 matrices:
    Z = U * D * Vt  
    where U and V are orthonormal and D is diagonal.  Then,
    EOFs = V
    ECs = U * D
    covariance matrix = ECst * ECs / (n-1) = D2 / (n-1)
    communalities matrix = ECs * ECst 
That is really all there is to it.  The EOFs are really the columns of the EOFs matrix.  I have included matlab code that performs step 3 above.  See the references for a more detailed discussion.

After finishing these calculations, you will probably want to reduce the EOFs and ECs to only those which explain a significant percentage of the overall variance by just selecting out those columns of the ECs and EOFs.  You then may or may not wish to rotate the EOFs to increase the physical explainability of the resulting patterns.  Finally, there are a number of useful ways to visualize the results of your analysis.  I will not discuss visualization here.  I will also not discuss EOF analysis of several fields.

Rotation of EOFs:

At times, the EOFs that result from the analysis will be difficult to explain in terms of physical forces.  In this case, it is often beneficial to rotate the orthogonal basis you found to one which can be better explained in terms of physical forces.  Upon rotation, you will at least loose the nice property that EOFs have that they are an orthogonal basis (no cross-correlations).  You will perhaps also loose orthonormality of the EOFs matrix if you choose a non-orthogonal transformation of the data.  It is also important to note that these rotations do not use any particular property of the EOFs (such as orthonormality) and you essentially reduce EOF analysis to noise reduction (via the reduction in the number of EOFs) if you perform these rotations (ie. the result no longer has anything to do with EOF analysis).

I will only discuss a particularly common orthogonal rotation known as varimax.  It seems to be the most popular and certainly has a logical explanation.  It looks to reduce the variances of the projection of the data onto the rotated basis (for the EOFs, this projection is just the ECs), thereby putting the basis closer to the actual data and increasing interpretability.

I have included matlab code and references for doing varimax rotation.  The code has extensive documentation that represents my best understanding of varimax rotation.

Matlab Code:

These are a couple of generic routines for EOF analysis and rotation.

EOF.m
varimax.m

References:

The following references were useful to me:
H. Bjornsson and S. A. Venegas.  1997.  A Manual for EOF and SVD Analyses of Climatic Data, Feb. 1997, 52 pages.  CCGCR Report No. 97-1.

Rudolph W. Preisendorfer and Curtis D. Mobley.  1988.  Principal component analysis in meteorology and oceanography.  Elsevier.

M.B. Richman. Rotation of principal components.  1986.  Journal of Climatology, vol.  6, no. 3, pp. 293-335.

H. v. Storch and A. Navarra.  1999.  Analysis of climate variability : applications of statistical techniques : proceedings of an autumn school organized by the Commission of the European Community on Elba from October 30 to November 6, 1993.  Springer.
These references are more technical, but useful nonetheless (links might not function outside of U. of California):
J.D. Horel.  1984.  Complex principal component analysis: theory and examples.  Journal of Climate and Applied Meteorology , vol. 23,no. 12 , pp.  1660-73 , Dec. 1984.

N.E. Huang.  2001.  Review of empirical mode decomposition.  pp.  71-80 , published as  Proceedings of the SPIE - The International Society for Optical Engineering , vol. 4391.

I.T. Jolliffe and M.B. Richman.  1987.  Rotation of principal components: some comments (with reply).  Journal of Climatology, vol. 7, no. 5, pp.  507-20.

M.A. Merrifield and R.T. Guza.  1990.  Detecting propagating signals with complex empirical orthogonal functions: a cautionary note.  Journal of Physical Oceanography, vol. 20, no. 10 , pp.  1628-33, Oct. 1990.

Zwiers, F.W.  1999.  The detection of climate change. In: Anthropogenic climate change, Edited by: von Storch, H.; Floser, G.  Berlin, Germany: Springer-Verlag. p.161-206.