Partial Least Squares (PLS)

Partial Least Squares (PLS) is a multivariate statistical technique that can be used to examine associations between two multivariate datasets. PLS identifies weighted linear combinations of original variables in the two datasets that maximally covary with each other. These weighted linear combinations of variables are termed latent variables and are mutually orthogonal, such that patterns captured by one latent variable will be independent from patterns captured by another latent variable. If you’re not familiar with PLS and want to learn more about it, McIntosh & Lobaugh, 2004, NeuroImage and McIntosh & Misic, 2013, Annu. Rev. Psychol. are good resources.

The pyls package is a Python package for performing different types of PLS analyses. Full documentation and code for pyls can be accessed here.

Note that the latent variable statistical testing implemented in pyls is based on nulls that are generated by randomly permuting rows of the two datasets included in PLS analysis. However, if you are running an analysis where the two datasets are two sets of brain maps (e.g., one set of maps for parcellated gene expression and another set for parcellared morphological maps), you will need to provide your own set of null models that preserve the spatial-autocorrelation inherent to the data (see “Spin Tests” section).