With a large number of variables measuring different aspects of a same theme, we would like to summarize the information in a limited number of components, i.e. linear combinations of the original variables. Among linear dimension reduction techniques, principal component analysis is optimal in at least two ways: principal components extract the maximum of the variability of the original variables, and they are uncorrelated. Unfortunately, they are often difficult to interpret. Moreover, in most applications, only the first principal component is a 'block component', the remaining components being 'difference components' which are also more difficult to interpret. The goal of simple component analysis is to replace (or to supplement) principal components with suboptimal but better interpretable 'simple components'. We propose a fast algorithm which seeks the optimal system of components under constraints of simplicity. Thus, in contrast with other techniques like 'varimax', this approach always provides a simple solution. The optimal simple system is suboptimal compared with principal components: less variability is extracted and components are correlated. However, if the loss of extracted variability is small, and correlations between components are low, it might be advantageous for practical use. Moreover, our concept of simplicity allows the system to have more than one block component, which also facilitates interpretation. Simplicity is not a guarantee for interpretability. With the help of our algorithm, the user can partly modify an optimal simple system of components to enhance interpretability. In this respect, the ultimate goal of simple component analysis is not to propose a method that leads automatically to a unique solution, but rather to develop tools for assisting the user in his or her choice of an interpretable solution. Finally, we argue that simple components may also make the task of choosing the dimension easier. The methodology is illustrated with a test battery to study the development of neuromotor functions in children and adolescents.

Rousson, V; Gasser, T (2004). *Simple component analysis.* Journal of the Royal Statistical Society: Series C (Applied Statistics), 53(4):539-555.

## Abstract

With a large number of variables measuring different aspects of a same theme, we would like to summarize the information in a limited number of components, i.e. linear combinations of the original variables. Among linear dimension reduction techniques, principal component analysis is optimal in at least two ways: principal components extract the maximum of the variability of the original variables, and they are uncorrelated. Unfortunately, they are often difficult to interpret. Moreover, in most applications, only the first principal component is a 'block component', the remaining components being 'difference components' which are also more difficult to interpret. The goal of simple component analysis is to replace (or to supplement) principal components with suboptimal but better interpretable 'simple components'. We propose a fast algorithm which seeks the optimal system of components under constraints of simplicity. Thus, in contrast with other techniques like 'varimax', this approach always provides a simple solution. The optimal simple system is suboptimal compared with principal components: less variability is extracted and components are correlated. However, if the loss of extracted variability is small, and correlations between components are low, it might be advantageous for practical use. Moreover, our concept of simplicity allows the system to have more than one block component, which also facilitates interpretation. Simplicity is not a guarantee for interpretability. With the help of our algorithm, the user can partly modify an optimal simple system of components to enhance interpretability. In this respect, the ultimate goal of simple component analysis is not to propose a method that leads automatically to a unique solution, but rather to develop tools for assisting the user in his or her choice of an interpretable solution. Finally, we argue that simple components may also make the task of choosing the dimension easier. The methodology is illustrated with a test battery to study the development of neuromotor functions in children and adolescents.

## Citations

## Altmetrics

## Downloads

## Additional indexing

Item Type: | Journal Article, refereed, original work |
---|---|

Communities & Collections: | 04 Faculty of Medicine > Epidemiology, Biostatistics and Prevention Institute (EBPI) |

Dewey Decimal Classification: | 610 Medicine & health |

Language: | English |

Date: | 2004 |

Deposited On: | 23 Jun 2009 07:35 |

Last Modified: | 05 Apr 2016 13:16 |

Publisher: | Royal Statistical Society |

ISSN: | 0035-9254 |

Additional Information: | The definitive version is available at www.blackwell-synergy.com |

Publisher DOI: | https://doi.org/10.1111/j.1467-9876.2004.05359.x |

## Download

TrendTerms displays relevant terms of the abstract of this publication and related documents on a map. The terms and their relations were extracted from ZORA using word statistics. Their timelines are taken from ZORA as well. The bubble size of a term is proportional to the number of documents where the term occurs. Red, orange, yellow and green colors are used for terms that occur in the current document; red indicates high interlinkedness of a term with other terms, orange, yellow and green decreasing interlinkedness. Blue is used for terms that have a relation with the terms in this document, but occur in other documents.

You can navigate and zoom the map. Mouse-hovering a term displays its timeline, clicking it yields the associated documents.