# Mathematical–statistical problem that has a significant implication on estimation of interval-specific rates of soil-forming processes

Matus, Francisco; Egli, Markus (2020). Mathematical–statistical problem that has a significant implication on estimation of interval-specific rates of soil-forming processes. Journal of Soil Science and Plant Nutrition, 20(1):12-18.

## Abstract

Mathematical–statistical problem on estimation of soil-forming processes. This paper raises a mathematical–statistical problem that finally can be found in any chronosequence of a specific Quaternary-based process (e.g. soil mass, clay content accumulation). The problem arises when interval-specific rates of a component f(t) are intended by dividing the stock by the given age t. The procedure implies that only an average specific rate is calculated in a linear form, for example when soil organic carbon often tends to an asymptotic end-value. For any other parameters (e.g. sedimentation rates, soil horizons thickness), the same holds true. The mathematical–statistical problem arises if the rates are derived using a linear calculation (individual stocks, f(t) divided by time, t) and then plotted as a function of time, in circumstances where there is no correlation between t and the components f(t). To illustrate this problem, we used a random generator function where the plot t versus f(t) are not correlated. We also give examples of the chronosequence approaches of soil organic carbon and soil mass, where, at a given point, the real specific rate corresponds to the slope (first derivative) of the obtained non-linear regression function. The random generator function for the plot t versus f(t) showed no significant relationship (R2 = 0.0), but f(t)t−1 and t showed highly significant correlation (R2 = 0.62). The error in calculating the component rate by dividing by time instead of using the derivative function when the processes are not linear ranged between 22% and > 500%. We conclude that non-independent variables plotted in a chronosequence can infer correlations even when they might not exist. Carefully observation is needed on the dataset when the time is involved, particularly in Quaternary-based studies, since the mass component not necessarily is related with time. Avoid in calculating the change of the mass component simply dividing by time, because an over- or under-estimation from real rate (obtained by derivative function) occurs when the process under study is not linear.

## Abstract

Mathematical–statistical problem on estimation of soil-forming processes. This paper raises a mathematical–statistical problem that finally can be found in any chronosequence of a specific Quaternary-based process (e.g. soil mass, clay content accumulation). The problem arises when interval-specific rates of a component f(t) are intended by dividing the stock by the given age t. The procedure implies that only an average specific rate is calculated in a linear form, for example when soil organic carbon often tends to an asymptotic end-value. For any other parameters (e.g. sedimentation rates, soil horizons thickness), the same holds true. The mathematical–statistical problem arises if the rates are derived using a linear calculation (individual stocks, f(t) divided by time, t) and then plotted as a function of time, in circumstances where there is no correlation between t and the components f(t). To illustrate this problem, we used a random generator function where the plot t versus f(t) are not correlated. We also give examples of the chronosequence approaches of soil organic carbon and soil mass, where, at a given point, the real specific rate corresponds to the slope (first derivative) of the obtained non-linear regression function. The random generator function for the plot t versus f(t) showed no significant relationship (R2 = 0.0), but f(t)t−1 and t showed highly significant correlation (R2 = 0.62). The error in calculating the component rate by dividing by time instead of using the derivative function when the processes are not linear ranged between 22% and > 500%. We conclude that non-independent variables plotted in a chronosequence can infer correlations even when they might not exist. Carefully observation is needed on the dataset when the time is involved, particularly in Quaternary-based studies, since the mass component not necessarily is related with time. Avoid in calculating the change of the mass component simply dividing by time, because an over- or under-estimation from real rate (obtained by derivative function) occurs when the process under study is not linear.

## Statistics

### Citations

Dimensions.ai Metrics
1 citation in Web of Science®
1 citation in Scopus®