A while back I wrote a post about Aitchison’s log-ratio transformation. It was about eliminating the dilution effect of an element 1c on elements 1a and 1b by dividing 1a/1b or log(1a/1b). One can show the effect of the log-ratio transformation also very nicely in the spectral domain. Here is an example.
Time Domain
Here is the MATLAB example illustrating the closed-sum problem of a system with three variables, and the use of ratios as well as log-ratios to overcome the problem of spurious correlations between pairs of variables. First, we clear the workspace and choose colors for the plots.
colors = [ 0 114 189 217 83 25 237 177 32 126 47 142 ]./255;
We are interested in element 1a and element 1b, which are diluted by element 1c. We simply create three variables with magnitudes measured in milligrams, contributing to a sediment and with a sinusoidal variation of 1,000 samples – more than in Part 1 – down core. Make sure that all absolute values are >0.
t = 0.1 : 0.1 : 100; t = t';
Aitchison’s log-ratio transformation only works if the elements>0 by adding 5. If not then we are mixing positive with negative values when calculating ratios and this causes useless results. We get positive values when adding 5 in our example (see Part 1 for MATLAB code for graphics).
element1a = sin(2*pi*t/2) + 5; element1b = sin(2*pi*t/5) + 5; element1c = 2*sin(2*pi*t/20) + 5;
Calculating percentages of elements 1a-c, i.e. creating ratios of the individual elements and the sum of all elements. This process creates closed data, i.e. the data are now expressed as proportions and adding up to a fixed total of 100 percent, 1a+1b+1c = 100%. Now elements 1a and elements 1b are affected by the dilution by elements 1c. These elements show a significant sinusoidal long-term trend that is not real, as the first figure shows (see Part 1 for MATLAB code for graphics).
element1a_perc = element1a./... (element1a+element1b+element1c); element1b_perc = element1b./... (element1a+element1b+element1c); element1c_perc = element1c./... (element1a+element1b+element1c);
Built ratios of element 1a/1b and element 1b/1a, which are both independent from element 1c. Note the change of sign and difference in the amplitudes. The ratio of element 1a/1b and 1b/1a do not show the trend caused by the dilution effect of element 1c. However, the two curves are not identical, i.e. 1a/1b and 1b/1a are not symmetric (see Weltje and Tjallingii 2008, page 426).
ratio12 = element1a_perc./element1b_perc; ratio21 = element1b_perc./element1a_perc;
Display log-ratios instead according to Aitchison (1986, 2003). The log- ratios log(1a/1b) and log(1b/1a) are identical except for the sign. Hence it makes no difference whether we use log(1a/1b) or log(1b/1a), for instance, when running further statistical analysis on the data.
ratio12log = log10(ratio12); ratio21log = log10(ratio21);
Frequency Domain
We now calculate a periodogram of the original data but detrended to remove the mean. Without detrending we get useless results when calculating ratios. It can be seen that the three elements each have the cycle assigned to them without influencing each other.
element1a = detrend(element1a); element1b = detrend(element1b); element1c = detrend(element1c); [Pxx1a,f] = periodogram(element1a,[],... length(element1a),10); [Pxx1b,f] = periodogram(element1b,[],... length(element1b),10); [Pxx1c,f] = periodogram(element1c,[],... length(element1c),10); figure('Position',[550 1000 500 200]) axes('Box','On',... 'LineWidth',0.6,... 'FontSize',12,... 'XLim',[0 1]) line(f,abs(Pxx1a),... 'LineWidth',1,... 'Color',colors(1,:)) line(f,abs(Pxx1b),... 'LineWidth',1,... 'Color',colors(2,:)) line(f,abs(Pxx1c),... 'LineWidth',1,... 'Color',colors(3,:)) legend('1a','1b','1c') legend boxoff xlabel('Frequency') ylabel('Power')
The periodograms of the percentages with strong closed-sum effects. As we can see, the elements 1a and 1b, related to the sum 1a+1b+1c, also have the cycle of the diluting element 1c.
element1a_perc = detrend(element1a_perc); element1b_perc = detrend(element1b_perc); [Pxxp1a,f] = periodogram(element1a_perc,[],... length(ratio12),10); [Pxxp1b,f] = periodogram(element1b_perc,[],... length(ratio21),10); figure('Position',[550 700 500 200]) axes('Box','On',... 'LineWidth',0.6,... 'FontSize',12,... 'XLim',[0 1]) line(f,abs(Pxxp1a),... 'LineWidth',1,... 'Color',colors(1,:)) line(f,abs(Pxxp1b),... 'LineWidth',1,... 'LineStyle','--',... 'Color',colors(2,:)) legend('1a/(1a+1b+1c)',... '1b/(1a+1b+1c)',... 'Box','Off'), grid legend boxoff xlabel('Frequency') ylabel('Power')
Periodogram of the ratios. If we form ratios 1a/1b or 1b/1a, the cycle of 1c disappears.
ratio12 = detrend(ratio12); ratio21 = detrend(ratio21); [Pxxr12,f] = periodogram(ratio12,[],... length(ratio12),10); [Pxxr21,f] = periodogram(ratio21,[],... length(ratio21),10); figure('Position',[550 400 500 200]) axes('Box','On',... 'LineWidth',0.6,... 'FontSize',12,... 'XLim',[0 1]) line(f,abs(Pxxr12),... 'LineWidth',1,... 'Color',colors(1,:)) line(f,abs(Pxxr21),... 'LineWidth',1,... 'LineStyle','--',... 'Color',colors(2,:)) legend('1a/1b',... '1b/1a',... 'Box','Off'), grid legend boxoff xlabel('Frequency') ylabel('Power')
Periodogram of the log-ratios. Again, if we form ratios 1a/1b or 1b/1a, the cycle of 1c disappears. The advantage of the logarithm cannot be established due to the qualitative nature of the periodogram, in contrast to the representation in the time domain.
ratio12log = detrend(ratio12log); ratio21log = detrend(ratio21log); [Pxxlr12,f] = periodogram(ratio12log,[],... length(ratio12log),10); [Pxxlr21,f] = periodogram(ratio21log,[],... length(ratio21log),10); figure('Position',[550 100 500 200]) axes('Box','On',... 'LineWidth',0.6,... 'FontSize',12,... 'XLim',[0 1]) line(f,abs(Pxxlr12),... 'LineWidth',1,... 'Color',colors(1,:)) line(f,abs(Pxxlr21),... 'LineWidth',1,... 'LineStyle','--',... 'Color',colors(2,:)) legend('log(1a/1b)',... 'log(1b/1a)',... 'Box','Off'), grid legend boxoff xlabel('Frequency') ylabel('Power')
References:
Aitchison, J., 1986, 2003, The Statistical Analysis of Compositional Data. Blackburn PR, 460 pages.
Aitchison, J., 1999, Logratios and natural laws in compositional data analysis, Mathematical Geology, 31, 563-580.
Croudace, I.W., Rothwell, R.G., 2015, Twenty Years of XRF Core Scanning Marine Sediments: What Do Geochemical Proxies Tell Us? in: Croudace, I.W., Rothwell, R.G. (eds.), 2015, Micro-XRF Studies of Sediment Cores, Springer. -> See page 50, “Plotting Core Scanner Data, the Importance of Normalisation and Log-Ratios”.
Davies, S.J., Lamb, H.F., Roberts, S.J., 2015, Micro-XRF Core Scanning in Palaeolimnology: Recent Developments, in: Croudace, I.W., Rothwell, R.G. (eds.), 2015, Micro-XRF Studies of Sediment Cores, Springer, Heidelberg.
Davis, J.C., 2002, Statistics and Data Analysis in Geology, Third Edition, John Wiley & Sons, New York.
Martín-Ferández, J.A., Thió-Henestrosa, S. (Eds.), 2016, Compositional Data Analysis, CoDaWork, L’Escala, Spain, June 2015, Springer Proceedings in Mathematics & Statistic, Volume 187, Springer.
van den Boogaart, K.G., Tolosana-Delgado, R., 2013, Analyzing Compositional Data with R, Use R! Springer, Heidelberg.
Pearson, K., 1897, Mathematical contributions to the theory of evolution. On a form of spurious correlation which may arise when indices are used in the measurement of organs. Proceedings of the Royal Society of London, LX, 489-502.
Weltje, G.J., Tjallingii, R., 2008, Calibration of XRF core scanners for quantitative geochemical logging of sediment cores: Theory and application, Earth and Planetary Science Letters, 274, 423-438.