1 Introduction

When analyzing relaxation and/or diffusion data acquired with various Nuclear Magnetic Resonance (NMR) techniques followed by the use of the Inverse Laplace transform (ILT) [1,2,3,4,5,6,7,8,9], the smoothing of the data is essential to produce distributions of relaxation times and/or diffusion coefficients. Even though there has been some work in non-uniform smoothing, the most common way to smooth the data is to apply a uniform smoothing on the data, i.e., a constant smoothing throughout the processing irrespective of the content of the data. Consequently, there will be a different broadening of the peaks depending on the peak position or fraction of the data in which the peak contributes to an attenuating signal [10]. Alternatively, discrete methods have been developed that fits the data to a limited number of components [11,12,13]. However, these methods suffer from the lack of information on the distributivity of the dataset.

Recently, a method was proposed that combines a discrete processing of the dataset together with the ILT, namely the Anahess distribution [10]. Here the number of components in the solution is limited to a minimum, which makes it possible to divide the solution into sub-groups that can be transformed and processed as superimposed datasets using the ILT with conditions set by the discrete solution. This approach has shown to reproduce synthetic distributions better than the ILT only. An extension of this method has now been developed and will be presented here. It aims at finding the expectation values for the fitted discrete components and the corresponding distribution. This should provide a measure allowing to evaluate the quality of the fit. In short, the procedure is as follows: after fitting the exponentially decaying data to a limited number of components according to the Bayesian Information stop Criterion (BIC) [14], a set of residual data or noise from the fit is produced. The residuals are then rearranged with respect to position. That is, the residuals or a group of such are interchanged in a random way so that new noise data is produced but with same expectation value and standard deviation as the original set of residuals. A new exponentially decaying data set can then be produced from the regrouped noise, and the fitted components are determined from the discrete fitting procedure. The new raw data set is then analyzed using the discrete Anahess, resulting in a new set of fitted components. If there is an impact of noise on the fitted components, the values will vary due to the different noise present. This procedure is repeated until enough data with rearranged residuals are produced to find an expectation value and a standard deviation for the fitted components.

In the following, we will recapture the combination of the discrete Anahess approach with ILT [10] and provide the method for determining the expectation values of the fitted components and distributions.

1.1 The Anahess Approach

In experimental data, the noise, ε, is superimposed on the exponentially decaying signal. Data arising from relaxation and/or diffusion NMR experiments can be described by a multi-exponential decaying signal S(t)

$$S({t}_{i})={\sum }_{k}A({T}_{k})\mathit{exp}\left(-\frac{{t}_{i}}{{T}_{k}}\right)+{\varepsilon }_{{\varvec{i}}}$$
(1)

Here A(Tk) is the distribution of intensities to be fitted to the corresponding Tk, which could be the corresponding relaxation time or the diffusion coefficient. The most common way to fit the equation above to experimental data is to use the Inverse Laplace Transform initially developed by Provencher [1,2,3]. Then, a predefined grid of a fixed number of points is defined, on which the solution (A(Tk)) has to be found. Another approach is to use a discrete method where the number of components is limited [11]. In this work, we apply the discrete Anahess approach, where the number of points (components) in the solution is minimized according to a Bayesian Information criterion and the points are allowed to move anywhere in the space of solutions. As in the ILT routine, a function involving the sum-squared relationship is to be minimized [11, 15].

$$S{S}_{res}\hspace{0.33em}={\sum }_{j}^{NX}\{{a}_{0}+{\sum }_{p}^{NCO}{a}_{p}{e}^{(-(1/{T}_{p}){t}_{j})}-{R}_{j}{\}}^{2}$$
(2)

where

$${R}_{j}={a}_{0}+{\sum }_{p}^{NCO}{a}_{p}{e}^{(-(1/{T}_{p}){t}_{j})}+{\varepsilon }_{j}$$
(3)

a0 is a baseline offset which may be positive or negative. The data matrix R has the corresponding number of data points NX. The parameters ap, Tp are thus characteristic properties of the component with index p out of all the NCO components. As SSres will decrease with increasing number of fitted components, a stop criterion is needed. The Bayesian information criterion (BIC) [14] is such a criterion. Let n be the number of observed data points, let q be the number of free model parameters, and SSres be the sum of squared residuals. Then if the residuals are normally distributed, the BIC has the form:

$$BIC\hspace{0.33em}\hspace{0.33em}=\hspace{0.33em}n\mathit{ln}(\frac{s{s}_{res}}{n})+q\mathit{ln}(n)$$
(4)

In this equation, a good model fit gives a low first term while few model parameters give a low second term. When comparing a set of models, the model with the minimal BIC value is selected.

The discrete solution fits to a relatively small number of components that provide a satisfactorily fit of the raw data. This is due to the discreteness of the fitting routine when using the BIC as a stop criterion [14]. As most of data reflect continuous distributions of components, a method for probing the distributivity using the discrete Anahess results has been developed. This is done by applying the ILT where the results from the discrete Anahess fit are fed into the ILT as initial and restricting conditions [8]. Because of the limited number of components in the Anahess, one may group various regions in the solution and provide a superimposed fit of each group. Consequently, prior to the use of the ILT, the data are grouped and then transformed in t according to the following equation [10]

$$t^{k} \to t \cdot \left( {\frac{{5T_{k} }}{{t_{max} }}} \right)$$
(5)

where t is the original observation time, Tk is the result from the Anahess fit for component k, and tmax is the longest observation time in the data set. With this approach, one may process data with the discrete Anahess method and probe the distributivity.

1.2 The Method for Determining the Expectation Value and its Standard Deviation

When the exponentially decaying data are fitted using the Discrete Anahess [11, 15], a resulting set of residuals is produced. These residuals are the difference between the fitted and the original data. As the noise is the crucial part that affects the result of the fitting procedure, both using discrete and continuous fittings, we here propose to produce new raw data to be subjected for processing through redistributing the noise as follows: divide the residual noise into several compartments for both 1- and 2-dimensional noise, as shown in Fig. 1. The compartments should be set so small that when moving them around in a random way such that the end product is a different noise dataset but with the same expectation value and standard deviation.

$${noise}^{NEW}({compartment}_{i})={noise}^{ORIGINAL}({compartment}_{j})$$
(6)

where j is a random number within the number of compartments. Using the fitted components and intensities from the Discrete Anahess method, one may then perform a Laplace Transform and produce new data sets where the noise is different due to the random interchanging of the compartments in the original noise data.

Fig. 1
figure 1

The synthetic Gaussian noise in one (left) and two (right) dimensions. The black dashed lines indicate the separation of the noise into different compartments (color figure online)

$${S}^{NEW}({t}_{i})={\sum }_{p}{a}_{p}\mathit{exp}\left(-\frac{{t}_{i}}{{T}_{p}}\right)+{noise}^{NEW}$$
(7)

These datasets can then be analyzed using the discrete Anahess approach and a variation in the fitted intensities and new relaxation times and/or diffusion coefficients will be found. When this procedure has been repeated until the effect of the noise has been probed properly, it will result in an expectation value for the various components, and these values can then be used as initial and restricting values to probe the distributivity of the original dataset.

This approach assumes that there is no dependency of the noise as a function of observation time and/or applied gradient strength. That is, the discrete Anahess fit returns a Gaussian distribution of residuals that has an expectation value of 0 and the same insignificant skew [16]. Also, this method provides a way to check if the fit to the original raw data is a stable one. That is, the minimum BIC number is achieved at the same number of components and the variation in the expectation values is acceptable.

Once a set of expectation values are found for a data set, these values can be used as input when probing the distributivity using the ILT procedure. As the expectation values provide a good fit to the raw data, the ILT routine will probe the distributivity around these initial values, resulting in a distribution that is characteristic for the expectation values.

2 Experimental Section

A set of synthetic data sets in 1- and 2 dimensions were used to verify the proposed method for determining the expectation values [10]. The one-dimensional synthetic data set was produced from a distribution located on a grid of 200 T2 values (0.0001–10 s) having three identical but separated peaks. After imposing a Laplace Transform to produce a decaying signal, synthetic Gaussian noise was added to mimic the experimental noise from an NMR experiment. The inter-echo spacing (2τ) was set to 0.4 ms and the data set contained 8000 echo points to mimic a Carr–Purcell–Meiboom–Gill (CPMG) decay. The two-dimensional synthetic data set was produced from distributions located on a 64 × 32 grid of T1’s (0.001–10 s) and T2’s (0.0001–10 s). As for the one-dimensional case, synthetic Gaussian noise was added, and the attenuation mimicked a combined Stimulated Echo—CPMG experiment [10].

In addition to the synthetic data sets, real NMR data sets were acquired from a sample of oat flakes. The NMR instrument applied was a 0.5 Tesla permanent magnet supplied by Advance Magnetic Resonance Ltd. [17] with the possibility of measuring samples of 18 mm in diameter. The one-dimensional experiment was a CPMG with inter-echo spacing of 0.2 ms acquiring 4000 echoes, and this was enough to secure the attenuating signal to reach the noise level. The Inversion Recovery (IR)-CPMG experiment was applied to produce two-dimensional data. After measuring on the oat flakes, the sample was dried at 105 C for 12 h and remeasured with the IR-CPMG experiment. The aim of this procedure is to identify the location of the moisture in the processed data prior to drying, and the oat flakes are chosen as sample because the T2 signal from moisture and fat is found to partially overlap.

The discrete Anahess is the processing method that provides the fitted components used to find the expectation values and their standard deviation [18]. The application starts with fitting one component to a data set and calculate its BIC number, then proceeds to two components and finds a new BIC number. If the current BIC number is lower than the previous one, the application increases the number of components to be fitted by one, and a new BIC number is found. This procedure continues until the current BIC number is larger than the previous one. The best fit for the data set is then the NCO that produces the lowest BIC number. This is shown in Fig. 2 for NCO ∈ [3, 7] using the data from the two-dimensional data set produced from oat flakes (Fig. 9), where the best fit is found at 6 components.

Fig. 2
figure 2

The fitted BIC number as a function of number of components

3 Results and Discussion

In the following, the results from processing of synthetic and real data in one and two dimensions are presented together with the expectation values and standard deviations achieved from the method proposed in Sect. 1.2

3.1 The Expectation Value for the One-Dimensional Synthetic Data Set

In Fig. 3, the attenuation of the synthetic data set is shown together with the residuals from the discrete Anahess fit, and it resulted in a three-component fit for the lowest BIC number. In the upper right corner of the figure, the noise is plotted and fitted to a Gaussian distribution. The skew of the noise is found to be 0.003, which shows that the distribution is symmetrical around the expectation value 0. In other words, the discrete Anahess fit returns Gaussian noise as residuals, and one may produce new data sets by doing a random permutation of the noise compartments. For this experiment, the 8000 points of noise were divided into 80 compartments of 100 points each, and they were randomly permuted 10 times to provide 10 new raw data sets.

Fig. 3
figure 3

The attenuation of the three-exponential decaying signal (red curve) arising from a synthetic data distribution. The blue curve corresponds to the fitted residuals. In the upper right corner, the distribution of residuals is shown together with the fitted Gaussian curve (color figure online)

As the 10 datasets were processed, it turned out that the minimum BIC number was achieved at NCO = 3 for all data sets, and one could easily identify the components’ position throughout the series of fitted data. The results for the three components are found in Table 1, while the synthetic and Anahess distribution are shown in Fig. 4. The Anahess distribution is based on the average values found in Table 1, and one may notice that the Anahess distribution is broadened due to the Gaussian noise that was added to the synthetic distribution when producing the original raw data. The areas of the synthetic peaks were 200 each, and the average T2 values were 2.33, 35.80 and 568.57 ms. All average values in Table 1 were within the standard deviation except for the average intensity of the peak at longest T2, so the there is a good agreement between the fitted and key average values.

Table 1 Anahess results
Fig. 4
figure 4

Comparison between the synthetic and the Anahess distributions produced from the average values for T2 and initial intensities

What is evident from the data in Table 1 is an increasing relative standard variation as T2 is reduced. For the peak at the shortest T2, the relative standard deviation is 4.2% for T2 and 7.5% for the intensity, for the peak in the middle, the standard deviation is 1.3% for T2 and 2.2% for the intensity, while for the peak at the longest T2, the standard deviation is 0.4% for T2 and 0.6% for the intensity. The reason for the decreasing uncertainty of the fitted values as T2 increases is because the number of significant datapoints where the components contribute in the attenuating signal is increased as T2 increases. For the component fitted to 2.39 ms, its signal has reduced to a fraction less than 0.01 at 24 ms. With a τ-value of 0.2 ms, this corresponds to 60 data points. Consequently, in the data that contains 8000 data points, the components with the shortest T2 contribute only in 0.75% of the data. The component with T2 of 35.38 contributes in 5.6% of the data while the component with T2 of 568.57 ms contributes in 88.9% of the data.

3.2 The Expectation Value for the One-Dimensional Real Data Set

In Fig. 5, the attenuation of the data set arising from oat flakes is shown together with the residuals from the discrete Anahess fit resulting in three components with the lowest BIC number. The three components are one isolated component with T2 ~ 0.6 ms and two components rather close to each other with T2 ~ 50 and 180 ms respectively. In the upper right corner of the figure, the noise is plotted and fitted to a Gaussian distribution. The skew of the noise is found to be 0.03, which is the same skew as found from the synthetic Gaussian noise distribution, i.e., an insignificant skew. Thus, one may conclude that the distribution is symmetrical around the expectation value 0.001, and again the discrete Anahess fit returns Gaussian noise as residuals, and on may produce new data sets by doing a random permutation of the noise compartments. In this experiment, the 4000 points of noise were divided into 40 compartments of 100 points each, and they were randomly permuted 20 times to provide 20 new raw data sets.

Fig. 5
figure 5

The attenuation of a decaying signal (red curve) arising from a CPMG experiment performed on oat flake. The blue curve corresponds to the fitted residuals. In the upper right corner, the distribution of residuals is shown together with the fitted Gaussian curve (color figure online)

As the 20 datasets were processed, it turned out that the minimum BIC number was achieved at NCO = 3 for all data sets, and one could easily identify the components’ position throughout the series of fitted data. The results for the three components are shown in Table 2. The isolated component at the shortest T2 is reported with a standard deviation of 9.1% in intensity and 13.3% in T2, while the component with intermediate T2 (47.1 ms) is reported with higher standard deviations (14.2% in intensity and 21.4% in T2). This does not fit the picture of an improved accuracy for the intermediate T2 component as more datapoints are available. The reason for this is that the two components at longer T2 do not correspond to two unique components as the varying noise significantly interferes with the fitted components. The solution is to group the two components together, as shown in Table 2.The population weighted average of T2 for the two components as a group is then calculated, and variation of this value in different fittings provides the expectation value and its standard deviation. Then the standard deviation for the group is down to 2.5% for the intensity and 3.2% for the T2. Thus, the improved accuracy of the expectation value for the group as compared to the individual components can be used as a criterion for the grouping of components. This knowledge can be applied when probing the distributivity of the data, and components 2 and 3 must be processed as a group. In Fig. 6, the Anahess T2 distribution is shown for the oat flakes, and it is based on the fit of the 20 data sets with different noise produced from the residuals of the fitting of the original data. Consequently, the intensity of the peaks and their average T2 values (i.e., peak position) can be reported with a standard deviation given in Table 2 for component 1 (left peak) and component 2 + 3 (right peak). Alternatively, one may produce 20 data sets by running the experiment 20 times.

Table 2 Average results from 20 processed datasets
Fig. 6
figure 6

The Anahess T2 distribution from oat flakes based on the average values for T2 and initial intensities of the oat flakes components

3.3 The Expectation Value for the Two-Dimensional Synthetic Data Set

Figure 7 shows the results from the discrete Anahess fit of the two-dimensional synthetic data set, with a six-component fit for the lowest BIC number. In the right part of the figure, the noise is shown plotted and fitted to a Gaussian distribution. The skew of the noise was found to be −0.003, confirming that the distribution is symmetrical around the expectation value 0 and thus the discrete Anahess fit returns Gaussian noise as residuals. New data sets can then be produced by doing a random permutation of the noise compartments. For this example, the 3000 points of noise were divided into 30 compartments of 100 points each, and they were randomly permuted 10 times to provide 10 new raw data sets.

Fig. 7
figure 7

The discrete components from the Anahess fit to the left, the resulting residuals to the lower right and the distributions of residuals to the upper right

In Table 3, the average results based on the 10 fitted data sets with varying noise are shown. As for the one-dimensional case on oat flakes, the components can be regrouped into component 4, components 1 + 2 and components 3 + 5 + 6. Without this grouping, the standard deviations for individual components at higher T2’s are higher than the standard deviation for the component with shortest T2, while a lower standard deviation is to be expected due to more data points available for fitting the components of longer T2’s. When grouping the components as shown in Table 3, the standard deviation decreases as T2 of the group increases. The Anahess distribution based on the average values of the components is shown in Fig. 8, and it fits well to the synthetic distribution provided in [10].

Table 3 Average results from 10 processed datasets
Fig. 8
figure 8

The Anahess T1–T2 synthetic distribution based on the average values for T1, T2 and initial intensities

3.4 The Expectation Value for the Two-Dimensional Real Data Set

Figure 9 shows the results from the discrete Anahess fit for a dataset recorded from a combined IR-CPMG experiment, and the lowest BIC number is found with a six-component fit. In the right part of the figure, the noise is shown plotted and fitted to a Gaussian distribution. The skew of the noise is found to be −0.002, confirming that the distribution is symmetrical around the expectation value 0, and thus the discrete Anahess fit returns Gaussian noise as residuals. New data sets can then be produced by doing a random permutation of the noise compartments. For this example, the 4000 points of noise were divided into 40 compartments of 100 points each, and they were randomly permuted 10 times to provide 10 new raw data sets.

Fig. 9
figure 9

The discrete components from the Anahess fit to the left, the resulting residuals to the lower right and the distributions of residuals to the upper right

When the 10 new datasets were processed using the discrete Anahess method, it was found that the lowest BIC number is 6 for 9 of the 10 datasets while for one data set, the lowest BIC number is found at NCO = 5. For this dataset, component 5 in Fig. 9 disappears and the neighboring components (6 and 3) are shifted slightly in position. This is a consequence of the Gaussian noise in combination with the fact that component 5 in Fig. 9 has an intensity of ~ 80 while the noise varies between ± 100. In short, in 10% of the processed data, we will fit to 5 components using Anahess while in 90% of the incidents, it will be reported 6 components. So based on the processed data, we can report the 6-component fit shown in Table 4 with a 90% probability. As for the synthetic two-dimensional dataset, we find that the standard deviation reduces as T2 and T1 increase if one regroups the components as component 4, components 1 + 2 + 3 and components 5 + 6. However, when producing the Anahess distribution as shown in Fig. 10, it turns out that components 5 + 6 do not produce one peak as components 1 + 2 + 3 do. Thus, the probing of the distributivity indicates that component 5 should be separated from component 6.

Table 4 Average results from 10 processed datasets
Fig. 10
figure 10

The Anahess T1–T2 distribution from oat flakes based on the average values for T1, T2 and initial intensities

The finding above provides a new tool using the discrete Anahess processing; for datasets where the noise is significant, as in NMR logging data, the discrete Anahess in combination with a random permutation of the fitted noise (assumed Gaussian) can be used to estimate the likelihood of finding a given number of components, and whether a component should be grouped with other components or not.

In order to establish the location of the moisture and fat signal in the T1–T2 distribution, the sample was dried at 105 °C for 12 h and remeasured using the IR-CPMG sequence. The processing resulted in 5 components as shown in Fig. 11, and the residuals from the discrete Anahess method were used to produce 10 new datasets that were processed. Then, it turned out that five of the processed datasets resulted in 4 components while the other five resulted in 5 components. The T1–T2 distributions are shown in Fig. 12 for the two equally probable results, and the largest variations in the results for NCO = 4 and NCO = 5 are found at the shortest T1 ‘s and T2’s. In particular, the component with the shortest T2 appears at 1 ms for NCO = 4 while it appears at 0.35 ms for NCO = 5. The intensity is approximately the same, around 160. Thus, it is evident that the variation of the noise affects the part with the smallest number of attenuating data points significantly, and it is not possible to establish what the best solution is based on the 10 data sets. A new experiment with improved signal-to-noise ratio was therefore conducted, where the number of scans was increased from 64 to 128. Then all processed data were reported with 5 components as the lowest BIC number (Fig. 11). Also, the average values and standard deviations are shown in Table 5. A shift in T1’s and T2’s toward shorter values is observed, the large component at the shortest T2 and the highest T1/T2 has been reduced to a small component and the component at the shortest T2 and the lowest T1/T2, component 4 in Fig. 9, has vanished. Component 4 in Fig. 9 can then most likely be assigned to the moisture, component 6 is believed to be the tail of protein signal which becomes undetectable at the time of the first echo at 0.2 ms because T2 is reduced due to the drying. What remains then in Fig. 12 is the fat signal that can be divided into more (long T2) and less (short T2) mobile fat [15, 19, 20]. When correcting for the different number of scans, the total signal from Fig. 12 fits to the signal from components 1, 2, 3 and 5 in Fig. 9. This indicates that it could be possible to determine the fat content without drying using this method.

Fig. 11
figure 11

The discrete components from the Anahess fit to the left, the resulting residuals to the lower right and the distributions of residuals to the upper right

Fig. 12
figure 12

The Anahess T1–T2 distribution from dried oat flakes either with the lowest BIC at 5 components (left) or at 4 components (right)

Table 5 Average results from 10 processed datasets

4 Conclusion

Provided that a dataset reflects a multi-exponential decay or recovery in one or two dimensions, the discrete Anahess processing tool returns a set of residual data or noise from the fit than can be regarded as symmetric and Gaussian. New raw datasets can then be produced from random permutations of the residuals, and they can be reprocessed using Anahess to find expectation values and standard deviations for T1, T2 and the intensity. Also, the results from the fitted datasets can be used to find the likelihood of fitting to a certain number of components with the lowest BIC number.