Geophysical methods are conducted to study characteristics of geological structures, distinguish their layers, find the elastic coefficients of each layer, evaluate dynamic parameters of surface layers, investigate the behavior of surface layers during earthquakes in order to design construction, and locate reservoirs such as hydrocarbons, metal mines, and underground water.
Ground penetrating radar is geophysical tool with an active source which uses high-frequency electromagnetic waves to study near surface layers. It was first used in 1956 and has been increasingly used ever since 1970. GPR instruments have been commercially available since the 80 s and have become popular over the past decade. GPR method beams very high frequency (12.5 - 2300 MHz) electromagnetic waves into the Earth which are reflected upon contact with various underground materials and relatively distinct boundaries therein. Such radar reflections are created as a result of the differences between electrical conductivity (dielectric constant) among the material through which the electromagnetic waves are passing. The electromagnetic waves from the GPR pass through the material with low electrical conductivity, but are strongly absorbed by conductive components such as clay, organic acidic soils and the material saturated with salt water  .
The GPR resolution varies by depth from centimeters at a few meters below the surface to meters at hundred meters of depth. It also depends on the amount of variation in the electric properties between the target and its surrounding environment, geometry of the target and the applied bandwidth, etc. and can be high enough to distinguish subtle layering in shallow structures or buried objects  .
Wavelet transform is widely used in signal processing and is especially applied in image and signal compression and noise suppression. It also allows for obtaining an accurate understanding of the signal properties. Contrary to Fourier transform, where each coefficient is a component dealing with all times and therefore a phase is necessary to isolate temporary events through canceling or amplifying over large periods, wavelet coefficients deal with already local and easy-to-interpret components. Wavelet transform allows for separating overlapping signal components in space and time through adjustable and adaptive wavelets. A multitude of discrete wavelets is ideal for adaptive systems―such as digital computers―which are flexible based on the signals  .
Autoregression is based on regressing a suitable model to data which results in more information than the original dataset. It is important to correctly estimation the autoregressive model parameters (AR), as using appropriate parameters is essential in iterative parametric methods used in estimating the spectra of random signals. In general, since such autoregressive models are easy to use, useful techniques have been introduced to estimate appropriate model parameters. This method can also be used in studying the effects of parameters on each other or sets of variables  .
The GPR signal (commonly including noise) in the receiver, X(t), can be represented as (1):
where the reflected series, s(t), is convolved with the source wavelet, w(t), to which the noise, n(t), is added which has to be removed. Since it is impossible to remove all the noises, noise suppression is conducted to obtain an X(t) which similar to s(t) as much as possible  .
In this study, we use the new automated noise suppression technique in wavelet and f-x domains to remove the random noise in the GPR data. First, we discuss autoregression and then introduce the f-x and wavelet domains and the necessary concepts in noise suppression. Finally, we will apply these methods to synthetic and real GPR data and compare the results.
3. Wavelet Transform
Wavelets are wave-shape mathematical functions with zero mean, and confined periods as opposed to sine functions which theoretically extend to infinity. Wavelet transform and series have proved to be efficient in the analysis of a wide range of signals and phenomena. (2) formulates the wavelet expansion
where j and k are integer indices, are the wavelet expansion orthogonal functions  . The size of indices in (2) decreases for a wide range of signals. This property, called the non-conditionality principle, explains why wavelets are useful in compressing images and signals, and noise suppression.
3.1. Discrete Wavelet Transform
Similar to the Fourier transform, wavelet transform also has continuous and discrete forms. A number of issues, namely redundancy, infinite number of wavelets, and lack of analytical solutions, make direct use of wavelet transform difficult. Discrete wavelet transform was introduced to overcome these issues, as it has orthogonal wavelets (no redundancy) through expansion and dilatation of an appropriate mother wavelet.
Linearly decomposing signals or functions in the following form allows for better analysis
In the wavelet domain, is called a mother wavelet which is defined in (4), after compression and transformation.
An exact method of studying the wavelet transform is multiresolution analysis which is defined as a continuation of finite subspaces of L2(R) in the Hilbert domain. In order to use multiresolution analysis, we define a scale function similar to the mother wavelet, as 
We then define any f (t) as 
where and are discrete wavelet transform of . Represents the larger scale whose space is created by the elements. For a high enough resolution, signal samples are very similar to scale coefficients. Discrete wavelet transform is similar to the Fourier series with more flexibility and efficiency, and just like the Fourier transform, it is useful in representing periodic signals. However, in contrast with Fourier, it can be also used in dealing with non-periodic signals with excellent results.
3.2. Undecimated Discrete Wavelet Transform
Undecimated discrete wavelet transform is not as popular as its regular counterpart. Figure 1 shows the simplest filter bank of the undecimated discrete wavelet transform. While the left hand side of the diagram in Figure 1 is called the analysis section, the right hand side is called returning section.
In Figure 1, the signal, S, is first filtered by a high-pass decomposition filter, H, in order to create cD1 coefficients. It is then filtered by a high-pass returning filter, H’, to generate the details (D1). S is also decomposed by a low-pass filter to make the cA1 coefficients. Finally, these coefficients are filtered by a high-pass returning filter, L’, to create the general features (A1).
3.3. Random Noise Suppression by Using the Autoregressive Vector Operator in the Wavelet Domain
In order to increase the signal to noise ratio (S/N) of a multi-component signal, we first consider a vector operator, , for autoregression of noisy data, and then a forward noise suppression estimation follows as (Naghizadeh & Sacchi, 2012).
Figure 1. Filter bank of the first order undecimated discrete wavelet transform  .
Similarly, the backward noise suppression estimate is given by
where represents the complex conjugate of the autoregressive operator. The final value for the denoised data through averaging the forward and backward values is
First, the noisy signal is transformed into the wavelet domain, by converting the AR coefficients, where it is filtered. After filtering, the inverse wavelet transform is applied which will generate the denoised version of the original signal.
4. The f-x Domain
4.1. General Basics of the Predictive Filter in the f-x Domain
Prediction in the f-x domain is a successful method to remove random noise from seismic data  , as linear events can be fully predicted using a Wiener prediction filter  . A good prediction filter may be used to interpolate lost data in the absence of wide gaps. An average of filters to should be used for large spaces. The main problem is to obtain an autocorrelation function for sparse data.  Used the Burg technique that handles the missing data in the same well known way as it handles the missing end points.
In general, a signal in the time domain is treated as a complex signal in the frequency domain. We can extract information such as phase, amplitude and energy spectra in the frequency domain. (10) gives the energy distribution as a function of frequency  .
Since in the f-x domain, linear events in the input signal are represented as sine functions in position, we discuss the prediction filter in this domain. This method assumes that the traces are composed of delayed impulses as shown in Figure 2.
The f-x prediction filter can be widely applied to noise reduction. In this method, we assume that the existing trends in the data to be linear. If not completely linear, we can divide the seismic section to shorter windows to satisfy the linearity condition. Let’s assume that a seismic trace, , to be a train of impulses with various amplitudes  .
where t and x are lateral positions, and Ai is the amplitude of the jth impulse, and gj(x) is the delay function containing the shapes of the events. By applying a
Figure 2. Schematics of the delayed traces  .
Fourier transform with respect to time from (11), we have
where ω is angular frequency. Since exponential functions can be rewritten as sine’s and cosines, the seismic traces in (12) are in fact a set of sine’s and cosines as functions of ω and x. Because f-x filters only predict linear data, the events in question have to be linear with respect to x, and therefore gj(x) functions must be linear (Canales, 1984).
By assuming a linear U(x, ω), gj(x) can be written as (13) in the frequency domain.
where Cj is a complex constant and is determined by source power and reflection coefficients and bj is the slope of the linear event. (13) shows that linearity results in a U(x,t) which is a perfect sine function of x which means it is a periodic and predictable function. Therefore, the signal is a predictable exponential function of x in the frequency-time domain  .
4.2. Wiener Filter
Wiener filter is an efficient, stable, linear filter which is applied to noisy images. It requires the assumption that both the signal and the noise are stable second order functions. For this purpose, noise is assumed to be frames of zero mean. This method is based on minimizing the sum of the squares of differences between arbitrary and real output signals. Wiener filter cannot reconstruct the frequency components which are contaminated with noise and simply suppress them. It also cannot neutralize images and is slow. To improve the filter’s speed, we can apply an inverse FFT to obtain the impulse response  .
5. Autoregressive Vector
Classic models of time series are divided into stationary and nonstationary. Autoregression (AR) is a class of classic stationary time series models which we discuss in this study.
A limitation of our models, so far, is that they impose a one way relationship so that predictive variables are affected by predicted ones and not vice versa. However, in many cases, the reverse is needed when the variables affect each other. Such relationships are allowed in the framework of autoregressive vectors where all variables are symmetric  .
5.2. Mathematical Framework
In autoregressive models, variables can be predicted using a linear combination of their previous values. The term “autoregression” is due to regressing a variable against itself  . An autoregressive model is defined as
where c is constant, εt is white noise (with zero mean and a variance, ) and ai’s are model parameters. In this fashion, yt is called a p-order autoregressive model, or AR(p). The structure of a first order autoregressive models, AR(1), shown in (15), is simple, useful and is applicable to a wide range of problems  .
where we assume:
1) residuals are zero:
2) error are not autocorrelated:
Autoregressive models are significantly flexible in controlling a wide range of time series. Figure 3 shows the main steps of vector autoregressive analysis.
5.3. Interpreting Vector Autoregressive Models
Autoregressive models do not allow us to comment on causality relationships. This is especially true when they are generally designed to process unknown time series. Causal interpretations require essential economic models. However, autoregression allows for active interpretations between the variables  .
In summary, the advantages of autoregressive vector are:
1) Predicting a set of related variables where an implied interpretation is needed.
2) Testing of whether or not a variable is useful in predicting another variable.
3) Analysis of impulse response where the response of a variable to an abrupt but temporary change in another variable is analyzed.
4) Error prediction of variance decomposition where a part of variance prediction for a variable is attributed to other variables  .
Figure 3. Flowchart for the main steps in the autoregressive vector analysis  .
5.4. Random Noise Reduction Using Autoregressive Vector Operator
In order to increase the signal to noise ratio (S/N) in multicomponent signal, first we calculate the autoregressive vector operator for noisy data, . Then, the forward estimate of the denoised data is given as 
Similarly, the backward estimate of the denoised data is
is the complex conjugate of the autoregressive vector operator. The final estimate of the denoised date will be the average of between forward and backward estimates:
6. Applying Autoregressive Filters to GPR Data in the f-x and Wavelet Domains
6.1. Applying the Method on Synthetic GPR Data
As we know, GPR is an instrument that used electromagnetic waves through transmitter and receiver antennas to determine the depth and trend of anomalies. The emitted waves from the transmitter antenna arrive at the target and are received by the receiver. These steps can be simulated into synthetic data. Since we aim to study random noise, it has to be added to the resulted section. In order to generate synthetic data, we first assume an Earth model with arbitrary coefficient. Due to the similarity of electromagnetic and seismic waves, modeling synthetic GPR and seismic data are similar. We have used Ricker wavelets which is formulated as 
where ω(t) is the Ricker wavelet with t and f as time and central frequency of the wavelet, respectively. The Ricker wavelet is symmetric in time and has a zero
mean ( )
The synthetic signal can be formulated as
where the received signal, S(t), is convolved with the Ricker wavelet, ω(t). n(t) is the random noise added to the input signal. Figure 4 compares application of autoregressive and wavelet autoregressive filters to a noisy section, which was created in the MATLAB environment.
Here, we created a synthetic GPR section, as shown in Figure 4(b), to study the efficiency of the autoregressive filter in the wavelet domain, with a sampling rate of 40 ns and a Gaussian white noise. As shown in Figure 4(b), the layers, especially narrower ones are to some extent removed and the presence of noise has caused the boundaries to become completely so vague that the bulges and trends of the layers are damaged and noisy throughout the section.
6.2. Applying the Autoregressive Filter on GPR Data in the f-x Domain
We apply the autoregressive filter in the f-x space to denoise the section as
Figure 4. (a) Synthetic GPR section; (b) noisy section; (c) filtered by autoregression; (d) filtered by autoregression in the wavelet domain.
shown in Figure 4(c). As we can see in Figure 4(c), after applying the filter, noise is removed and the layers are more visible, however, the many of the boundaries have become murky. Structures with low amplitude are most affected by denoising, as opposed to the high-amplitude structures which are more visible, and therefore, reconstruction of the layers has not been done efficiently.
6.3. Applying the AR Filter in the Wavelet Domain
Here we have used the undecimated discrete wavelet transform in applying the autoregressive filter to the noise section in the wavelet domain. We note that here, the applied filter is linear which is a great advantage in noise reduction due to the linearity of the wavelet space.
The denoised section by using this method is shown in Figure 4(d) which looks promising, since the layer boundaries are distinguishable and there is logical smooth trend throughout the section. Both the low and high amplitude structures are more well-defined compared to the autoregressive filter in the f-x domain. We also note that the layer trends are efficiently reconstructed.
6.4. Applying the Method on Real Data
Here, we apply the linear regressive model to real data first in the f-x and then in the wavelet domain in order to remove random noise and eventually compare the results. We show traces #450 onward for better comparison in Figure 6.
We first, suppress the noise in the f-x domain using the autoregressive filter on the noisy real data shown in Figure 5(a) where the boundaries and traces (especially after trace #450) are difficult to distinguish and the bulges are faded in the noise. As shown in Figure 6(c), the autoregressive filter in the f-x domain has properly reduced the noise and the layer trends are visible.
As mentioned before, the noise is random and does not have a specific source. As we can see in Figure 5(a), the right (and specially lower right) portion of the section is very noisy as noise has covered all the trends and layers. Figure 5(c) shows the denoised section after applying the autoregressed filter in the wavelet domain. As we can see in Figure 5(c), the trends are well-defined, especially in the lower right corner of the section and the previously vague parts can now be distinguished to a much greater extent. Therefore the method has successfully reduced the noise.
6.5. Comparing the Application of AR in the Wavelet and f-x Domains
Considering the previous sections on wavelet and f-x domains, as well as the better performance of the filter in the wavelet domain, here we compare denoising on synthetic and real data in both domains. By comparing Figure 5(b) and Figure 5(c), we notice the better performance of the filter in denoising the synthetic data in the wavelet domain. Especially, in the later arrival times which correspond to greater depths, noise reduction is more evident. We note that the boundaries are better represented and are more distinguishable. Also, by comparing
Figure 5. The GPR data section; (a) raw data; (b) after applying autoregression in the f-x domain; (c) after applying autoregression in the wavelet domain.
Figure 6. The GPR data section from trace 450 to the end; (a) raw data; (b) after applying autoregression in the f-x domain; (c) after applying autoregression in the wavelet domain.
Figure 6(b) and Figure 6(c), we can see that again the wavelet domain has had a much better performance in reducing the noise and retrieving the real signal (the layer trends are more visible). Overall, the autoregressive filter in the wavelet domain had done a better job in retrieving the signal due to the linearity of the wavelet domain.
As discussed above, various factors such as phone networks, power posts, utility poles, etc. cause contamination in the GPR data. Since the goal of this study was to increase the signal to noise ratio, a method was chosen to damage the signal as least as possible while reducing the noise. In this study, the noise suppression procedure was applied to both synthetic and real GPR data in f-x and wavelet domains through using autoregressive filter. As we see, noise reduction improves interpretation of data and Autoregressive filter bears good results in both f-x and wavelet domains. Which means that Linear regression in the wavelet domain leads to better results, compared to those of the f-x domain, due to the local nature of the wavelet transform and the imposed linearity on the events on different scales.
We should also note that in contrast with the f-x domain, the autoregressive filter is linear just as the wavelet domain which is why the filter does not work as well in the f-x domain.
 Luetkepohl, H. (2011) Vector Autoregressive Models, Economics Departments, In-stitutes and Research Centers in the World. Retrieved from IDEAS.