Introduction to Adaptive Fourier Decomposition

The adaptive Fourier decomposition is an adaptive signal decomposition:

  • Compared with the conventional signal decomposition methods, e.g. Fourier decomposition and wavelet decomposition, the AFD uses adaptive orthogonal basis. These adaptive basis can make the decomposition components match the processed signals best and thus can provide good time-frequency resolution without pre-defined basis.

  • Compared with other adaptive decomposition methods, e.g. empirical mode decomposition (EMD), the AFD has the rigorous mathematical foundation, which allowed users further analyze detailed decomposition coefficients and decomposition components.

Chinese introduction blogs related to the basic principles of the AFD and the implementation of the fast AFD can be found in zewang.site.

Basic Idea

The AFD decomposes the processed signal \(g(t)\) to a series of orthogonal signals, which can be expressed as

\[g(t) = \sum_{n=1}^\infty A_n B_n(t),\]

where \(A_n\) is the decomposition coefficient, \(B_n(t)\) is the decomposition basis component, and \(n\) denotes the decomposition level. The decomposition is from \(n=1\) to \(n=\infty\). In each decomposition level, a suitable basis component is generated adaptively to make sure that the corresponding decomposition component matches the processed signal best. Such decomposition process can provide a fast energy convergence. In other words, the AFD applies matching pursuit process to provide a sparse approximation of the processed signal.

It should be noticed that, before using the AFD, the recorded signals should be transferred to their analytic form. This transform can be achieved by the Hilbert transform, which is

\[g(t)=s(t)+\mathcal{H}\left\{s(t)\right\},\]

where \(s(t)\) is the recorded signal.

Decomposition Basis

The AFD uses the Blaschke product, or say the rational orthogonal system, \(\left\{B_n\right\}_{n=1}^\infty\) as its basis where

\[B_n(t)=\frac{\sqrt{1-\left| a_n \right|^2}}{1-\overline{a}_ne^{jt}}\prod_{k=1}^{n-1}\frac{e^{jt}-a_k}{1-\overline{a}_ke^{jt}}.\]

It should be noticed that, in the decomposition level \(n\), the basis component \(B_n\) is only determined by \(\left\{a_k\right\}_{k=1}^n\). Suppose parameters \(a_k\) from \(k=1\) to \(k=n-1\) have been obtained in previous decomposition levels, only \(a_n\) needs to be found in the decomposition level \(n\).

It should be noticed that \(t\) in \(B_n(t)\) does not denote the time sample but denotes the phase. \(t\) in \(B_n(t)\) is like \(\theta\) in \(\sin(\theta)\). Normally, \(t\) is defined from \(0\) to \(2\pi\). According to requirements of specific applications, \(t\) also can be defined in other ranges.

The basis parameter \(a_n\) is defined in the unit circle \(\mathbb{D}\) of the complex plane \(\mathbb{C}\) where \(\mathbb{D}=\left\{ z\in\mathbb{C}: \left| z \right|<1 \right\}\). The effects of \(a_n\) for \(B_n(t)\) are similar as the effects of shift and scaling factors for the wavelet decomposition as shown in the following figure. Moreover, when \(a_1= \cdots =a_n=0\), the basis component of the AFD becomes the basis component of the conventional Fourier decomposition.

_images/AFD_Bn_diff_a.png

Decomposition Process

As mentioned in Basic Idea, the AFD adopts the idea of the matching pursuit to achieve the fast energy convergence. There are different extensions (or say versions) of the AFD. Although these extensions of the AFD all uses the rational orthogonal system and have the same target that is to achieve the fast energy convergence, they apply different decomposition process.

In this toolbox, the following extensions of the AFD have been included.

Core AFD

The core AFD is the fundamental implementation of the AFD. It is an iterative decomposition process. In each decomposition level \(n\), the basis parameter \(a_n\) is searched by

\[a_n = \arg\max_{a\in\mathbb{A}}\left\{ \left| \left< G_n(t),\text{e}_{\left\{ a \right\}}(t) \right> \right| \right\},\]

where \(G_n(t)\), called reduced remainder, is computed from the remainder in the last decomposition level and \(\text{e}_{\left\{ a \right\}}(t)\) is the evaluator of the basis parameter \(a\). \(\left| \left< G_n(t),\text{e}_{\left\{ a \right\}}(t) \right> \right|\) is the extracted energy when using \(a\) as \(a_n\). The detailed evaluation process can be found in “Algorithm of adaptive Fourier decomposition”.

The reduced remainder \(G_n(t)\) can be computed by

\begin{eqnarray} G_n(t) & = & R_{n-1}(t)\prod_{k=1}^{n-1}\frac{1-\overline{a}_d e^{jt}}{e^{jt}-a_d}\\ & = & \left( G_{n-1}(t)-A_{n-1}\text{e}_{\left\{a_{n-1}\right\}}(t) \right)\frac{1-\overline{a}_{n-1}e^{jt}}{e^{jt}-a_{n-1}}. \end{eqnarray}

The evaluator \(\text{e}_{\left\{ a \right\}}(t)\) is defined as

\[\text{e}_{\left\{ a \right\}}(t) = \frac{\sqrt{1-\left|a\right|^2}}{1-\overline{a}e^{jt}}.\]

The set \(\mathbb{A}\) is the searching range of \(a_n\). Normally, it is set as \(\mathbb{A}=\mathbb{D}\). According to requirements of specific applications, \(\mathbb{A}\) also can be set as a subset of \(\mathbb{D}\). It should be noticed that, in real implementation, it is impossible to scan all possible values in \(\mathbb{A}\). Therefore, \(\mathbb{A}\) needs to be discretized. In this toolbox, there are several different ways to discretize \(\mathbb{A}\). Of course, the higher density of points in the discretized \(\mathbb{A}\), the higher accurate optimization results of \(a_n\). The discretized \(\mathbb{A}\) is called search dictionary of \(a_n\).

The basic decomposition process of the core AFD is shown below.

_images/core_AFD.svg

Unwinding AFD

The unwinding AFD is similar as the core AFD but considers the inner function. The inner function is identical with the Blaschke product defined by zeros of the reduced remainder. The inner functions can be considered as a stable oscillations. If the processed signals contain stable oscillations, you may would like to use the unwinding AFD to achieve faster energy convergence.

By considering the inner functions, the decomposition basis components can be described as

\[B_{n,\text{unwinding}}(t)=B_n(t)\prod_{i=1}^nI_i(t),\]

where the inner function \(I_i(t)\) is

\[I_i(t)=\prod_{h=1}^{H_i}\frac{e^{jt}-r_{i,h}}{1-\overline{r}_{i,h}e^{jt}}.\]

The basis parameter \(a_n\) can be searched by

\[a_n = \arg\max_{a\in\mathbb{A}}\left\{ \left| \left< \frac{G_n(t)}{I_n(t)},\text{e}_{\left\{ a \right\}}(t) \right> \right| \right\}.\]

The parameters \(\left\{r_{n,h}\right\}_{h=1}^{H_n}\) are zeros of the reduced remainder. They are defined in \(\mathbb{D}\) and need to satisfy \(G_n(r_{n,1})=G_n(r_{n,2})= \cdots =G_n(r_{n,H_n})=0\). According to the Cauchy formula, this requirement can be represented as \(\left< G_n(t),\frac{1}{1-\overline{r}_{n,h}e^{jt}} \right>=0 \forall h=1,\cdots,H_n\).

In real implementation, \(r_{n,h}\) can be searched iteratively, whcih is similar as the searching process of \(a_n\). The searching process is from \(h=1\) to \(h=H_n\). \(r_{n,h}\) can be searched by solving

\begin{eqnarray} \text{minimize} & \; & \left| \left< G_n(t)\prod_{i=1}^{h-1}\frac{1-\overline{r}_{n,i}e^{jt}}{e^{jt}-r_{n,i}},\frac{1}{1-\overline{r}e^{jt}} \right> \right|\\ \text{subject to} & \; & r\in\mathbb{R} \text{ and } \left| \left< G_n(t),\frac{1}{1-\overline{r}e^{jt}} \right> \right|<\epsilon. \end{eqnarray}

\(\mathbb{R}\) is the searching range of zeros. To simplify the computation, the searching dictionary of zeros is set as the same as the searching dictionary of \(a_n\). Moreover, \(\epsilon\) is threshold to check whether the objective function value is close to 0 and thus should be set as a very small value.

The basic decomposition process of the unwinding AFD is shown below.

_images/unwinding_AFD.svg

Improving Computational Efficiency

As mentioned above, the searching processes of parameters \(a_n\) and \(r_{n,h}\) are the key decomposition steps in the core AFD and the unwinding AFD. They are all based on exhaustive searching, which means that the objective function values are evaluated one by one. As the number of points in the searching dictionary increases, the computational time will increase. To improve the computational efficiency, the fast AFD is proposed. Based on the convolution theory, the computations of objective function values can be simplified by the FFT.

In the fast AFD, the points in the searching dictionaries of \(a_n\) and \(r_{n,h}\) are represented by their amplitudes and phases, which are

\begin{eqnarray} a_n&=&\rho_ne^{j\theta_n}\text{ and }\\ r_{n,h}&=&\alpha_{n,h}e^{j\gamma_{n,h}}. \end{eqnarray}

In the core AFD, suppose \(t\) in the objective function is same as the phase \(theta\) in the searching dictionary, then the searching process of the basis parameter \(a_n\) can be represented as

\[\rho_n,\; \theta_n=\arg\max_{\rho e^{j\theta}\in\mathbb{A}}\left\{ \left| \mathcal{F}^{-1}\left\{ \mathcal{F}\left\{ G_n(\theta) \right\} \cdot \mathcal{F}\left\{ \text{e}_{\left\{\rho\right\}}(\theta) \right\} \right\} \right| \right\},\]

where \(\mathcal{F}\) and \(\mathcal{F}^{-1}\) denote the FFT and the inverse FFT.

In the unwnding AFD, suppose \(t\) in the objective function is same as the phase \(theta\) in the searching dictionary, then the searching process of the basis parameter \(a_n\) can be represented as

\[\rho_n,\; \theta_n=\arg\max_{\rho e^{j\theta}\in\mathbb{A}}\left\{ \left| \mathcal{F}^{-1}\left\{ \mathcal{F}\left\{ \frac{G_n(\theta)}{I_n(\theta)} \right\} \cdot \mathcal{F}\left\{ \text{e}_{\left\{\rho\right\}}(\theta) \right\} \right\} \right| \right\}.\]

And the zeros can be searched by

\begin{eqnarray} \text{minimize} & \; & \left| \mathcal{F}^{-1}\left\{ \mathcal{F}\left\{ G_n(\gamma) \right\}\cdot\mathcal{F}\left\{ \frac{1}{1-\alpha e^{j\gamma}} \right\} \right\} \right|\\ \text{subject to} & \; & \alpha e^{j\gamma}\in\mathbb{R} \text{ and } \left| \left< G_n(t),\frac{1}{1-\alpha e^{j(t-\gamma)}} \right> \right|<\epsilon. \end{eqnarray}

Although such implementation can significantly improve the computational efficiency, the fast AFD has some limitations:

  • Points in the searching dictionaries must be distributed based on their amplitudes and phases. Users cannot define their own searching dictionaries.

  • The phases of points in the searching dictionaries must be same as the phase of signal, which means that, when users change the phase of signal, the phase of points in the searching dictionary will also be changed.

Multi-channel AFD

The AFD generates its decomposition basis components based on the processed signals. When the single channel AFD decomposes signals channel by channel, different parameters, i.e. \(\left\{a_n\right\}_{n=1}^N\) and \(\left\{\left\{r_{n,h}\right\}_{h=1}^{H_n}\right\}_{n=1}^N\), will be searched for different channels, which leads different sets of basis components for different channels. However, if the processed multi-channel signals are recorded from the same system or contain same components, we would like to use same set of basis components to conduct the decomposition.

Suppose that the processed signal contain total \(C\) channels, then, in the core AFD, the parameters of decomposition components can be searched by

\[a_n = \arg\max_{a\in\mathbb{A}}\left\{ \sum_{c=1}^C\left| \left< G_{n,c}(t),\text{e}_{\left\{ a \right\}}(t) \right> \right| \right\}.\]

In the unwinding AFD, zeros can be searched by

\begin{eqnarray} \text{minimize} & \; & \sum_{c=1}^C\left| \left< G_{n,c}(t)\prod_{i=1}^{h-1}\frac{1-\overline{r}_{n,i}e^{jt}}{e^{jt}-r_{n,i}},\frac{1}{1-\overline{r}e^{jt}} \right> \right|\\ \text{subject to} & \; & r\in\mathbb{R} \text{ and } \sum_{c=1}^C\left| \left< G_{n,c}(t),\frac{1}{1-\overline{r}e^{jt}} \right> \right|<\epsilon. \end{eqnarray}

And the parameters of decomposition components in the unwinding AFD can be searched by

\[a_n = \arg\max_{a\in\mathbb{A}}\left\{ \sum_{c=1}^C\left| \left< \frac{G_{n,c}(t)}{I_{n,c}(t)},\text{e}_{\left\{ a \right\}}(t) \right> \right| \right\}.\]

It should be noticed that,

  • If the processed multi-channel signals do not contain same components or are not suitable to be analyzed by same basis components, the multi-channel AFD cannnot provide good performance.

  • Suppose values of \(t\) are not same for different channels, the values of basis components are different. However, the parameters \(a_n\) and \(r_{n,h}\) are same for all channels.