# Regularization¶

If there is one model that has a misfit that equals the desired tolerance, then there are infinitely many other models which can fit to the same degree. The challenge is to find that model which has the desired characteristics and is compatible with a priori information. A single model can be selected from an infinite ensemble by measuring the length, or norm, of each model. Then a smallest, or sometimes largest, member can be isolated. Our goal is to design a norm that embodies our prior knowledge and, when minimized, yields a realistic candidate for the solution of our problem. The norm can penalize variation from a reference model, spatial derivatives of the model, or some combination of these.

## Tikhonov Regularization¶

Here we will define regularization of a model, m, in general however, this should be thought of as (m-m_ref) but otherwise it is exactly the same:

$R(m) = \int_\Omega \frac{\alpha_x}{2}\left(\frac{\partial m}{\partial x}\right)^2 + \frac{\alpha_y}{2}\left(\frac{\partial m}{\partial y}\right)^2 \partial v$

Our discrete gradient operator works on cell centers and gives the derivative on the cell faces, which is not where we want to be evaluating this integral. We need to average the values back to the cell-centers before we integrate. To avoid null spaces, we square first and then average. In 2D with ij notation it looks like this:

$\begin{split}R(m) \approx \sum_{ij} \left[\frac{\alpha_x}{2}\left[\left(\frac{m_{i+1,j} - m_{i,j}}{h}\right)^2 + \left(\frac{m_{i,j} - m_{i-1,j}}{h}\right)^2\right] \\ + \frac{\alpha_y}{2}\left[\left(\frac{m_{i,j+1} - m_{i,j}}{h}\right)^2 + \left(\frac{m_{i,j} - m_{i,j-1}}{h}\right)^2\right] \right]h^2\end{split}$

If we let D_1 be the derivative matrix in the x direction

$\mathbf{D}_1 = \mathbf{I}_2\otimes\mathbf{d}_1$
$\mathbf{D}_2 = \mathbf{d}_2\otimes\mathbf{I}_1$

Where d_1 is the one dimensional derivative:

$\begin{split}\mathbf{d}_1 = \frac{1}{h} \left[ \begin{array}{cccc} -1 & 1 & & \\ & \ddots & \ddots&\\ & & -1 & 1\end{array} \right]\end{split}$
$R(m) \approx \mathbf{v}^\top \left[\frac{\alpha_x}{2}\mathbf{A}_1 (\mathbf{D}_1 m) \odot (\mathbf{D}_1 m) + \frac{\alpha_y}{2}\mathbf{A}_2 (\mathbf{D}_2 m) \odot (\mathbf{D}_2 m) \right]$

Recall that this is really a just point wise multiplication, or a diagonal matrix times a vector. When we multiply by something in a diagonal we can interchange and it gives the same results (i.e. it is point wise)

$\mathbf{a\odot b} = \text{diag}(\mathbf{a})\mathbf{b} = \text{diag}(\mathbf{b})\mathbf{a} = \mathbf{b\odot a}$

and the transpose also is true (but the sizes have to make sense...):

$\mathbf{a}^\top\text{diag}(\mathbf{b}) = \mathbf{b}^\top\text{diag}(\mathbf{a})$

So R(m) can simplify to:

$R(m) \approx \mathbf{m}^\top \left[\frac{\alpha_x}{2}\mathbf{D}_1^\top \text{diag}(\mathbf{A}_1^\top\mathbf{v}) \mathbf{D}_1 + \frac{\alpha_y}{2}\mathbf{D}_2^\top \text{diag}(\mathbf{A}_2^\top \mathbf{v}) \mathbf{D}_2 \right] \mathbf{m}$

We will define W_x as:

$\mathbf{W}_x = \sqrt{\alpha_x}\text{diag}\left(\sqrt{\mathbf{A}_1^\top\mathbf{v}}\right) \mathbf{D}_1$

And then W as a tall matrix of all of the different regularization terms:

$\begin{split}\mathbf{W} = \left[ \begin{array}{c} \mathbf{W}_s\\ \mathbf{W}_x\\ \mathbf{W}_y\end{array} \right]\end{split}$

Then we can write

$R(m) \approx \frac{1}{2}\mathbf{m^\top W^\top W m}$

### The API¶

class SimPEG.Regularization.BaseRegularization(mesh=None, nP=None, mapping=None, indActive=None, **kwargs)[source]

Base Regularization Class

This is used to regularize the model space:

reg = Regularization(mesh)

counter = None
mapPair

A SimPEG.Map Class

alias of IdentityMap

mesh = None

A discretize instance.

mref = None

Reference model.

mapping = None

A SimPEG.Map instance.

parent

This is the parent of the regularization.

inv
invProb
reg
opt
prob
survey
W

Full regularization weighting matrix W.

eval(*args, **kwargs)[source]
evalDeriv(*args, **kwargs)[source]

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top W(m-m_\text{ref})}$

So the derivative is straight forward:

$R(m) = \mathbf{W^\top W (m-m_\text{ref})}$
eval2Deriv(*args, **kwargs)[source]

Second derivative

Parameters: m (numpy.array) – geophysical model v (numpy.array) – vector to multiply scipy.sparse.csr_matrix WtW, or if v is supplied WtW*v (numpy.ndarray)

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top W(m-m_\text{ref})}$

So the second derivative is straight forward:

$R(m) = \mathbf{W^\top W}$
class SimPEG.Regularization.Simple(mesh, mapping=None, indActive=None, **kwargs)[source]

Simple regularization that does not include length scales in the derivatives.

mrefInSmooth = False

include mref in the smoothness?

alpha_s

Smallness weight

alpha_x

Weight for the first derivative in the x direction

alpha_y

Weight for the first derivative in the y direction

alpha_z

Weight for the first derivative in the z direction

Wsmall

Regularization matrix Wsmall

Wx

Regularization matrix Wx

Wy

Regularization matrix Wy

Wz

Regularization matrix Wz

Wsmooth

Full smoothness regularization matrix W

W

Full regularization matrix W

evalDeriv(*args, **kwargs)[source]

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top W(m-m_\text{ref})}$

So the derivative is straight forward:

$R(m) = \mathbf{W^\top W (m-m_\text{ref})}$
class SimPEG.Regularization.Tikhonov(mesh, mapping=None, indActive=None, **kwargs)[source]

L2 Tikhonov regularization with both smallness and smoothness (first order derivative) contributions.

$\phi_m(\mathbf{m}) = \alpha_s \| W_s (\mathbf{m} - \mathbf{m_{ref}} ) \|^2 + \alpha_x \| W_x \frac{\partial}{\partial x} (\mathbf{m} - \mathbf{m_{ref}} ) \|^2 + \alpha_y \| W_y \frac{\partial}{\partial y} (\mathbf{m} - \mathbf{m_{ref}} ) \|^2 + \alpha_z \| W_z \frac{\partial}{\partial z} (\mathbf{m} - \mathbf{m_{ref}} ) \|^2$

Note if the key word argument mrefInSmooth is False, then mref is not included in the smoothness contribution.

Parameters: mesh (BaseMesh) – SimPEG mesh mapping (IdentityMap) – regularization mapping, takes the model from model space to the thing you want to regularize indActive (numpy.ndarray) – active cell indices for reducing the size of differential operators in the definition of a regularization mesh mrefInSmooth (bool) – (default = False) put mref in the smoothness component? alpha_s (float) – (default 1e-6) smallness weight alpha_x (float) – (default 1) smoothness weight for first derivative in the x-direction alpha_y (float) – (default 1) smoothness weight for first derivative in the y-direction alpha_z (float) – (default 1) smoothness weight for first derivative in the z-direction alpha_xx (float) – (default 1) smoothness weight for second derivative in the x-direction alpha_yy (float) – (default 1) smoothness weight for second derivative in the y-direction alpha_zz (float) – (default 1) smoothness weight for second derivative in the z-direction
alpha_s

Smallness weight

alpha_x

Weight for the first derivative in the x direction

alpha_y

Weight for the first derivative in the y direction

alpha_z

Weight for the first derivative in the z direction

alpha_xx

Weight for the second derivative in the x direction

alpha_yy

Weight for the second derivative in the y direction

alpha_zz

Weight for the second derivative in the z direction

Wsmall

Regularization matrix Wsmall

Wx

Regularization matrix Wx

Wy

Regularization matrix Wy

Wz

Regularization matrix Wz

Wxx

Regularization matrix Wxx

Wyy

Regularization matrix Wyy

Wzz

Regularization matrix Wzz

Wsmooth2

Full smoothness regularization matrix W

evalDeriv(*args, **kwargs)[source]

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top W(m-m_\text{ref})}$

So the derivative is straight forward:

$R(m) = \mathbf{W^\top W (m-m_\text{ref})}$
eval2Deriv(m, v=None)[source]

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top W(m-m_\text{ref})}$

So the derivative is straight forward:

$R(m) = \mathbf{W^\top W (m-m_\text{ref})}$
class SimPEG.Regularization.Sparse(mesh, mapping=None, indActive=None, **kwargs)[source]

The regularization is:

$R(m) = \frac{1}{2}\mathbf{(m-m_\text{ref})^\top W^\top R^\top R W(m-m_\text{ref})}$

where the IRLS weight

$R = \eta TO FINISH LATER!!!$

So the derivative is straight forward:

$R(m) = \mathbf{W^\top R^\top R W (m-m_\text{ref})}$

The IRLS weights are recomputed after each beta solves. It is strongly recommended to do a few Gauss-Newton iterations before updating.

Wsmall

Regularization matrix Wsmall

Wx

Regularization matrix Wx

Wy

Regularization matrix Wy

Wz

Regularization matrix Wz

class SimPEG.Regularization.RegularizationMesh(mesh, indActive=None)[source]

Bases: object

Regularization Mesh

This contains the operators used in the regularization. Note that these are not necessarily true differential operators, but are constructed from a SimPEG Mesh.

Parameters: mesh (BaseMesh) – problem mesh indActive (numpy.array) – bool array, size nC, that is True where we have active cells. Used to reduce the operators so we regularize only on active cells
vol

reduced volume vector :rtype: numpy.array :return: reduced cell volume

nC

reduced number of cells :rtype: int :return: number of cells being regularized

dim

dimension of regularization mesh (1D, 2D, 3D) :rtype: int :return: dimension

aveFx2CC

averaging from active cell centers to active x-faces :rtype: scipy.sparse.csr_matrix :return: averaging from active cell centers to active x-faces

aveCC2Fx

averaging from active x-faces to active cell centers :rtype: scipy.sparse.csr_matrix :return: averaging matrix from active x-faces to active cell centers

aveFy2CC

averaging from active cell centers to active y-faces :rtype: scipy.sparse.csr_matrix :return: averaging from active cell centers to active y-faces

aveCC2Fy

averaging from active y-faces to active cell centers :rtype: scipy.sparse.csr_matrix :return: averaging matrix from active y-faces to active cell centers

aveFz2CC

averaging from active cell centers to active z-faces :rtype: scipy.sparse.csr_matrix :return: averaging from active cell centers to active z-faces

aveCC2Fz

averaging from active z-faces to active cell centers :rtype: scipy.sparse.csr_matrix :return: averaging matrix from active z-faces to active cell centers

cellDiffx

cell centered difference in the x-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the x-direction

cellDiffy

cell centered difference in the y-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the y-direction

cellDiffz

cell centered difference in the z-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the z-direction

faceDiffx

x-face differences :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active faces in the x-direction

faceDiffy

y-face differences :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active faces in the y-direction

faceDiffz

z-face differences :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active faces in the z-direction

cellDiffxStencil

cell centered difference stencil (no cell lengths include) in the x-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the x-direction

cellDiffyStencil

cell centered difference stencil (no cell lengths include) in the y-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the y-direction

cellDiffzStencil

cell centered difference stencil (no cell lengths include) in the y-direction :rtype: scipy.sparse.csr_matrix :return: differencing matrix for active cells in the y-direction