寻找特征(finding features)

Image matching: 一个物体在不同的角度拍了2副照片，在这两幅图中找到这个匹配到这个物体。我们通常用局部特征去完成这个任务。

为什么用局部特征呢？

Global representations have major limitations
Instead, describe and match only local regions
Increased robustness to
- Occlusions，这里举了一个人被书本挡住了部分脸的照片，可能在另一个角度去拍这个人的照片，可能这本书就不会遮住。那么怎么在这两幅图中匹配这个人呢？很明显用局部的特征会更有优势。因为如果用全局的话，这本书会带来非常大的误差。
- Articulation，这里举了一个人的脚踝的两张图，第一章人站立，腿和脚之间的夹角为90，第二张图是人在行走的过程中的截图，腿和脚的夹角仍然是90，只是图像相对于原来的坐标系旋转了一个角度而已，那么怎么去匹配呢？
- Intra-category variations，这里举了两个小动物，这两个小动物属于同一个物种，那么怎么去匹配呢？圈出了鼻子的特征。

基本步骤

Find a set of distinctive key points
Define a region around each key point
Extract and normalize the region content
Compute a local descriptor from the normalized region
Match local descriptors

问题点：

在两幅图中独立的寻找到相同的点
正确的找到每一个点在另一幅图中对应的点

不变性 - 几何变换 (Invariance - Geometric Tranformations)

通过几何变换后，两个区域是一样的。

Translation
Similarity
Euclidean
affine
projective

不变性 - 光学变换(Invariance - Photometric Transformations)

比如同一物体，在光线好的时候拍的照片和光线暗的时候拍的照片，匹配这两个物体。

Scaling
Offset

需求

Region extraction need to be repeatable and accurate.
- Invariance to translation, rotation and scale changes
- Robust or covariant to out-of-plane(affine) transformation
- Robust to lighting variances, noise, blur, quantization
Locality: Features are local, so robust to occlusion and clutter.
Quantity: We need a sufficient number of regions to cover the project.
Distinctivenes: The regions should contain “interesting” structure.
Efficiency: Close to real-time performance.

现成的探测器(detector)

Hessian & Harris, [Beaudet ‘78], [Harris ‘88]
Laplacian, DoG, [Lindeberg ‘98], [Lowe ‘99]
Harris-/Hessian-Laplace, [Mikolajczyk & Schmid ‘01]
Harris-/Hessian-Affine, [Mikolajczyk & Schmid ‘04]
EBR and IBR, [Tuytelaars & Van Gool ‘04]
MSER, [Matas ‘02]
Salient Regions, [Kadir & Brady ‘01]
现在很多探测器都是基于上边的探测器的。

定位关键点(keypoint localization)

目标
1. Repeatable detection, 多次探测都会检测到这些关键点，不能这一次检测到，下一次检测不到，也就是要求算法的输出是稳定的。
2. Precise localization，精确定位。
3. Interesting content.

直接一点：找到二维信号的变化就OK了，也就是这些地方不只是一个方向变化，起码两个。换句人话，找到corner就对了。

找角(Finding Corners)

Key property: In the region around a corner, image gradient has two or more dominant directions.
Corners are repeatable and distinctive.

Corners as Distinctive Interest Points

Disign critera
1. We should easily recognize the point by looking through a small window (locality)
2. Shifing the window in any direction should give a large change in intensity (good localization), 只有在在有角的地方才能满足这个条件，所以我们通过找角来找这些关键点。

Harris 探测器公式

Change the intensity for the shift (u, v)

$E(u,v) = \sum_{x, y}w(x, y)\left[I(x + u, y + v) - I(x, y)\right]^{2}$

$w(x, y)$ 为窗函数，可以为矩形窗或高斯窗.

This measure of change can be approximated by

$E(u, v) \approx \ \begin{bmatrix} u & v \end{bmatrix} \ M \ \begin{bmatrix} u \\ v \\ \end{bmatrix}$

where M 是一个2 x 2矩阵，是图像梯度的组合

$M = \sum_{x, y}w(x, y) \ \begin{bmatrix} I_{x}^{2} & I_{x}I_{y} \\ I_{x}I_{y} & I_{y}^{2} \\ \end{bmatrix}$

这个公式中 $x$, $y$ 是针对我们的窗函数覆盖的区域， $I_{x}$ , $I_{y}$ 分别是 $x$, $y$ 的梯度, 公式也可以写作

$M = \ \begin{bmatrix} \sum I_{x}I_{x} & \sum I_{x}I_{y} \\ \sum I_{x}I_{y} & \sum I_{y}I_{y} \\ \end{bmatrix} = \sum \ \begin{bmatrix} I_{x} \\ I_{y} \\ \end{bmatrix} \ \begin{bmatrix} I_{x} & I_{y} \end{bmatrix}$

矩阵 M 的意义

首先，当角是跟坐标轴对其的情况，就是这个角跟坐标系趾间的角度为0.

$M = \ \begin{bmatrix} \sum I_{x}I_{x} & \sum I_{x}I_{y} \\ \sum I_{x}I_{y} & \sum I_{y}I_{y} \\ \end{bmatrix} = \ \begin{bmatrix} \lambda_{1} & 0 \\ 0 & \lambda_{2} \\ \end{bmatrix}$

这就意味着
1. Dominant gradient directions align with x or y axis.
2. If either $\lambda$ is close to 0, then this is not a corner, so look for locations where both are large.
坐标轴跟这个角不对齐, 我们可以对M做特征值分解

$M = R^{-1} \ \begin{bmatrix} \lambda_{1} & 0 \\ 0 & \lambda_{2} \\ \end{bmatrix} \ R$

用几何描述一下就是，一个椭圆，长轴和短轴的长度分别用 $\lambda_{1}$ 和 $\lambda_{2}$ 来表示，而椭圆的方向用 R 来表示。

特征值的意义

当 $\lambda_{1} » \lambda_{2} $ or $\lambda_{2} » \lambda_{1}$ 时，说明这个点位于Edge位置。
当 $\lambda_{1}$ 和 $\lambda_{2}$ 都很大时，E 在任何方向都在增大，说明这个点在corner位置。
当 $\lambda_{1}$ 和 $\lambda_{2}$ 都小小时，E 在任何方向都是常量，说明这个点在平坦的位置。

Corner Response Function

$\theta = \det(M) - \alpha trace(M)^{2} = \lambda_{1}\lambda_{2} - \alpha(\lambda_{1} + \lambda_{2})^{2}$

$trace(M)$ 值得是矩阵M的迹，等于矩阵M的特征值之和。

快速逼近
- 不用计算特征值
- $\alpha$ ,常数（0.04 to 0.06）
$\theta < 0 $, 说明这个点位于Edge位置。
$\theta > 0 $, 说明这个点位于corner位置。
$\theta = 0 $, 说明这个点在平坦位置。

窗函数

$M = \sum_{x, y}w(x, y) \ \begin{bmatrix} I_{x}^{2} & I_{x}I_{y} \\ I_{x}I_{y} & I_{y}^{2} \\ \end{bmatrix}$

矩形窗
- 简单，就是窗内数据的累加
- 问题：没有旋转因子
高斯窗
- 高斯已经做了权重累加的活，因此M可以表示为
- 结果包含旋转因子

总结

$M(\sigma_{I}, \sigma_{D}) = g(\sigma_{I}) * \ \begin{bmatrix} I_{x}^{2}(\sigma_{D}) & I_{x}I_{y}(\sigma_{D}) \\ I_{x}I_{y}(\sigma_{D}) & I_{y}^{2}(\sigma_{D}) \\ \end{bmatrix}$

计算图像的梯度，得到 $I_{x}$ 和 $I_{y}$
计算梯度的平方，得到 $I_{x}^{2}$ ， $I_{y}^{2}$ 和 $I_{x}I_{y}$
计算梯度的平方通过高斯滤波器，得到 $g(I_{x}^{2})$ , $g(I_{y}^{2})$ 和 $g(I_{x}I_{y})$
使用cornerness function - two strong eigenvalues
$\theta = \det\left[M(\sigma_{I}, \sigma_{D})\right] - \ \alpha\left[trace\left(M(\sigma_{I}, \sigma_{D})\right)\right]^{2} = \\\\ g(I_{x}^{2})g(I_{y}^{2}) - \left[g(I_{x}I_{y})\right]^{2} - \ \alpha\left[g(I_{x}^{2}) + g(I_{y}^{2})\right]^{2}$
对 $\theta$ 使用 non-maximum supperssion

性质

Translation invariance
Rotation invariance