Stanford CS131 学习笔记 (一)

这篇笔记是我在学习Prof Fei-Fei Li的机器视觉介绍课程所写的。好记性不如烂笔头:)

课程的课后作业

什么是机器视觉?

机器视觉就是通过算法让计算机能够跟人一样看到事物，并且识别事物。

人：事物 -> 人眼 -> 大脑 -> 识别
机器: 事物 -> 摄像头 -> 计算机程序 -> 识别

机器视觉是一门交叉学科。

机器视觉的目的

To bridge the gap between pixels and meaning.

计算机看到的是像素，怎么把这些像素(二进制数)翻译出来呢?

图像中的信息

Metric 3D infromation, 可以用图像做测量
Semantic information，认识图像中的事物

为什么学习机器视觉，能做什么？

Special effect: shape and motion capture
3D urban（城市)modeling: 如google街景
face detection: 广泛应用了
smile detection：笑脸识别，比人脸识别更进一步
biometrics(生物测定学)：比如怎么识别亚洲人，阿拉伯人，欧洲人，非洲人?
OCR(字符识别)：怎么识别一幅图中的文字
Toys and robots
Mobile visual search: 搜索引擎根据拍的照片搜索出用户想用的信息,现在bing, google,baidu都有相应的服务。
Automotive safety:

In mid 2010 mobileye will launch a world’s first application of full emergency braking for collision mitigation for pedestrians where vision is the key technology for detecting pedestrians.

Vision in supermarket: （总感觉这玩意不靠谱，遮挡了怎么办？）

A smart camera is flush-mounted in the checkout lane, continuously watchcing for items. When an item is detected and recognized, the cachier verfifies the quantity of items that were found under the backet, and continues to close the transaction. The item can remain under the basket, and with laneHwK, you are assured to get paid for it…

Vision-based interactions: 目前PS3，PS4，XBOX的体感游戏。
Vision for robotics, space explortation: 像中国的月球车-玉兔，美国的火星车-探索者