Rich feature hierarchies for accurate object detection and semantic segmentation Tech report(RCNN)

我不是女神ヾ 2022-10-06 03:54 116阅读 0赞

#### Rich feature hierarchies for accurate object detection and semantic segmentation Tech report(RCNN) ####

### 文章目录 ###

*   *   *   *  Rich feature hierarchies for accurate object detection and semantic segmentation Tech report(RCNN)
             *   *  总揽
                 *  Object detection with RCNN
                 *   *  model design
                     *  test-time detection
                     *  training
                 *  实验数据
                 *  附录A：Object proposal transformations
                 *  附录B：Positive vs. negative examples and softmax
                 *  附录C：Bounding-box regression

##### 总揽 #####

*  ![image-20210610231510096][]
 *  由上图可以看到，RCNN分为三个部分
    
     *  1、提取候选区域（region proposals），它采取的方式是Selective Search，选取大约2k个
     *  2、对候选区域进行直接resize，使之符合CNN网络的输入要求，然后，进入特征提取
     *  3、在分类阶段使用SVM分类器进行分类

##### Object detection with RCNN #####

###### model design ######

*  Region proposals
    
     *  Selective Search
 *  Feature Extraction
    
     *  resize成227x227的RGB image(三通道)，无视候选区的尺寸、比例（附录A对此有讨论）
     *  此外，在resize之前，还对候选区域进行拼接（扩张）
        
         *  ![在这里插入图片描述][7ac7649c5b44c0aa58604158b2336bd6.png_pic_center]

###### test-time detection ######

*  在测试阶段，仍然是上述流程，另外，作者特地强调了下RCNN的高效（就当时而言），主要有以下两点
    
     *  CNN参数全局共享
     *  特征向量的计算是低维的（计算量小）
 *  最后输出的特征向量矩阵是2000x4096，SVM则是4096xN，其中N表示类别数目

###### training ######

*  先把CNN扔到大号数据集里面训练（监督预训练，指ILSVRC2012）
 *  然后再在VOC中做主要特征微调
 *  然后使用SVM对每个区域的每类进行判断（分类）（附录B对此有讨论）

##### 实验数据 #####

*  在VOC2010上的结果
    
     *  ![cdfa7f4180363def9e9348c75a65effa.png][]
     *  RCNN BB是用了BB regression（框回归）的版本，可以看到mAP提了3个点
 *  而在ILSVR2013上，大体趋势依然不变
    
     *  ![666f31e0055701605de544c1757f5814.png][]
 *  然后对fine-tuning效果的测试（1-3）:（4-6）,7是6+用了BB regression的效果，顺手验证了BB regression的效果
    
     *  ![image-20210611001515518][]
 *  不同backbone（CNN部分）的效果
    
     *  ![image-20210611001744653][]

##### 附录A：Object proposal transformations #####

*  提供了两种方法：
    
     *  1、短边+padding然后再resize，这样可以解决ratio不协调问题
     *  2、暴力resize，也就是文中采用的方式（图简单？）

##### 附录B：Positive vs. negative examples and softmax #####

*  首先给出定义
    
     *  对于GT：positive if IOU>0.5 else background
     *  对于SVM：
        
         *  if IOU>0.5 =>positive
         *  if IOU<0.3 =>negetive
         *  else 忽略
 *  由于正负样本的不平衡，作者引入many “jittered” examples，使得positive扩大30倍

##### 附录C：Bounding-box regression #####

*  [Bounding-box regression][]

[image-20210610231510096]: /images/20221005/c500877bb73f495ba0b835770d59f74f.png
[7ac7649c5b44c0aa58604158b2336bd6.png_pic_center]: /images/20221005/5f492321d1524ba1a49b70f8f3866138.png
[cdfa7f4180363def9e9348c75a65effa.png]: /images/20221005/56c7b92af8b74abca6a9ed561f2141eb.png
[666f31e0055701605de544c1757f5814.png]: /images/20221005/dfa67feeaff2444181db5d734dae07e3.png
[image-20210611001515518]: /images/20221005/f80c4ef656dc4c0dbb68a683ec50ce41.png
[image-20210611001744653]: /images/20221005/a8b6d00ff1bd4d95a0322eb68236f952.png
[Bounding-box regression]: https://blog.csdn.net/symuamua/article/details/116953480

Rich feature hierarchies for accurate object detection and semantic segmentation Tech report(RCNN)

发表评论取消回复

还没有评论，来说两句吧...

相关阅读