《Siamese FC:Fully-Convolutional Siamese Networks for Object Tracking》论文笔记

曾经终败给现在 2023-02-23 03:51 90阅读 0赞


1. 概述


2. 方法设计

这篇文章的方法利用了Siamese网络共享权值的特性,对参考图和当前帧分别获取embedding feature,之后使用correlation match计算两个特征图的相似度,使其在对应的区域响应度最大。其infer的时候(训练的时候经过图像剪裁和补边,所以中心怎么都是对齐的)流程见下图所示:

3. 代码梳理


Step1. 训练标注的生成
这里的训练标注是在给定的correlation feature尺寸上去生成label(同时给定正负样本的阈值,用以将中心与边缘区分开,当然也可以通过高斯平滑)

  1. def create_BCELogit_loss_label(label_size, pos_thr, neg_thr, upscale_factor=4):
  2. pos_thr = pos_thr / upscale_factor
  3. neg_thr = neg_thr / upscale_factor
  4. # Note that the center might be between pixels if the label size is even.
  5. center = (label_size - 1) / 2
  6. line = np.arange(0, label_size)
  7. line = line - center
  8. line = line**2
  9. line = np.expand_dims(line, axis=0)
  10. dist_map = line + line.transpose()
  11. label = np.zeros([label_size, label_size, 2]).astype(np.float32)
  12. label[:, :, 0] = dist_map <= pos_thr**2
  13. label[:, :, 1] = (dist_map <= pos_thr**2) | (dist_map > neg_thr**2)
  14. return label

Step2. 计算ref与search之间的correlation match

  1. def match_corr(self, embed_ref, embed_srch):
  2. b, c, h, w = embed_srch.shape
  3. match_map = F.conv2d(embed_srch.view(1, b * c, h, w),
  4. embed_ref, groups=b)
  5. # Here we reorder the dimensions to get back the batch dimension.
  6. match_map = match_map.permute(1, 0, 2, 3)
  7. match_map = self.match_batchnorm(match_map) # torch.nn.BatchNorm2d(1)
  8. if self.upscale:
  9. match_map = F.interpolate(match_map, self.upsc_size, mode='bilinear',
  10. align_corners=False)
  11. return match_map


  1. def BCELogit_Loss(score_map, labels):
  2. labels = labels.unsqueeze(1)
  3. loss = F.binary_cross_entropy_with_logits(score_map, labels[:, :, :, :, 0],
  4. weight=labels[:, :, :, :, 1],
  5. reduction='mean')
  6. return loss


评论列表 (有 0 条评论,90人围观)

