人脸检测(二)

古城微笑少年丶 2022-06-16 05:36 382阅读 0赞

2.级联CNN人脸检测方法
采用级联网络来进行人脸检测,参考2015年CVPR上的一篇论文A Convolution Neural Network Cascade for Face Detection,它采用了12-net,24-net,48-net级联网络用于人脸检测,12-calibration-net,24-calibration,48-calibration边界校订网络用于更好的定位人脸框。它最小能够检测12x12大小的人脸,相比于单个CNN的人脸检测方法,大大加快了人脸检测的速度,并提高了人脸框的准确度,人脸检测的准确率和召回率也很高,在FDDB数据集上达到了当时最高的分数。

这里写图片描述

论文下载地址
github开源代码
github开源模型

作者的代码写的非常优美易懂,注释简洁明了,其级联CNN人脸检测的基本思路如下:
这里写图片描述
(1)12-net:首先用一个输入为12 x 12图像的小网络来训练人脸非人脸二分类器,将最后的全连接层修改成卷积层,这样的全卷积网络即12-full-conv-net就可以接受任意大小的输入图像,输出的特征图表示对应感受野区域属于人脸的概率。在检测时,假设要检测的最小人连尺寸为K x K,例如40 x 40,将待检测的图像缩放到原来的12/K,然后将整幅图像输入到训练好的12 x 12的全卷积网络,得到特征图,设定阈值过滤掉得分低的,这样就可以去除绝大部分不感兴趣区域,保留一定数量的候选框。
(2)12-calibration-net:训练一个输入为12 x 12图像的校订网络,来矫正上一步12-net得到的人脸框边界,它实质上是一个45类分类器,判断当前图像中包含的人脸是左右偏了、上下偏了还是大了小了,即包括:
x方向上-0.17、0、0.17共3种平移
y方向上-0.17、0、0.17共3三种平移
以及0.83、0.91、1.0、1.1、1.21共5种缩放尺度
检测时,将上一步12-net得到的所有人脸候选框作为12-calibration-net的输入,根据其分类结果对候选框位置进行校订。
(3)localNMS:对上述12-calibration-net矫正后的人脸候选框做局部非极大值抑制,过滤掉其中重叠的得分较低的候选框,保留得分更高的人脸候选框。
这里写图片描述
(4)24-net:训练输入为24 x 24图像的人脸分类器网络。测试时,以上一步localNMS得到的人脸候选框缩放到24 x 24大小,作为24-net网络输入,判定是否属于人脸,设置阈值保留得分较高的候选框。
(5)24-calibration-net:同样训练一个输入图像为24 x 24大小的边界校订分类网络,来矫正上一步24-net保留的人脸候选框的位置,候选框区域图像缩放到24 x 24大小,其它与12-calibration-net一致。
(6)localNMS:将24-calibration-net矫正后的人脸候选框进行局部非极大值抑制,过滤掉重叠的得分较低的候选框,保留得分更高的。
这里写图片描述
(7)48-net:训练一个更加准确的输入为48 x 48的人脸非人脸分类器。测试时,将上一步localNMS得到的人脸候选框缩放到48 x 48大小,作为48-net输入,保留得分高的人脸候选框。
(8)globalNMS:将48-net得到的所有人脸候选框进行全局非极大值抑制,保留所有的最佳人脸框
(9)48-calibration-net:训练一个输入为48 x 48的边界校订分类网络。测试时,将globalNMS得到的最佳人脸框缩放到48 x 48作为输入进行人脸框边界校订。

因此,我们需要先单独训练人脸非人脸二分类12-net,24-net,48-net网络,以及人脸框边界校订12-calibration-net,24-calibration,48-calibration网络。测试时,用图像金字塔来做多尺度人脸检测,对于任意的输入图像,依次经过12-net(12-full-conv-net) -> 12-calibration-net -> localNMS -> 24-net -> 24-calibration-net -> localNMS -> 48-net -> globalNMS -> 48-calibration-net,得到最终的人脸框作为检测结果。
其中localNMS和globalNMS(Non-maximum Suppression,NMS,非极大值抑制)的区别主要在于前者仅使用了IoU(Intersection over Union),即交集与并集的比值,而后者还用到了IoM(Intersection over Min-area),即交集与两者中最小面积的比值,来过滤重叠的候选框。
博主没有自己训练,而是直接用了作者训练好的模型。使用CNN_face_detection-master\face_detection文件夹下的face_cascade_fullconv_single_crop_single_image.py脚本可以对但张图像进行测试,代码如下,其中需要注意的是,我们可以根据需求来设置检测最小的人脸尺寸min_face_size,它与检测速度直接相关,如果我们需要检测的人脸比较大,例如在128x128以上时,检测可以达到实时水平。

  1. import numpy as np
  2. import cv2
  3. import time
  4. import os
  5. #from operator import itemgetter
  6. from load_model_functions import *
  7. from face_detection_functions import *
  8. # ================== caffe ======================================
  9. #caffe_root = '/home/anson/caffe-master/' # this file is expected to be in {caffe_root}/examples
  10. import sys
  11. #sys.path.insert(0, caffe_root + 'python')
  12. import caffe
  13. # ================== load models ======================================
  14. net_12c_full_conv, net_12_cal, net_24c, net_24_cal, net_48c, net_48_cal = \
  15. load_face_models(loadNet=True)
  16. nets = (net_12c_full_conv, net_12_cal, net_24c, net_24_cal, net_48c, net_48_cal)
  17. start_time = time.time()
  18. read_img_name = 'C:/Users/Administrator/Desktop/caffe/matlab/demo/1.jpg'
  19. img = cv2.imread(read_img_name) # BGR
  20. print img.shape
  21. min_face_size = 48 #最小的人脸检测尺寸,设置的越小能检测到更小人脸的同时,速度下降的很快
  22. stride = 5 #步长,实际并未使用
  23. # caffe_image = np.true_divide(img, 255) # convert to caffe style (0~1 BGR)
  24. # caffe_image = caffe_image[:, :, (2, 1, 0)]
  25. img_forward = np.array(img, dtype=np.float32)
  26. img_forward -= np.array((104, 117, 123))
  27. rectangles = detect_faces_net(nets, img_forward, min_face_size, stride, True, 2, 0.05)
  28. for rectangle in rectangles: # draw rectangles
  29. cv2.rectangle(img, (rectangle[0], rectangle[1]), (rectangle[2], rectangle[3]), (255, 0, 0), 2)
  30. end_time = time.time()
  31. print 'aver_time = ',(end_time-start_time)*1000,'ms'
  32. cv2.imshow('test img', img)
  33. cv2.waitKey(0)
  34. cv2.destroyAllWindows()

上述脚本文件中的detect_faces_net()函数在CNN_face_detection-master\face_detection文件夹下的face_detection_functions.py脚本中实现,face_detection_functions.py是整个级联网络的检测过程脚本,写的非常优雅,思路和注释清晰,本人针对自己需求进行了修改,代码如下:

  1. import numpy as np
  2. import cv2
  3. import time
  4. from operator import itemgetter
  5. # ================== caffe ======================================
  6. #caffe_root = '/home/anson/caffe-master/' # this file is expected to be in {caffe_root}/examples
  7. import sys
  8. #sys.path.insert(0, caffe_root + 'python')
  9. import caffe
  10. def find_initial_scale(net_kind, min_face_size):
  11. '''
  12. :param net_kind: what kind of net (12, 24, or 48)
  13. :param min_face_size: minimum face size
  14. :return: returns scale factor
  15. '''
  16. return float(min_face_size) / net_kind
  17. def resize_image(img, scale):
  18. '''
  19. :param img: original img
  20. :param scale: scale factor
  21. :return: resized image
  22. '''
  23. height, width, channels = img.shape
  24. new_height = int(height / scale) # resized new height
  25. new_width = int(width / scale) # resized new width
  26. new_dim = (new_width, new_height)
  27. img_resized = cv2.resize(img, new_dim) # resized image
  28. return img_resized
  29. def draw_rectangle(net_kind, img, face):
  30. '''
  31. :param net_kind: what kind of net (12, 24, or 48)
  32. :param img: image to draw on
  33. :param face: # list of info. in format [x, y, scale]
  34. :return: nothing
  35. '''
  36. x = face[0]
  37. y = face[1]
  38. scale = face[2]
  39. original_x = int(x * scale) # corresponding x and y at original image
  40. original_y = int(y * scale)
  41. original_x_br = int(x * scale + net_kind * scale) # bottom right x and y
  42. original_y_br = int(y * scale + net_kind * scale)
  43. cv2.rectangle(img, (original_x, original_y), (original_x_br, original_y_br), (255,0,0), 2)
  44. def IoU(rect_1, rect_2):
  45. '''
  46. :param rect_1: list in format [x11, y11, x12, y12, confidence, current_scale]
  47. :param rect_2: list in format [x21, y21, x22, y22, confidence, current_scale]
  48. :return: returns IoU ratio (intersection over union) of two rectangles
  49. '''
  50. x11 = rect_1[0] # first rectangle top left x
  51. y11 = rect_1[1] # first rectangle top left y
  52. x12 = rect_1[2] # first rectangle bottom right x
  53. y12 = rect_1[3] # first rectangle bottom right y
  54. x21 = rect_2[0] # second rectangle top left x
  55. y21 = rect_2[1] # second rectangle top left y
  56. x22 = rect_2[2] # second rectangle bottom right x
  57. y22 = rect_2[3] # second rectangle bottom right y
  58. x_overlap = max(0, min(x12,x22) -max(x11,x21))
  59. y_overlap = max(0, min(y12,y22) -max(y11,y21))
  60. intersection = x_overlap * y_overlap
  61. union = (x12-x11) * (y12-y11) + (x22-x21) * (y22-y21) - intersection
  62. return float(intersection) / union
  63. def IoM(rect_1, rect_2):
  64. '''
  65. :param rect_1: list in format [x11, y11, x12, y12, confidence, current_scale]
  66. :param rect_2: list in format [x21, y21, x22, y22, confidence, current_scale]
  67. :return: returns IoM ratio (intersection over min-area) of two rectangles
  68. '''
  69. x11 = rect_1[0] # first rectangle top left x
  70. y11 = rect_1[1] # first rectangle top left y
  71. x12 = rect_1[2] # first rectangle bottom right x
  72. y12 = rect_1[3] # first rectangle bottom right y
  73. x21 = rect_2[0] # second rectangle top left x
  74. y21 = rect_2[1] # second rectangle top left y
  75. x22 = rect_2[2] # second rectangle bottom right x
  76. y22 = rect_2[3] # second rectangle bottom right y
  77. x_overlap = max(0, min(x12,x22) -max(x11,x21))
  78. y_overlap = max(0, min(y12,y22) -max(y11,y21))
  79. intersection = x_overlap * y_overlap
  80. rect1_area = (y12 - y11) * (x12 - x11)
  81. rect2_area = (y22 - y21) * (x22 - x21)
  82. min_area = min(rect1_area, rect2_area)
  83. return float(intersection) / min_area
  84. def localNMS(rectangles):
  85. '''
  86. :param rectangles: list of rectangles, which are lists in format [x11, y11, x12, y12, confidence, current_scale],
  87. sorted from highest confidence to smallest
  88. :return: list of rectangles after local NMS
  89. '''
  90. result_rectangles = rectangles[:] # list to return
  91. number_of_rects = len(result_rectangles)
  92. threshold = 0.3 # threshold of IoU of two rectangles
  93. cur_rect = 0
  94. while cur_rect < number_of_rects - 1: # start from first element to second last element
  95. rects_to_compare = number_of_rects - cur_rect - 1 # elements after current element to compare
  96. cur_rect_to_compare = cur_rect + 1 # start comparing with element after current
  97. while rects_to_compare > 0: # while there is at least one element after current to compare
  98. if (IoU(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= threshold) \
  99. and (result_rectangles[cur_rect][5] == result_rectangles[cur_rect_to_compare][5]): # scale is same
  100. del result_rectangles[cur_rect_to_compare] # delete the rectangle
  101. number_of_rects -= 1
  102. else:
  103. cur_rect_to_compare += 1 # skip to next rectangle
  104. rects_to_compare -= 1
  105. cur_rect += 1 # finished comparing for current rectangle
  106. return result_rectangles
  107. def globalNMS(rectangles):
  108. '''
  109. :param rectangles: list of rectangles, which are lists in format [x11, y11, x12, y12, confidence, current_scale],
  110. sorted from highest confidence to smallest
  111. :return: list of rectangles after global NMS
  112. '''
  113. result_rectangles = rectangles[:] # list to return
  114. number_of_rects = len(result_rectangles)
  115. threshold = 0.3 # threshold of IoU of two rectangles
  116. cur_rect = 0
  117. while cur_rect < number_of_rects - 1: # start from first element to second last element
  118. rects_to_compare = number_of_rects - cur_rect - 1 # elements after current element to compare
  119. cur_rect_to_compare = cur_rect + 1 # start comparing with element ater current
  120. while rects_to_compare > 0: # while there is at least one element after current to compare
  121. if IoU(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= 0.2 \
  122. or ((IoM(result_rectangles[cur_rect], result_rectangles[cur_rect_to_compare]) >= threshold)):
  123. #and (result_rectangles[cur_rect_to_compare][5] < 0.85)): # if IoU ratio is higher than threshold #10/12=0.8333?
  124. del result_rectangles[cur_rect_to_compare] # delete the rectangle
  125. number_of_rects -= 1
  126. else:
  127. cur_rect_to_compare += 1 # skip to next rectangle
  128. rects_to_compare -= 1
  129. cur_rect += 1 # finished comparing for current rectangle
  130. return result_rectangles
  131. # ====== Below functions (12cal ~ 48cal) take images in style of caffe (0~1 BGR)===
  132. def detect_face_12c(net_12c_full_conv, img, min_face_size, stride,
  133. multiScale=False, scale_factor=1.414, threshold=0.05):
  134. '''
  135. :param img: image to detect faces
  136. :param min_face_size: minimum face size to detect (in pixels)
  137. :param stride: stride (in pixels)
  138. :param multiScale: whether to find faces under multiple scales or not
  139. :param scale_factor: scale to apply for pyramid
  140. :param threshold: score of patch must be above this value to pass to next net
  141. :return: list of rectangles after global NMS
  142. '''
  143. net_kind = 12
  144. rectangles = [] # list of rectangles [x11, y11, x12, y12, confidence, current_scale] (corresponding to original image)
  145. current_scale = find_initial_scale(net_kind, min_face_size) # find initial scale
  146. caffe_img_resized = resize_image(img, current_scale) # resized initial caffe image
  147. current_height, current_width, channels = caffe_img_resized.shape
  148. while current_height > net_kind and current_width > net_kind:
  149. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1)) # switch from H x W x C to C x H x W
  150. # shape for input (data blob is N x C x H x W), set data
  151. net_12c_full_conv.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  152. net_12c_full_conv.blobs['data'].data[...] = caffe_img_resized_CHW
  153. # run net and take argmax for prediction
  154. net_12c_full_conv.forward()
  155. out = net_12c_full_conv.blobs['prob'].data[0][1, :, :]
  156. # print out.shape
  157. out_height, out_width = out.shape
  158. for current_y in range(0, out_height):
  159. for current_x in range(0, out_width):
  160. # total_windows += 1
  161. confidence = out[current_y, current_x] # left index is y, right index is x (starting from 0)
  162. if confidence >= threshold:
  163. current_rectangle = [int(2*current_x*current_scale), int(2*current_y*current_scale),
  164. int(2*current_x*current_scale + net_kind*current_scale),
  165. int(2*current_y*current_scale + net_kind*current_scale),
  166. confidence, current_scale] # find corresponding patch on image
  167. rectangles.append(current_rectangle)
  168. if multiScale is False:
  169. break
  170. else:
  171. caffe_img_resized = resize_image(caffe_img_resized, scale_factor)
  172. current_scale *= scale_factor
  173. current_height, current_width, channels = caffe_img_resized.shape
  174. return rectangles
  175. def cal_face_12c(net_12_cal, caffe_img, rectangles):
  176. '''
  177. :param caffe_image: image in caffe style to detect faces
  178. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  179. :return: rectangles after calibration
  180. '''
  181. height, width, channels = caffe_img.shape
  182. result = []
  183. all_cropped_caffe_img = []
  184. for cur_rectangle in rectangles:
  185. original_x1 = cur_rectangle[0]
  186. original_y1 = cur_rectangle[1]
  187. original_x2 = cur_rectangle[2]
  188. original_y2 = cur_rectangle[3]
  189. cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
  190. all_cropped_caffe_img.append(cropped_caffe_img)
  191. if len(all_cropped_caffe_img) == 0:
  192. return []
  193. output_all = net_12_cal.predict(all_cropped_caffe_img) # predict through caffe
  194. for cur_rect in range(len(rectangles)):
  195. cur_rectangle = rectangles[cur_rect]
  196. output = output_all[cur_rect]
  197. prediction = output[0] # (44, 1) ndarray
  198. threshold = 0.1
  199. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  200. number_of_cals = len(indices) # number of calibrations larger than threshold
  201. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  202. result.append(cur_rectangle)
  203. continue
  204. original_x1 = cur_rectangle[0]
  205. original_y1 = cur_rectangle[1]
  206. original_x2 = cur_rectangle[2]
  207. original_y2 = cur_rectangle[3]
  208. original_w = original_x2 - original_x1
  209. original_h = original_y2 - original_y1
  210. total_s_change = 0
  211. total_x_change = 0
  212. total_y_change = 0
  213. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  214. cal_label = int(indices[current_cal]) # should be number in 0~44
  215. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  216. total_s_change += 0.83
  217. elif (cal_label >= 9) and (cal_label <= 17):
  218. total_s_change += 0.91
  219. elif (cal_label >= 18) and (cal_label <= 26):
  220. total_s_change += 1.0
  221. elif (cal_label >= 27) and (cal_label <= 35):
  222. total_s_change += 1.10
  223. else:
  224. total_s_change += 1.21
  225. if cal_label % 9 <= 2: # decide x change
  226. total_x_change += -0.17
  227. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  228. total_x_change += 0.17
  229. if cal_label % 3 == 0: # decide y change
  230. total_y_change += -0.17
  231. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  232. total_y_change += 0.17
  233. s_change = total_s_change / number_of_cals # calculate average
  234. x_change = total_x_change / number_of_cals
  235. y_change = total_y_change / number_of_cals
  236. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  237. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  238. cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
  239. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  240. cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))
  241. result.append(cur_result)
  242. result = sorted(result, key=itemgetter(4), reverse=True) # sort rectangles according to confidence
  243. # reverse, so that it ranks from large to small
  244. return result
  245. def detect_face_24c(net_24c, caffe_img, rectangles):
  246. '''
  247. :param caffe_img: image in caffe style to detect faces
  248. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  249. :return: rectangles after calibration
  250. '''
  251. result = []
  252. all_cropped_caffe_img = []
  253. for cur_rectangle in rectangles:
  254. x1 = cur_rectangle[0]
  255. y1 = cur_rectangle[1]
  256. x2 = cur_rectangle[2]
  257. y2 = cur_rectangle[3]
  258. cropped_caffe_img = caffe_img[y1:y2, x1:x2] # crop image
  259. all_cropped_caffe_img.append(cropped_caffe_img)
  260. if len(all_cropped_caffe_img) == 0:
  261. return []
  262. prediction_all = net_24c.predict(all_cropped_caffe_img) # predict through caffe
  263. for cur_rect in range(len(rectangles)):
  264. confidence = prediction_all[cur_rect][1]
  265. if confidence > 0.05:
  266. cur_rectangle = rectangles[cur_rect]
  267. cur_rectangle[4] = confidence
  268. result.append(cur_rectangle)
  269. return result
  270. def cal_face_24c(net_24_cal, caffe_img, rectangles):
  271. '''
  272. :param caffe_image: image in caffe style to detect faces
  273. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  274. :return: rectangles after calibration
  275. '''
  276. height, width, channels = caffe_img.shape
  277. result = []
  278. for cur_rectangle in rectangles:
  279. original_x1 = cur_rectangle[0]
  280. original_y1 = cur_rectangle[1]
  281. original_x2 = cur_rectangle[2]
  282. original_y2 = cur_rectangle[3]
  283. original_w = original_x2 - original_x1
  284. original_h = original_y2 - original_y1
  285. cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
  286. output = net_24_cal.predict([cropped_caffe_img]) # predict through caffe
  287. prediction = output[0] # (44, 1) ndarray
  288. threshold = 0.1
  289. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  290. number_of_cals = len(indices) # number of calibrations larger than threshold
  291. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  292. result.append(cur_rectangle)
  293. continue
  294. total_s_change = 0
  295. total_x_change = 0
  296. total_y_change = 0
  297. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  298. cal_label = int(indices[current_cal]) # should be number in 0~44
  299. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  300. total_s_change += 0.83
  301. elif (cal_label >= 9) and (cal_label <= 17):
  302. total_s_change += 0.91
  303. elif (cal_label >= 18) and (cal_label <= 26):
  304. total_s_change += 1.0
  305. elif (cal_label >= 27) and (cal_label <= 35):
  306. total_s_change += 1.10
  307. else:
  308. total_s_change += 1.21
  309. if cal_label % 9 <= 2: # decide x change
  310. total_x_change += -0.17
  311. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  312. total_x_change += 0.17
  313. if cal_label % 3 == 0: # decide y change
  314. total_y_change += -0.17
  315. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  316. total_y_change += 0.17
  317. s_change = total_s_change / number_of_cals # calculate average
  318. x_change = total_x_change / number_of_cals
  319. y_change = total_y_change / number_of_cals
  320. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  321. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  322. cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
  323. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  324. cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))
  325. result.append(cur_result)
  326. return result
  327. def detect_face_48c(net_48c, caffe_img, rectangles):
  328. '''
  329. :param caffe_img: image in caffe style to detect faces
  330. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  331. :return: rectangles after calibration
  332. '''
  333. result = []
  334. all_cropped_caffe_img = []
  335. for cur_rectangle in rectangles:
  336. x1 = cur_rectangle[0]
  337. y1 = cur_rectangle[1]
  338. x2 = cur_rectangle[2]
  339. y2 = cur_rectangle[3]
  340. cropped_caffe_img = caffe_img[y1:y2, x1:x2] # crop image
  341. all_cropped_caffe_img.append(cropped_caffe_img)
  342. prediction = net_48c.predict([cropped_caffe_img]) # predict through caffe
  343. confidence = prediction[0][1]
  344. if confidence > 0.3:
  345. cur_rectangle[4] = confidence
  346. result.append(cur_rectangle)
  347. result = sorted(result, key=itemgetter(4), reverse=True) # sort rectangles according to confidence
  348. # reverse, so that it ranks from large to small
  349. return result
  350. def cal_face_48c(net_48_cal, caffe_img, rectangles):
  351. '''
  352. :param caffe_image: image in caffe style to detect faces
  353. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  354. :return: rectangles after calibration
  355. '''
  356. height, width, channels = caffe_img.shape
  357. result = []
  358. for cur_rectangle in rectangles:
  359. original_x1 = cur_rectangle[0]
  360. original_y1 = cur_rectangle[1]
  361. original_x2 = cur_rectangle[2]
  362. original_y2 = cur_rectangle[3]
  363. original_w = original_x2 - original_x1
  364. original_h = original_y2 - original_y1
  365. cropped_caffe_img = caffe_img[original_y1:original_y2, original_x1:original_x2] # crop image
  366. output = net_48_cal.predict([cropped_caffe_img]) # predict through caffe
  367. prediction = output[0] # (44, 1) ndarray
  368. threshold = 0.1
  369. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  370. number_of_cals = len(indices) # number of calibrations larger than threshold
  371. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  372. result.append(cur_rectangle)
  373. continue
  374. total_s_change = 0
  375. total_x_change = 0
  376. total_y_change = 0
  377. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  378. cal_label = int(indices[current_cal]) # should be number in 0~44
  379. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  380. total_s_change += 0.83
  381. elif (cal_label >= 9) and (cal_label <= 17):
  382. total_s_change += 0.91
  383. elif (cal_label >= 18) and (cal_label <= 26):
  384. total_s_change += 1.0
  385. elif (cal_label >= 27) and (cal_label <= 35):
  386. total_s_change += 1.10
  387. else:
  388. total_s_change += 1.21
  389. if cal_label % 9 <= 2: # decide x change
  390. total_x_change += -0.17
  391. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  392. total_x_change += 0.17
  393. if cal_label % 3 == 0: # decide y change
  394. total_y_change += -0.17
  395. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  396. total_y_change += 0.17
  397. s_change = total_s_change / number_of_cals # calculate average
  398. x_change = total_x_change / number_of_cals
  399. y_change = total_y_change / number_of_cals
  400. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  401. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  402. cur_result[1] = int(max(0, original_y1 - 1.1 * original_h * y_change / s_change))
  403. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  404. cur_result[3] = int(min(height, cur_result[1] + 1.1 * original_h / s_change))
  405. result.append(cur_result)
  406. return result
  407. def detect_faces(nets, img_forward, caffe_image, min_face_size, stride,
  408. multiScale=False, scale_factor=1.414, threshold=0.05):
  409. '''
  410. Complete flow of face cascade detection
  411. :param nets: 6 nets as a tuple
  412. :param img_forward: image in normal style after subtracting mean pixel value
  413. :param caffe_image: image in style of caffe (0~1 BGR)
  414. :param min_face_size:
  415. :param stride:
  416. :param multiScale:
  417. :param scale_factor:
  418. :param threshold:
  419. :return: list of rectangles
  420. '''
  421. net_12c_full_conv = nets[0]
  422. net_12_cal = nets[1]
  423. net_24c = nets[2]
  424. net_24_cal = nets[3]
  425. net_48c = nets[4]
  426. net_48_cal = nets[5]
  427. rectangles = detect_face_12c(net_12c_full_conv, img_forward, min_face_size,
  428. stride, multiScale, scale_factor, threshold) # detect faces
  429. rectangles = cal_face_12c(net_12_cal, caffe_image, rectangles) # calibration
  430. rectangles = localNMS(rectangles) # apply local NMS
  431. rectangles = detect_face_24c(net_24c, caffe_image, rectangles)
  432. rectangles = cal_face_24c(net_24_cal, caffe_image, rectangles) # calibration
  433. rectangles = localNMS(rectangles) # apply local NMS
  434. rectangles = detect_face_48c(net_48c, caffe_image, rectangles)
  435. rectangles = globalNMS(rectangles) # apply global NMS
  436. rectangles = cal_face_48c(net_48_cal, caffe_image, rectangles) # calibration
  437. return rectangles
  438. # ========== Adjusts net to take one crop of image only during test time ==========
  439. # ====== Below functions take images in normal style after subtracting mean pixel value===
  440. def detect_face_12c_net(net_12c_full_conv, img_forward, min_face_size, stride,
  441. multiScale=False, scale_factor=1.414, threshold=0.05):
  442. '''
  443. Adjusts net to take one crop of image only during test time
  444. :param img: image in caffe style to detect faces
  445. :param min_face_size: minimum face size to detect (in pixels)
  446. :param stride: stride (in pixels)
  447. :param multiScale: whether to find faces under multiple scales or not
  448. :param scale_factor: scale to apply for pyramid
  449. :param threshold: score of patch must be above this value to pass to next net
  450. :return: list of rectangles after global NMS
  451. '''
  452. net_kind = 12
  453. rectangles = [] # list of rectangles [x11, y11, x12, y12, confidence, current_scale] (corresponding to original image)
  454. current_scale = find_initial_scale(net_kind, min_face_size) # find initial scale
  455. caffe_img_resized = resize_image(img_forward, current_scale) # resized initial caffe image
  456. current_height, current_width, channels = caffe_img_resized.shape
  457. # print "Shape after resizing : " + str(caffe_img_resized.shape)
  458. while current_height > net_kind and current_width > net_kind:
  459. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1)) # switch from H x W x C to C x H x W
  460. # shape for input (data blob is N x C x H x W), set data
  461. net_12c_full_conv.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  462. net_12c_full_conv.blobs['data'].data[...] = caffe_img_resized_CHW
  463. # run net and take argmax for prediction
  464. net_12c_full_conv.forward()
  465. out = net_12c_full_conv.blobs['prob'].data[0][1, :, :]
  466. # print out.shape
  467. out_height, out_width = out.shape
  468. # threshold = 0.02
  469. # idx = out[:, :] >= threshold
  470. # out[idx] = 1
  471. # idx = out[:, :] < threshold
  472. # out[idx] = 0
  473. # cv2.imshow('img', out*255)
  474. # cv2.waitKey(0)
  475. # print "Shape of output after resizing " + str(caffe_img_resized.shape) + " : " + str(out.shape)
  476. for current_y in range(0, out_height):
  477. for current_x in range(0, out_width):
  478. # total_windows += 1
  479. confidence = out[current_y, current_x] # left index is y, right index is x (starting from 0)
  480. if confidence >= threshold:
  481. current_rectangle = [int(2*current_x*current_scale), int(2*current_y*current_scale),
  482. int(2*current_x*current_scale + net_kind*current_scale),
  483. int(2*current_y*current_scale + net_kind*current_scale),
  484. confidence, current_scale] # find corresponding patch on image
  485. rectangles.append(current_rectangle)
  486. if multiScale is False:
  487. break
  488. else:
  489. caffe_img_resized = resize_image(caffe_img_resized, scale_factor)
  490. current_scale *= scale_factor
  491. current_height, current_width, channels = caffe_img_resized.shape
  492. return rectangles
  493. def cal_face_12c_net(net_12_cal, img_forward, rectangles):
  494. '''
  495. Adjusts net to take one crop of image only during test time
  496. :param caffe_image: image in caffe style to detect faces
  497. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  498. :return: rectangles after calibration
  499. '''
  500. height, width, channels = img_forward.shape
  501. result = []
  502. for cur_rectangle in rectangles:
  503. original_x1 = cur_rectangle[0]
  504. original_y1 = cur_rectangle[1]
  505. original_x2 = cur_rectangle[2]
  506. original_y2 = cur_rectangle[3]
  507. original_w = original_x2 - original_x1
  508. original_h = original_y2 - original_y1
  509. cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image
  510. caffe_img_resized = cv2.resize(cropped_caffe_img, (12, 12))
  511. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
  512. net_12_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  513. net_12_cal.blobs['data'].data[...] = caffe_img_resized_CHW
  514. net_12_cal.forward()
  515. output = net_12_cal.blobs['prob'].data
  516. # output = net_12_cal.predict([cropped_caffe_img]) # predict through caffe
  517. prediction = output[0] # (44, 1) ndarray
  518. threshold = 0.1
  519. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  520. number_of_cals = len(indices) # number of calibrations larger than threshold
  521. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  522. result.append(cur_rectangle)
  523. continue
  524. total_s_change = 0
  525. total_x_change = 0
  526. total_y_change = 0
  527. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  528. cal_label = int(indices[current_cal]) # should be number in 0~44
  529. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  530. total_s_change += 0.83
  531. elif (cal_label >= 9) and (cal_label <= 17):
  532. total_s_change += 0.91
  533. elif (cal_label >= 18) and (cal_label <= 26):
  534. total_s_change += 1.0
  535. elif (cal_label >= 27) and (cal_label <= 35):
  536. total_s_change += 1.10
  537. else:
  538. total_s_change += 1.21
  539. if cal_label % 9 <= 2: # decide x change
  540. total_x_change += -0.17
  541. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  542. total_x_change += 0.17
  543. if cal_label % 3 == 0: # decide y change
  544. total_y_change += -0.17
  545. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  546. total_y_change += 0.17
  547. s_change = total_s_change / number_of_cals # calculate average
  548. x_change = total_x_change / number_of_cals
  549. y_change = total_y_change / number_of_cals
  550. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  551. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  552. cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
  553. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  554. cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))
  555. result.append(cur_result)
  556. result = sorted(result, key=itemgetter(4), reverse=True) # sort rectangles according to confidence
  557. # reverse, so that it ranks from large to small
  558. return result
  559. def detect_face_24c_net(net_24c, img_forward, rectangles):
  560. '''
  561. Adjusts net to take one crop of image only during test time
  562. :param caffe_img: image in caffe style to detect faces
  563. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  564. :return: rectangles after calibration
  565. '''
  566. result = []
  567. for cur_rectangle in rectangles:
  568. x1 = cur_rectangle[0]
  569. y1 = cur_rectangle[1]
  570. x2 = cur_rectangle[2]
  571. y2 = cur_rectangle[3]
  572. cropped_caffe_img = img_forward[y1:y2, x1:x2] # crop image
  573. caffe_img_resized = cv2.resize(cropped_caffe_img, (24, 24))
  574. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
  575. net_24c.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  576. net_24c.blobs['data'].data[...] = caffe_img_resized_CHW
  577. net_24c.forward()
  578. prediction = net_24c.blobs['prob'].data
  579. confidence = prediction[0][1]
  580. if confidence > 0.9:#0.05:
  581. cur_rectangle[4] = confidence
  582. result.append(cur_rectangle)
  583. return result
  584. def cal_face_24c_net(net_24_cal, img_forward, rectangles):
  585. '''
  586. Adjusts net to take one crop of image only during test time
  587. :param caffe_image: image in caffe style to detect faces
  588. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  589. :return: rectangles after calibration
  590. '''
  591. height, width, channels = img_forward.shape
  592. result = []
  593. for cur_rectangle in rectangles:
  594. original_x1 = cur_rectangle[0]
  595. original_y1 = cur_rectangle[1]
  596. original_x2 = cur_rectangle[2]
  597. original_y2 = cur_rectangle[3]
  598. original_w = original_x2 - original_x1
  599. original_h = original_y2 - original_y1
  600. cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image
  601. caffe_img_resized = cv2.resize(cropped_caffe_img, (24, 24))
  602. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
  603. net_24_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  604. net_24_cal.blobs['data'].data[...] = caffe_img_resized_CHW
  605. net_24_cal.forward()
  606. output = net_24_cal.blobs['prob'].data
  607. prediction = output[0] # (44, 1) ndarray
  608. threshold = 0.1
  609. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  610. number_of_cals = len(indices) # number of calibrations larger than threshold
  611. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  612. result.append(cur_rectangle)
  613. continue
  614. total_s_change = 0
  615. total_x_change = 0
  616. total_y_change = 0
  617. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  618. cal_label = int(indices[current_cal]) # should be number in 0~44
  619. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  620. total_s_change += 0.83
  621. elif (cal_label >= 9) and (cal_label <= 17):
  622. total_s_change += 0.91
  623. elif (cal_label >= 18) and (cal_label <= 26):
  624. total_s_change += 1.0
  625. elif (cal_label >= 27) and (cal_label <= 35):
  626. total_s_change += 1.10
  627. else:
  628. total_s_change += 1.21
  629. if cal_label % 9 <= 2: # decide x change
  630. total_x_change += -0.17
  631. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  632. total_x_change += 0.17
  633. if cal_label % 3 == 0: # decide y change
  634. total_y_change += -0.17
  635. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  636. total_y_change += 0.17
  637. s_change = total_s_change / number_of_cals # calculate average
  638. x_change = total_x_change / number_of_cals
  639. y_change = total_y_change / number_of_cals
  640. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  641. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  642. cur_result[1] = int(max(0, original_y1 - original_h * y_change / s_change))
  643. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  644. cur_result[3] = int(min(height, cur_result[1] + original_h / s_change))
  645. result.append(cur_result)
  646. result = sorted(result, key=itemgetter(4), reverse=True) # sort rectangles according to confidence # reverse, so that it ranks from large to small
  647. return result
  648. def detect_face_48c_net(net_48c, img_forward, rectangles):
  649. '''
  650. Adjusts net to take one crop of image only during test time
  651. :param caffe_img: image in caffe style to detect faces
  652. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  653. :return: rectangles after calibration
  654. '''
  655. result = []
  656. for cur_rectangle in rectangles:
  657. x1 = cur_rectangle[0]
  658. y1 = cur_rectangle[1]
  659. x2 = cur_rectangle[2]
  660. y2 = cur_rectangle[3]
  661. cropped_caffe_img = img_forward[y1:y2, x1:x2] # crop image
  662. caffe_img_resized = cv2.resize(cropped_caffe_img, (48, 48))
  663. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
  664. net_48c.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  665. net_48c.blobs['data'].data[...] = caffe_img_resized_CHW
  666. net_48c.forward()
  667. prediction = net_48c.blobs['prob'].data
  668. confidence = prediction[0][1]
  669. if confidence > 0.95:#0.1:
  670. cur_rectangle[4] = confidence
  671. result.append(cur_rectangle)
  672. result = sorted(result, key=itemgetter(4), reverse=True) # sort rectangles according to confidence
  673. # reverse, so that it ranks from large to small #after sorter and globalnms the best faces will be detected
  674. return result
  675. def cal_face_48c_net(net_48_cal, img_forward, rectangles):
  676. '''
  677. Adjusts net to take one crop of image only during test time
  678. :param caffe_image: image in caffe style to detect faces
  679. :param rectangles: rectangles in form [x11, y11, x12, y12, confidence, current_scale]
  680. :return: rectangles after calibration
  681. '''
  682. height, width, channels = img_forward.shape
  683. result = []
  684. for cur_rectangle in rectangles:
  685. original_x1 = cur_rectangle[0]
  686. original_y1 = cur_rectangle[1]
  687. original_x2 = cur_rectangle[2]
  688. original_y2 = cur_rectangle[3]
  689. original_w = original_x2 - original_x1
  690. original_h = original_y2 - original_y1
  691. cropped_caffe_img = img_forward[original_y1:original_y2, original_x1:original_x2] # crop image
  692. caffe_img_resized = cv2.resize(cropped_caffe_img, (48, 48))
  693. caffe_img_resized_CHW = caffe_img_resized.transpose((2, 0, 1))
  694. net_48_cal.blobs['data'].reshape(1, *caffe_img_resized_CHW.shape)
  695. net_48_cal.blobs['data'].data[...] = caffe_img_resized_CHW
  696. net_48_cal.forward()
  697. output = net_48_cal.blobs['prob'].data
  698. prediction = output[0] # (44, 1) ndarray
  699. threshold = 0.1
  700. indices = np.nonzero(prediction > threshold)[0] # ndarray of indices where prediction is larger than threshold
  701. number_of_cals = len(indices) # number of calibrations larger than threshold
  702. if number_of_cals == 0: # if no calibration is needed, check next rectangle
  703. result.append(cur_rectangle)
  704. continue
  705. total_s_change = 0
  706. total_x_change = 0
  707. total_y_change = 0
  708. for current_cal in range(number_of_cals): # accumulate changes, and calculate average
  709. cal_label = int(indices[current_cal]) # should be number in 0~44
  710. if (cal_label >= 0) and (cal_label <= 8): # decide s change
  711. total_s_change += 0.83
  712. elif (cal_label >= 9) and (cal_label <= 17):
  713. total_s_change += 0.91
  714. elif (cal_label >= 18) and (cal_label <= 26):
  715. total_s_change += 1.0
  716. elif (cal_label >= 27) and (cal_label <= 35):
  717. total_s_change += 1.10
  718. else:
  719. total_s_change += 1.21
  720. if cal_label % 9 <= 2: # decide x change
  721. total_x_change += -0.17
  722. elif (cal_label % 9 >= 6) and (cal_label % 9 <= 8): # ignore case when 3<=x<=5, since adding 0 doesn't change
  723. total_x_change += 0.17
  724. if cal_label % 3 == 0: # decide y change
  725. total_y_change += -0.17
  726. elif cal_label % 3 == 2: # ignore case when 1, since adding 0 doesn't change
  727. total_y_change += 0.17
  728. s_change = total_s_change / number_of_cals # calculate average
  729. x_change = total_x_change / number_of_cals
  730. y_change = total_y_change / number_of_cals
  731. cur_result = cur_rectangle # inherit format and last two attributes from original rectangle
  732. cur_result[0] = int(max(0, original_x1 - original_w * x_change / s_change))
  733. cur_result[1] = int(max(0, original_y1 - 1.1 * original_h * y_change / s_change))
  734. cur_result[2] = int(min(width, cur_result[0] + original_w / s_change))
  735. cur_result[3] = int(min(height, cur_result[1] + 1.1 * original_h / s_change))
  736. result.append(cur_result)
  737. return result
  738. def detect_faces_net(nets, img_forward, min_face_size, stride,
  739. multiScale=False, scale_factor=1.414, threshold=0.05):
  740. '''
  741. Complete flow of face cascade detection
  742. :param nets: 6 nets as a tuple
  743. :param img_forward: image in normal style after subtracting mean pixel value
  744. :param min_face_size:
  745. :param stride:
  746. :param multiScale:
  747. :param scale_factor:
  748. :param threshold:
  749. :return: list of rectangles
  750. '''
  751. net_12c_full_conv = nets[0]
  752. net_12_cal = nets[1]
  753. net_24c = nets[2]
  754. net_24_cal = nets[3]
  755. net_48c = nets[4]
  756. net_48_cal = nets[5]
  757. rectangles = detect_face_12c_net(net_12c_full_conv, img_forward, min_face_size,
  758. stride, multiScale, scale_factor, threshold) # detect faces
  759. rectangles = cal_face_12c_net(net_12_cal, img_forward, rectangles) # calibration
  760. rectangles = localNMS(rectangles) # apply local NMS
  761. rectangles = detect_face_24c_net(net_24c, img_forward, rectangles)
  762. rectangles = cal_face_24c_net(net_24_cal, img_forward, rectangles) # calibration
  763. rectangles = localNMS(rectangles) # apply local NMS
  764. rectangles = detect_face_48c_net(net_48c, img_forward, rectangles)
  765. rectangles = globalNMS(rectangles) # apply global NMS
  766. rectangles = cal_face_48c_net(net_48_cal, img_forward, rectangles) # calibration
  767. return rectangles

发表评论

表情:
评论列表 (有 0 条评论,382人围观)

还没有评论,来说两句吧...

相关阅读

    相关 人脸检测(三)

    3.OpenCV人脸检测方法 采用的是harr-like特征提取 + adaboost级联分类器构成的人脸检测算法,可以设置要检测的人脸大小范围,召回率高,但有时会出现误检

    相关 人脸识别系统_人脸检测

    项目:基于人脸识别的无卡ATM机模拟系统 主要实现内容: 包括实现AMT机模拟人脸识别和密码输入、PC端模拟实现储户数据库服务器系统。 1. ATM模拟端实现采用手