cropper固定宽高裁剪_深度学习图像预处理-python缩放裁剪

「爱情、让人受尽委屈。」 2023-01-05 12:39 243阅读 0赞

使用深度学习进行图像类任务时,通常网络的输入大小是固定的,最近在进行涉及到文字检测的工作中,由于预处理resize缩小了原图,导致字体变模糊,从而检测失败,后来想到使用overlap来对图像进行缩放裁剪,即先将原图缩放到一定尺寸,再裁剪得到网络的输入。

好了,来说正题,使用yolov3,网络的输入是352x352x3,而输入图像大小为几百上千不等,因此需对原图进行resize,起初直接进行缩放 + 填充,检测的map很低,后来分析发现有些352x352的输入图像中的文字已经很模糊,因此直接缩放的方案不可行,改进后方案如下:

  1. 原图最大尺寸大于1000,则resize到800x800,再裁剪为9个352x352,overlap为128个像素
  2. 原图最大尺寸大小500且小于1000,则resize到600x600,再裁剪为4个352x352,overlap为96个像素
  3. 原图最大尺寸小于500,则resize到352x352。

python实现代码如下,使用了PIL、opencv库,将整个目录下的图像全部做缩放裁剪处理,代码包含如下功能:

  1. 遍历某一目录的文件
  2. opencv进行图像载入及保存
  3. opencv进行缩放裁剪
  4. PIL进行图像显示

    import numpy as np
    from PIL import Image
    import cv2
    import os

    输入bgr通道,并显示图像

    def img_show(img):

    1. img_rgb = cv2.cvtColor(img, cv2.COLOR_BGR2RGB) #生成numpy数组
    2. print(type(img_rgb), img_rgb.shape)
    3. pil_img = Image.fromarray(img_rgb)
    4. pil_img.show()

    完成图像的等比缩放及黑色填充

    def img_resize(cvImageSrc, net_size, width, height):

    1. if (width != net_size or height != net_size):
    2. #宽高的缩放比非常接近时直接进行缩放
    3. if ((net_size / height - net_size / height) < 0.001):
    4. det_mat = cv2.resize(cvImageSrc, (net_size, net_size))
    5. return det_mat
    6. else:
    7. new_w = width
    8. new_h = height
    9. if (net_size / width < net_size / height):
    10. new_w = net_size
    11. new_h = max(1, (height * net_size) / width)
    12. else:
    13. new_h = net_size
    14. new_w = max(1, (width * net_size) / height)
    15. det_mat = np.zeros((net_size, net_size, 3), dtype="uint8")
    16. if (new_w == width and new_h == height):
    17. cvImageSrc.copyTo(det_mat)
    18. else:
    19. net_w = int(new_w + 0.5)
    20. net_h = int(new_h + 0.5)
    21. if (net_w % 2 == 1):
    22. net_w = net_w - 1
    23. if (net_h % 2 == 1):
    24. net_h = net_h - 1
    25. #print(net_w, net_h)
    26. det_matROI = cv2.resize(cvImageSrc, (net_w, net_h))
    27. base_w = int((net_size - new_w) / 2 + 0.5)
    28. base_h = int((net_size - new_h) / 2 + 0.5)
    29. #print(base_h, base_w)
    30. for c in range(3):
    31. for j in range(net_h):
    32. for i in range(net_w):
    33. #print(c, j, i)
    34. det_mat[j + base_h - 1, i + base_w - 1, :] = det_matROI[j - 1, i - 1, :]
    35. else:
    36. det_mat = cvImageSrc
    37. return det_mat

    baseRoot = “/Users/lemonhe/Documents/CNN/dataset/01-data/“
    rootdir = baseRoot + “dataset_test”
    list = os.listdir(rootdir) #列出文件夹下所有的目录与文件
    print(len(list))
    count = 0

    threshold1 = 1000
    threshold2 = 500

    for i in range(0,len(list)):

    1. path = os.path.join(rootdir, list[i])
    2. print(path)
    3. if os.path.isfile(path):
    4. img = cv2.imread(path)
    5. if(img is None):
    6. print("this is nonetype")
    7. else:
    8. height, width, channel = img.shape #获取图像信息
    9. print(height, width, channel)
    10. max_dim = max(height, width)
    11. #img_show(img)
    12. if(max_dim > threshold1):
    13. det_mat = img_resize(img, 800, width, height)
    14. #img_show(det_mat)
    15. img11 = np.uint8(det_mat[0:352, 0:352, :])
    16. img12 = np.uint8(det_mat[0:352, 223:575, :])
    17. img13 = np.uint8(det_mat[0:352, 447:799, :])
    18. img21 = np.uint8(det_mat[223:575, 0:352, :])
    19. img22 = np.uint8(det_mat[223:575, 223:575, :])
    20. img23 = np.uint8(det_mat[223:575, 447:799, :])
    21. img31 = np.uint8(det_mat[447:799, 0:352, :])
    22. img32 = np.uint8(det_mat[447:799, 223:575, :])
    23. img33 = np.uint8(det_mat[447:799, 447:799, :])
    24. #print(img13.shape)
    25. path11 = baseRoot + "test1/img" + str(count) + "_11.jpg"
    26. path12 = baseRoot + "test1/img" + str(count) + "_12.jpg"
    27. path13 = baseRoot + "test1/img" + str(count) + "_13.jpg"
    28. path21 = baseRoot + "test1/img" + str(count) + "_21.jpg"
    29. path22 = baseRoot + "test1/img" + str(count) + "_22.jpg"
    30. path23 = baseRoot + "test1/img" + str(count) + "_23.jpg"
    31. path31 = baseRoot + "test1/img" + str(count) + "_31.jpg"
    32. path32 = baseRoot + "test1/img" + str(count) + "_32.jpg"
    33. path33 = baseRoot + "test1/img" + str(count) + "_33.jpg"
    34. cv2.imwrite(path11, img11)
    35. cv2.imwrite(path12, img12)
    36. cv2.imwrite(path13, img13)
    37. cv2.imwrite(path21, img21)
    38. cv2.imwrite(path22, img22)
    39. cv2.imwrite(path23, img23)
    40. cv2.imwrite(path31, img31)
    41. cv2.imwrite(path32, img32)
    42. cv2.imwrite(path33, img33)
    43. elif(max_dim > threshold2):
    44. det_mat = img_resize(img, 608, width, height)
    45. img11 = np.uint8(det_mat[0:352, 0:352, :])
    46. img12 = np.uint8(det_mat[0:352, 255:607, :])
    47. img21 = np.uint8(det_mat[255:607, 0:352, :])
    48. img22 = np.uint8(det_mat[255:607, 255:607, :])
    49. #img_show(img11)
    50. #img_show(img12)
    51. #img_show(img21)net_size
    52. #img_show(img22)
    53. path11 = baseRoot + "test1/img" + str(count) + "_11.jpg"
    54. path12 = baseRoot + "test1/img" + str(count) + "_12.jpg"
    55. path21 = baseRoot + "test1/img" + str(count) + "_21.jpg"
    56. path22 = baseRoot + "test1/img" + str(count) + "_22.jpg"
    57. cv2.imwrite(path11, img11)
    58. cv2.imwrite(path12, img12)
    59. cv2.imwrite(path21, img21)
    60. cv2.imwrite(path22, img22)
    61. else:
    62. det_mat = img_resize(img, 352, width, height)
    63. img_show(det_mat)
    64. path_352 = baseRoot + "test1/img" + str(count) + ".jpg"
    65. cv2.imwrite(path_352, np.uint8(det_mat))
    66. count = count + 1
    67. print(count)

执行python脚本,结果如下:

451477b4bb8904cd37488f2aaa2c24dd.png

输入图片如下,分辨率为719x1280,裁剪后输出9幅352x352的子图,这样就完成了图像的预处理。

3883bf6eb5914f4f10c1afa5b5a80326.png

发表评论

表情:
评论列表 (有 0 条评论,243人围观)

还没有评论,来说两句吧...

相关阅读

    相关 python裁剪图像

    之前以为python裁剪图像很难,后来才发现知道裁剪坐标以后(坐标根据目标检测进行坐标变换吧),就很简单了。 1.裁剪 import cv2 cut =