|
这些图片尺寸并不相同,因为血涂片和细胞图像是基于人、测试方法、图片方向不同而不同的。让我们总结我们的训练数据集的统计信息来决定最佳的图像尺寸(牢记,我们根本不会碰测试数据集)。
import cv2from concurrent import futuresimport threading-
def get_img_shape_parallel(idx, img, total_imgs): if idx % 5000 == 0 or idx == (total_imgs - 1): print('{}: working on img num: {}'.format(threading.current_thread().name, idx)) return cv2.imread(img).shape ex = futures.ThreadPoolExecutor(max_workers=None)data_inp = [(idx, img, len(train_files)) for idx, img in enumerate(train_files)]print('Starting Img shape computation:')train_img_dims_map = ex.map(get_img_shape_parallel, [record[0] for record in data_inp], [record[1] for record in data_inp], [record[2] for record in data_inp])train_img_dims = list(train_img_dims_map)print('Min Dimensions:', np.min(train_img_dims, axis=0)) print('Avg Dimensions:', np.mean(train_img_dims, axis=0))print('Median Dimensions:', np.median(train_img_dims, axis=0))print('Max Dimensions:', np.max(train_img_dims, axis=0))-
-
# OutputStarting Img shape computation:ThreadPoolExecutor-0_0: working on img num: 0ThreadPoolExecutor-0_17: working on img num: 5000ThreadPoolExecutor-0_15: working on img num: 10000ThreadPoolExecutor-0_1: working on img num: 15000ThreadPoolExecutor-0_7: working on img num: 17360Min Dimensions: [46 46 3]Avg Dimensions: [132.77311215 132.45757733 3.]Median Dimensions: [130. 130. 3.]Max Dimensions: [385 394 3]
我们应用并行处理来加速图像读取,并且基于汇总统计结果,我们将每幅图片的尺寸重新调整到 125x125 像素。让我们载入我们所有的图像并重新调整它们为这些固定尺寸。
IMG_DIMS = (125, 125)-
def get_img_data_parallel(idx, img, total_imgs): if idx % 5000 == 0 or idx == (total_imgs - 1): print('{}: working on img num: {}'.format(threading.current_thread().name, idx)) img = cv2.imread(img) img = cv2.resize(img, dsize=IMG_DIMS, interpolation=cv2.INTER_CUBIC) img = np.array(img, dtype=np.float32) return img-
ex = futures.ThreadPoolExecutor(max_workers=None)train_data_inp = [(idx, img, len(train_files)) for idx, img in enumerate(train_files)]val_data_inp = [(idx, img, len(val_files)) for idx, img in enumerate(val_files)]test_data_inp = [(idx, img, len(test_files)) for idx, img in enumerate(test_files)]-
print('Loading Train Images:')train_data_map = ex.map(get_img_data_parallel, [record[0] for record in train_data_inp], [record[1] for record in train_data_inp], [record[2] for record in train_data_inp])train_data = np.array(list(train_data_map))-
print('nLoading Validation Images:')val_data_map = ex.map(get_img_data_parallel, [record[0] for record in val_data_inp], [record[1] for record in val_data_inp], [record[2] for record in val_data_inp])val_data = np.array(list(val_data_map))-
print('nLoading Test Images:')test_data_map = ex.map(get_img_data_parallel, [record[0] for record in test_data_inp], [record[1] for record in test_data_inp], [record[2] for record in test_data_inp])test_data = np.array(list(test_data_map))-
train_data.shape, val_data.shape, test_data.shape -
-
# OutputLoading Train Images:ThreadPoolExecutor-1_0: working on img num: 0ThreadPoolExecutor-1_12: working on img num: 5000ThreadPoolExecutor-1_6: working on img num: 10000ThreadPoolExecutor-1_10: working on img num: 15000ThreadPoolExecutor-1_3: working on img num: 17360-
Loading Validation Images:ThreadPoolExecutor-1_13: working on img num: 0ThreadPoolExecutor-1_18: working on img num: 1928-
Loading Test Images:ThreadPoolExecutor-1_5: working on img num: 0ThreadPoolExecutor-1_19: working on img num: 5000ThreadPoolExecutor-1_8: working on img num: 8267((17361, 125, 125, 3), (1929, 125, 125, 3), (8268, 125, 125, 3))
(编辑:湖南网)
【声明】本站内容均来自网络,其相关言论仅代表作者个人观点,不代表本站立场。若无意侵犯到您的权利,请及时与联系站长删除相关内容!
|