使用yolov7-u7训练自己的数据集实现图像分割 vvipi 发表于2022年12月19日,阅读:128 # 一、前言 yolov7的物体检测模块使用者众多,教程也很容易找到,而图像分割的模块(seg)明显就冷门很多。博主搜索了一大圈,发现yolov7的u7分支可以用来实现图像分割,花了好几天才跌跌撞撞把训练跑通。为避免遗忘,也为造福有缘人,留下踩坑记录一篇。 关键词:图像分割,实例分割,YOLOv7,YOLOv7-u7,YOLOv7-seg,训练自己的数据集 # 二、部署 ## 1. 下载YOLOv7-u7 在git bash中运行`git clone -b u7 https://github.com/WongKinYiu/yolov7`,下载u7分支,也就是用于图像分割的分支 如果连不上github,可以用`git clone -b u7 https://gitee.com/monkeycc/yolov7.git`,这是gitee上的一个镜像 ## 2. 安装pytorch 打开cmd,运行`nvidia-smi`,查看当前驱动的版本号,观察Driver Version的值是否大于400,如果小于则先更新显卡驱动。观察CUDA Version的值,确定稍后要下载的版本。  进入[pytorch官网](https://pytorch.org/),选择torch版本、系统类型、通过conda还是pip安装、语言和CUDA版本。CUDA版本应小于等于CUDA Version。  复制生成的命令,在cmd中运行,开始安装。 例如`pip3 install torch torchvision torchaudio --extra-index-url https://download.pytorch.org/whl/cu117` ## 3. 安装detectron2 ``` git clone https://github.com/facebookresearch/detectron2.git cd detectron2 python setup.py build ``` ## 4. 安装其他依赖库 `pip install -r requirements.txt',或者先运行起来,报错了再缺哪个装哪个 # 三、训练自己的数据集 ## 1. 下载权重文件yolov7-seg.pt 下载链接:[https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-seg.pt](https://github.com/WongKinYiu/yolov7/releases/download/v0.1/yolov7-seg.pt) 放在seg文件夹下。 ## 2. 准备图片和标签 ### 2.1 参考目录结构 `images/train`目录放训练的图片,`label/train`目录放对应的txt标签文件 `images/val`目录放评测的图片,`label/val`目录放对应的txt标签文件 ``` ├──seg └──data ├──custom.yaml └──custom ├──train_list.txt ├──val_list.txt ├──images | └──train | └──val └──labels └──train └──val ``` ### 2.2 标注图片 使用labelme进行标注,点击Create Polygons进行多边形的标注,如图:  将使用labelme标注多边形生成的json文件转化为训练使用的txt文件,脚本如下: ``` import json import os LABEL_PATH = 'D:\\yolov7-u7\\seg\\data\\custom\\labels\\train' name2id = { 'floor': 0, # 根据你的classname修改 } def convert(size, points): dw = 1. / (size[0]) dh = 1. / (size[1]) result = [] for point in points: x = point[0] y = point[1] x *= dw y *= dh result.append(x) result.append(y) return result def decode_json(json_folder_path, json_name): txt_name = LABEL_PATH + json_name[0:-5] + '.txt' txt_file = open(txt_name, 'w') json_path = os.path.join(json_folder_path, json_name) with open(json_path, 'r', encoding='utf-8') as f: data = json.load(f) img_w = data['imageWidth'] img_h = data['imageHeight'] for i in data['shapes']: label_name = i['label'] if i['shape_type'] == 'polygon': points = i['points'] xys = convert((img_w, img_h), points) txt_file.write(str(name2id[label_name]) + " " + " ".join([str(a) for a in xys]) + '\n') if __name__ == '__main__': json_folder_path = LABEL_PATH json_names = [f for f in os.listdir(json_folder_path) if '.json' in f] for j in json_names: decode_json(json_folder_path, j) ``` ### 2.3 生成文件列表的脚本 使用以下脚本,把标签文件的路径批量记录到train_list.txt和val_list.txt ``` import os TRAIN_IMG_DIR = 'seg/data/custom/images/train/' TRAIN_IMG_LIST = 'seg/data/custom/train_list.txt' VAL_IMG_DIR = 'seg/data/custom/images/val/' VAL_IMG_LIST = 'seg/data/custom/val_list.txt' def generate_img_list(): l = [f for f in os.listdir(TRAIN_IMG_DIR) if (".jpg" in f or '.bmp' in f)] with open(TRAIN_IMG_LIST, "w", encoding='utf-8') as f: for filename in l: f.write(TRAIN_IMG_DIR + filename + '\n') l = [f for f in os.listdir(VAL_IMG_DIR) if (".jpg" in f or '.bmp' in f)] with open(VAL_IMG_LIST, "w", encoding='utf-8') as f: for filename in l: f.write(VAL_IMG_DIR + filename + '\n') if __name__ == '__main__': generate_img_list() ``` ## 3. 修改配置文件 复制`seg/data/coco.yaml`,重命名为`custom.yaml`,改为如下格式,具体目录和classname自行修改。测试的数据我直接和评测的文件用一样的,省点时间。 ``` # Train/val/test sets as 1) dir: path/to/imgs, 2) file: path/to/imgs.txt, or 3) list: [path/to/imgs1, path/to/imgs2, ..] path: D:\\yolov7-u7\\seg\\data\\custom\\ # dataset root dir train: train_list.txt # train images val: val_list.txt # val images test: val_list.txt # test images # Classes names: 0: floor ``` 修改`seg/models/segment/yolov7-seg.yaml`,将nc改为你的目标总数+1,相当于目标数加上一个背景。例如你只需要识别person这一种目标,那么设置nc=2. ``` # Parameters nc: 2 # number of classes depth_multiple: 1.0 # model depth multiple width_multiple: 1.0 # layer channel multiple anchors: - [12,16, 19,36, 40,28] # P3/8 - [36,75, 76,55, 72,146] # P4/16 - [142,110, 192,243, 459,401] # P5/32 ``` ## 4. 修改train.py的参数 weights参数设置之前下载的预训练权重`yolov7-seg.pt`,当然也可以留空从头训练。一般来说使用预训练权重效果更好。 cfg参数设置yolov7-seg.yaml的路径 data参数设置custom.yaml的路径,也就是我们自定义的数据集 batch-size参数指一次往GPU塞多少张图片,根据你显卡的情况设置,一般为2的整数次幂,显卡好可以设置到16,训练的更快,如果报错就调低。8、4、2依次尝试,量力而为。 workers参数指数据装载时cpu所使用的线程数,也是根据电脑配置情况,能承受就调高点,报错就调低 ``` parser = argparse.ArgumentParser() parser.add_argument('--weights', type=str, default=ROOT / 'yolov7-seg.pt', help='initial weights path') parser.add_argument('--cfg', type=str, default=ROOT / 'models/segment/yolov7-seg.yaml', help='model.yaml path') parser.add_argument('--data', type=str, default=ROOT / 'data/custom.yaml', help='dataset.yaml path') parser.add_argument('--batch-size', type=int, default=2, help='total batch size for all GPUs, -1 for autobatch') parser.add_argument('--workers', type=int, default=4, help='max dataloader workers (per RANK in DDP mode)') ``` 准备完毕,运行`train.py`开始训练 # 四、检验训练成果 使用seg/segment/predict.py进行检验 ## 1. 准备一些检验的图片 随便创建一个文件夹,例如`seg/data/images/`,放入几张待分割的图片 ## 2. 修改predict.py weights参数设置我们训练出来的`best.pt` source参数设置放图片的目录`data/images` ``` parser = argparse.ArgumentParser() parser.add_argument('--weights', nargs='+', type=str, default=ROOT / 'best.pt', help='model path(s)') parser.add_argument('--source', type=str, default=ROOT / 'data/images', help='file/dir/URL/glob, 0 for webcam') ``` 运行`predict.py`,效果如图  # 五、填坑记录 ## 1. 选择train.py 不能使用主目录下的`train.py`进行训练,而应该使用`/seg/segment/train.py`进行训练。 使用主目录下的train.py训练时,发生了一个如下错误: ``` Traceback (most recent call last): File "d:\gitserver\machineLearning\yolov7-u7\seg\train.py", line 633, in <module> main(opt) File "d:\gitserver\machineLearning\yolov7-u7\seg\train.py", line 526, in main train(opt.hyp, opt, device, callbacks) File "d:\gitserver\machineLearning\yolov7-u7\seg\train.py", line 308, in train loss, loss_items = compute_loss(pred, targets.to(device)) # loss scaled by batch_size File "d:\gitserver\machineLearning\yolov7-u7\seg\utils\loss.py", line 125, in __call__ tcls, tbox, indices, anchors = self.build_targets(p, targets) # targets File "d:\gitserver\machineLearning\yolov7-u7\seg\utils\loss.py", line 198, in build_targets anchors, shape = self.anchors[i], p[i].shape AttributeError: 'list' object has no attribute 'shape' ``` 在github上的一个issue中,我发现有人指出应该使用/seg/segment/train.py进行训练。 ## 2. torch.cuda.OutOfMemoryError 出错信息如下: ``` Traceback (most recent call last): File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 683, in <module> main(opt) File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 577, in main train(opt.hyp, opt, device, callbacks) File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 321, in train pred = model(imgs) # forward File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 300, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 212, in _forward_once x = m(x) # run File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\common.py", line 99, in forward return self.act(self.bn(self.conv(x))) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "D:\Anaconda3\lib\site-packages\torch\nn\modules\batchnorm.py", line 171, in forward return F.batch_norm( File "D:\Anaconda3\lib\site-packages\torch\nn\functional.py", line 2450, in batch_norm return torch.batch_norm( torch.cuda.OutOfMemoryError: CUDA out of memory. Tried to allocate 50.00 MiB (GPU 0; 4.00 GiB total capacity; 3.45 GiB already allocated; 0 bytes free; 3.48 GiB reserved in total by PyTorch) If reserved memory is >> allocated memory try setting max_split_size_mb to avoid fragmentation. See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF ``` 训练时出现`torch.cuda.OutOfMemoryError`错误,很可能是显存不够大,此时可以调小batch-size参数,我4G显存调到2才正常运行 `parser.add_argument('--batch-size', type=int, default=2, help='total batch size for all GPUs, -1 for autobatch')` ## 3. meshgrid() got multiple values for keyword argument 'indexing' 训练出现这个错误可能是pytorch版本问题带来的,搜索发现别人出的错跟我是相反的,受到启发把报错的`functional.py`中的`return _VF.meshgrid(tensors, **kwargs, indexing = 'ij')`改为`return _VF.meshgrid(tensors, **kwargs)`,问题居然就解决了 ``` Traceback (most recent call last): File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 683, in <module> main(opt) File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 577, in main train(opt.hyp, opt, device, callbacks) File "d:\gitserver\machineLearning\yolov7-u7\seg\segment\train.py", line 373, in train results, maps, _ = validate.run(data_dict, File "D:\Anaconda3\lib\site-packages\torch\autograd\grad_mode.py", line 27, in decorate_context return func(*args, **kwargs) File "D:\gitserver\machineLearning\yolov7-u7\seg\segment\val.py", line 255, in run out, train_out = model(im) # if training else model(im, augment=augment, val=True) # inference, loss File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 300, in forward return self._forward_once(x, profile, visualize) # single-scale inference, train File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 212, in _forward_once x = m(x) # run File "D:\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 1190, in _call_impl return forward_call(*input, **kwargs) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 179, in forward x = self.detect(self, x) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 122, in forward self.grid[i], self.anchor_grid[i] = self._make_grid(nx, ny, i) File "D:\gitserver\machineLearning\yolov7-u7\seg\models\yolo.py", line 143, in _make_grid yv, xv = torch.meshgrid(y, x, indexing='ij') if torch_1_10 else torch.meshgrid(y, x) # torch>=0.7 compatibility File "D:\Anaconda3\lib\site-packages\torch\functional.py", line 489, in meshgrid return _meshgrid(*tensors, indexing=indexing) File "D:\Anaconda3\lib\site-packages\torch\functional.py", line 504, in _meshgrid return _VF.meshgrid(tensors, **kwargs, indexing = 'ij') # type: ignore[attr-defined] TypeError: meshgrid() got multiple values for keyword argument 'indexing' ```