找回密码
 会员注册
查看: 31|回复: 0

yolov5将标注好的数据集进行划分(附完整可运行python代码)

[复制链接]

2万

主题

0

回帖

6万

积分

超级版主

积分
69864
发表于 2024-9-10 14:31:41 | 显示全部楼层 |阅读模式
问题描述准备使用yolov5训练自己的模型,自己将下载的开源数据集按照自己的要求重新标注了一下,然后现在对其进行划分。问题分析划分数据集主要的步骤就是,首先要将数据集打乱顺序,然后按照一定的比例将其分为训练集,验证集和测试集。这里我定的比例是7:1:2。步骤流程1、将数据集打乱顺序数据集有图片和标注文件,我们需要把两种文件绑定然后将其打乱顺序。首先读取数据后,将两种文件通过zip函数绑定 each_class_image=[]each_class_label=[]forimageinos.listdir(file_path):each_class_image.append(image)forlabelinos.listdir(xml_path):each_class_label.append(label)data=list(zip(each_class_image,each_class_label))1234567然后打乱顺序,再将两个列表分开random.shuffle(data)each_class_image,each_class_label=zip(*data)122、按照确定好的比例将两个列表元素分割分别用三个列表储存一下图片和标注文件的元素 train_images=each_class_image[0:int(train_rate*total)]val_images=each_class_image[int(train_rate*total):int((train_rate+val_rate)*total)]test_images=each_class_image[int((train_rate+val_rate)*total):]train_labels=each_class_label[0:int(train_rate*total)]val_labels=each_class_label[int(train_rate*total):int((train_rate+val_rate)*total)]test_labels=each_class_label[int((train_rate+val_rate)*total):]12345673、在本地生成文件夹,将划分好的数据集分别保存这样就保存好了。forimageintrain_images:#print(image)old_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'train'+'/'+'images'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelintrain_labels:#print(label)old_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'train'+'/'+'labels'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)forimageinval_imagesld_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'val'+'/'+'images'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelinval_labelsld_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'val'+'/'+'labels'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)forimageintest_imagesld_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'test'+'/'+'images'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelintest_labelsld_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'test'+'/'+'labels'ifnotos.path.exists(new_path1)s.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)12345678910111213141516171819202122232425262728293031323334353637383940414243444546474849运行结果展示直接运行单个python文件即可。运行完毕去本地查看图片和标注文件乱序,且一一对应。完整代码分享importosimportshutilimportrandomrandom.seed(0)defsplit_data(file_path,xml_path,new_file_path,train_rate,val_rate,test_rate):each_class_image=[]each_class_label=[]forimageinos.listdir(file_path):each_class_image.append(image)forlabelinos.listdir(xml_path):each_class_label.append(label)data=list(zip(each_class_image,each_class_label))total=len(each_class_image)random.shuffle(data)each_class_image,each_class_label=zip(*data)train_images=each_class_image[0:int(train_rate*total)]val_images=each_class_image[int(train_rate*total):int((train_rate+val_rate)*total)]test_images=each_class_image[int((train_rate+val_rate)*total):]train_labels=each_class_label[0:int(train_rate*total)]val_labels=each_class_label[int(train_rate*total):int((train_rate+val_rate)*total)]test_labels=each_class_label[int((train_rate+val_rate)*total):]forimageintrain_images:print(image)old_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'train'+'/'+'images'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelintrain_labels:print(label)old_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'train'+'/'+'labels'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)forimageinval_images:old_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'val'+'/'+'images'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelinval_labels:old_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'val'+'/'+'labels'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)forimageintest_images:old_path=file_path+'/'+imagenew_path1=new_file_path+'/'+'test'+'/'+'images'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+imageshutil.copy(old_path,new_path)forlabelintest_labels:old_path=xml_path+'/'+labelnew_path1=new_file_path+'/'+'test'+'/'+'labels'ifnotos.path.exists(new_path1):os.makedirs(new_path1)new_path=new_path1+'/'+labelshutil.copy(old_path,new_path)if__name__=='__main__':file_path="D:/Files/dataSet/drone_images"xml_path='D:/Files/dataSet/drone_labels'new_file_path="D:/Files/dataSet/droneData"split_data(file_path,xml_path,new_file_path,train_rate=0.7,val_rate=0.1,test_rate=0.2)123456789101112131415161718192021222324252627282930313233343536373839404142434445464748495051525354555657585960616263646566676869707172737475767778798081
回复

使用道具 举报

您需要登录后才可以回帖 登录 | 会员注册

本版积分规则

QQ|手机版|心飞设计-版权所有:微度网络信息技术服务中心 ( 鲁ICP备17032091号-12 )|网站地图

GMT+8, 2025-1-7 06:35 , Processed in 0.836296 second(s), 25 queries .

Powered by Discuz! X3.5

© 2001-2025 Discuz! Team.

快速回复 返回顶部 返回列表