Train a custom model with exported data from the Intel® Geti™ platform#
You can export an annotated dataset to train your own custom algorithm with that dataset. The following example shows how to do that for a single class Object Detection model. We use PyTorch Lightning to train a Faster RCNN model.
Preparation#
Export the dataset in COCO format, and unzip the dataset to a directory with the name data
.
Create PyTorch Dataset#
COCO annotations are stored in a JSON file. You can find it in data/annotations/instances_default.json
.
There are five root items in the JSON file:
Licenses
Info
Categories
Images
Annotations
In this tutorial, we will focus on images and annotations. images
contains the filename of the image, and an image_id
. Every annotation contains the following items: id, image_id, category_id, segmentation, area, bbox, iscrowd
. See the COCO format documentation for more details.
Since we use single label detection, the category_id is always 1 and we do not have segmentation annotations. The annotation we are interested in is bbox. COCO format specifies that bbox elements are [x, y, width, height]
. The Faster RCNN network expects bounding box annotations in [x1, y1, x2, y2]
format, so you need to create a function to convert the boxes.
def convert_boxes(annotations: dict) -> list:
"""
Convert annotation boxes in COCO format with absolute coordinates (x1, y1, width, height)
to boxes with format [x1, y1, x2, y2]
:param annotations: annotations for image in COCO format
"""
boxes = []
for annotation in annotations:
xmin, ymin, width, height = annotation["bbox"]
xmax = xmin + width
ymax = ymin + height
boxes.append([xmin, ymin, xmax, ymax])
return boxes
TorchVision contains a CocoDataset for loading data in Coco format, so we do not have to write our own code for loading data, we just override the __getitem__
method to return boxes in the format expected by the network:
class CustomCocoDetection(CocoDetection):
"""
Custom CocoDetection Dataset with normalized boxes in x1, y1, x2, y2 format
"""
def __getitem__(self, index):
image, annotations = super().__getitem__(index)
boxes = convert_boxes(annotations)
categories = [annotation["category_id"] for annotation in annotations]
if len(boxes) > 0:
labels = {
"boxes": torch.as_tensor(boxes),
"labels": torch.as_tensor([1 for _ in range(len(boxes))]),
}
else:
labels = []
return image, labels
Create PyTorch Lightning DataModule and Model#
You need to create a PyTorch Lightning DataModule which contains DataLoaders for the CustomCocoDetection Dataset.
class DataModule(pl.LightningDataModule):
"""
DataModule for CocoDataset
:param data_dir: Path to directory which contains subdirectories "annotations" and "images"
:param batch_size: Batch size
"""
def __init__(self, data_dir: Union[str, Path], batch_size: int):
super().__init__()
self.data_dir = Path(data_dir)
self.batch_size = batch_size
def setup(self, stage=None):
random.seed(1.414213)
ds = CustomCocoDetection(
root=self.data_dir / "images/default",
annFile=self.data_dir / "annotations/instances_default.json",
)
# skip images without annotation
ds = [item for item in ds if len(item[1]) > 0]
# select 80 % of the dataset as training data
train_size = int(len(ds) * 0.8)
val_size = len(ds) - train_size
self.dataset_train, self.dataset_val = random_split(ds, [train_size, val_size])
def train_dataloader(self):
return DataLoader(
self.dataset_train,
batch_size=self.batch_size,
shuffle=True,
collate_fn=lambda x:tuple(zip(*x))
)
The PyTorch Lightning Module wraps the FasterRCNN model. In this simple example, we only provide a training step. See the PyTorch Lightning Documentation for more options.
class DetectionModel(pl.LightningModule):
def __init__(self):
super().__init__()
self.weights = FasterRCNN_ResNet50_FPN_Weights.COCO_V1
model = torchvision.models.detection.fasterrcnn_resnet50_fpn(
weights=self.weights
)
num_classes = 2 # 1 class (dog) + background
in_features = model.roi_heads.box_predictor.cls_score.in_features
model.roi_heads.box_predictor = FastRCNNPredictor(in_features, num_classes)
self._model = model
def forward(self, x):
images, targets = x
preprocess = self.weights.transforms()
images = list(preprocess(image) for image in images)
targets = [{k: v for k, v in t.items()} for t in targets]
return self._model(images, targets)
def configure_optimizers(self):
optimizer = torch.optim.Adam(self._model.parameters())
return optimizer
def training_step(self, batch, batch_idx):
loss_dict = self.forward(batch)
total_loss = sum(loss for loss in loss_dict.values())
print(total_loss)
return total_loss
Train the model#
To train the model, you need to create a DetectionModel()
and DataModule()
instance, together with a Trainer. The Trainer.fit()
method trains the model with the provided dataset, and save_checkpoint()
saves the model checkpoint to use later.
model = DetectionModel()
data = DataModule(data_dir="data", batch_size=4)
trainer = pl.Trainer(max_epochs=5)
trainer.fit(model, data)
trainer.save_checkpoint("model.pth")