Labels Management#

Important

You will not be able to edit labels for all anomaly projects once you create the labels.

When you create a project, you must create labels to complete the project creation. You may modify the labels or extend them after creating the project in the Labels view when you enter your project.

There might be several reasons why you would want to edit the labels:

  1. You annotate a dataset and discover a new class that was or was not previously encountered, e.g. images in your dataset showed buildings but you did not need to annotate or could not recognize these objects in images at the time.

  2. You want to refine existing classes with subcategories e.g. you did not foresee that you would need to annotate, e.g. you identified another animal kingdom group - mammals and you need to embed dog and cat labels into the mammals group, and add another label - parrot.

  3. You want to label a new concept (group) e.g. you labeled cars but every car brand had a “car” label and now you would want to extend your model to learn a cars brand and cars model, so you can create a new group “Z Brand” and add labels “Model Y”, “Model X” to the group.

  4. You want to remove a label, or merge one with a new or existing label e.g. if you do not want the model to keep on recognizing an object, you can remove a label that is connected to that object; sometimes you may predefine a large number of labels but you do not know which ones will actually occur - that is when you can remove labels that were not used for the annotation.

Label definition#

In the realm of artificial intelligence and machine learning, a Label signifies an object of interest that a project aims to identify or classify. For instance, in object detection projects, if the goal is to recognize humans within images, “Human” could be designated as a label. A label is characterized by several attributes:

  • Name: A descriptive identifier for the label, such as “Human”.

  • Color: A visual marker used to differentiate this label from others in visual representations.

  • Hotkey: A keyboard shortcut assigned to this label for quick annotation by the user.

Note

In addition to labels defined by users, the Intel® Geti™ platform will also add an Empty Label definition to the project depending on the project type. The empty label is used to identify instances where the object of interest is absent from the data.

Empty Label#

In machine learning, particularly in tasks like segmentation and detection, the presence of images without any objects of interest (negative samples) can significantly contribute to the robustness of an algorithm. The “Empty label” concept is introduced to explicitly mark these negative samples, informing the system that an image lacks the objects of interest. This inclusion enriches the training dataset, allowing the algorithm to better distinguish between the presence and absence of target objects.

For segmentation and detection projects, where annotations are specific to regions within an image, it’s crucial for the model to learn from both positive samples (images with objects of interest) and negative samples. Assigning an empty label to images without target objects ensures that these negative samples are recognized and utilized during training.

The principle of an empty label extends to Multi select (multilabel) classification projects as well. Unlike single-label scenarios where an image is associated with one specific label, multilabel classifications can have images tagged with anywhere from zero to many labels. Here, the empty label serves a critical function in categorizing images that do not fit any of the defined classes, supporting the model in understanding the absence of relevant labels.

Not all project types incorporate the empty label concept:

  • Multi select Classification: Projects with a multiclass framework (selecting one option among many) do not accommodate the notion of an empty label, as the classification inherently implies selection from available classes. To include the concept of “none of the above,” such projects would need to adopt a multilabel approach.

  • Anomaly Type Projects: These projects, focused on identifying deviations from normal patterns, do not use empty labels due to their distinct objectives and methodologies.

Dealing with new labels and concept drift in the Intel® Geti™ platform#

When you interact with the Intel® Geti™ platform, adding a new label can trigger a series of processes to ensure the validity and relevance of previously annotated images. The Intel® Geti™ platform has a built-in mechanism to ascertain whether, and to what extent already annotated images need to be reconsidered in light of the newly added label.

This is particularly pertinent in the case of multilabel classification, where the newly added label could potentially be relevant to some of the pre-existing annotated data. Consequently, these media are earmarked as ‘to revisit’, requiring reassessment.

This status remains in place until you re-submit the annotation via the user interface, with or without changes. As far as the AI is concerned, media marked as ‘to revisit’ can be incorporated into the training set. However, the algorithm will deliberately disregard any label that is currently in this state of uncertainty.

To illustrate, consider a scenario where you are working on a clothing classification problem and decide to add a new label for “vintage”. If you proceed to train a model before revisiting the annotated dataset, the algorithm’s OTX loss function will not factor the “vintage” label as either present or absent in a given image, but will instead skip over it entirely.

In MLOps jargon, this intelligent feature acts as a mechanism to combat concept drift, ensuring that the model remains robust and adaptable to evolving labeling needs.