Tests Management#

The goal of testing is to evaluate how well the model performs on data that it has not seen before, in order to simulate how it will perform in real-world scenarios. Therefore, it is important to make sure that the test data is representative of the types of data the model is likely to encounter in the real world.

Testing is important for AI models for a number of reasons. It allows us to:

  • assess the accuracy of the model

  • identify any areas where the model is not performing well

  • improve the model by tweaking the algorithms or adding more data

Run Tests#

In the Tests screen, you can perform two types of tests:

  • single image test

  • tests on testing sets

Single Image Test#

To run a single image test, you do not need to create a testing set, however, you need to have a trained model. Then, you simply upload an image and get a prediction right away.

Single Image Test

To get a prediction from a single image, click on Upload from the single image test box. Once you upload the image, you will immediately get the prediction provided that you have trained a model. You can also change the canvas settings canvas-settings-icon to improve the view of predictions once the image is uploaded.

Tests on testing sets#

Note

Before you can run this test, you need to have a testing set and a trained model. A dialog box for configuring the test will appear once you press Run test. Remember that your testing set needs to be annotated before you can run a test against it.

Test

You can choose a model and its version to run against a selected testing set. Once you run the test, you will have to wait for a performing model test job to finish. When completed, click on the test result of the executed test to see the details.

Single Image Test

You will see the properties of the test:

  • Model

  • Architecture

  • Testing set name

  • Creation date

  • Number of labels

  • Number of images/frames

  • Score of the model

Below the properties section, you can see two buckets that separate the predictions based on a model score. The default score threshold is 50% but you can change it by moving a slider. The right bucket displays the media items above the specified model score and the left bucket displays the media items below the specified model score.

You can filter the media items based on the model score either in an ascending or descending order, or based on a label selection.