
Simpler solutions for complex image classification problems
CVB Polimago can be used in many similar applications to neural networks and is part of the Common Vision Blox machine vision toolkit from STEMMER IMAGING. It uses ridge regression, a supervised learning method for search and classification in machine vision applications. Supervised learning, in this context, means that there is some prior knowledge in the training images where the user has marked typical examples of the features for classification with a region of interest. This allows the CVB Polimago algorithm to generate a function to produce the desired output.
Crucially CVB Polimago requires typically just 20 - 100 training images whereas CNNs could require 500 training images per class as well as 500 good ones. As an example, for an OCR application with alphanumerics A-Z and 0-9, a CNN could require 36 x 500 = 18000 training images.
There are a number of significant benefits that arise from having smaller training sets. In particular, training times for Polimago will be much shorter than for CNNs. Polimago has been trained in times of typically 5-20 minutes compared to hours for a CNN. It can also take a long time simply to label the required features in the larger training sets needed for CNNs. If an iterative training process is required to evaluate different parameters then CNNs become even more unwieldy.
CVB Polimago is designed to be run on a CPU and achieves comparable speeds to a CNN using GPU acceleration. Since Polimago does not require a GPU it can be used on a compact PC rather than a 19“ rack mounted PC. Typical execution times for Polimago search are of the order of tens of milliseconds, which are comparable to GPU-accelerated neural network speeds, but CVB classification-only tasks run at much faster speeds – often sub-millisecond.
CVB Polimago is available at significantly lower cost than many commercial CNN tools, and has the added benefit that in CVB 2019 (due for release Q3 2019) it will also be available for Linux (on Intel and ARM platforms) for the first time. This means that it can also be used in embedded vision applications.
Recent applications for CVB Polimago have included OCR, detection of incomplete biscuits, whether coated nuts are coated completely and classifying bolts that are similar in size and shape. In each of these applications, training sets of between 20 and 50 training images per class were required with classification execution times of around 7 ms. For an application involving the identification of chicken wings, thighs, drumsticks and breasts, 14 training images were used in each set with a classification time of 1 ms.