The 7 steps to create a successful computer vision PoC

The 7 steps to create a successful computer vision PoC

There are many pitfalls and challenges to realizing the value of a computer vision in a business setting – from solving the wrong problem to not collecting enough data. In this article, I will review seven key steps to making your computer vision PoC (Proof of Concept) a success.

The seven steps to creating a successful computer vision PoC:

  1. Identify the business problem
  2. Define the success criteria
  3. Determine the appropriate computer vision techniques
  4. Collect and label training and test images
  5. Train and evaluate model
  6. Deploy and test
  7. Iterate on the solution

Step 1: Identify the business problem

Many vision projects fail to achieve value because they address the wrong problem. To be successful, a vision project should have a clear business goal and benefit. You should be able to describe the goal and benefit in one to two sentences. For example, the goal may be to reduce the number of defective products that leave a factory or reduce the number of stockouts on a store shelf. If the business problem or benefits are unclear, consider spending time with business partners that are close to the problem.

Think about how to quantify the business benefit of solving the problem. A precise return on investment calculation is not necessary at this stage. What is important is estimating the order of magnitude of the benefit. Think about the cost that may be avoided or revenue that may be gained if the problem can be solved. The more well thought out the business impact, the easier it will be to get funding and approval to move your proof of concept into production. One way to estimate the scale of the potential benefit is to define three buckets of relative size and ask a business partner to help classify the problem into one of the buckets. For instance, $50k-$100k, $200k-500k, and $1-2M.

Another important consideration is how technology fits into the overall context of the business process. It is rare that a vision solution will completely replace an existing process. It is much more likely that this technology will improve the existing process by improving quality or enabling new insights. If a person is part of the existing process, consider how technology may enable them to gain more focus in their work, work more efficiently, or even work more safely.

Step 2: Define the success criteria

If you spend time thinking about the business process and how to quantify the business benefit, then this step should be straightforward. The goal here is to translate the business outcome into simple success criteria that can be used to measure the effectiveness of the solution.

For example, if the goal is to reduce the number of defective parts that leave a factory, then success can be measured by the number of additional defective parts that can be identified with the vision system. It is important a this stage to be realistic in setting goals for your PoC. Remember that the purpose of a PoC is to prove the feasibility of your project, not achieve production-level performance. 

Step 3: Determine the appropriate computer vision techniques

Before collecting data or selecting an algorithm, it is important to identify the techniques that best match the problem. Identifying the right techniques up front will clarify your data requirements and enable your team to focus during the execution phase.

There are many computer vision techniques available today. Each technique is associated with specific algorithms and performance metrics.  Classification and object detection are two of the most common and versatile computer vision techniques.


In classification problems, the goal is to categorize the entire image into one or more distinct classes. This can be the right tool to reach for if your goal is to classify a product as defective using images that capture a single product example. A good example of computer vision classification is evaluation of chest x-rays based on degree of damage caused by disease.

Object detection

Object detection is a great technique if the aim is to identify multiple objects in a single image. Object detection combines localization and classification into a single pipeline. These algorithms effectively draw a bounding box around the relevant object and classify the object based on a set of predefined classes. Object detection is commonly used in several applications including detection of faces in family photos and detection of license plates using toll road cameras.

Step 4: Collect and label images

If you plan to use a deep learning model for classification or object detection, you will likely need to collect data to train your model. Many deep learning models are available pre-trained to detect or classify a multitude of common daily objects such as cars, people, bicycles, etc. If your scenario focuses on one of these common objects, then you may be able to simply download and deploy a pre-trained model for your scenario. Otherwise, you will need to collect and label data to train your model.

Select a camera

Start by selecting a camera. Consider factors such as zoom, color, and number of pixels when selecting a camera. The human eye can be a good sanity check. If you are not able to identity the subject of your PoC using your camera, it is likely that an algorithm will also struggle.

Collect images

The next step is to capture a diverse set of images that focus on the topic of the PoC. Variety and quantity are to keys to collecting a good dataset for model training. Plan to collect at least 100-200 examples of each object class relevant to the problem. This should be enough to get an idea of the difficulty of the problem. In terms of diversity, try to capture images of the subject from multiple angles and under various lighting conditions that represent the range of operating conditions for the scenario.

Label the images

Once you have a good set of images collected you will need to label the images. Several tools exist to facilitate the labeling process. These include open-source tools such as labelImg and commercial tools such as Azure Machine Learning, which support image classification and object detection labeling. For large labeling projects of 500+ images, I recommend selecting a labeling tool that supports workflow management and quality reviews. These features are essential to ensure quality and efficiency in the labeling process.

Step 5: Train and evaluate model

Once you have a good set of images labeled you are ready to shift to model training. Assuming you are using a deep learning computer vision model, the next step is to use transfer learning to train your model. In transfer learning, a pre-trained model is repurposed for a new scenario by freezing the neural network and retraining only the final SoftMax layer. Transfer learning is a fast and effective way to train deep learning vision models with limited data. Other techniques such as fine-tuning offer more advanced performance but have larger data requirements.

Several tutorials and code repositories exist that support transfer learning. These include open source as well as commercial options. A couple are listed below for reference:

Step 6: Deploy and test

Once your model is trained, you are ready to deploy your model. Tools such as Azure Percept make deploying AI models at the edge easier than ever. For a walkthrough guide on how to deploy a model on an edge device using Azure Percept follow the steps outlined in the link below.

With your model deployed you are ready to interact with your model in a real-world environment. At this point you can start to solicit feedback from potential end users and start thinking about the next steps.

Step 7: Iterate on the solution

It is rare that a model works perfectly on the first attempt. Developing a machine learning model is an iterative process of trial and error. If your model is not performing well, there are many steps you can take to improve performance.

Below are some great places to start troubleshooting:

  • Experiment with learning rate: The wrong learning rate can make it difficult for a neural network to converge, leading to poor performance.
  • Check that your training labels are correct and consistent: Inconsistent labels can lead to unexpected results and make it difficult for a neural network to learn a good set of features for classification or object detection.
  • Consider collecting more training data: Look for ways to increase diversity and decrease sources of bias that may limit model performance.
  • Consider experimenting with alternative algorithms: Algorithms vary in terms of inference speed, pre-training data source, and ability to accurately detect objects.

This is a good time to revisit the success criteria for your model.  Compare your model’s performance with the KPIs that you defined earlier. Think about any gaps that exist in performance between the PoC and start to outline steps you can take to further improve performance. These steps will help highlight the path to turning your PoC into a production solution.

Can you think of a business problem that can be improved with AI? If so, there has never been a better time to build a PoC. If you follow the key steps outlined in this article you will be well on your way to a successful project.

If you need any help, contact us to speak with an expert at Neal Analytics. From training models in the cloud to deploying solutions at the edge, we have helped many clients to build and deploy AI solutions in their businesses.