I wanted to build my own vision model for a few reasons:
- I wanted to learn how
- In my limited experience with OpenALPR, it looked like it was missing some license plates that seemed fairly readable to my eyes – could I possibly do better training my own model?
- Just the way it is built, I know I wouldn’t be able to get OpenALPR to run faster on my Pi – I wouldn’t be able to get it to run faster by off loading image processing to a VPU like the Myriad X in my Oak D camera.
- The gen2-license-plate-recognition example provided by Luxonis, built from Intel’s Model Zoo, does not work well with Ontario license plates
The first step was building a library of images to train a model with. I sorted through hundreds of images I’d taken on rides in September, and selected 65 where the photos were clear, and there were license plates in the frame. As this was my first attempt, I wasn’t going to worry about sub-optimal situations (out of focus, low light, over exposed, etc…). I then had to annotate the images – draw boxes around the license plates in the photos, and “tag” them as plates. I looked at a couple tools – I started with Microsoft’s VoTT, but ended up using labelimg. Labelimg was efficient, with great keyboard shortcuts, and used the mouse scroll wheel to control zoom, which was great for labeling small details in larger photos.
I then tried one tutorial after another, and struggled to get them to work. Many examples were setup to run on Google Colab. I found when I was following these instructions, and I got to part where I was actually training the model, Colab would time out. Colab is only intended for short interactive sessions – perhaps it wouldn’t work for me as I was working with higher resolution images, which would take more computing time.
What I ended up doing was manually running the steps in the Train_YoloV3.ipynb notebook from pysource, straight into the console. As my home PCs don’t have dedicated GPUs, I setup a p3.2xlarge Amazon EC2 instance to run the training. If memory serves, training against those 65 images, using the settings from that tutorial, took a couple of hours.
I took the model I created from my September rides, and then tested it against images from my October rides – I’m surprised how well it worked.
Since training that model, I’ve been on the lookout for an nVidia video card I can use for training at home. It’s hard to know for sure, but it seems it wouldn’t take long to recoup the cost of a GPU vs training on an EC2 instance in the cloud, and I can always resell a GPU. I’ve tried a few times with the fastest CPU I have in the house (a Ryzen 3400g), and it just doesn’t seem feasible. I haven’t seen a cheap GPU option, and the prices just seem to be going higher since I started looking in November.
I don’t have usable code or a useful model to share at this point, at this point, I’m mostly learning and trying to figure out the process.