I wanted to build my own vision model for a few reasons:
- I wanted to learn how
- In my limited experience with OpenALPR, it looked like it was missing some license plates that seemed fairly readable to my eyes - could I possibly do better training my own model?
- Just the way it is built, I know I wouldn’t be able to get OpenALPR to run faster on my Pi - I wouldn’t be able to get it to run faster by off loading image processing to a VPU like the Myriad X in my Oak D camera.
- The gen2-license-plate-recognition example provided by Luxonis, built from Intel’s Model Zoo, does not work well with Ontario license plates
The first step was building a library of images to train a model with. I sorted through hundreds of images I’d taken on rides in September, and selected 65 where the photos were clear, and there were license plates in the frame. As this was my first attempt, I wasn’t going to worry about sub-optimal situations (out of focus, low light, over exposed, etc…). I then had to annotate the images - draw boxes around the license plates in the photos, and “tag” them as plates. I looked at a couple tools - I started with Microsoft’s VoTT, but ended up using labelimg. Labelimg was efficient, with great keyboard shortcuts, and used the mouse scroll wheel to control zoom, which was great for labeling small details in larger photos.