Smart Dashcam for Bicycles – Part 7: Training A Vision Model

I wanted to build my own vision model for a few reasons:

  1. I wanted to learn how
  2. In my limited experience with OpenALPR, it looked like it was missing some license plates that seemed fairly readable to my eyes – could I possibly do better training my own model?
  3. Just the way it is built, I know I wouldn’t be able to get OpenALPR to run faster on my Pi – I wouldn’t be able to get it to run faster by off loading image processing to a VPU like the Myriad X in my Oak D camera.
  4. The gen2-license-plate-recognition example provided by Luxonis, built from Intel’s Model Zoo, does not work well with Ontario license plates

The first step was building a library of images to train a model with. I sorted through hundreds of images I’d taken on rides in September, and selected 65 where the photos were clear, and there were license plates in the frame. As this was my first attempt, I wasn’t going to worry about sub-optimal situations (out of focus, low light, over exposed, etc…). I then had to annotate the images – draw boxes around the license plates in the photos, and “tag” them as plates. I looked at a couple tools – I started with Microsoft’s VoTT, but ended up using labelimg. Labelimg was efficient, with great keyboard shortcuts, and used the mouse scroll wheel to control zoom, which was great for labeling small details in larger photos.

I then tried one tutorial after another, and struggled to get them to work. Many examples were setup to run on Google Colab. I found when I was following these instructions, and I got to part where I was actually training the model, Colab would time out. Colab is only intended for short interactive sessions – perhaps it wouldn’t work for me as I was working with higher resolution images, which would take more computing time.

What I ended up doing was manually running the steps in the Train_YoloV3.ipynb notebook from pysource, straight into the console. As my home PCs don’t have dedicated GPUs, I setup a p3.2xlarge Amazon EC2 instance to run the training. If memory serves, training against those 65 images, using the settings from that tutorial, took a couple of hours.

I took the model I created from my September rides, and then tested it against images from my October rides – I’m surprised how well it worked.

My Yolov3 model running on Oak-D

Since training that model, I’ve been on the lookout for an nVidia video card I can use for training at home. It’s hard to know for sure, but it seems it wouldn’t take long to recoup the cost of a GPU vs training on an EC2 instance in the cloud, and I can always resell a GPU. I’ve tried a few times with the fastest CPU I have in the house (a Ryzen 3400g), and it just doesn’t seem feasible. I haven’t seen a cheap GPU option, and the prices just seem to be going higher since I started looking in November.

I don’t have usable code or a useful model to share at this point, at this point, I’m mostly learning and trying to figure out the process.

Printing and Binding an ePub eBook

I wanted a hard copy of an eBook I had that is out of print. There are many resources out there for binding books. Many recommend using acid free PVA glue. I can’t speak to how it compares to other glues, but “Aleene’s Tacky Glue” is a PVA glue, available acid free, which was available at craft stores in my area.

This post will focus on prepping an eBook for print. As US Letter is the common paper size here, which is too big for a book, I decided to print 4 pages per US letter page, 2 pages per side, each 5.5″ wide by 8.5″ tall.

First, I loaded to book into Calibre, opened it, and printed it to PDF. For this exercise, I’ve used Ian Fleming’s Casino Royale, which is out of copyright in Canada.

Calibre Print to PDF screen
Calibre Print to PDF

Next, I had to re-arrange the pages. If we just print 2 pages per side, duplex, page 4 will end up on the back of page 1. We want page 2 on the back of page 1 – we want to reorder the PDF following the patterns 1, 3, 4, 2, 5, 7, 8, 6… This LibreOffice spreadsheet might help: Pages.ods

Illustration of required page ordering
Pages have to be re-ordered for regular duplex printing – page 2 has to be on the back of page!

PDFTK is a great tool for re-ordering PDFs. I have re-ordered the book, skipping the first page, with PDFTK as follows:
pdftk Casino\ Royale.pdf cat 2 4 5 3 6 8 9 7 10 12 13 11 14 16 17 15 18 20 21 19 22 24 25 23 26 28 29 27 30 32 33 31 34 36 37 35 38 40 41 39 42 44 45 43 46 48 49 47 50 52 53 51 54 56 57 55 58 60 61 59 62 64 65 63 66 68 69 67 70 72 73 71 74 76 77 75 78 80 81 79 82 84 85 83 86 88 89 87 90 92 93 91 94 96 97 95 98 100 101 99 102 104 105 103 106 108 109 107 110 112 113 111 114 116 117 115 118 120 121 119 122 124 125 123 126 128 129 127 130 132 133 131 134 136 137 135 138 140 141 139 142 144 145 143 146 148 149 147 150 152 153 151 154 156 output collated.pdf

Next, I used a tool called pdfjam to fit 2 pages per side:
pdfjam collated.pdf -o collated-2perpagealternate.pdf --nup 2x1 --landscape

I sent this PDF to my local printer, and had them cut the pages in half for me. With this output, I bound the book, roughly following a Youtube tutorial. My book turned out OK, but it feels like it would take me a few more attempts to get a book as sturdy as a commercially bound book.

Smart Dashcam for Bicycles – Part 6: Experimenting With A New Camera Platform

One of the features I have in mind for my bicycle dashcam was license plate recognition. In parts 1, 2 and 3, I experimented with the OpenALPR license plate recognition library and a couple different Pi cameras. I encountered a few challenges:

  • Image quality challenges: out-of-focus images, warped images due to the “rolling shutter” of the Pi camera
  • Field of view: capturing more than just the license plate
  • Speed: Only able to process 1 image every 8 seconds on my Pi 3

I acquired the Luxonis Oak-D AI accelerated camera to experiment with different image sensors which could potentially address my image quality challenges, stereo vision/depth sensing provided interesting capabilities, and the AI acceleration to increase the speed. This spring, I mounted it to my bike and started capturing images on my rides.

I had issues with my Pi 3 – it would stop running reliably after a minute or two – I suspect it had been damaged by vibration from previous rides, being strapped to my bike rack. I acquired a new Pi 4, and was up and running again.

Initially, with the Oak-D setup, I had a lot of the same image quality problems I was having with the Pi 1 and 2 cameras – lots of out-of-focus images, the camera just kept on trying to focus, which is a hard problem with taking photos in moving traffic on a bumpy bicycle ride. My application would also crash – this turned out to be due to filling buffers – I was writing more data to my USB thumb drive than it could handle. I ended up getting acceptable results by reducing my capture speed to 2 fps, recording at 4056×3040, turning auto focus off, locking the focus at its 120 setting, and setting the scene mode sports, in the DepthAI API as follows:

rgb.setFps(2)
rgb.initialControl.setManualFocus(120)
rgb.initialControl.SceneMode(dai.CameraControl.SceneMode.SPORTS) rgb.initialControl.setAutoFocusMode(dai.RawCameraControl.AutoFocusMode.OFF)

With these settings, images are focused in the narrow range where it’s possible to read a license plate – when cars are too far back, the plates are impossible to read anyway, and it doesn’t matter if that’s out of focus. Luxonis will soon launch a model with fixed focus cameras, which should further improve image quality in high vibration environments. I hope to try this out in the future.

I wanted to build a library of images I could later use to test against various machine vision models, and potentially train my own. I posted a the question on the Luxonis Discord channel – their team directed me to their gen2-record-replay code sample. This code allows you to record imagery, and later play it back against a model – it was exactly what I needed. So I started to collect imagery on my next few rides.