OCR

From: Peter (BOUGHTONP) 6 Apr 2019 22:34
To: CHYRON (DSMITHHFX) 8 of 11
It's a photo taken with a digital compact, so there's a degree of noise and slight gradient, but there's no reason why it shouldn't be 99% OCR-able.

For example, attached is a crop of the row that gave "Q 124 on eel" - on its own it produces "124 97 2el", and in the first image (fixed horizontal/verticals, but gridlines still present and no brightness/contrast changes), it came closest with "0 124 on 221".

Attachments:
From: Peter (BOUGHTONP) 6 Apr 2019 23:03
To: ALL9 of 11
I had the thought of forgetting about OCR and searching for what I actually want, i.e: "image to spreadsheet conversion", which came up with this: https://online2pdf.com/convert-jpg-to-excel

The formatting it produced was all over the place, but it did a good job on the numbers - a handful of mistakes, mostly with zeroes. A couple of incorrect numbers (161->151 and 77->17) which were highlighted through the totals not matching, but compared to Tesseract it was brilliant.

Happy Peter -> :)

From: CHYRON (DSMITHHFX) 6 Apr 2019 23:23
To: Peter (BOUGHTONP) 10 of 11
I've had good luck with online OCR, though not tried for excel.
From: Peter (BOUGHTONP) 6 Apr 2019 23:37
To: CHYRON (DSMITHHFX) 11 of 11
I'm guessing it's mostly just regular OCR, but uses tabs if there's more than a single space, although the file I got back did have merged cells with a dozen spaces for some of the rows, suggesting buggy overcomplicated logic.

We need to set Stallman on them all.