How do you encode your paper scans?

Atemu@lemmy.ml · 1 year ago

How do you encode your paper scans?

Saigonauticon@voltage.vn · 1 year ago

Oh, it’s common in my country to use a smartphone to ‘scan’ documents by actually just taking a lousy photo of them. It’s so prevalent that when you tell someone to do a scan they usually do this instead.

I bought a cheap canon scanner for 50$ and it’s pretty perfect for legal documents. A little slow maybe. I use SANE, then do lossy compression too.

In rare situations I’d then post process the PDF to even worse quality using ghostscript, for example when a foreign visa application form requires a scan of a really long document, but doesn’t accept sizes over 2MB.

Atemu@lemmy.ml · 1 year ago

I use SANE, then do lossy compression too.

Well, what kind of lossy compression? JPEG?

IME, JPEG looks quite terrible for text documents -even at q=95.

Saigonauticon@voltage.vn · 1 year ago

Yeah just jpeg. Always comes out perfectly legible.

dpflug@hachyderm.io · 1 year ago

@Atemu
I just use grayscale PNGs, myself. optipng usually takes them down to a decent size.
@Saigonauticon

Atemu@lemmy.ml · 1 year ago

Hmm, I’m using grayscale PNGs as my baseline here. A 150dpi scan is about 1.3MiB.

A (for the purpose of text documents) similar quality WEBP is about 1/4 of that.

kyle@infosec.pub · 1 year ago

You could also try adjusting the contrast a bit. I use an app called Genius Scan, which increases the contrast of the scanned image to reduce the number of bits needed per pixel. This reduces the size of the file quite a bit, although it obviously isn’t a true representation of the scanned document. The TextCleaner imagemagick plugin looks like it’s doing something similar.

dpflug@hachyderm.io · 1 year ago

@Atemu
Webp is much better, as long as your target reader(s) support it.

Atemu@lemmy.ml · 1 year ago

Yes, as I said.

As also mentioned in the post, I need a solution for multiple pages and an image (no matter what format) only represents a single page and WEBPs don’t go into PDFs.

Sifr Moja@mastodon.social · 1 year ago

@Saigonauticon @Atemu A scanner is a camera. Why complicate things?