/Δ/ - Book scanning

Name z☈[7ftr$~ N1I#{+'@w,g\s0L⚄V2☴R;l bk*`JY⚩<=d⚚_!U/xa&♨.B(	1,VD⛯v2^dyMi*Y)☖]{3 r[O_C~(eh#gb:j;4XTWQs'5+&⚌@=Rf0G
Email
Subject	Spoiler Image
Comment formatting options
File
Password	(For file deletion.)

File: 1493634771349.gif (1.25 MB, 600x450, scan.gif)

Book scanning Unspecified 2017/05/01 (Mon) 10:32:51 No.59

Not sure if it belongs on this board, but I'm looking for some advice on scanning books.

Recently I ordered a book through the library that seemed to exist nowhere online or in a digital format. It was about 120 pages so I thought it might not be too difficult to scan the whole thing. It took about an hour with an overhead scanner, but I was able to get the book in a shoddy (but readable) pdf form. For added measure, I used Acrobat to crop some pages and sanitize the data, which took another half hour. So, what could I do differently? Do you have any experience with this process? What do you use?

>Scanners & Setups
I had access to an overhead scanner which seemed like it would be a bit speedier than a flat-bed scanner but some drawbacks were that sometimes my fingers are in the images, the page detection sometimes cause weird effects on the page, and there was generally a lower resolution.

>Formatting
Are there alternatives to pdf/adobe software? Is there any way of automating the page cropping, the page rotating, etc? How do I make the scan look "clean"?

>DRM & Sharing
If I were to post my scans, what measures should I take? How do I find the people/places that might want this book? Do I just send it off to libgen?

>Anything else?
Is OCR worth it?

In the process of writing this post I found a guide online (pic related) as well as some helpful discussions on the libgen forums. I'll keep lain updated on my experiments.

Ishikawa 2017/05/01 (Mon) 10:35:28 No.60

I too wish to know. I want to scan some books I bought, and release the scans online.
Alternatives to PDF, as far as I know so far, are PNG, JPG, and djvu

Kaneda 2017/05/02 (Tue) 07:07:46 No.66

>>59
Did you just borrow an overhead scanner? Where can I look for one?

If it's not on libgen you should contribute, it's real easy, and I'm sure that for most people a shoddy version is better than none.

Also; can you share this guide you spoke of?
That technique seen in your attached image seems like it would make rotating/cropping etc easier due to the static nature of the pages and camera. Any edit done on one can be replicated exactly for every other one

Undisclosed 2017/05/02 (Tue) 18:44:28 No.68

File: 1493750668200.jpg (14.84 KB, 308x308, overhead.jpg)

>>66
The scanner was free to use at the local university library. I imagine a hacker-space or printshop might potentially have one as well.

I was too ashamed to post an instructables link in my first post on μ but here it is: http://www.instructables.com/id/Bargain-Price-Book-Scanner-From-A-Cardboard-Box/

Unidentified 2017/05/02 (Tue) 22:54:03 No.72

>>68
That guide was awesome, I need to try that myself.

Hideo Kuze 2017/05/03 (Wed) 01:49:50 No.75

Is OCR not soykaf anymore, I really don't get the hold up on this relatively simple AI. I guess it's mostly proprietary as well -_-

John Doe 2017/10/11 (Wed) 14:46:16 No.631

File: 1507733176911.jpg (151.63 KB, 500x375, scanner1.jpg)

For further various methods of book scanning with varying designs according to skills/money one should look here [0]

[0] https://www.diybookscanner.org/

Prof. Hodgeson 2019/06/07 (Fri) 00:40:23 No.819

>Scanners
I use a basic ass HP scanner. It's annoying but it works, and don't need to stand there in the library for half an hour scanning.

>Formatting
use djvu. If you can only scan PDF's(which applies to most scanners), convert using pdf2djvu. PDF is not intended as a raster format, DJVU is designed specifically for scanned documents. This is what libgen recommends as well.

>DRM & Sharing
libgen is a good place to start. Probably also seed it on libgen once it gets included in a torrent.

>is OCR worth it?
idk

Unidentified 2019/06/07 (Fri) 01:59:35 No.820

> ocr

It's definitely worth it. You still need to read it through and spell/grammar check, but if you ocr first before you even begin reading, it's not as dry as read->ocr->read. GNU has an open source ocr if you don't care for freeware (;https://www.gnu.org/software/ocrad/). It's relatively easy to make your own too, if you're into that sort of thing:(https://www.nist.gov/node/1298471/emnist-dataset)