arisuchan    [ tech / cult / art ]   [ λ / Δ ]   [ psy ]   [ ru ]   [ random ]   [ meta ]   [ all ]    info / stickers     temporarily disabledtemporarily disabled

/Δ/ - diy/projects

make it. create it. do-it-yourself. hardware, software, and community projects.
Name
Email
Subject
Comment

formatting options

File
Password (For file deletion.)

Help me fix this shit. https://legacy.arisuchan.jp/q/res/2703.html#2703

Kalyx ######


File: 1493634771349.gif (1.25 MB, 600x450, scan.gif)

 No.59

Not sure if it belongs on this board, but I'm looking for some advice on scanning books.

Recently I ordered a book through the library that seemed to exist nowhere online or in a digital format. It was about 120 pages so I thought it might not be too difficult to scan the whole thing. It took about an hour with an overhead scanner, but I was able to get the book in a shoddy (but readable) pdf form. For added measure, I used Acrobat to crop some pages and sanitize the data, which took another half hour. So, what could I do differently? Do you have any experience with this process? What do you use?

>Scanners & Setups

I had access to an overhead scanner which seemed like it would be a bit speedier than a flat-bed scanner but some drawbacks were that sometimes my fingers are in the images, the page detection sometimes cause weird effects on the page, and there was generally a lower resolution.

>Formatting

Are there alternatives to pdf/adobe software? Is there any way of automating the page cropping, the page rotating, etc? How do I make the scan look "clean"?

>DRM & Sharing

If I were to post my scans, what measures should I take? How do I find the people/places that might want this book? Do I just send it off to libgen?

>Anything else?

Is OCR worth it?

In the process of writing this post I found a guide online (pic related) as well as some helpful discussions on the libgen forums. I'll keep lain updated on my experiments.

 No.60

I too wish to know. I want to scan some books I bought, and release the scans online.
Alternatives to PDF, as far as I know so far, are PNG, JPG, and djvu

 No.66

>>59
Did you just borrow an overhead scanner? Where can I look for one?

If it's not on libgen you should contribute, it's real easy, and I'm sure that for most people a shoddy version is better than none.

Also; can you share this guide you spoke of?
That technique seen in your attached image seems like it would make rotating/cropping etc easier due to the static nature of the pages and camera. Any edit done on one can be replicated exactly for every other one

 No.68

File: 1493750668200.jpg (14.84 KB, 308x308, overhead.jpg)

>>66
The scanner was free to use at the local university library. I imagine a hacker-space or printshop might potentially have one as well.

I was too ashamed to post an instructables link in my first post on μ but here it is: http://www.instructables.com/id/Bargain-Price-Book-Scanner-From-A-Cardboard-Box/

 No.72

>>68
That guide was awesome, I need to try that myself.

 No.75

Is OCR not soykaf anymore, I really don't get the hold up on this relatively simple AI. I guess it's mostly proprietary as well -_-

 No.631

File: 1507733176911.jpg (151.63 KB, 500x375, scanner1.jpg)

For further various methods of book scanning with varying designs according to skills/money one should look here [0]


[0] https://www.diybookscanner.org/

 No.819

>Scanners
I use a basic ass HP scanner. It's annoying but it works, and don't need to stand there in the library for half an hour scanning.

>Formatting

use djvu. If you can only scan PDF's(which applies to most scanners), convert using pdf2djvu. PDF is not intended as a raster format, DJVU is designed specifically for scanned documents. This is what libgen recommends as well.

>DRM & Sharing

libgen is a good place to start. Probably also seed it on libgen once it gets included in a torrent.

>is OCR worth it?

idk

 No.820

> ocr

It's definitely worth it. You still need to read it through and spell/grammar check, but if you ocr first before you even begin reading, it's not as dry as read->ocr->read. GNU has an open source ocr if you don't care for freeware (;https://www.gnu.org/software/ocrad/). It's relatively easy to make your own too, if you're into that sort of thing:(https://www.nist.gov/node/1298471/emnist-dataset)



[Return] [Go to top] [ Catalog ] [Post a Reply]
Delete Post [ ]