View Single Post
Old 11-09-2012, 11:22 PM   #3
Heaneisismich

Join Date
Oct 2005
Posts
447
Senior Member
Default
What I need:

1. Members who can select the text from the PDF file and create it in word format and also index it so that it becomes searchable through Google.
2. Ways to make the process better and easier, scanning as well as indexing.


may Allah Rabbul 'Izzat put Barakah in your time and reward your efforts

1. I have started something similar with http://hayatus-sahabah.tumblr.com/ I would suggest creating a couple of tumblr blogs for each volume.

This is due to site reliability issues but also a blogging/CMS platform like tumblr or WordPress will enable this crowd sourced effort to be carried out efficiently.

2. I would suggest looking into http://www.diybookscanner.org/ Its basically using a digital camera to take a picture of each page of the book instead of using a flat bed scanner.

A Traditional flat bed scanner is too slow and laborious. A basic DIY setup using a digital camera can easily scan an entire 900 page book in a few hours.

I have experiment with a similar setup and was quite pleased with the results. The software for processing is fairly straight forward to use and results are Alhamdulillah very good.

Ideally you would want to use some Optical Character Recognition (OCR) software to recognise all the text and then proof read, edit so you get the actual text of the book and not just images of each page, which is what you have already covered in 1.

Heaneisismich is offline


 

All times are GMT +1. The time now is 08:42 AM.
Copyright ©2000 - 2012, Jelsoft Enterprises Ltd.
Design & Developed by Amodity.com
Copyright© Amodity