pdf2html is a module which helps to convert PDF file to HTML pages using Apache Tika. This module also helps to generate thumbnail image for PDF file using Apache PDFBox.
It’s designed to let you preview many different image formats ... When a free PDF editor has the best OCR software equipped, it can scan and convert paper documents into digital documents ...