How can I repair pdf-to-epub conversions?
Some pdfs only show blank pages on the Kindle, so I'm trying to convert from pdf to epub and eventually to mobi. This one's a simple pdf with a good text layer and only 1 image. And 1 column.
Different conversion software has different bugs.
So the original pdf has the best text layer, and the Calibre epub has the best cover, but the Calibre epub has p breaks between every few lines, with erratic spacing, while the other epubs have chapter breaks between every 2 pages, and erratic font-size changes, and probably ocr errors.
I can't find and remove all the p-breaks, or I'd remove the necessary ones along with the extra ones.
I'm not sure how to overwrite a bad text layer with a good one.
I can't open all 3 epubs at the same time in case I want to splice together 1 epub with the best sections from all 3.
P.S. The original pdf was left-justified, rather than double-justified, so most lines triggered line-unwrapping. The original pdf has tabs at the beginning of each paragraph.
Some pdfs only show blank pages on the Kindle, so I'm trying to convert from pdf to epub and eventually to mobi. This one's a simple pdf with a good text layer and only 1 image. And 1 column.
Different conversion software has different bugs.
So the original pdf has the best text layer, and the Calibre epub has the best cover, but the Calibre epub has p breaks between every few lines, with erratic spacing, while the other epubs have chapter breaks between every 2 pages, and erratic font-size changes, and probably ocr errors.
I can't find and remove all the p-breaks, or I'd remove the necessary ones along with the extra ones.
I'm not sure how to overwrite a bad text layer with a good one.
I can't open all 3 epubs at the same time in case I want to splice together 1 epub with the best sections from all 3.
P.S. The original pdf was left-justified, rather than double-justified, so most lines triggered line-unwrapping. The original pdf has tabs at the beginning of each paragraph.