I had a terribly badly formatted book that I had to go back to the raw TXT and try to start over
That file was clean ASCII with CR-LF's separating paragraphs (#1 and #2)
I added it to Calibre and converted to epub using the default TXT input preferences. I didn't see any knobs to turn that would help.
There were some H2 and H3 assumptions made and the text was divided in unexpected places. For example (#3) the TOC text from the TXT file was pretty much converted to the first 42 files with 2 or 3 lines per file. The bulk of the text was in the last 4 files. Since the TOC was in the ASCII text file two time it was converted 2 times, driving up the number of 2 line files.
Q1 - why were there so many 2 or 3 line files created? What is the conversion logic that decided H2 and H3's and separate file?
Q2 - is there a better way to add and convert txt files?
Q3 - RegEx will clean or fix a lot. For example
<p>6</p> into <h1>Chapter 6</h1>
but it can still be a lot of fiddly work. Are there any options or plug ins that might help?
Thanks
That file was clean ASCII with CR-LF's separating paragraphs (#1 and #2)
I added it to Calibre and converted to epub using the default TXT input preferences. I didn't see any knobs to turn that would help.
There were some H2 and H3 assumptions made and the text was divided in unexpected places. For example (#3) the TOC text from the TXT file was pretty much converted to the first 42 files with 2 or 3 lines per file. The bulk of the text was in the last 4 files. Since the TOC was in the ASCII text file two time it was converted 2 times, driving up the number of 2 line files.
Q1 - why were there so many 2 or 3 line files created? What is the conversion logic that decided H2 and H3's and separate file?
Q2 - is there a better way to add and convert txt files?
Q3 - RegEx will clean or fix a lot. For example
<p>6</p> into <h1>Chapter 6</h1>
but it can still be a lot of fiddly work. Are there any options or plug ins that might help?
Thanks