Hi,
I have been using ebook-convert to go from a docx file to an epub file.
Analyzing the resulting xhtml in the epub I see that paragraphs are split into <p></p> lines (see below an example). This causes a problem as each line behaves independently when changing font size in my epub reader.
I tried saving the docx file using LibreOffice (Ubuntu) and Word (OSX) but the same behavior remains.
Oprimally I would like to have each paragraph contained in a single line (i.e. a single <p> </p>) so that font "reflow" would later work.
Is this a problem of the docx format or is there a workaround possible? I thought of applying some regexp to the xhtml to eliminate consecutive <p> lines, but it is not so simple, as there are cases where in the original document the lines are meant to appear in different lines (e.g. in poetry).
An example of what my xhtml looks like is:
<p class="calibre1">In the bosom of one of those spacious coves which indent the eastern</p>
<p class="calibre1">shore of the Hudson, at that broad expansion of the river denominated</p>
<p class="calibre1">by the ancient Dutch navigators the Tappan Zee, and where they always</p>
<p class="calibre1">prudently shortened sail and implored the protection of St. Nicholas</p>
<p class="calibre1">when they crossed, there lies a small market town or rural port, which</p>
<p class="calibre1">by some is called Greensburgh, but which is more generally and properly</p>
<p class="calibre1">known by the name of Tarry Town. This name was given, we are told, in</p>
<p class="calibre1">former days, by the good housewives of the adjacent country, from the</p>
<p class="calibre1">inveterate propensity of their husbands to linger about the village</p>
<p class="calibre1">tavern on market days. Be that as it may, I do not vouch for the fact,</p>
<p class="calibre1">but merely advert to it, for the sake of being precise and authentic.</p>
<p class="calibre1">Not far from this village, perhaps about two miles, there is a little</p>
<p class="calibre1">valley or rather lap of land among high hills, which is one of the</p>
<p class="calibre1">quietest places in the whole world. A small brook glides through it,</p>
<p class="calibre1">with just murmur enough to lull one to repose; and the occasional</p>
<p class="calibre1">whistle of a quail or tapping of a woodpecker is almost the only sound</p>
<p class="calibre1">that ever breaks in upon the uniform tranquillity.</p>
<p class="block">*</p>
<p class="calibre1">I recollect that, when a stripling, my first exploit in</p>
<p class="calibre1">squirrel-shooting was in a grove of tall walnut-trees that shades one</p>
<p class="calibre1">side of the valley. I had wandered into it at noontime, when all nature</p>
<p class="calibre1">is peculiarly quiet, and was startled by the roar of my own gun, as it</p>
<p class="calibre1">broke the Sabbath stillness around and was prolonged and reverberated</p>
<p class="calibre1">by the angry echoes. If ever I should wish for a retreat whither I might</p>
<p class="calibre1">steal from the world and its distractions, and dream quietly away the</p>
<p class="calibre1">remnant of a troubled life, I know of none more promising than this</p>
<p class="calibre1">little valley.</p>
I have been using ebook-convert to go from a docx file to an epub file.
Analyzing the resulting xhtml in the epub I see that paragraphs are split into <p></p> lines (see below an example). This causes a problem as each line behaves independently when changing font size in my epub reader.
I tried saving the docx file using LibreOffice (Ubuntu) and Word (OSX) but the same behavior remains.
Oprimally I would like to have each paragraph contained in a single line (i.e. a single <p> </p>) so that font "reflow" would later work.
Is this a problem of the docx format or is there a workaround possible? I thought of applying some regexp to the xhtml to eliminate consecutive <p> lines, but it is not so simple, as there are cases where in the original document the lines are meant to appear in different lines (e.g. in poetry).
An example of what my xhtml looks like is:
<p class="calibre1">In the bosom of one of those spacious coves which indent the eastern</p>
<p class="calibre1">shore of the Hudson, at that broad expansion of the river denominated</p>
<p class="calibre1">by the ancient Dutch navigators the Tappan Zee, and where they always</p>
<p class="calibre1">prudently shortened sail and implored the protection of St. Nicholas</p>
<p class="calibre1">when they crossed, there lies a small market town or rural port, which</p>
<p class="calibre1">by some is called Greensburgh, but which is more generally and properly</p>
<p class="calibre1">known by the name of Tarry Town. This name was given, we are told, in</p>
<p class="calibre1">former days, by the good housewives of the adjacent country, from the</p>
<p class="calibre1">inveterate propensity of their husbands to linger about the village</p>
<p class="calibre1">tavern on market days. Be that as it may, I do not vouch for the fact,</p>
<p class="calibre1">but merely advert to it, for the sake of being precise and authentic.</p>
<p class="calibre1">Not far from this village, perhaps about two miles, there is a little</p>
<p class="calibre1">valley or rather lap of land among high hills, which is one of the</p>
<p class="calibre1">quietest places in the whole world. A small brook glides through it,</p>
<p class="calibre1">with just murmur enough to lull one to repose; and the occasional</p>
<p class="calibre1">whistle of a quail or tapping of a woodpecker is almost the only sound</p>
<p class="calibre1">that ever breaks in upon the uniform tranquillity.</p>
<p class="block">*</p>
<p class="calibre1">I recollect that, when a stripling, my first exploit in</p>
<p class="calibre1">squirrel-shooting was in a grove of tall walnut-trees that shades one</p>
<p class="calibre1">side of the valley. I had wandered into it at noontime, when all nature</p>
<p class="calibre1">is peculiarly quiet, and was startled by the roar of my own gun, as it</p>
<p class="calibre1">broke the Sabbath stillness around and was prolonged and reverberated</p>
<p class="calibre1">by the angry echoes. If ever I should wish for a retreat whither I might</p>
<p class="calibre1">steal from the world and its distractions, and dream quietly away the</p>
<p class="calibre1">remnant of a troubled life, I know of none more promising than this</p>
<p class="calibre1">little valley.</p>