python - remove empty lines from HTML, except in <code></code> blocks -
i'm using editorial write posts of wordpress blog markdown.
the markdown parser outputs html code perfectly, , editorial embedded viewer shows result spected format , style.
when paste html in wordpress mobile editor, shows text in wrong format, showing empty lines.
for example:
# header hello world, **this markdown!** other markdown paragraph!.
is parsed to:
<h1>header</h1> <p>hello world, <strong>this markdown!</strong></p> <p>other markdown paragraph!. </p>
which showed in viewer as:
which expected.
the wordpress mobile app, on other hand, shows html code as:
as can see, there empty lines.
i think css sheet of wordpress has margin of paragraphs , headers configured put 1 empty line above, , 1 empty line bellow. but cannot modify css, brute-force solution remove blank lines between paragraphs in html code. works fine, process tedious.
so want use powerful tools of editorial build workflow automatize process.
goal write python script wich takes generated html , erases empty lines, being careful not erase empty lines located @ code blocks, wich source code examples.
i'm thinking solution using regular expressions find empty lines , discard code blocks, i'm pretty new python , libraries, code snippets have tried didn't work.
anybody provide me example of how achieve this, or general gideline write myself?
thanks.
pd: post kind of question without example/sourcecode of have tried bad idea, know, python code noob-messy-bunch of code without sense, decided not post it.
let's assume have loaded html text (html):
html = """ html html html code-start code code code code-end """ new_html = "" is_code = false line in html.split('\n'): # disable empty line remover when code starts if line == 'code-start': is_code = true # check empty line/is_code if is_code or line != '': new_html += line+'\n' # enable empty line remover when code ends if line == 'code-end': is_code = false print new_html
of course have replace code-start , code-end valid html tags.
this quick , dirty approach should you.
Comments
Post a Comment