python - remove empty lines from HTML, except in <code></code> blocks -


i'm using editorial write posts of wordpress blog markdown.

the markdown parser outputs html code perfectly, , editorial embedded viewer shows result spected format , style.
when paste html in wordpress mobile editor, shows text in wrong format, showing empty lines.

for example:

# header hello world, **this markdown!**  other markdown paragraph!.  

is parsed to:

<h1>header</h1>  <p>hello world, <strong>this markdown!</strong></p>  <p>other markdown paragraph!. </p> 

which showed in viewer as:

enter image description here

which expected.

the wordpress mobile app, on other hand, shows html code as:

enter image description here

as can see, there empty lines.

i think css sheet of wordpress has margin of paragraphs , headers configured put 1 empty line above, , 1 empty line bellow. but cannot modify css, brute-force solution remove blank lines between paragraphs in html code. works fine, process tedious.

so want use powerful tools of editorial build workflow automatize process.
goal write python script wich takes generated html , erases empty lines, being careful not erase empty lines located @ code blocks, wich source code examples.

i'm thinking solution using regular expressions find empty lines , discard code blocks, i'm pretty new python , libraries, code snippets have tried didn't work.

anybody provide me example of how achieve this, or general gideline write myself?

thanks.

pd: post kind of question without example/sourcecode of have tried bad idea, know, python code noob-messy-bunch of code without sense, decided not post it.

let's assume have loaded html text (html):

html = """ html html  html  code-start code code  code code-end  """  new_html = "" is_code = false line in html.split('\n'):     # disable empty line remover when code starts     if line == 'code-start':         is_code = true     # check empty line/is_code     if is_code or line != '':         new_html += line+'\n'     # enable empty line remover when code ends     if line == 'code-end':         is_code = false  print new_html         

of course have replace code-start , code-end valid html tags.

this quick , dirty approach should you.


Comments

Popular posts from this blog

c++ - Linked List error when inserting for the last time -

java - activate/deactivate sonar maven plugin by profile? -

java - What is the difference between String. and String.this. ? -