python - Excluding web links with specific extensions in web scraper -


i need exclude printing links in web scraper end in .od .jpg .pdf or .mp3

here's if statement

if link in linklist():       print link 

is there library in python that? know of"regex" i'm not greatest user of it.

assuming link path, can following:

import os if os.path.splitext(link)[1] not in ['.jpg', '.pdf', '.mp3']:     print link 

the function splitext takes path , returns tuple containing path without extension, followed extension. example:

>>> os.path.splitext('http://www.example.com/path/to/filename.ext') ('http://www.example.com/path/to/filename', '.ext') 

so if split link function, can check whether last element of tuple member of list/set/tuple containing blacklist of extensions.


Comments

Popular posts from this blog

java - activate/deactivate sonar maven plugin by profile? -

python - TypeError: can only concatenate tuple (not "float") to tuple -

java - What is the difference between String. and String.this. ? -