python正则表达式提取网页URL

清华大佬耗费三个月吐血整理的几百G的资源,免费分享!....>>>

python正则表达式提取网页URL

import re
import urllib
url="http://www.open-open.com"
s=urllib.urlopen(url).read()
ss=s.replace(" ","")
urls=re.findall(r"<a.*?href=.*?<\/a>",ss,re.I)
for i in urls:
print i
else:
print 'this is over'