urllib.parse分为URL parsing and URL quoting,即网址解析和网址引用。
URL解析函数专注于将URL字符串拆分为其组件,或将URL组件组合到URL字符串中。
urllib.parse.
urlparse
(urlstring, scheme='', allow_fragments=True) >>> from urllib.parse import urlparse
>>> o = urlparse('http://www.cwi.nl:80/%7Eguido/Python.html')
>>> o
ParseResult(scheme='http', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
params='', query='', fragment='')
>>> o.scheme
'http'
>>> o.port
80
>>> o.geturl()
'http://www.cwi.nl:80/%7Eguido/Python.html'
>>> from urllib.parse import urlparse
>>> urlparse('//www.cwi.nl:80/%7Eguido/Python.html')
ParseResult(scheme='', netloc='www.cwi.nl:80', path='/%7Eguido/Python.html',
params='', query='', fragment='')
>>> urlparse('www.cwi.nl/%7Eguido/Python.html')
ParseResult(scheme='', netloc='', path='www.cwi.nl/%7Eguido/Python.html',
params='', query='', fragment='')
>>> urlparse('help/Python.html')
ParseResult(scheme='', netloc='', path='help/Python.html', params='',
query='', fragment='')
urllib.parse.
urlsplit
(urlstring, scheme='', allow_fragments=True)
使用方法同上
合并URL
urll.parse.urlencode()
urllib.parse.
urlunsplit
(parts)
urllib.parse.
urljoin
(base, url, allow_fragments=True)
>>> from urllib.parse import urljoin
>>> urljoin('http://www.cwi.nl/%7Eguido/Python.html', 'FAQ.html')
'http://www.cwi.nl/%7Eguido/FAQ.html'