Metadata-Version: 2.1 Name: rfc3986 Version: 2.0.0 Summary: Validating URI References per RFC 3986 Home-page: http://rfc3986.readthedocs.io Author: Ian Stapleton Cordasco Author-email: graffatcolmingov@gmail.com License: Apache 2.0 Platform: UNKNOWN Classifier: Development Status :: 5 - Production/Stable Classifier: Intended Audience :: Developers Classifier: License :: OSI Approved :: Apache Software License Classifier: Natural Language :: English Classifier: Programming Language :: Python Classifier: Programming Language :: Python :: 3 Classifier: Programming Language :: Python :: 3 :: Only Classifier: Programming Language :: Python :: 3.7 Classifier: Programming Language :: Python :: 3.8 Classifier: Programming Language :: Python :: 3.9 Classifier: Programming Language :: Python :: 3.10 Classifier: Programming Language :: Python :: Implementation :: CPython Requires-Python: >=3.7 Description-Content-Type: text/x-rst License-File: LICENSE Provides-Extra: idna2008 Requires-Dist: idna ; extra == 'idna2008' rfc3986 ======= A Python implementation of `RFC 3986`_ including validation and authority parsing. Installation ------------ Use pip to install ``rfc3986`` like so:: pip install rfc3986 License ------- `Apache License Version 2.0`_ Example Usage ------------- The following are the two most common use cases envisioned for ``rfc3986``. Replacing ``urlparse`` `````````````````````` To parse a URI and receive something very similar to the standard library's ``urllib.parse.urlparse`` .. code-block:: python from rfc3986 import urlparse ssh = urlparse('ssh://user@git.openstack.org:29418/openstack/glance.git') print(ssh.scheme) # => ssh print(ssh.userinfo) # => user print(ssh.params) # => None print(ssh.port) # => 29418 To create a copy of it with new pieces you can use ``copy_with``: .. code-block:: python new_ssh = ssh.copy_with( scheme='https' userinfo='', port=443, path='/openstack/glance' ) print(new_ssh.scheme) # => https print(new_ssh.userinfo) # => None # etc. Strictly Parsing a URI and Applying Validation `````````````````````````````````````````````` To parse a URI into a convenient named tuple, you can simply: .. code-block:: python from rfc3986 import uri_reference example = uri_reference('http://example.com') email = uri_reference('mailto:user@domain.com') ssh = uri_reference('ssh://user@git.openstack.org:29418/openstack/keystone.git') With a parsed URI you can access data about the components: .. code-block:: python print(example.scheme) # => http print(email.path) # => user@domain.com print(ssh.userinfo) # => user print(ssh.host) # => git.openstack.org print(ssh.port) # => 29418 It can also parse URIs with unicode present: .. code-block:: python uni = uri_reference(b'http://httpbin.org/get?utf8=\xe2\x98\x83') # ☃ print(uni.query) # utf8=%E2%98%83 With a parsed URI you can also validate it: .. code-block:: python if ssh.is_valid(): subprocess.call(['git', 'clone', ssh.unsplit()]) You can also take a parsed URI and normalize it: .. code-block:: python mangled = uri_reference('hTTp://exAMPLe.COM') print(mangled.scheme) # => hTTp print(mangled.authority) # => exAMPLe.COM normal = mangled.normalize() print(normal.scheme) # => http print(mangled.authority) # => example.com But these two URIs are (functionally) equivalent: .. code-block:: python if normal == mangled: webbrowser.open(normal.unsplit()) Your paths, queries, and fragments are safe with us though: .. code-block:: python mangled = uri_reference('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') normal = mangled.normalize() assert normal == 'hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth' assert normal == 'http://example.com/Some/reallY/biZZare/pAth' assert normal != 'http://example.com/some/really/bizzare/path' If you do not actually need a real reference object and just want to normalize your URI: .. code-block:: python from rfc3986 import normalize_uri assert (normalize_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') == 'http://example.com/Some/reallY/biZZare/pAth') You can also very simply validate a URI: .. code-block:: python from rfc3986 import is_valid_uri assert is_valid_uri('hTTp://exAMPLe.COM/Some/reallY/biZZare/pAth') Requiring Components ~~~~~~~~~~~~~~~~~~~~ You can validate that a particular string is a valid URI and require independent components: .. code-block:: python from rfc3986 import is_valid_uri assert is_valid_uri('http://localhost:8774/v2/resource', require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component assert is_valid_uri('mailto:user@example.com', require_authority=True) is False If you have an instance of a ``URIReference``, you can pass the same arguments to ``URIReference#is_valid``, e.g., .. code-block:: python from rfc3986 import uri_reference http = uri_reference('http://localhost:8774/v2/resource') assert uri.is_valid(require_scheme=True, require_authority=True, require_path=True) # Assert that a mailto URI is invalid if you require an authority # component mailto = uri_reference('mailto:user@example.com') assert uri.is_valid(require_authority=True) is False Alternatives ------------ - `rfc3987 `_ This is a direct competitor to this library, with extra features, licensed under the GPL. - `uritools `_ This can parse URIs in the manner of RFC 3986 but provides no validation and only recently added Python 3 support. - Standard library's `urlparse`/`urllib.parse` The functions in these libraries can only split a URI (valid or not) and provide no validation. Contributing ------------ This project follows and enforces the Python Software Foundation's `Code of Conduct `_. If you would like to contribute but do not have a bug or feature in mind, feel free to email Ian and find out how you can help. The git repository for this project is maintained at https://github.com/python-hyper/rfc3986 .. _RFC 3986: https://datatracker.ietf.org/doc/html/rfc3986/ .. _Apache License Version 2.0: https://www.apache.org/licenses/LICENSE-2.0