From @christianbundy on Fri Jun 07 2019 20:28:05 GMT+0000 (UTC)
I was wrong, thanks for correcting me! It looks like we’re complying with everything the standard says we must do, it’s just that we’re going against what the standard says we should do. (Obligatory RFC 2119.)
A host identified by a registered name is a sequence of characters
usually intended for lookup within a locally defined host or service
name registry, though the URI’s scheme-specific semantics may require
that a specific registry (or fixed name table) be used instead. The
most common name registry mechanism is the Domain Name System (DNS).
A registered name intended for lookup in the DNS uses the syntax
defined in Section 3.5 of [RFC1034] and Section 2.1 of [RFC1123].
Such a name consists of a sequence of domain labels separated by “.”,
each domain label starting and ending with an alphanumeric character
and possibly also containing “-” characters. The rightmost domain
label of a fully qualified domain name in DNS may be followed by a
single “.” and should be if it is necessary to distinguish between
the complete domain name and some local domain.
[…] URI producers should use names
that conform to the DNS syntax, even when use of DNS is not
immediately apparent, and should limit these names to no more than
255 characters in length.
<domain> ::= <subdomain> | " "
<subdomain> ::= <label> | <subdomain> "." <label>
The labels must follow the rules for ARPANET host names. They must
start with a letter, end with a letter or digit, and have as interior
characters only letters, digits, and hyphen. There are also some
restrictions on the length. Labels must be 63 characters or less.
TL;DR: The host should be 63 or fewer characters of alphanumerics and hyphens, and if you’re using the host for a DNS lookup then you must conform to that standard. We aren’t required to use the 63-char limit or remove the
+, but we should be mindful about the fact that what we’re doing is not recommended:
- SHOULD NOT This phrase, or the phrase “NOT RECOMMENDED” mean that
there may exist valid reasons in particular circumstances when the
particular behavior is acceptable or even useful, but the full
implications should be understood and the case carefully weighed
before implementing any behavior described with this label.
In this case the “full implications” are mostly that URI parsers may not recognize Dat URIs or parse them as links, which is the reason I opened this issue (see https://github.com/markdown-it/mdurl/issues/2). Technically our URI scheme doesn’t violate any standards, but it has some properties that aren’t recommended and that hurts interoperability with URI parsers that don’t support our edge-case.