diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml
index d84deb735..16fe3ee92 100644
--- a/specifications/xpath-functions-40/src/function-catalog.xml
+++ b/specifications/xpath-functions-40/src/function-catalog.xml
@@ -27534,6 +27534,8 @@ declare function some(
any backlashes (\
), replace them with forward
slashes (/
).
Strip off the fragment identifier and any query:
+If the ^(.*)#([^#]*)$
,
the
Attempt to identify the scheme:
+If the ^[a-zA-Z][:|].*$
:
If the unc-path
option is true
, and the ^//[^/].*$
, then the scheme is file
- and the
Now that the scheme, if there is one, has been identified, + determine if the URI is hierarchical:
+If the true
if /
and false
otherwise.
true
if /
and
+ false
otherwise.
+ If file
and^//*([a-zA-Z]:.*)$
,
- the ^///*([^/]+)(/.*)?$
then the
Then examine the remaining parts of the string.
-If the If the Otherwise: If the scheme is not known or is known to be Otherwise, if the Finally, if the ^//*([a-zA-Z]:.*)$
,
+ unc-path
option is true
, and the
+ ^//[^/].*$
, then the
+ scheme is file
, the file
+ and the ^//*([a-zA-Z]:.*)$
,
the ^///*([^/]+)(/.*)?$
then the ^///*([^/]+)?(/.*)?$
, the
If the ^(([^@]*)@)(.*)(:([^:]*))?$
,
@@ -27657,23 +27673,23 @@ declare function some(
Similar care must be taken to match the port because an IPv6/IPvFuture address may contain a colon.
-If the ^(([^@]*)@)?(\[[^\]]*\])(:([^:]*))?$
,
- then the
If the
Otherwise, if the ^(([^@]*)@)?([^:]+)(:([^:]*))?$
,
- then the
the
Otherwise, the
If the omit-default-ports
option is true
, the port
is discarded and set to the empty sequence if the port number is the same
@@ -27697,20 +27713,8 @@ declare function some(
separator and applying
Applying +
) with spaces and all occurrences of
- %[a-fA-F0-9][a-fA-F0-9]
with a single character with the
- codepoint represented by the two digit hexadecimal number that
- follows the %
character. In other words, "A%42C"
becomes
- "ABC"
If there are any occurrences of %
followed
- by up to two characters that are not hexadecimal digits, they are
- replaced by the character sequence 0xef
, 0xbf
, 0xbd
- (that is, 0xfffd
, the Unicode replacement character, in UTF-8).
- After replacing all of the percent-escaped characters, the character sequence is
- interpreted as UTF-8 to get the string. In other words "A%XYC%Z%F0%9F%92%A9"
becomes
- "A�C�💩"
. 0xfffd
.
Applying fn:decode-from-uri
on the string.
The query-separator
option.
@@ -28292,20 +28296,26 @@ path with an explicit file:
scheme.
The components are derived from the contents of the $parts
map in the following way:
If the scheme
key is present in the map, the URI begins
- with the value of that key. A URI is considered to be non-hierarchical
- if either the hierarchical
key is present in the
- $parts
map with the value
- false()
or if the scheme is known to be non-hierarchical.
- (In other words, schemes are hierarchical by default.)
If the scheme
is file
and the unc-path
- option is true
, the scheme is delimited by a trailing :////
,
- otherwise, if the URI is non-hierarchical, the scheme is delimited by
- a trailing :
. For all other schemes, it is delimited by
- a trailing ://
. Exactly which schemes are known to be
- non-hierarchical is
-
If the scheme
key is present in the map,
+ the URI begins with the value of that key. A URI is considered to be
+ non-hierarchical if either the hierarchical
key
+ is present in the $parts
map with the value
+ false()
or if the scheme is known to be
+ non-hierarchical. (In other words, schemes are hierarchical by
+ default.)
If the scheme
is
+ known to be non-hierarchical, it is delimited by a trailing
+ :
.
Otherwise, if the scheme
is file
and the unc-path
+ option is true
, the scheme is delimited by a trailing :////
.
Otherwise, the scheme is delimited by
+ a trailing ://
.
For simplicity of exposition, we take the
userinfo
, host
, and
@@ -28501,4 +28511,4 @@ path with an explicit file:
scheme.
Some URI schemes are hierarchical and some are non-hierarchical.
+ Implementations must treat the following schemes as non-hierarchical:
+ jar
, mailto
, news
, tag
,
+ tel
, and urn
. Whether additional schemes
+ are known to be non-hierarchical
+
The structured representation of a URI is described by the
@@ -3312,8 +3321,6 @@ It is recommended that implementers consult
The parts of this structure are:
Parsed and unescaped path segments. | ||
query-segments | +query-parameters | Parsed and unescaped query terms |
The segmented forms of the path and query parameters provide - convenient access to commonly used information. They’re represented - in the map as arrays, instead of sequences, just for the convenience - of serializing the structure.
+ convenient access to commonly used information.The path, if there is one, is tokenized on “/” characters and
- each segment is unesaped. Consider the URI http://example.com/path/to/a%2fb
. The path portion has to be returned as /path/to/a%2fb
because
+ each segment is unescaped (as per the fn:decode-from-uri
function). Consider the URI
+ http://example.com/path/to/a%2fb
.
+ The path portion has to be returned as /path/to/a%2fb
because
decoding the %2f
would change the nature of the path.
- The unescaped form is easily accessible from the path-segments array:
Note that the presence or absence of a leading slash on the path will effect whether or not the array begins with an empty string.
-The query parameters are similarly decoded. Consider the URI: +
The query parameters are decoded into a map. Consider the URI:
http://example.com/path?a=1&b=2%264&a=3
.
- Here the decoded form in the query-segments gives quick access to
- the parameter values:
Note that both keys and values are unescaped and that it’s an array
- of maps because key values can be repeated, as seen for a
+ The decoded form in the query-parameters is the following map:
Note that both keys and values are unescaped. If a key
+ is repeated in the query string, the map will contain a
+ sequence of values for that key, as seen for a
in this example.