From b5619e34635ec9f73d6aa3217a8f16feb81004ed Mon Sep 17 00:00:00 2001 From: Michael Kay Date: Fri, 15 Nov 2024 12:11:24 +0000 Subject: [PATCH] Minor corrections and refinements --- .../src/function-catalog.xml | 155 ++++++------------ .../src/xpath-functions.xml | 65 ++++++-- 2 files changed, 93 insertions(+), 127 deletions(-) diff --git a/specifications/xpath-functions-40/src/function-catalog.xml b/specifications/xpath-functions-40/src/function-catalog.xml index 6dbb80df3..326440c17 100644 --- a/specifications/xpath-functions-40/src/function-catalog.xml +++ b/specifications/xpath-functions-40/src/function-catalog.xml @@ -25476,25 +25476,26 @@ return json-to-xml($json, $options)]]> Names are output as lexical QNames, in the same form as they would appear if serialized using the XML output method. The result may - contain a namespace prefix: note that the output will not contain any information - enabling such prefixes to be resolved to a namespace URI. + contain a namespace prefix, but the output will not contain any information + enabling such a prefix to be resolved to a namespace URI. Namespace URIs in element and attribute names are discarded; only the local - names are output. If this leads to duplicate keys in a context where the names - must be unique, then the setting is ignored and "eqname" is used instead. + names are output. Names in a namespace are output in the form "Q{uri}local". Names in no namespace are output using the local name alone. - Element names in the default namespace of the top-level element node - (the node supplied in the $elements argument), and attribute names - in no namespace, are output using the local name alone. - All other names are output in the format "Q{uri}local", or Q{}local - in the case of a no-namespace element name where this is not the default. + An element name is output as a local name alone if either (a) it is + a top-level element and is in no namespace, or (b) it is in the same namespace as its + parent element. An attribute name is output as a local name alone if it is in no namespace. + All other names are output in the format "Q{uri}local" if in a namespace, + or "Q{}local" if in no namespace. "Top-level" here means that the element + is one that appears explicitly in the sequence of elements passed in the first argument, + as distinct from a descendant element. A mapping from element names to layout names, used to override the default formatting rules for a particular element name. map(xs:QName, enum("empty", "empty-plus", "simple", "simple-plus", "list", "list-plus", - "record", "sequence", "mixed", "xml", "html", "xhtml")) + "record", "sequence", "mixed", "xml")) map{} @@ -25504,7 +25505,7 @@ return json-to-xml($json, $options)]]>

The principles for conversion from elements to maps are described - in specref ref="xml-to-json-mappings"/>.

+ in .

In general, an element node maps to a key-value pair in which the key represents the element name, and the @@ -25516,117 +25517,53 @@ return json-to-xml($json, $options)]]>

The representation of other kinds of node depends on the layout chosen for its parent element.

- -

Strings are escaped as follows:

- - -

Any occurrence of backslash is replaced by \\

-
- - -

Any occurrence of quotation mark, backspace, form-feed, newline, carriage return, or tab is - replaced by \", \b, \f, \n, \r, or \t respectively;

-
- - -

Any solidus ("/") is - replaced by \/ if the escape-solidus option is set to - true (its default value) but is output as "/" if the option is set - to false;

-
- - -

Any other codepoint in the range 1-31 or 127-159 is replaced by an escape in - the form \uHHHH where HHHH is the upper-case hexadecimal representation of the codepoint value.

-
-

A dynamic error is raised if the selected layout rules require atomization of an element that does not have a typed value (typically because it has been validated against an element-only content model): .

- + - items-to-json(()) - 'null' - - - items-to-json(12) - '12' - - - items-to-json((12, "December")) - '[12,"December"]' + elements-to-maps(()) + () - items-to-json(true()) - 'true' + bar")/*)]]> + { "foo": "bar" } + + + + + + + ")/*)]]> + { "list": [ + { "@value": "1" }, + { "@value": "2" } + ] } + + + + Jane + Smith + + ")/*)]]> + { "name": { + "first": "Jane", + "last": "Smith" + } +} - + diff --git a/specifications/xpath-functions-40/src/xpath-functions.xml b/specifications/xpath-functions-40/src/xpath-functions.xml index 567eee04b..c4323a070 100644 --- a/specifications/xpath-functions-40/src/xpath-functions.xml +++ b/specifications/xpath-functions-40/src/xpath-functions.xml @@ -7719,6 +7719,13 @@ return Converting Elements to Maps + + + A new function fn:elements-to-maps is provided for converting XDM trees + to maps suitable for serialization as JSON. Unlike the fn:xml-to-json function + retained from 3.1, this can handle arbitrary XML as input. + +

The fn:elements-to-maps function converts XML element nodes to maps, in a form suitable for serialization as JSON. This section describes the mappings used by this function.

@@ -7751,7 +7758,7 @@ return
- + Element Layouts @@ -7814,7 +7821,7 @@ return

This specification defines a number of named mappings, called layouts, and allows the layout for a particular - element to be selected in three different ways:

+ element to be selected in four different ways:

The layout to be used for a specific element name can be explicitly selected in the options @@ -7822,12 +7829,13 @@ return

In the absence of an explicit selection, if the data has been schema-validated, the layout is inferred from the content model for the element type as defined in the schema.

-

Otherwise (that is, when the data is untyped and no specific layout has been selected), +

When the data is untyped and no specific layout has been selected, a default layout is chosen based on the properties of the individual element instance.

- +
+

If the uniform option is set to true, then the same layout will be used for all elements with a given name. This means that all elements need to be - examined before any element is processed. The layout chosen is the first one (in the order of + examined before any element is converted. The layout chosen is the first one (in the order of presentation in the following sections) whose match predicate matches every element with the relevant name.

@@ -8610,19 +8618,40 @@ return
Element and Attribute Names

The name-format option gives control over how element and attribute names are formatted. - The default is to use a lexical QName, as if the name were being serialized using the XML output method. - This option is only suitable if the use of namespace prefixes is regular and predictable, since there - is no information in the result map to enable prefixes to be associated with namespace URIs.

- -

There is also an option attribute-marker allowing attribute names to be distinguished - in the output from element names. By default, attribute names are prefixed with the character - "@".

- -

Whichever format of names is chosen, if the rules for the selected layout result in an output - map having two entries with the same key, the conflict is resolved by adding a unique suffix to - each such key. The suffix takes the form "[N]" where N - is an integer, allocated sequentially starting at 1. For example if there are two entries - with the key author, they are renamed author[1] and author[2].

+ For element names there are four options:

+ + +

The default option (which may be explicitly requested by specifying "name-format": "default") + retains the namespace URI for any element that is either (a) the top-level element of a tree being + converted, or (b) has a name that is in a different namespace from its parent element. In such cases + the format "Q{uri}local" is used. For other elements, the name is output using the + local part of the element name alone. For attributes, the form "Q{uri}local" is used + for an attribute in a namespace, and the local name alone is used for a no-namespace name. + Namespace prefixes are not retained.

+

The option eqname uses the format "Q{uri}local" for all + element and attribute names that are in a namespace, or the local name alone for all names + that are not in a namespace.

+

The option local discards all namespace information: all elements and attributes + are output using the local name alone.

+

The option lexical outputs element and attribute names in the form that + would be used if the tree were serialized using the XML output method. If the name has a prefix, + the prefix is retained in the output. However, the output contains no information that enables the + prefix to be associated with a namespace URI, so this format is suitable only when prefixes + in the input documents are used predictably.

+
+ + + +

Attribute names in the output are typically prefixed with the character "@". + The option attribute-marker allows this to be changed to a different + prefix or none.

+ +

Whichever format of names is chosen, if the rules for the selected layout would result in an output + map having two entries with the same key, the conflict is resolved by combining these + entries into an array. For example if name-format is set to local + then the element ]]> becomes either + { "data": { "@val": ["3", "4"] } } or (because attribute order is unpredictable) + { "data": { "@val": ["4", "3"] } }.

Attributes in the xsi namespace (http://www.w3.org/2001/XMLSchema-instance) are discarded.