Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Passing key into update/remove process xslt for iso 19139 to fix issue with updating/deleting resources with same url #7431

Merged
Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
21 commits
Select commit Hold shift + click to select a range
acc3540
Passing the index into remove process xslt for iso 19139
wangf1122 Oct 12, 2023
f4a8b82
Updated resource logic to use 2 new field called idx and hash and rev…
ianwallen Oct 27, 2023
af6ad78
Add missing <xsl:copy> that was causing bugs.
ianwallen Oct 30, 2023
4879f57
Merge pull request #15 from ianwallen/main.prevent.delete.multiple.on…
wangf1122 Oct 30, 2023
158a898
Merge remote-tracking branch 'upstream/main' into main.prevent.delete…
wangf1122 Oct 30, 2023
bdc8c67
unit test fix
wangf1122 Nov 28, 2023
5322776
Merge branch 'geonetwork:main' into main.prevent.delete.multiple.onli…
wangf1122 Jan 10, 2024
fba52da
Fix issues based on code review.
ianwallen Jan 15, 2024
e5e6f92
Merge pull request #16 from ianwallen/main.prevent.delete.multiple.on…
wangf1122 Jan 15, 2024
396dcec
Merge branch 'geonetwork:main' into main.prevent.delete.multiple.onli…
wangf1122 Jan 23, 2024
e6df4e6
Update schemas/iso19139/src/main/plugin/iso19139/process/thumbnail-re…
wangf1122 Jan 23, 2024
cf723fe
Update schemas/iso19139/src/main/plugin/iso19139/process/onlinesrc-ad…
wangf1122 Jan 23, 2024
8aa2802
Update schemas/iso19139/src/main/plugin/iso19139/process/onlinesrc-re…
wangf1122 Jan 23, 2024
8eb4c1d
Update schemas/iso19139/src/main/plugin/iso19139/process/onlinesrc-re…
wangf1122 Jan 23, 2024
f1975e3
Update schemas/iso19139/src/main/plugin/iso19139/process/thumbnail-re…
wangf1122 Jan 23, 2024
2ae75c4
Merge branch 'geonetwork:main' into main.prevent.delete.multiple.onli…
wangf1122 Feb 14, 2024
892a8b4
Add hash and index to search indexing. Remove thumbnail hash and inde…
wangf1122 Feb 19, 2024
90f7b0a
Fix unit test
wangf1122 Feb 20, 2024
b9f1d5d
Remove unwanted method in XslUtils
wangf1122 Feb 20, 2024
fff34db
fix unit test
wangf1122 Feb 21, 2024
d630326
code review comment
wangf1122 Mar 5, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
108 changes: 62 additions & 46 deletions schemas/iso19139/src/main/plugin/iso19139/extract-relations.xsl
Original file line number Diff line number Diff line change
Expand Up @@ -34,6 +34,8 @@
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:util="java:org.fao.geonet.util.XslUtil"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing it out. I just realized this namespace is used. I have placed it back.

xmlns:digestUtils="java:org.apache.commons.codec.digest.DigestUtils"
xmlns:exslt="http://exslt.org/common"
xmlns:gn-fn-rel="http://geonetwork-opensource.org/xsl/functions/relations"
version="2.0"
exclude-result-prefixes="#all">
Expand Down Expand Up @@ -85,7 +87,7 @@
</xsl:template>

<!-- Relation contained in the metadata record has to be returned
It could be document or thumbnails
It could be a document or thumbnails
-->
<xsl:template mode="relation"
match="metadata[gmd:MD_Metadata or *[contains(@gco:isoType, 'MD_Metadata')]]"
Expand All @@ -97,69 +99,83 @@
</xsl:call-template>
</xsl:variable>

<xsl:if test="count(*/descendant::*[name(.) = 'gmd:graphicOverview']/*) > 0">
<xsl:if test="count(*//gmd:graphicOverview) > 0">
<thumbnails>
<xsl:for-each select="*/descendant::*[name(.) = 'gmd:graphicOverview']/*">
<xsl:for-each select="*//gmd:graphicOverview">
<item>
<id>
<xsl:value-of select="gmd:fileName/gco:CharacterString"/>
<xsl:value-of select="gmd:MD_BrowseGraphic/gmd:fileName/gco:CharacterString"/>
</id>
<idx>
<xsl:value-of select="position()"/>
</idx>
<hash>
<xsl:value-of select="digestUtils:md5Hex(string(exslt:node-set(normalize-space(.))))"/>
</hash>
<url>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:fileName"/>
select="gmd:MD_BrowseGraphic/gmd:fileName"/>
</url>
<title>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:fileDescription"/>
select="gmd:MD_BrowseGraphic/gmd:fileDescription"/>
</title>
<type>thumbnail</type>
</item>
</xsl:for-each>
</thumbnails>
</xsl:if>

<xsl:if test="count(*/descendant::*[name(.) = 'gmd:onLine']/*[gmd:linkage/gmd:URL!='']) > 0">
<xsl:if test="count(*//gmd:MD_DigitalTransferOptions/gmd:onLine[gmd:CI_OnlineResource[gmd:linkage/gmd:URL!='']]) > 0">
<onlines>
<xsl:for-each select="*/descendant::*[name(.) = 'gmd:onLine']/*[gmd:linkage/gmd:URL!='']">
<item>
<xsl:variable name="langCode">
<xsl:value-of select="concat('#', upper-case(util:twoCharLangCode($lang, 'EN')))"/>
</xsl:variable>
<xsl:variable name="url" select="gmd:linkage/gmd:URL"/>
<id>
<xsl:value-of select="$url"/>
</id>
<title>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:name"/>
</title>
<url>
<value lang="{$mainLanguage}">
<xsl:for-each select="*//gmd:MD_DigitalTransferOptions/gmd:onLine">
<xsl:if test="gmd:CI_OnlineResource[gmd:linkage/gmd:URL!='']">
<item>
<xsl:variable name="langCode">
<xsl:value-of select="concat('#', upper-case(util:twoCharLangCode($lang, 'EN')))"/>
</xsl:variable>
<xsl:variable name="url" select="gmd:CI_OnlineResource/gmd:linkage/gmd:URL"/>
<id>
<xsl:value-of select="$url"/>
</value>
</url>
<function>
<xsl:value-of select="gmd:function/*/@codeListValue"/>
</function>
<applicationProfile>
<xsl:value-of select="gmd:applicationProfile/*/text()"/>
</applicationProfile>
<description>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:description"/>
</description>
<protocol>
<xsl:value-of select="gn-fn-rel:translate(gmd:protocol, $langCode)"/>
</protocol>
<mimeType>
<xsl:value-of select="if (*/gmx:MimeFileType)
then */gmx:MimeFileType/@type
else if (starts-with(gmd:protocol/gco:CharacterString, 'WWW:DOWNLOAD:'))
then replace(gmd:protocol/gco:CharacterString, 'WWW:DOWNLOAD:', '')
else ''"/>
</mimeType>
<type>onlinesrc</type>
</item>
</id>
<idx>
<xsl:value-of select="position()"/>
</idx>
<hash>
<xsl:value-of select="digestUtils:md5Hex(string(exslt:node-set(normalize-space(.))))"/>
</hash>
<title>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:CI_OnlineResource/gmd:name"/>
</title>
<url>
<value lang="{$mainLanguage}">
<xsl:value-of select="$url"/>
</value>
</url>
<function>
<xsl:value-of select="gmd:CI_OnlineResource/gmd:function/*/@codeListValue"/>
</function>
<applicationProfile>
<xsl:value-of select="gmd:CI_OnlineResource/gmd:applicationProfile/*/text()"/>
</applicationProfile>
<description>
<xsl:apply-templates mode="get-iso19139-localized-string"
select="gmd:CI_OnlineResource/gmd:description"/>
</description>
<protocol>
<xsl:value-of select="gn-fn-rel:translate(gmd:CI_OnlineResource/gmd:protocol, $langCode)"/>
</protocol>
<mimeType>
<xsl:value-of select="if (gmd:CI_OnlineResource/*/gmx:MimeFileType)
then gmd:CI_OnlineResource/*/gmx:MimeFileType/@type
else if (starts-with(gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString, 'WWW:DOWNLOAD:'))
then replace(gmd:CI_OnlineResource/gmd:protocol/gco:CharacterString, 'WWW:DOWNLOAD:', '')
else ''"/>
</mimeType>
<type>onlinesrc</type>
</item>
</xsl:if>
</xsl:for-each>
</onlines>
</xsl:if>
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -33,6 +33,8 @@
xmlns:xlink="http://www.w3.org/1999/xlink"
xmlns:gn-fn-index="http://geonetwork-opensource.org/xsl/functions/index"
xmlns:index="java:org.fao.geonet.kernel.search.EsSearchManager"
xmlns:digestUtils="java:org.apache.commons.codec.digest.DigestUtils"
xmlns:exslt="http://exslt.org/common"
xmlns:util="java:org.fao.geonet.util.XslUtil"
xmlns:date-util="java:org.fao.geonet.utils.DateUtil"
xmlns:daobs="http://daobs.org"
Expand Down Expand Up @@ -1105,6 +1107,8 @@
<atomfeed><xsl:value-of select="gmd:linkage/gmd:URL"/></atomfeed>
</xsl:if>
<link type="object">{
"hash": "<xsl:value-of select="digestUtils:md5Hex(string(exslt:node-set(normalize-space(.))))"/>",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Working on a new indexing module, I think we could simplify this to digestUtils:md5Hex(normalize-space(.)) which should produce the same hash or is there a particular case where we need exslt and the string function?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fxprunayre you are correct that this seems cleaner.

"idx": <xsl:value-of select="position()"/>,
"protocol":"<xsl:value-of select="util:escapeForJson((gmd:protocol/*/text())[1])"/>",
"mimeType":"<xsl:value-of select="if (*/gmx:MimeFileType)
then util:escapeForJson(*/gmx:MimeFileType/@type)
Expand Down
133 changes: 93 additions & 40 deletions schemas/iso19139/src/main/plugin/iso19139/process/onlinesrc-add.xsl
Original file line number Diff line number Diff line change
Expand Up @@ -25,11 +25,20 @@
<!--
Processing to insert or update an online resource element.
Insert is made in first transferOptions found.

Note: It assumes that it will be adding new items in
the first /gmd:distributionInfo
and first /gmd:MD_Distribution
and first /gmd:transferOptions
-->
<xsl:stylesheet xmlns:gmd="http://www.isotc211.org/2005/gmd"
xmlns:gco="http://www.isotc211.org/2005/gco"
xmlns:gmx="http://www.isotc211.org/2005/gmx"
xmlns:xsl="http://www.w3.org/1999/XSL/Transform"
xmlns:xs="http://www.w3.org/2001/XMLSchema"
xmlns:digestUtils="java:org.apache.commons.codec.digest.DigestUtils"
xmlns:exslt="http://exslt.org/common"
wangf1122 marked this conversation as resolved.
Show resolved Hide resolved
exclude-result-prefixes="#all"
version="2.0">

<!-- Main properties for the link.
Expand All @@ -52,17 +61,29 @@ Insert is made in first transferOptions found.
in this one. -->
<xsl:param name="extra_metadata_uuid"/>

<!-- Target element to update. The key is based on the concatenation
of URL+Protocol+Name -->
<xsl:param name="updateKey"/>
<!-- Target element to update.
updateKey is used to identify the resource name to be updated - it is for backwards compatibility. Will not be used if resourceHash is set.
The key is based on the concatenation of URL+Protocol+Name
resourceHash is hash value of the object to be removed which will ensure the correct value is removed. It will override the usage of updateKey
resourceIdx is the index location of the object to be removed - can be used when duplicate entries exists to ensure the correct one is removed.
-->

<xsl:param name="updateKey" select="''"/>
<xsl:param name="resourceHash" select="''"/>
<xsl:param name="resourceIdx" select="''"/>

<xsl:variable name="update_flag">
<xsl:value-of select="boolean($updateKey != '' or $resourceHash != '' or $resourceIdx != '')"/>
</xsl:variable>

<xsl:variable name="mainLang">
<xsl:value-of
select="(gmd:MD_Metadata|*[@gco:isoType='gmd:MD_Metadata'])/gmd:language/gmd:LanguageCode/@codeListValue"/>
</xsl:variable>

<xsl:template match="gmd:MD_Metadata|*[@gco:isoType='gmd:MD_Metadata']">
<!-- Add new gmd:onLine and consider cases where parent elements don't exist -->
<!-- <gmd:distributionInfo> does not exist-->
<xsl:template match="gmd:MD_Metadata[not(gmd:distributionInfo) and $update_flag = false()]|*[@gco:isoType='gmd:MD_Metadata' and not(gmd:distributionInfo) and $update_flag = false()]">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates
Expand All @@ -86,41 +107,15 @@ Insert is made in first transferOptions found.

<gmd:distributionInfo>
<gmd:MD_Distribution>
<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/gmd:distributionFormat"/>
<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/gmd:distributor"/>
<gmd:transferOptions>
<gmd:MD_DigitalTransferOptions>
<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/
gmd:transferOptions[1]/gmd:MD_DigitalTransferOptions/gmd:unitsOfDistribution"/>
<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/
gmd:transferOptions[1]/gmd:MD_DigitalTransferOptions/gmd:transferSize"/>
<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/
gmd:transferOptions[1]/gmd:MD_DigitalTransferOptions/gmd:onLine"/>


<xsl:if test="$updateKey = ''">
<xsl:call-template name="createOnlineSrc"/>
</xsl:if>

<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/
gmd:transferOptions[1]/gmd:MD_DigitalTransferOptions/gmd:offLine"/>
<xsl:call-template name="createOnlineSrc"/>
</gmd:MD_DigitalTransferOptions>
</gmd:transferOptions>


<xsl:apply-templates
select="gmd:distributionInfo/gmd:MD_Distribution/
gmd:transferOptions[position() > 1]"/>

</gmd:MD_Distribution>
</gmd:distributionInfo>


<xsl:apply-templates
select="gmd:dataQualityInfo|
gmd:portrayalCatalogueInfo|
Expand All @@ -135,14 +130,72 @@ Insert is made in first transferOptions found.
</xsl:copy>
</xsl:template>

<!-- <gmd:MD_Distribution> does not exist-->
<xsl:template match="*/gmd:distributionInfo[not(gmd:MD_Distribution) and $update_flag = false() and position() = 1]">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
<gmd:MD_Distribution>
<gmd:transferOptions>
<gmd:MD_DigitalTransferOptions>
<xsl:call-template name="createOnlineSrc"/>
</gmd:MD_DigitalTransferOptions>
</gmd:transferOptions>
</gmd:MD_Distribution>
</xsl:copy>
</xsl:template>

<!-- <gmd:transferOptions> does not exist-->
<xsl:template match="*/gmd:distributionInfo[1]/gmd:MD_Distribution[not(gmd:transferOptions) and $update_flag = false() and position() = 1]">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
<gmd:transferOptions>
<gmd:MD_DigitalTransferOptions>
<xsl:call-template name="createOnlineSrc"/>
</gmd:MD_DigitalTransferOptions>
</gmd:transferOptions>
</xsl:copy>
</xsl:template>

<!-- <gmd:MD_DigitalTransferOptions> does not exist-->
<xsl:template match="*/gmd:distributionInfo[1]/gmd:MD_Distribution[1]/gmd:transferOptions[not(gmd:MD_DigitalTransferOptions) and $update_flag = false() and position() = 1]">
<xsl:copy>
<xsl:apply-templates select="node()|@*"/>
<gmd:MD_DigitalTransferOptions>
<xsl:call-template name="createOnlineSrc"/>
</gmd:MD_DigitalTransferOptions>
</xsl:copy>
</xsl:template>

<!-- Add new gmd:gmd:onLine-->
<xsl:template match="*/gmd:distributionInfo[1]/gmd:MD_Distribution[1]/gmd:transferOptions[1]/gmd:MD_DigitalTransferOptions[$update_flag = false() and position() = 1]">
<xsl:copy>
<xsl:apply-templates select="@*"/>
<xsl:apply-templates
select="gmd:unitsOfDistribution|
gmd:transferSize|
gmd:onLine"/>

<xsl:call-template name="createOnlineSrc"/>

<xsl:apply-templates select="gmd:offLine"/>
</xsl:copy>
</xsl:template>

<!-- Updating the link matching the update key. -->
<xsl:template match="gmd:onLine[$updateKey != '' and
normalize-space($updateKey) = concat(
gmd:CI_OnlineResource/gmd:linkage/gmd:URL,
gmd:CI_OnlineResource/gmd:protocol/*,
gmd:CI_OnlineResource/gmd:name/gco:CharacterString)
]">
<!-- End of inserting gmd:onLine -->


<!-- Updating the gmd:onLine based on update parameters -->
<!-- Note: first part of the match needs to match the xsl:for-each select from extract-relations.xsl in order to get the position() to match -->
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would write a comment explaining how it works, something like this:

If provided resourceHash and resourceIdx matches the online resource to update using that fields. When not provided (backwards compatibility) uses the updateKey to match the online resource to update.

That would facilitate the maintenance in the long term, as the condition is not that simple.

Please if you can also update the other xslt files with similar conditions.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have added some extra comment explain what the resourceIdx and resourceHash for both add and remove xsl

<!-- The unique identifier is marked with resourceIdx which is the position index and resourceHash which is hash code of the current node (combination of url, resource name, and description) -->
<xsl:template
match="*//gmd:MD_DigitalTransferOptions/gmd:onLine
[gmd:CI_OnlineResource[gmd:linkage/gmd:URL!=''] and ($resourceIdx = '' or position() = xs:integer($resourceIdx))]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The position here is computed in the context of *//gmd:MD_DigitalTransferOptions/gmd:onLine and during indexing with the context of https://github.com/geonetwork/core-geonetwork/blob/main/schemas/iso19139/src/main/plugin/iso19139/index-fields/index.xsl#L1125-L1126 so the resourceIdx is not the same. Removal does not work when we have more than one transferOptions block.

Maybe we should simplify the process with the hash only support?

  <xsl:template
    match="*//gmd:MD_DigitalTransferOptions/gmd:onLine[digestUtils:md5Hex(normalize-space(.)) = $resourceHash]"
    priority="2">
  </xsl:template>

This was also not applied to 115-3 https://github.com/geonetwork/core-geonetwork/blob/main/schemas/iso19115-3.2018/src/main/plugin/iso19115-3.2018/process/onlinesrc-remove.xsl

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@fxprunayre
We have recently encountered issues where data was supplied by a third party which has all the resources in separate transferOptions block and it was noticed that it was not possible to edit (and possibly delete) data from these other blocks. So you are correct that there are issues.

Maybe we should simplify the process with the hash only support?

  • Do you mean ignore the position? The idea of also checking the position is the following

    • block1
    • block2
    • block3
    • block4 (Same as block 1 so same hash value)

    Then if someone click delete on block4 then it would delete block4 even with duplicate hash.
    If we only use the hash then it could delete block1 or block4.
    I know GN does not allow duplicate data to be entered however when getting data from third party, we don't really know what is being provided.
    So position/index was the original goal and the hash was added later as a double check to ensure it did not mess up on the position/index.

    We also have a request from users to be able to order record positions how they like. I think will be easier done if we keep position/index as we should be able to perform a swap based on position? Not really sure as we have not had time to look into this feature yet.

  • Do you mean ignore the url check? If so then unchanged metadata 101 schema will stop working if we remove the url check. This was kept for backward compatibility.

Copy link
Contributor

@ianwallen ianwallen Jul 29, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This was also not applied to 115-3

Our intent was to add it but did not get around to it. Now that there are a couple bugs need to be fix. Probably once these are fixed then we can add support to 115-3

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes indeed it depends a bit on how people organize things in ISO distribution blocks or not.

From use cases I have here, I would say hash of (URL+name+protocol) should be unique but user can always create duplicates in different block. We can keep it backward compatible like it is now but if relying on the position, we need to compute a position using the same context (currently the context in indexing is different).

Will think about it, if any better options.

[($resourceHash != '' or ($updateKey != '' and normalize-space($updateKey) = concat(
gmd:CI_OnlineResource/gmd:linkage/gmd:URL,
gmd:CI_OnlineResource/gmd:protocol/*,
gmd:CI_OnlineResource/gmd:name/gco:CharacterString)))
and ($resourceHash = '' or digestUtils:md5Hex(string(exslt:node-set(normalize-space(.)))) = $resourceHash)]"
priority="2">
<xsl:call-template name="createOnlineSrc"/>
</xsl:template>

Expand Down Expand Up @@ -170,7 +223,7 @@ Insert is made in first transferOptions found.

<xsl:if test="$url">
<!-- In case the protocol is an OGC protocol
the name parameter may contains a list of layers
the name parameter may contain a list of layers
separated by comma.
In that case on one online element is added per
layer/featureType.
Expand Down
Loading