Skip to content

Commit

Permalink
Merge branch 'develop' into 10517-dataset-types #10517
Browse files Browse the repository at this point in the history
  • Loading branch information
pdurbin committed Aug 6, 2024
2 parents dd7541f + ac32b00 commit f2e1fa0
Show file tree
Hide file tree
Showing 17 changed files with 261 additions and 52 deletions.
32 changes: 32 additions & 0 deletions .github/workflows/check_property_files.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,32 @@
name: "Properties Check"
on:
pull_request:
paths:
- "src/**/*.properties"
- "scripts/api/data/metadatablocks/*"
jobs:
duplicate_keys:
name: Duplicate Keys
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Run duplicates detection script
shell: bash
run: tests/check_duplicate_properties.sh

metadata_blocks_properties:
name: Metadata Blocks Properties
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup GraalVM + Native Image
uses: graalvm/setup-graalvm@v1
with:
github-token: ${{ secrets.GITHUB_TOKEN }}
java-version: '21'
distribution: 'graalvm-community'
- name: Setup JBang
uses: jbangdev/setup-jbang@main
- name: Run metadata block properties verification script
shell: bash
run: tests/verify_mdb_properties.sh
23 changes: 3 additions & 20 deletions .github/workflows/shellspec.yml
Original file line number Diff line number Diff line change
Expand Up @@ -24,28 +24,11 @@ jobs:
run: |
cd tests/shell
shellspec
shellspec-centos7:
name: "CentOS 7"
shellspec-rocky9:
name: "RockyLinux 9"
runs-on: ubuntu-latest
container:
image: centos:7
steps:
- uses: actions/checkout@v2
- name: Install shellspec
run: |
curl -fsSL https://github.com/shellspec/shellspec/releases/download/${{ env.SHELLSPEC_VERSION }}/shellspec-dist.tar.gz | tar -xz -C /usr/share
ln -s /usr/share/shellspec/shellspec /usr/bin/shellspec
- name: Install dependencies
run: yum install -y ed
- name: Run shellspec
run: |
cd tests/shell
shellspec
shellspec-rocky8:
name: "RockyLinux 8"
runs-on: ubuntu-latest
container:
image: rockylinux/rockylinux:8
image: rockylinux/rockylinux:9
steps:
- uses: actions/checkout@v2
- name: Install shellspec
Expand Down
3 changes: 3 additions & 0 deletions doc/release-notes/make-data-count-.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
### Counter Processor 1.05 Support

This release includes support for counter-processor-1.05 for processing Make Data Count metrics. If you are running Make Data Counts support, you should reinstall/reconfigure counter-processor as described in the latest Guides. (For existing installations, note that counter-processor-1.05 requires a Python3, so you will need to follow the full counter-processor install. Also note that if you configure the new version the same way, it will reprocess the days in the current month when it is first run. This is normal and will not affect the metrics in Dataverse.)
2 changes: 1 addition & 1 deletion doc/sphinx-guides/source/_static/util/counter_daily.sh
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
#! /bin/bash

COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-0.1.04"
COUNTER_PROCESSOR_DIRECTORY="/usr/local/counter-processor-1.05"
MDC_LOG_DIRECTORY="/usr/local/payara6/glassfish/domains/domain1/logs/mdc"

# counter_daily.sh
Expand Down
8 changes: 4 additions & 4 deletions doc/sphinx-guides/source/admin/make-data-count.rst
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ Architecture

Dataverse installations who would like support for Make Data Count must install `Counter Processor`_, a Python project created by California Digital Library (CDL) which is part of the Make Data Count project and which runs the software in production as part of their `DASH`_ data sharing platform.

.. _Counter Processor: https://github.com/CDLUC3/counter-processor
.. _Counter Processor: https://github.com/gdcc/counter-processor
.. _DASH: https://cdluc3.github.io/dash/

The diagram below shows how Counter Processor interacts with your Dataverse installation and the DataCite hub, once configured. Dataverse installations using Handles rather than DOIs should note the limitations in the next section of this page.
Expand Down Expand Up @@ -84,9 +84,9 @@ Configure Counter Processor

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-0.1.04``
* ``cd /usr/local/counter-processor-1.05``

* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-0.1.04``.
* Download :download:`counter-processor-config.yaml <../_static/admin/counter-processor-config.yaml>` to ``/usr/local/counter-processor-1.05``.

* Edit the config file and pay particular attention to the FIXME lines.

Expand All @@ -99,7 +99,7 @@ Soon we will be setting up a cron job to run nightly but we start with a single

* Change to the directory where you installed Counter Processor.

* ``cd /usr/local/counter-processor-0.1.04``
* ``cd /usr/local/counter-processor-1.05``

* If you are running Counter Processor for the first time in the middle of a month, you will need create blank log files for the previous days. e.g.:

Expand Down
5 changes: 5 additions & 0 deletions doc/sphinx-guides/source/container/dev-usage.rst
Original file line number Diff line number Diff line change
Expand Up @@ -341,6 +341,11 @@ The steps below describe options to enable the later in different IDEs.

**IMPORTANT**: This tool uses a Bash shell script and is thus limited to Mac and Linux OS.

Exploring the Database
----------------------

See :ref:`db-name-creds` in the Developer Guide.

Using a Debugger
----------------

Expand Down
6 changes: 4 additions & 2 deletions doc/sphinx-guides/source/contributor/documentation.md
Original file line number Diff line number Diff line change
Expand Up @@ -74,8 +74,10 @@ In some parts of the documentation, graphs are rendered as images using the Sphi

Building the guides requires the ``dot`` executable from GraphViz.

This requires having `GraphViz <https://graphviz.org>`_ installed and either having ``dot`` on the path or
`adding options to the make call <https://groups.google.com/forum/#!topic/sphinx-users/yXgNey_0M3I>`_.
This requires having [GraphViz](https://graphviz.org) installed and either having ``dot`` on the path or
[adding options to the `make` call](https://groups.google.com/forum/#!topic/sphinx-users/yXgNey_0M3I).

On a Mac we recommend installing GraphViz through [Homebrew](<https://brew.sh>). Once you have Homebrew installed and configured to work with your shell, you can type `brew install graphviz`.

### Editing and Building the Guides

Expand Down
10 changes: 8 additions & 2 deletions doc/sphinx-guides/source/developers/dev-environment.rst
Original file line number Diff line number Diff line change
Expand Up @@ -32,14 +32,20 @@ Install Java

The Dataverse Software requires Java 17.

On Mac and Windows, we suggest downloading OpenJDK from https://adoptium.net (formerly `AdoptOpenJDK <https://adoptopenjdk.net>`_) or `SDKMAN <https://sdkman.io>`_.
On Mac and Windows, we suggest using `SDKMAN <https://sdkman.io>`_ to install Temurin (Eclipe's name for its OpenJDK distribution). Type ``sdk install java 17`` and then hit the "tab" key until you get to a version that ends with ``-tem`` and then hit enter.

Alternatively you can download Temurin from https://adoptium.net (formerly `AdoptOpenJDK <https://adoptopenjdk.net>`_).

On Linux, you are welcome to use the OpenJDK available from package managers.

Install Maven
~~~~~~~~~~~~~

Follow instructions at https://maven.apache.org
If you are using SKDMAN, run this command:

``sdk install maven``

Otherwise, follow instructions at https://maven.apache.org.

Install and Start Docker
~~~~~~~~~~~~~~~~~~~~~~~~
Expand Down
6 changes: 3 additions & 3 deletions doc/sphinx-guides/source/developers/make-data-count.rst
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
Make Data Count
===============

Support for Make Data Count is a feature of the Dataverse Software that is described in the :doc:`/admin/make-data-count` section of the Admin Guide. In order for developers to work on the feature, they must install Counter Processor, a Python 3 application, as described below. Counter Processor can be found at https://github.com/CDLUC3/counter-processor
Support for Make Data Count is a feature of the Dataverse Software that is described in the :doc:`/admin/make-data-count` section of the Admin Guide. In order for developers to work on the feature, they must install Counter Processor, a Python 3 application, as described below. Counter Processor can be found at https://github.com/gdcc/counter-processor

.. contents:: |toctitle|
:local:
Expand Down Expand Up @@ -49,7 +49,7 @@ Once you are done with your configuration, you can run Counter Processor like th

``su - counter``

``cd /usr/local/counter-processor-0.1.04``
``cd /usr/local/counter-processor-1.05``

``CONFIG_FILE=counter-processor-config.yaml python39 main.py``

Expand Down Expand Up @@ -82,7 +82,7 @@ Second, if you are also sending your SUSHI report to Make Data Count, you will n

``curl -H "Authorization: Bearer $JSON_WEB_TOKEN" -X DELETE https://$MDC_SERVER/reports/$REPORT_ID``

To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-0.1.04/tmp/datacite_response_body.txt``
To get the ``REPORT_ID``, look at the logs generated in ``/usr/local/counter-processor-1.05/tmp/datacite_response_body.txt``

To read more about the Make Data Count api, see https://github.com/datacite/sashimi

Expand Down
54 changes: 47 additions & 7 deletions doc/sphinx-guides/source/developers/tips.rst
Original file line number Diff line number Diff line change
Expand Up @@ -94,23 +94,63 @@ Then configure the JVM option mentioned in :ref:`install-imagemagick` to the pat
Database Schema Exploration
---------------------------

With over 100 tables, the Dataverse Software PostgreSQL database ("dvndb") can be somewhat daunting for newcomers. Here are some tips for coming up to speed. (See also the :doc:`sql-upgrade-scripts` section.)
With over 100 tables, the Dataverse PostgreSQL database can be somewhat daunting for newcomers. Here are some tips for coming up to speed. (See also the :doc:`sql-upgrade-scripts` section.)

.. _db-name-creds:

Database Name and Credentials
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

The default database name and credentials depends on how you set up your dev environment.

.. list-table::
:header-rows: 1
:align: left

* - MPCONFIG Key
- Docker
- Classic
* - dataverse.db.name
- ``dataverse``
- ``dvndb``
* - dataverse.db.user
- ``dataverse``
- ``dvnapp``
* - dataverse.db.password
- ``secret``
- ``secret``

Here's an example of using these credentials from within the PostgreSQL container (see :doc:`/container/index`):

.. code-block:: bash
pdurbin@beamish dataverse % docker exec -it postgres-1 bash
root@postgres:/# export PGPASSWORD=secret
root@postgres:/# psql -h localhost -U dataverse dataverse
psql (16.3 (Debian 16.3-1.pgdg120+1))
Type "help" for help.
dataverse=# select id,alias from dataverse limit 1;
id | alias
----+-------
1 | root
(1 row)
See also :ref:`database-persistence` in the Installation Guide.

pgAdmin
~~~~~~~~
~~~~~~~

Back in the :doc:`classic-dev-env` section, we had you install pgAdmin, which can help you explore the tables and execute SQL commands. It's also listed in the :doc:`tools` section.
If you followed the :doc:`classic-dev-env` section, we had you install pgAdmin, which can help you explore the tables and execute SQL commands. It's also listed in the :doc:`tools` section.

SchemaSpy
~~~~~~~~~

SchemaSpy is a tool that creates a website of entity-relationship diagrams based on your database.

As part of our build process for running integration tests against the latest code in the "develop" branch, we drop the database on the "phoenix" server, recreate the database by deploying the latest war file, and run SchemaSpy to create the following site: http://phoenix.dataverse.org/schemaspy/latest/relationships.html
We periodically run SchemaSpy and publish the output: https://guides.dataverse.org/en/6.2/schemaspy/index.html

To run this command on your laptop, download SchemaSpy and take a look at the syntax in ``scripts/deploy/phoenix.dataverse.org/post``

To read more about the phoenix server, see the :doc:`testing` section.
To run SchemaSpy locally, take a look at the syntax in ``scripts/deploy/phoenix.dataverse.org/post``.

Deploying With ``asadmin``
--------------------------
Expand Down
16 changes: 8 additions & 8 deletions doc/sphinx-guides/source/installation/prerequisites.rst
Original file line number Diff line number Diff line change
Expand Up @@ -428,7 +428,7 @@ firewalled from your Dataverse installation host).
Counter Processor
-----------------

Counter Processor is required to enable Make Data Count metrics in a Dataverse installation. See the :doc:`/admin/make-data-count` section of the Admin Guide for a description of this feature. Counter Processor is open source and we will be downloading it from https://github.com/CDLUC3/counter-processor
Counter Processor is required to enable Make Data Count metrics in a Dataverse installation. See the :doc:`/admin/make-data-count` section of the Admin Guide for a description of this feature. Counter Processor is open source and we will be downloading it from https://github.com/gdcc/counter-processor

Installing Counter Processor
============================
Expand All @@ -438,9 +438,9 @@ A scripted installation using Ansible is mentioned in the :doc:`/developers/make
As root, download and install Counter Processor::

cd /usr/local
wget https://github.com/CDLUC3/counter-processor/archive/v0.1.04.tar.gz
tar xvfz v0.1.04.tar.gz
cd /usr/local/counter-processor-0.1.04
wget https://github.com/gdcc/counter-processor/archive/refs/tags/v1.05.tar.gz
tar xvfz v1.05.tar.gz
cd /usr/local/counter-processor-1.05

Installing GeoLite Country Database
===================================
Expand All @@ -451,7 +451,7 @@ The process required to sign up, download the database, and to configure automat

As root, change to the Counter Processor directory you just created, download the GeoLite2-Country tarball from MaxMind, untar it, and copy the geoip database into place::

<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-0.1.04 directory>
<download or move the GeoLite2-Country.tar.gz to the /usr/local/counter-processor-1.05 directory>
tar xvfz GeoLite2-Country.tar.gz
cp GeoLite2-Country_*/GeoLite2-Country.mmdb maxmind_geoip

Expand All @@ -461,12 +461,12 @@ Creating a counter User
As root, create a "counter" user and change ownership of Counter Processor directory to this new user::

useradd counter
chown -R counter:counter /usr/local/counter-processor-0.1.04
chown -R counter:counter /usr/local/counter-processor-1.05

Installing Counter Processor Python Requirements
================================================

Counter Processor version 0.1.04 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.
Counter Processor version 1.05 requires Python 3.7 or higher. This version of Python is available in many operating systems, and is purportedly available for RHEL7 or CentOS 7 via Red Hat Software Collections. Alternately, one may compile it from source.

The following commands are intended to be run as root but we are aware that Pythonistas might prefer fancy virtualenv or similar setups. Pull requests are welcome to improve these steps!

Expand All @@ -477,7 +477,7 @@ Install Python 3.9::
Install Counter Processor Python requirements::

python3.9 -m ensurepip
cd /usr/local/counter-processor-0.1.04
cd /usr/local/counter-processor-1.05
pip3 install -r requirements.txt

See the :doc:`/admin/make-data-count` section of the Admin Guide for how to configure and run Counter Processor.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -117,10 +117,10 @@ public class DatasetMetrics implements Serializable {
* For an example of sending various metric types (total-dataset-requests,
* unique-dataset-investigations, etc) for a given month (2018-04) per
* country (DK, US, etc.) see
* https://github.com/CDLUC3/counter-processor/blob/5ce045a09931fb680a32edcc561f88a407cccc8d/good_test.json#L893
* https://github.com/gdcc/counter-processor/blob/5ce045a09931fb680a32edcc561f88a407cccc8d/good_test.json#L893
*
* counter-processor uses GeoLite2 for IP lookups according to their
* https://github.com/CDLUC3/counter-processor#download-the-free-ip-to-geolocation-database
* https://github.com/gdcc/counter-processor#download-the-free-ip-to-geolocation-database
*/
@Column(nullable = true)
private String countryCode;
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -27,15 +27,15 @@
* How to Make Your Data Count July 10th, 2018).
*
* The recommended starting point to implement Make Data Count is
* https://github.com/CDLUC3/Make-Data-Count/blob/master/getting-started.md
* https://github.com/gdcc/Make-Data-Count/blob/master/getting-started.md
* which specifically recommends reading the "COUNTER Code of Practice for
* Research Data" mentioned in the user facing docs.
*
* Make Data Count was first implemented in DASH. Here's an example dataset:
* https://dash.ucmerced.edu/stash/dataset/doi:10.6071/M3RP49
*
* For processing logs we could try DASH's
* https://github.com/CDLUC3/counter-processor
* https://github.com/gdcc/counter-processor
*
* Next, DataOne implemented it, and you can see an example dataset here:
* https://search.dataone.org/view/doi:10.5063/F1Z899CZ
Expand Down
3 changes: 2 additions & 1 deletion src/main/java/propertyFiles/codeMeta20.properties
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
metadatablock.name=codeMeta20
metadatablock.displayName=Software Metadata (CodeMeta 2.0)
metadatablock.displayName=Software Metadata (CodeMeta v2.0)
metadatablock.displayFacet=Software
datasetfieldtype.codeVersion.title=Software Version
datasetfieldtype.codeVersion.description=Version of the software instance, usually following some convention like SemVer etc.
datasetfieldtype.codeVersion.watermark=e.g. 0.2.1 or 1.3 or 2021.1 etc
Expand Down
37 changes: 37 additions & 0 deletions tests/check_duplicate_properties.sh
Original file line number Diff line number Diff line change
@@ -0,0 +1,37 @@
#!/bin/bash

# This script will check Java *.properties files within the src dir for duplicates
# and print logs with file annotations about it.

set -euo pipefail

FAIL=0

while IFS= read -r -d '' FILE; do

# Scan the whole file for duplicates
FILTER=$(grep -a -v -E "^(#.*|\s*$)" "$FILE" | cut -d"=" -f1 | sort | uniq -c | tr -s " " | { grep -vs "^ 1 " || true; })

# If there are any duplicates present, analyse further to point people to the source
if [ -n "$FILTER" ]; then
FAIL=1

echo "::group::$FILE"
for KEY in $(echo "$FILTER" | cut -d" " -f3); do
# Find duplicate lines' numbers by grepping for the KEY and cutting the number from the output
DUPLICATE_LINES=$(grep -n -E -e "^$KEY=" "$FILE" | cut -d":" -f1)
# Join the found line numbers for better error log
DUPLICATE_NUMBERS=$(echo "$DUPLICATE_LINES" | paste -sd ',')

# This form will make Github annotate the lines in the PR that changes the properties file
for LINE_NUMBER in $DUPLICATE_LINES; do
echo "::error file=$FILE,line=$LINE_NUMBER::Found duplicate for key '$KEY' in lines $DUPLICATE_NUMBERS"
done
done
echo "::endgroup::"
fi
done < <( find "$(git rev-parse --show-toplevel)" -wholename "*/src/*.properties" -print0 )

if [ "$FAIL" -eq 1 ]; then
exit 1
fi
1 change: 1 addition & 0 deletions tests/shell/spec/spec_helper.sh
Original file line number Diff line number Diff line change
Expand Up @@ -22,3 +22,4 @@ spec_helper_configure() {
# Available functions: import, before_each, after_each, before_all, after_all
: import 'support/custom_matcher'
}

Loading

0 comments on commit f2e1fa0

Please sign in to comment.