Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Changing things via the server/region/zone settings endpoint might break the format of the settings #979

Open
skateman opened this issue Dec 11, 2020 · 7 comments

Comments

@skateman
Copy link
Member

skateman commented Dec 11, 2020

Basically a format conversion problem, the settings are coming from YAML but presented as JSON are bringing in some conversion artifacts that are described here.

@miq-bot assign @gtanzillo

We should probably discuss this on our next UI/API meeting...

See also ManageIQ/manageiq#17201

@gtanzillo
Copy link
Member

gtanzillo commented Dec 16, 2020

@kbrock and I looked at config/settings.yml. As an experiment, we did the following:

orig=YAML.load_file("./config/settings.yml")
File.write("./config/settings2.yml", YAML.dump(orig))

s2=YAML.load_file("./config/settings2.yml")
s3=JSON.parse(s2.to_json, :symbolize_names => true)
File.write("./config/settings3.yml", YAML.dump(s3))
(git:kasparov) ~/work/ibm/manageiq$ diff -wb config/settings{,2}.yml
55c55
<     :reindex_schedule: "1 * * * *"
---
>     :reindex_schedule: 1 * * * *
59c59
<     :vacuum_schedule: "0 2 * * 6"
---
>     :vacuum_schedule: 0 2 * * 6
98d97
<
104,107d102
< # provider specific settings are nested here, but they are in the provider repos
< # e.g.:
< #  :ems_<provider_name>:
< #    :use_feature: false
1052c1047
<     :retry_interval: 15 # in seconds
---
>     :retry_interval: 15
1173c1168
<       :chargeback_generation_time_utc: 01:00:00
---
>       :chargeback_generation_time_utc: 3600

Without ignoring whitespace, there were about 30 other differences.

diff config/settings{2,3}.yml
985c985
<       :name: :used_swap_percent_gt_value
---
>       :name: used_swap_percent_gt_value
991c991
<       :name: :used_swap_percent_lt_value
---
>       :name: used_swap_percent_lt_value
1086c1086
<       :poll_method: :normal
---
>       :poll_method: normal
1116,1117c1116,1117
<         :dequeue_method: :drb
<         :poll_method: :normal
---
>         :dequeue_method: drb
>         :poll_method: normal
1123c1123
<           :poll_method: :escalate
---
>           :poll_method: escalate
1129c1129
<         :poll_method: :escalate
---
>         :poll_method: escalate
1136c1136
<           :poll_method: :normal
---
>           :poll_method: normal
1138c1138
<           :dequeue_method: :sql
---
>           :dequeue_method: SQL

Note: Here the main differences are due to values that were originally symbols are now strings and whitespaces.

@kbrock
Copy link
Member

kbrock commented Dec 16, 2020

Process

Our settings is stored in yaml and edited in json:

where data format note
fs or db yaml
ruby ruby objects converted from yaml
ruby json converted to json, prepping for send
browser json user edits
ruby json sent back up
ruby ruby objects converted from json
ruby yaml converted from ruby objects
db yaml deltas stored

yaml and yaml2 are not the same, even when no edits are made.
This is why we are having this discussion.

Where changes occurred

We looked at the 2 serialization processes:

  1. yaml => ruby => yaml
  2. ruby => json => ruby

I ran GT's script on settings.yml for all the repos. I am unsure whether there are other files that we use for populating settings, but this looked pretty complete to me. We learned that both of the serialization processes changed the values.

I feel the original note conflated the two issues.
If yaml can't do a round trip, it is not fair to blame this on json, especially when the original yaml file was not properly formatted.
Json does introduce issues, but the number of cases is minimal.

from ruby to json back to ruby objects

Changes introduced:

  • regular expressions got screwed up
    • nuage (4), azure (9), redfish (1)
  • symbols values are converted to strings
    • manageiq (9)

These are both native ruby types, so it makes sense that javascript centric json would not know how to handle them.

from yaml to ruby back to yaml

Changes introduced:

  • comments removed
    • azure, nuage, amazon, manageiq
  • indentation fixed
    • azure, openshift, api, nuage, amazon, openstack,
  • quotes changed from " to '
    • azure, openstack
  • quotes removed
    • lenovo, openstack, manageiq (cron strings), ovirt
  • comments were removed
    • azure, manageiq, (others too - sorry dropped the ball on this one)
  • _ in numerical values were removed
    • openshift, api
  • regular expression back slash fixed (e.g.: \A => \\A)
    • openstack, ovirt
  • blank lines removed
    • redfish
  • time convrted to number of seconds (01:00:00 => 3600)
    • manageiq

Of note, if I ran the serialization and deserialization process on the output file to another output file, it ended up with the same file. so at least it is deterministic. All hashes and arrays were in the same order.

These changes do not loose data but do alter it. Most of these changes are fixing errors in the original yaml files.

Proposal

We have a few ways to solve this:

  • keep everything as yaml and have the javascript handle serialization/deserialization
  • chang the settings so they are json compliant.
  • change the way we serialize the json to not be lossy with the data types we do use.

I propose the second option:

  • reformat yml files to be valid
  • use strings instead of symbols
  • use strings instead of regular expressions

verbose:

Modify all settings files so the yaml deserialization and serialization process produces the same output as the original input. This is basically done by running the yaml only portion of the script above and replacing the original scripts. I am not happy about loosing the comments but our current yaml process does loose them, and all the other changes are fixing indentation and escaping issues.

Modify manageiq in a few spots to handle string or symbolic values. It is possible that the code will work unchanged.
this will be backwards compatible, but we will change the settings files to use the string instead.

Modifying plugins to convert strings to regular expressions where they are used (i.e.: Regexp.new(value)). This is the biggest change but it is possible that all plugins already use only a single block of common manageiq core code.

We need to ensure there are no other settings files are also used.
We need to perform a potentially slow migration to convert yaml stored in the database to comply with these changes.

@gtanzillo I did not see the SQL change that you saw. Maybe my files are outdated or something strange is happengin in terminal encoding.

@gtanzillo gtanzillo added the bug label Dec 17, 2020
@kbrock
Copy link
Member

kbrock commented Dec 17, 2020

Notes from @gtanzillo and @kbrock:

  • yaml to yaml being the same is not necessary. we care about ruby objects comparison to ruby objects.
  • json is the only one we need to have serialize and deserialize produce the same value.
  • we store regexp as strings and as regular expressions (ruby objects) - it is only the places we store them as regular expressions that are an issue.
  • the yaml changes to regular expressions were only an issue when the string version of regular expressions is stored.
  • keenan hypothesizes that converting the symbolized values to strings may just work
  • It looks like all the regular expression objects are used for the same purpose and are processed in core and would need one line of code to fix
  • changes to our code for both problems are backwards compatible and can be made before we actually decide to change the value stored in the database

@kbrock
Copy link
Member

kbrock commented Dec 27, 2020

update:

@Fryguy
Copy link
Member

Fryguy commented Jan 4, 2021

Moving forward, I wonder if we should change our settings.yml to settings.json... This way we can't introduce non-JSON stuff.

@skateman
Copy link
Member Author

skateman commented Feb 8, 2021

@kbrock any update on this?

@miq-bot
Copy link
Member

miq-bot commented Feb 27, 2023

This issue has been automatically marked as stale because it has not been updated for at least 3 months.

If you can still reproduce this issue on the current release or on master, please reply with all of the information you have about it in order to keep the issue open.

Thank you for all your contributions! More information about the ManageIQ triage process can be found in the triage process documentation.

@Fryguy Fryguy removed the stale label Mar 2, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

6 participants