Release first working memtester diag (#1)

Release first working memtester diag with the following features: * Implement first parser * Implement first fully working diag * Add requirements.txt * Forward args to memtester * Add Dockerfile to ease testing of different memtester versions * Add more memtester versions to the support list * Remove src files from container * Install build tools explicitly * Add some content to README.md * Fix workdir * Fix parsing errors for some FAILURE messages * Separate scripts from future tests * Add support for logging memtester's raw output * Improve README.md * Add unit tests * Move log file reporting * Improve README.md * Add license header * Report status and result for the test run * Fix copyright info according to recommendations * Get rid of the sudo advice * Remove stray semicolon * Rename FLR token * Annotate dataclasses * Improve argparse help message * Use proper cmd argument splitting * Type-annotate MemtesterObserver.run * Use Measurement instead of MeasurementSeries for bad addresses * Fail the diag for unknown memtester versions by default * Clarify test type in logs * Specify Python version requirement * Turn comments into docstrings Signed-off-by: Goshik92 <igorsemenov@google.com>
opencomputeproject · Dec 5, 2023 · 86afdaa · 86afdaa
1 parent ab50e45
commit 86afdaa
Show file tree

Hide file tree

Showing 10 changed files with 785 additions and 1 deletion.
diff --git a/Dockerfile b/Dockerfile
@@ -0,0 +1,15 @@
+# syntax=docker/dockerfile:1
+FROM ubuntu:23.10
+WORKDIR /home/root
+RUN apt-get update && apt-get install -y --no-install-recommends \
+    python3 pip wget build-essential
+COPY requirements.txt ./
+RUN pip install -r requirements.txt --break-system-packages
+ARG MT_VERSION=4.6.0 # Version of memtester to build and run
+ENV MT_NAME memtester-${MT_VERSION}
+ENV MT_ARCHIVE ${MT_NAME}.tar.gz
+RUN wget https://pyropus.ca./software/memtester/old-versions/${MT_ARCHIVE} && \
+    tar -xf ${MT_ARCHIVE} && cd ${MT_NAME} && make && mv memtester .. && \
+    cd .. && rm -r ${MT_NAME} && rm ${MT_ARCHIVE}
+COPY src/ ./
+CMD python3 main.py --mt_args="100m 3" --mt_path=./memtester
diff --git a/README.md b/README.md
@@ -1 +1,79 @@
-# ocp-diag-memtester
+# Memtester diagnostic
+
+[Memtester](https://linux.die.net/man/8/memtester) is a Linux utility that tests RAM.
+This repo contains Python scripts that turn `memtester` into an OCP-compliant diagnostic by parsing its output using [SLY](https://sly.readthedocs.io/en/latest/), a Python implementation of [Lex](https://en.wikipedia.org/wiki/Lex_(software)) and [Yacc](https://en.wikipedia.org/wiki/Yacc).
+The parsing process happens in runtime, allowing the diag to report partial results while the memory test is still running.
+
+Currently, Python >=3.11 is required to run this diag.
+
+## Basic usage
+To run the diag do the following:
+1. Install `memtester` on your DUT. Make sure the version you have is [supported](#supported-versions).
+2. Install all the dependencies specified in `requirements.txt`.
+3. Run the diag with parameters required for your use case.
+
+For Debian-based operating systems the procedure above may look as follows:
+```
+apt install memtester
+git clone https://github.com/opencomputeproject/ocp-diag-memtester.git
+cd ocp-diag-memtester
+python -m venv .
+source bin/activate
+pip install -r requirements.txt
+python3 src/main.py --mt_args="100M 3"
+```
+
+In the last command, `mt_args` specifies arguments that the diag will pass to `memtester`.
+The value above will make `memtester` reserve 100 megabytes of RAM and test it three times.
+For more info on the parameters you can pass to `memtester` run `man memtester`.
+
+## Supported versions
+The output `memtester` produces may vary significantly across different versions.
+This means that it is hard to write a parser that works properly with all of them.
+Thus, only a few versions are currently supported:
+
+| Version of memtester | Status |
+|-|-|
+| 4.6.0 | Full support. No known issues. |
+| 4.5.0 | Full support. No known issues. |
+| 4.5.1 | Full support. No known issues. |
+| 4.4.0 | Memtester reports memory errors for a normally working system. Error messages appear in unexpected places. No plans to support this version. |
+| 4.3.0 and earlier | Source code does not compile. No plans to support this version(s). |
+
+## Running custom `memtester`
+It is possible to run this diag with a custom version of `memtester`. To do that, install your custom version in the location
+of your choice and pass this location to the diag as follows (the path must include the name of the executable):
+
+```
+python3 main.py --mt_args="100M 3" --mt_path="/my/favorite/location/memtester"
+```
+
+In order for this diag to work properly, the output format of your `memtester` must comply with
+one of the supported versions.
+
+## Running unit tests
+Execute the following command in the repo's root directory to run the diag's unit tests:
+```
+python -m unittest
+```
+
+## Testing older memtester versions using Docker
+If you need to test this diag with an older memtester version, you can use `Dockerfile` from the root directory of this repo.
+This `Dockerfile` takes care of all necessary dependencies for the diag. In addition to that, it downloads `memtester` of the required version from its author's website and builds it.
+You can use the following command to build and run the container:
+
+```
+docker build -t ocp_memtester --build-arg="MT_VERSION=<version>" . && docker run --rm -t ocp_memtester
+```
+
+In the command above, replace `<version>` with the version of `memtester` you want to test. For example:
+
+```
+ docker build -t ocp_memtester --build-arg="MT_VERSION=4.5.1" . && docker run --rm -t ocp_memtester
+```
+
+Note: if you face any permission issues when running docker, please refer to [this article](https://docs.docker.com/engine/install/linux-postinstall).
+
+The list of available `memtester` versions can be found on [this page](https://pyropus.ca./software/memtester/old-versions).
+
+Since there is no Python parser for OCP-compliant output yet, the container does not do automatic diag validation, so you have to manually confirm whether the version you chose works correctly.
diff --git a/requirements.txt b/requirements.txt
@@ -0,0 +1,3 @@
+ocptv>=0.1.6
+sh>=2.0.6
+sly>=0.5
diff --git a/src/__init__.py b/src/__init__.py
diff --git a/src/main.py b/src/main.py
@@ -0,0 +1,125 @@
+# (c) OCP Test & Validation
+# (c) Google LLC
+# 
+# Use of this source code is governed by an MIT-style
+# license that can be found in the LICENSE file or at
+# https://opensource.org/licenses/MIT.
+
+import ocptv.output as tv
+import argparse
+import sh
+import re
+import os
+import shlex
+import contextlib
+from memtester_parsing import MemtesterObserver
+
+def main():
+    parser = argparse.ArgumentParser(description="OCP-compliant memtester wrapper.")
+    parser.add_argument("--mt_path", default="memtester",
+        help="Path to memtester executable")
+    parser.add_argument("--mt_args", default="",
+        help="Memtester arguments packed into a string. See `man memtester` for more detail")
+    parser.add_argument("--mt_log_filename", default=None,
+        help="Path to the log file for dumping raw memtester output (both STDOUT and STDERR)")
+    parser.add_argument("--mt_permit_unknown_versions", default=False,
+        help="If specified, unsupported memtester versions will not trigger the ERROR status",
+        action=argparse.BooleanOptionalAction)
+    args = parser.parse_args()
+
+    run = tv.TestRun(name="memtester", version="1.0")
+    dut = tv.Dut(id=sh.hostid().strip(), name=sh.hostname().strip())
+    observer = MemtesterObserver()
+
+    # Use ExitStack() to avoid deep nesting
+    with contextlib.ExitStack() as estack:
+        estack.enter_context(run.scope(dut=dut))
+        estack.enter_context((step := run.add_step("run-memtester")).scope())
+
+        # Check if a supported version of memtester is installed in the system
+        def version_callback(version, is_known):
+            step.add_log(tv.LogSeverity.INFO, "Memtester v{} was found".format(version))
+            if (not is_known):
+                m = "This version of memtester was not tested. Expect parsing errors"
+                step.add_log(tv.LogSeverity.WARNING, m)
+                if not args.mt_permit_unknown_versions:
+                    m = "The diag is configured to fail with unknown memtester versions"
+                    step.add_error(symptom="memtester-unknown-version", message=m)
+                    raise tv.TestRunError(
+                        status=tv.TestStatus.ERROR,
+                        result=tv.TestResult.NOT_APPLICABLE)
+        observer.callbacks.version_ready = version_callback
+
+        # Report loop results
+        def loop_callback(loop):
+            tests = loop.failed_tests()
+            if (len(tests) == 0):
+                m = "Loop #{} finished with success".format(loop.index)
+                step.add_log(tv.LogSeverity.INFO, m)
+            else:
+                # Log failed tests
+                names = ", ".join([t.name for t in tests])
+                m = "Loop #{} failed {} tests: {}".format(loop.index, len(tests), names)
+                step.add_error(symptom="loop-failure", message=m)
+
+                # Record failed addresses
+                for test in tests:
+                    for r in test.result:
+                        step.add_measurement(name="bad-address", value=r.addr)
+        observer.callbacks.loop_ready = loop_callback
+
+        # Report individual tests to make long runs more responsive
+        def test_callback(test):
+            m = "Memory test '{}' {}"
+            if test.passed():
+                step.add_log(tv.LogSeverity.INFO, m.format(test.name, "passed"))
+            else:
+                step.add_log(tv.LogSeverity.ERROR, m.format(test.name, "failed"))
+        observer.callbacks.test_ready = test_callback
+
+        # Report diagnosis
+        def run_callback(failed_loop_count):
+            if failed_loop_count == 0:
+                step.add_diagnosis(tv.DiagnosisType.PASS, verdict="memtester-passed")
+            else:
+                step.add_diagnosis(tv.DiagnosisType.FAIL, verdict="memtester-failed")
+                raise tv.TestRunError(status=tv.TestStatus.COMPLETE, result=tv.TestResult.FAIL)
+        observer.callbacks.run_ready = run_callback
+
+        # Report parsing errors (supposed to only happen with unknown memtester versions)
+        def parsing_error_callback(desc):
+            step.add_error(symptom="memtester-parsing-error", message=desc)
+            raise tv.TestRunError(status=tv.TestStatus.ERROR, result=tv.TestResult.NOT_APPLICABLE)
+        observer.callbacks.parsing_error = parsing_error_callback
+
+        # Log raw memtester output if necessary
+        if args.mt_log_filename:
+            estack.enter_context(log_file := open(args.mt_log_filename, "w"))
+            def line_ready_callback(line):
+                log_file.write(line)
+            observer.callbacks.line_ready = line_ready_callback
+
+        # Run memtester (finally!)
+        try:
+            mt_cmd = sh.Command(args.mt_path)
+            mt_args = shlex.split(args.mt_args)
+            with contextlib.redirect_stderr(None): # Magic to silence sh lib
+                # For _ok_code see https://linux.die.net/man/8/memtester
+                observer.run(mt_cmd(*mt_args,
+                    _err_to_out=True, _iter=True, _ok_code=(0, 2, 4, 2 | 4)))
+        except sh.CommandNotFound as e:
+            m = "Memtester not found: {}".format(e)
+            step.add_error(symptom="memtester-not-found", message=m)
+        except sh.ErrorReturnCode as e:
+            m = "Memtester returned code {}".format(e.exit_code)
+            step.add_error(symptom="memtester-error-code", message=m)
+            step.add_log(tv.LogSeverity.ERROR, "Memtester error description: {}".format(e))
+
+        # Link the raw output file at the end, so that it is complete when linked
+        if args.mt_log_filename:
+            name = os.path.basename(args.mt_log_filename)
+            uri = "file://" + os.path.abspath(args.mt_log_filename)
+            step.add_file(name=name, uri=uri, is_snapshot=False)
+
+if __name__ == '__main__':
+    main()