Don't do a loadgen release from dev branch, add python3.12,13 to load…

…gen test, exclude power-checker from auto format (#1994) * Update generate_final_report.py * Fix sdxl (#1911) * Fix typo in fid_score.py, fail_safe for SDXL short runs * [Automated Commit] Format Codebase * Fix typo in fid_score.py, fail_safe for SDXL short runs * Fix dlrmv2 reference implementation | Update run_local.sh * Fixes for filtering invalid results * [Automated Commit] Format Codebase * Update preprocess_submission.py * Added an option to pass in sample_ids.txt for SDXL accuracy check * [Automated Commit] Format Codebase * Update accuracy_coco.py * [Automated Commit] Format Codebase * Fix typo * Not use default for sample_ids.txt * Update requirements.txt (#1907) Updating the pip packages * Fix preprocess_sudbmission for a bug * Update submission_checker.py | Removed TEST05 * Fix to SDXL accuracy output * Added exists checks for rmtree in preprocess_submission script * [Automated Commit] Format Codebase * Delete .github/workflows/format.yml * Delete .github/scripts directory * Update build_wheels.yml | Added src distribution * Update VERSION.txt * Update build_wheels.yml * Update VERSION.txt * Update pyproject.toml * Increment version to 4.1.26 * Update MANIFEST.in * Increment version to 4.1.27 * Update pyproject.toml * Increment version to 4.1.28 * Update build_wheels.yml * Update VERSION.txt * Update accuracy_coco.py * Making sdxl run thread safe * Create format.yml | Run format on push instead of PR * Update backend_pytorch.py | Fix lock usage * Upgrade loadgen version to 5.0 (#1962) * Fix loadgen build for version numbers having "0" (#1967) * Fix loadgen build for version numbers having "0" * Update test-resnet50.yml * Update test-retinanet.yml * Update test-bert.yml * Increment version to 5.0.1 * Fix Dockerfile for 405B (#1960) Co-authored-by: Miro <mirhodak@amd.com> * Add llama3 metrics + remove llama3-99.9 (#1973) * Fix submission checker for v5.0 rgat (#1974) * Fix submission checker for v5.0 rgat * Update submission_checker.py | Updates for v5.0 * [Automated Commit] Format Codebase * Update submission_checker.py | Fixes latency_constraints for v5.0 * [Automated Commit] Format Codebase --------- Co-authored-by: mlcommons-bot <mlcommons-bot@users.noreply.github.com> * Fix test05 seeds missing error for v5.0 submission checker (#1976) * Fix llama3-405B docker workflow and performance sample count (#1978) * Fix llama3-405B docker workflow * Fix the performance sample count from 8312 to 8313 * More fixes * Increment version to 5.0.2 * Fix submission generation for v5.0 (#1981) * Fix submission checker for v5.0 rgat * Fix accuracy pattern for rgat, report-generator for v5.0 * More minor fixes for llama3.1-405b (#1983) * More minor fixes * Fix indentation for stats report * Remove unused rgat files (#1961) Co-authored-by: Miro <mirhodak@amd.com> * Update docker GPU, avoid long build time (#1966) Co-authored-by: Miro <mirhodak@amd.com> * Require equal issue mode for R-GAT (#1968) * Require equal issue mode for R-GAT * Add equal issue note in readme --------- Co-authored-by: Miro <mirhodak@amd.com> * Increment version to 5.0.3 * Docs update for r-gat (#1969) * Fixes #1648, restrict loadgen uncommitted error message to within the loadgen directory * Update test-rnnt.yml (#1688) Stopping the github action for rnnt * Added docs init Added github action for website publish Update benchmark documentation Update publish.yaml Update publish.yaml Update benchmark documentation Improved the submission documentation Fix taskname Removed unused images * Fix benchmark URLs * Fix links * Add _full variation to run commands * Added script flow diagram * Added docker setup command for CM, extra run options * Added support for docker options in the docs * Added --quiet to the CM run_cmds in docs * Fix the test query count for cm commands * Support ctuning-cpp implementation * Added commands for mobilenet models * Docs cleanup * Docs cleanup * Added separate files for dataset and models in the docs * Remove redundant tab in the docs * Fixes some WIP models in the docs * Use the official docs page for CM installation * Fix the deadlink in docs * Fix indendation issue in docs * Added dockerinfo for nvidia implementation * Added run options for gptj * Added execution environment tabs * Cleanup of the docs * Cleanup of the docs * Reordered the sections of the docs page * Removed an unnecessary heading in the docs * Fixes the commands for datacenter * Fix the build --sdist for loadgen * Fixes #1761, llama2 and mixtral runtime error on CPU systems * Added mixtral to the benchmark list, improved benchmark docs * Update docs for MLPerf inference v4.1 * Update docs for MLPerf inference v4.1 * Fix typo * Gave direct link to implementation readmes * Added tables detailing implementations * Update vision README.md, split the frameworks into separate rows * Update README.md * pointed links to specific frameworks * pointed links to specific frameworks * Update Submission_Guidelines.md * Update Submission_Guidelines.md * Update Submission_Guidelines.md * api support llama2 * Added request module and reduced max token len * Fix for llama2 api server * Update SUT_API offline to work for OpenAI * Update SUT_API.py * Minor fixes * Fix json import in SUT_API.py * Fix llama2 token length * Added model name verification with server * clean temp files * support num_workers in LLAMA2 SUTs * Remove batching from Offline SUT_API.py * Update SUT_API.py * Minor fixes for llama2 API * Fix for llama2 API * removed table of contents * enabled llama2-nvidia + vllm-NM : WIP * enabled dlrm for intel * lower cased implementation * added raw data input * corrected data download commands * renamed filename * changes for bert and vllm * documentation to work on custom repo and branch * benchmark index page update * enabled sdxl for nvidia and intel * updated vllm server run cmd * benchmark page information addition * fix indendation issue * Added submission categories * update submission page - generate submission with or w/o using CM for benchmarking * Updated kits dataset documentation * Updated model parameters * updation of information * updated non cm based benchmark * added info about hf password * added links to model and access tokens * Updated reference results structuree tree * submission docs cleanup * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * added generic stubs deepsparse * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info * Some cleanups for benchmark info (FID and CLIP data added) * typo fix for bert deepsparse framework * added min system requirements for models * fixed code version * changes for displaying reference and intel implementation tip * added reference to installation page * updated neural magic documentation * Added links to the install page, redirect benchmarks page * added tips about batch size and dataset for nvidia llama2 * fix conditions logic * modified tips and additional run cmds * sentence corrections * Minor fix for the documentation * fixed bug in deepsparse generic model stubs + styling * added more information to stubs * Added SCC24 readme, support reproducibility in the docs * Made clear the custom CM repo URL format * Support conditional implementation, setup and run tips * Support rocm for sdxl * Fix _short tag support * Fix install URL * Expose bfloat16 and float16 options for sdxl * Expose download model to host option for sdxl * IndySCC24 documentation added * Improve the SCC24 docs * Improve the support of short variation * Improved the indyscc24 documentation * Updated scc run commands * removed test_query_count option for scc * Remove scc24 in the main docs * Remove scc24 in the main docs * Fix docs: indendation issue on the submission page * generalised code for skipping test query count * Fixes for SCC24 docs * Fix scenario text in main.py * Fix links for scc24 * Fix links for scc24 * Improve the general docs * Fix links for scc24 * Use float16 in scc24 doc * Improve scc24 docs * Improve scc24 docs * Use float16 in scc24 doc * fixed command bug * Fix typo in docs * Fix typo in docs * Remove unnecessary indendation in docs * initial commit for tip - native run CUDA * Updated tip * added docker_cm_repo_branch to more run option - docker * Update docs for IndySCC24 * Support custom repo branch and owner for final report generation * enabled amd implementation for llama2 * updations for amd - docs * Fix scenarios in docs page * formatted the files to pass the gh action * scenarios -> fixed_scenarios in docs * [Automated Commit] Format Codebase * Update indyscc24-bert.md * Update scc24.md * updated tip for reference implementation (#1912) * [Automated Commit] Format Codebase * fix for run suffix (#1913) * [Automated Commit] Format Codebase * Updation for adding submission flow diagram * Added submission flow diagram * Update scc24.md * changes in submission documentation (#1946) * update results category (#1947) * changes for adding rgat to docs (#1965) * Update index.md | Added R-GAT details (WIP) * Update index.md * Create system_requirements.yml * Update system_requirements.yml * Update system_requirements.yml * Update system_requirements.yml --------- Co-authored-by: anandhu-eng <anandhukicks@gmail.com> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com> Co-authored-by: Pablo Gonzalez <pablo.gonzalez@factored.ai> Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com> Co-authored-by: Miro <mirhodak@amd.com> * [Automated Commit] Format Codebase * Update automated run command section - R-GAT (#1970) * Update automated run command section * add cm commands for model and dataset downloads * Update README.md * Update cm run cmds --------- Co-authored-by: Miro <mirhodak@amd.com> * Unify llama3 names to llama3.1-405b (#1982) * Unify llama3 names to llama3.1-405b * Set mlperf.conf name to llama3_1-405b * Increment version to 5.0.4 * Create test-rgat.yml (#1984) * Create test-rgat.yml * Update test-rgat.yml * Update test-rgat.yml --------- Co-authored-by: Miro <mirhodak@amd.com> * Update compliance test table (#1987) Co-authored-by: Miro <mirhodak@amd.com> * Create benchmark-checklist.md for r-gat (#1985) * Create benchmark-checklist.md for r-gat * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md * Update benchmark-checklist.md --------- Co-authored-by: Miro <mirhodak@amd.com> * Increment version to 5.0.5 * Added python3.12, 3.13 to loadgen test * Update format.yml | Don't format power_checker being synced from power-dev repo * Update index.md | Update accuracy for r-gat * Update benchmark-checklist.md for r-gat * Update CM commands in R-GAT README.md * Update README.md * Create reset-branch.yml * Create auto-update-dev.yml * Tested and fixed SDXL README (#1997) * Update SDXL README.md, improved CM commands * Update README.md | Fix SDXL model download path * Update README.md | Added cm command for downloading coco2014 size.50 * Update README.md | Fix SDXL calibration download command * Update SDXL README.md * Update README.md * Update preprocess_submission.py * Update README.md * Update README.md | added the outdirname in the CM command * Update README.md | added the outdirname in the CM Command * include cm commands - accuracy and calibration * Update README.md * Update README.md | added the outdirname in the CM command * Update README.md| added outdirname in the CM command * Support audit.conf with static mlperf.conf * Support audit.conf with static mlperf.conf * [Automated Commit] Format Codebase * Update test_settings_internal.cc | Fix conf_type usage * Update test_settings_internal.cc * Fixes to submission checker * [Automated Commit] Format Codebase * [Automated Commit] Format Codebase * Update submission_checker.py | Fix rgat performance_sample_count * Update evaluate-accuracy.py | Fixes #2008 * Update index.md * Update index.md * Update index.md * Update submission generation steps (WIP) * add submission generation graphs for local sync and through github repo (#2016) * add graphs for local sync and through github repo * Update index.md * Update index.md * Update index.md * Update index.md * Update index.md * Update index.md * Fixes to submission generation docs * Fixes to submission generation docs * Added link to the expected results folder structure * add docs for llama3 + inference version upgrade (#2020) * add docs for llama3 + inference version upgrade * add output path and hf token * Update CM run commands for llama3_1-405b (#2019) * Update CM run commands for llama3_1-405b * Update cm commands for llama3 * add information about hf tokens * Fixes the submission README * Update README.md * Create test-submission-generation.yml * Update test-submission-generation.yml * Clean invalid model results in preprocess_submission script * [Automated Commit] Format Codebase * Fixes the submission README * Update README.md * Update README.md * Update test-submission-generation.yml --------- Co-authored-by: arjunsuresh <arjunsuresh@users.noreply.github.com> Co-authored-by: Zhihan Jiang <68881590+nvzhihanj@users.noreply.github.com> Co-authored-by: pgmpablo157321 <pgmpablo157321@users.noreply.github.com> Co-authored-by: Miro <mirhodak@amd.com> Co-authored-by: Pablo Gonzalez <pablo.gonzalez@factored.ai> Co-authored-by: mlcommons-bot <mlcommons-bot@users.noreply.github.com> Co-authored-by: mrmhodak <mrmhodak@users.noreply.github.com> Co-authored-by: anandhu-eng <anandhukicks@gmail.com> Co-authored-by: ANANDHU S <71482562+anandhu-eng@users.noreply.github.com> Co-authored-by: Michael Goin <michael@neuralmagic.com> Co-authored-by: Mitchelle Rasquinha <80070689+mrasquinha-g@users.noreply.github.com> Co-authored-by: sahilavaran <139779393+sahilavaran@users.noreply.github.com>
mlcommons · Jan 7, 2025 · d6c3a8d · d6c3a8d
1 parent b9f22d6
commit d6c3a8d
Show file tree

Hide file tree

Showing 27 changed files with 535 additions and 247 deletions.
diff --git a/.github/workflows/auto-update-dev.yml b/.github/workflows/auto-update-dev.yml
@@ -0,0 +1,34 @@
+name: Auto-Update Dev Branch from Master
+
+on:
+  push:
+    branches:
+      - master  # Trigger workflow on commits to 'dev' branch
+
+jobs:
+  update-main:
+    runs-on: ubuntu-latest
+    permissions:
+      contents: write  # Required to push to protected branches
+
+    steps:
+      - name: Checkout Main Branch
+        uses: actions/checkout@v4
+        with:
+          ref: dev
+          fetch-depth: 0
+          ssh-key: ${{ secrets.DEPLOY_KEY }}
+
+      - name: Configure Git User
+        run: |
+          git config user.name "github-actions"
+          git config user.email "github-actions@github.com"
+
+      - name: Merge auto-update into dev
+        run: |
+          git fetch origin master:master
+          git merge --no-ff master -m "Auto-merge updates from master branch"
+
+      - name: Push Changes to Main
+        run: |
+          git push origin dev
diff --git a/.github/workflows/build_wheels.yml b/.github/workflows/build_wheels.yml
@@ -7,7 +7,6 @@ on:
     branches:
       - master
       - loadgen-release
-      - dev
     paths:
       - loadgen/**
 

diff --git a/.github/workflows/format.yml b/.github/workflows/format.yml
@@ -38,7 +38,7 @@ jobs:
           for FILE in $(git diff --name-only $filter | grep -E '.*\.py$')
           do
             # Check if the file still exists in the working tree
-            if [ -f "$FILE" ]; then
+            if [ -f "$FILE" ] && [ "$FILE" != "tools/submission/power/power_checker.py" ]; then
               autopep8 --in-place -a "$FILE"
               git add "$FILE"
             fi

diff --git a/.github/workflows/reset-branch.yml b/.github/workflows/reset-branch.yml
@@ -0,0 +1,42 @@
+name: Reset Current Branch to Upstream After Squash Merge
+
+on:
+  workflow_dispatch:
+    inputs:
+      branch:
+        description: 'Branch to reset (leave blank for current branch)'
+        required: false
+        default: 'dev'
+
+jobs:
+  reset-branch:
+    runs-on: ubuntu-latest
+
+    steps:
+      - name: Checkout Repository
+        uses: actions/checkout@v4
+        with:
+          fetch-depth: 0
+
+      - name: Detect Current Branch
+        if: ${{ inputs.branch == '' }}
+        run: echo "branch=$(git rev-parse --abbrev-ref HEAD)" >> $GITHUB_ENV
+
+      - name: Use Input Branch
+        if: ${{ inputs.branch != '' }}
+        run: echo "branch=${{ inputs.branch }}" >> $GITHUB_ENV
+
+      - name: Add Upstream Remote
+        run: |
+          git remote add upstream https://github.com/mlcommons/inference.git
+          git fetch upstream
+      - name: Reset Branch to Upstream
+        run: |
+          git checkout ${{ env.branch }}
+          git reset --hard upstream/${{ env.branch }}
+        if: success()
+
+      - name: Force Push to Origin
+        run: |
+          git push origin ${{ env.branch }} --force-with-lease
+        if: success()
diff --git a/.github/workflows/test-loadgen.yml b/.github/workflows/test-loadgen.yml
@@ -21,7 +21,7 @@ jobs:
     strategy:
       fail-fast: false
       matrix:
-        python-version: ["3.7", "3.8", "3.9", "3.10", "3.11"]
+        python-version: ["3.8", "3.9", "3.10", "3.11", "3.12", "3.13"]
 
     steps:
     - uses: actions/checkout@v3

diff --git a/.github/workflows/test-submission-generation.yml b/.github/workflows/test-submission-generation.yml
@@ -0,0 +1,52 @@
+# This workflow will test the submission generation using MLPerf Automation 
+
+name: CM based Submission Generation
+
+on:
+  pull_request:
+    branches: [ "master", "dev" ]
+    paths:
+      - '.github/workflows/test-submission-generation.yml'
+      - '**'  
+      - '!**.md'
+jobs:
+  submission_generation:
+    runs-on: ${{ matrix.os }}
+    strategy:
+      fail-fast: false
+      matrix:
+        os: [ubuntu-latest, windows-latest, macos-latest]
+        python-version: [ "3.12" ]
+        division: ["closed", "open", "closed-open"]
+        category: ["datacenter", "edge"]
+        case: ["closed"]
+        action: ["run", "docker"]
+        exclude:
+          - os: macos-latest
+          - os: windows-latest
+          - category: "edge"
+
+    steps:
+    - uses: actions/checkout@v4
+    - name: Set up Python ${{ matrix.python-version }}
+      uses: actions/setup-python@v3
+      with:
+        python-version: ${{ matrix.python-version }}
+    - name: Install dependencies
+      run: |
+        pip install cm4mlops
+    - name: Pull repo where test cases are uploaded
+      run: |
+        git clone -b submission-generation-examples https://github.com/mlcommons/inference.git submission_generation_examples
+    - name: Run Submission Generation - ${{ matrix.case }} ${{ matrix.action }} ${{ matrix.category }} ${{ matrix.division }} 
+      continue-on-error: true
+      run: |
+        if [ "${{ matrix.case }}" == "closed" ]; then
+          description="Test submission - contains closed edge and datacenter"
+        elif [ "${{ matrix.case }}" == "closed-power" ]; then
+          description="Test submission - contains closed-power edge and datacenter results"
+        fi
+        # Dynamically set the log group to simulate a dynamic step name
+        echo "::group::$description"
+        cm ${{ matrix.action }} script --tags=generate,inference,submission  --adr.compiler.tags=gcc --version=v5.0 --clean --preprocess_submission=yes --submission_base_dir=mysubmissions --results_dir=$PWD/submission_generation_tests/${{ matrix.case }}/ --run-checker --submitter=MLCommons --tar=yes --division=${{ matrix.division }} --env.CM_DETERMINE_MEMORY_CONFIGURATION=yes --quiet
+        cm ${{ matrix.action }} script --tags=run,submission,checker --submitter_id_off=mysubmitter_id --tar=yes --submission_dir=mysubmissions/submissions --submission_tar_file=mysubmission.tar.gz
diff --git a/docs/benchmarks/language/get-llama3_1-405b-data.md b/docs/benchmarks/language/get-llama3_1-405b-data.md
@@ -0,0 +1,41 @@
+---
+hide:
+  - toc
+---
+
+# Text Summarization using LLAMA3.1-405b
+
+## Dataset
+
+The benchmark implementation run command will automatically download the validation and calibration datasets and do the necessary preprocessing. In case you want to download only the datasets, you can use the below commands.
+
+=== "Validation"
+
+    ### Get Validation Dataset
+    ```
+    cm run script --tags=get,dataset,mlperf,inference,llama3,_validation --outdirname=<path to download> -j
+    ```
+
+=== "Calibration"
+
+    ### Get Calibration Dataset
+    ```
+    cm run script --tags=get,dataset,mlperf,inference,llama3,_calibration --outdirname=<path to download> -j
+    ```
+
+## Model
+The benchmark implementation run command will automatically download the required model and do the necessary conversions. In case you want to only download the official model, you can use the below commands.
+
+Get the Official MLPerf LLAMA3.1-405b Model
+
+=== "Pytorch"
+
+    ### Pytorch
+    ```
+    cm run script --tags=get,ml-model,llama3 --outdirname=<path to download> --hf_token=<huggingface access token> -j
+    ```
+
+!!! tip
+
+    Downloading llama3.1-405B model from Hugging Face will require an [**access token**](https://huggingface.co/settings/tokens) which could be generated for your account. Additionally, ensure that your account has access to the [llama3.1-405B](https://huggingface.co/meta-llama/Llama-3.1-405B-Instruct) model. 
+
diff --git a/docs/benchmarks/language/llama3_1-405b.md b/docs/benchmarks/language/llama3_1-405b.md
@@ -0,0 +1,13 @@
+---
+hide:
+  - toc
+---
+
+# Text Summarization using LLAMA3_1-405b
+
+=== "MLCommons-Python"
+    ## MLPerf Reference Implementation in Python
+
+{{ mlperf_inference_implementation_readme (4, "llama3_1-405b-99", "reference", devices=["CPU","CUDA"]) }}
+
+{{ mlperf_inference_implementation_readme (4, "llama3_1-405b-99.9", "reference", devices=["CPU","CUDA"]) }}
diff --git a/docs/index.md b/docs/index.md
@@ -163,7 +163,7 @@ The currently valid [MLPerf Inference Benchmarks](index_gh.md) as of MLPerf infe
     - **Dataset Size**: 788,379
     - **QSL Size**: 788,379
 - **Number of Parameters**: 
-- **Reference Model Accuracy**: ACC = ?
+- **Reference Model Accuracy**: ACC = 72.86%
 - **Server Scenario Latency Constraint**: N/A
 - **Equal Issue mode**: True
 - **High accuracy variant**: No
-Original file line number
+Diff line change
@@ Expand Up / @@ -7,7 +7,6 @@ on: @@
         branches:
           - master
           - loadgen-release
-          - dev
         paths:
           - loadgen/**
@@ Expand Down @@