Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make matrix mod 2 conversion to numpy faster & some semantic fixes #39152

Open
wants to merge 1 commit into
base: develop
Choose a base branch
from

Conversation

user202729
Copy link
Contributor

@user202729 user202729 commented Dec 18, 2024

As in the title. Plus a few minor changes as needed.

Reuses numpy_util module from #38834 for the utility function…
in retrospect it might have been placed in the wrong place. (?)

📝 Checklist

  • The title is concise and informative.
  • The description explains in detail what this PR is about.
  • I have linked a relevant issue or discussion.
  • I have created tests covering the changes.
  • I have updated the documentation and checked the documentation preview.

@user202729 user202729 force-pushed the faster-to-numpy branch 2 times, most recently from 97b8c50 to bfb90b9 Compare December 18, 2024 06:43
@@ -303,7 +304,7 @@ def process_block(block, src_in_lines, file_optional_tags, venv_explainer=''):
got = re.sub(r'(doctest:warning).*^( *DeprecationWarning:)',
r'\1...\n\2',
got, 1, re.DOTALL | re.MULTILINE)
got = got.splitlines() # got can't be the empty string
got = textwrap.dedent(got).splitlines() # got can't be the empty string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously the fix made by the script would look like

sage: b = numpy.array(a); b
array([[ 0,  1,  2,  3],
[ 4,  5,  6,  7],
[ 8,  9, 10, 11]])

because each line is individually lstrip-ed. With the change it becomes

sage: b = numpy.array(a); b
array([[ 0,  1,  2,  3],
       [ 4,  5,  6,  7],
       [ 8,  9, 10, 11]])

Copy link

github-actions bot commented Dec 18, 2024

Documentation preview for this PR (built with commit b1d65a8; changes) is ready! 🎉
This preview will update shortly after each push to this PR.

"""
if copy is not _MISSING:
from sage.misc.superseded import deprecation
deprecation(39152, "passing copy argument to numpy() is deprecated")
Copy link
Contributor Author

@user202729 user202729 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I decide to deprecate this feature because:

  • copy=False was never supported in the first place (it always copies)
  • copy=False implicitly copies is incompatible with numpy 2.0 interface where np.array(..., copy=False) raises ValueError if a copy is made
  • copy=* doesn't work in numpy-based matrices (even after Fix matrix coercion with numpy 2.1 #38683 )
  • it seems dangerous to expose the internal array (which will change on mutation on the original object, which requires implementation to use the exact dtype otherwise user visible change will be seen)

20000
"""
from ..modules.numpy_util import mzd_matrix_to_numpy
return mzd_matrix_to_numpy(<uintptr_t>self._entries, dtype)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This method was the original plan (it overrides numpy() method of parent to provide a fast path)

return self._matrix_numpy.copy()
else:
return Matrix_dense.numpy(self, dtype=dtype)
return np.array(self._matrix_numpy, dtype=dtype)
Copy link
Contributor Author

@user202729 user202729 Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Simplification.

Also the old code made a call to generic Matrix_dense.numpy(self, dtype=dtype) method which is obviously much slower than this.

+ the old code documentation is in fact incorrect because the __array__ method of Matrix_numpy_dense class is not overridden, so the method is not called inside np.array(...). Of course it works because of the generic (slow) implementation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what's wrong with flintlib/flint#2027 so that it "prevents testing" ?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, nothing. I just mean that I need a version of flint after that pull request to allow testing, and I haven't gotten around to figure out how to install it from source yet (since latest version on conda-forge didn't have that pull request merged)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in Sage we have a workaround (some macro magic) for this Flint issue installed. (Flint still hasn't released a version with this fix merged)

@user202729 user202729 changed the title Make matrix mod 2 conversion to numpy faster Make matrix mod 2 conversion to numpy faster & some semantic fixes Dec 18, 2024
@user202729 user202729 marked this pull request as draft December 28, 2024 10:05
@user202729 user202729 marked this pull request as ready for review December 28, 2024 14:50
@user202729
Copy link
Contributor Author

I tried to install numpy 2.1.0 with pip (in conda environment) and monkey-patch flint headers to change the I.

The new tests pass, but when do a --all test there are a bunch of

    ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
    ImportError: numpy.core.multiarray failed to import
    AttributeError: _ARRAY_API not found

How do I install numpy 2.1.0 properly? ( @antonio-rojas ?)

It would be easier if we can just make it one of the checks for the CI, although it has significant overhead, especially on pull requests that doesn't touch the numpy part.

@dimpase
Copy link
Member

dimpase commented Dec 29, 2024

to allow unfixed flint, one can do a bit of C macro hackery (a branch doing this by mkoeppe is floating somewhere, I can't find it afk) - is it what you refer to as monkey-patching?

@user202729
Copy link
Contributor Author

user202729 commented Jan 1, 2025

@dimpase Do you know why fpylll doesn't work with numpy==2.0.0 (/ what's expected time frame for it to be supported / or if there's any way around it?)

@dimpase
Copy link
Member

dimpase commented Jan 1, 2025

no idea, but surely fpylll works with numpy 2.

vbraun pushed a commit to vbraun/sage that referenced this pull request Jan 1, 2025
    
As in the title. Estimated 50× speedup.


### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [x] I have updated the documentation and checked the documentation
preview.

(Reuses the `numpy_util` module from sagemath#38834 for the utility function)


Reverse direction: sagemath#39152
    
URL: sagemath#39219
Reported by: user202729
Reviewer(s): Travis Scrimshaw
vbraun pushed a commit to vbraun/sage that referenced this pull request Jan 3, 2025
    
As in the title. Estimated 50× speedup.


### 📝 Checklist

<!-- Put an `x` in all the boxes that apply. -->

- [x] The title is concise and informative.
- [x] The description explains in detail what this PR is about.
- [x] I have linked a relevant issue or discussion.
- [x] I have created tests covering the changes.
- [x] I have updated the documentation and checked the documentation
preview.

(Reuses the `numpy_util` module from sagemath#38834 for the utility function)


Reverse direction: sagemath#39152
    
URL: sagemath#39219
Reported by: user202729
Reviewer(s): Travis Scrimshaw
@antonio-rojas
Copy link
Contributor

The new tests pass, but when do a --all test there are a bunch of

    ValueError: numpy.dtype size changed, may indicate binary incompatibility. Expected 96 from C header, got 88 from PyObject
    ImportError: numpy.core.multiarray failed to import
    AttributeError: _ARRAY_API not found

How do I install numpy 2.1.0 properly? ( @antonio-rojas ?)

I know nothing about conda, but upgrading numpy requires a rebuild of everything that uses its C API.

@user202729
Copy link
Contributor Author

user202729 commented Jan 4, 2025

Nice, conda-forge rebuilt fpylll with numpy 2.0. Should be alright to just use conda-provided fpylll then. (https://github.com/conda-forge/fpylll-feedstock/pull/32)

@user202729
Copy link
Contributor Author

Alright, I tested this on numpy 2 and everything passes — with irrelevant failures fixed by #39242 .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants