-
Notifications
You must be signed in to change notification settings - Fork 2
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Libcint #113
Libcint #113
Conversation
…in STO3G. Errors are ~ 1e-7.
…ch of suff: 1) float/double implciit slowdown (2) parallelize over tiles (3) controll number of rys_roots/c2s code we store on tiles
…checking if introduced any bugs, comparing different basis sets, molecules, etc.
Input<Vector<float>> mat; | ||
Input<Vector<int>> shls_slice; | ||
Input<Vector<int>> ao_loc; | ||
Input<Vector<int>> atm; | ||
Input<Vector<int>> bas; | ||
Input<Vector<float>> env; | ||
Input<Vector<int>> natm; | ||
Input<Vector<int>> nbas; | ||
Input<Vector<int>> which_integral; | ||
Input<Vector<int>> comp; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Similarly sizes here
cintopt = None | ||
|
||
# type | ||
float32 = "#define dtype float" in open("libcint.c", "r").read() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is not going to be called a lot? It will be expensive, and after ~1000 calls will fail due to running out of file handles.
Suggest a @memoized
function called e.g. libcint_is_float32()
, which closes the file.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Or just compute it up at line 19 where you are already interacting with the filesyste,
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I assume this is not going to be called a lot?
Once per integral type during tracing (i.e. at most 8 times). I'd think of this as increasing compile time from 2min to 2min 100ms. We can refactor to do once and make it 2min and 10ms.
from tessellate_ipu import create_ipu_tile_primitive, ipu_cycle_count, tile_map, tile_put_sharded, tile_put_replicated | ||
vertex_filename = osp.join(osp.dirname(__file__), "libcint.cpp") | ||
#mat, shls_slice, ao_loc, atm, bas, env | ||
grad = create_ipu_tile_primitive( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, probably want to do this only once?
|
||
cintopt = None | ||
prescreen = lib.c_null_ptr() | ||
ic(intor_name, 'GTOnr2e_fill_'+aosym, "GTOnr2e_fill_drv") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ic = icrecream? Probably remove?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done.
# type | ||
float32 = "#define dtype float" in open("libcint.c", "r").read() | ||
if float32: | ||
out = out.astype(np.float32) | ||
env = env.astype(np.float32) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of duplicated code here - and a lot of it is slow - so it feels as if there might be an opportunity to dedup.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There's a lot of duplicated code here - and a lot of it is slow - so it feels as if there might be an opportunity to dedup.
Sure, let's make those refactors. The heavy compute part is the 20k lines of C. Most of the above is the easy 1% optimizations I was hoping to off-load.
Co-authored-by: Andrew Fitzgibbon <awf@graphcore.ai>
…aining long term solution.
Got all integrals (except grad of kinetic) passing test case. I'd like to get this on main branch so we can start