You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Is your feature request related to a problem? Please describe.
For many of my algorithms I will be executing a series of cub functions with varying num_items values and it is cost-prohibitive to allocate d_temp_storage each time. I cannot predict ahead-of-time what num_items will be but I can guarantee a tight bound on the max value. I currently query temp_storage_bytes with this bound value and allocate a block of memory of the queried size to use for all subsequent cub functions. This works in practice but so far as I can tell, is not formally guaranteed to work.
Describe the solution you'd like
Guarantee that (for the same function and same template parameters), the output temp_storage_bytes value is a non-strict monotonic increasing function of num_items.
Describe alternatives you've considered
Nvidians: don't suggest workarounds and alternative solutions before reading internal discussion on Slack cdd-cub.
Additional context
No response
The text was updated successfully, but these errors were encountered:
Thanks for submitting this issue - the CCCL team has been notified and we'll get back to you as soon as we can!
In the mean time, feel free to add any relevant information to this issue.
Is this a duplicate?
Area
CUB
Is your feature request related to a problem? Please describe.
For many of my algorithms I will be executing a series of cub functions with varying num_items values and it is cost-prohibitive to allocate d_temp_storage each time. I cannot predict ahead-of-time what num_items will be but I can guarantee a tight bound on the max value. I currently query temp_storage_bytes with this bound value and allocate a block of memory of the queried size to use for all subsequent cub functions. This works in practice but so far as I can tell, is not formally guaranteed to work.
Describe the solution you'd like
Guarantee that (for the same function and same template parameters), the output temp_storage_bytes value is a non-strict monotonic increasing function of num_items.
Describe alternatives you've considered
Nvidians: don't suggest workarounds and alternative solutions before reading internal discussion on Slack cdd-cub.
Additional context
No response
The text was updated successfully, but these errors were encountered: