how to mudularize PredictionEnsemble.bootstrap()
#801
Replies: 5 comments
-
@aaronspring, is this still relevant? Just going through Github notifications now. Learning to use them since my email inbox just gets piled up with Github stuff. To re-iterate an exchange with Steve Yeager, which is my goal for modularizing the bootstrapping methods:
|
Beta Was this translation helpful? Give feedback.
-
I think this is still relevant. Currently we just have my Do it all bootstrap function. Before breaking it down in smaller units I think we need to cover my post above. |
Beta Was this translation helpful? Give feedback.
-
I think we first need to have a plan here how the API should finally look like. My proposal: goal is to get skill = hindcast.verify(...) # no iteration
hindcast_resampled = hindcast.resample_iterations(replace=True, iterations=10, blocksize=maxlead) # return iterations also for uninitialized if present
resampled_skill = hindcast_resampled.verify()
p_or_ci = resampled_skill.get_p_value()
# shortcut
p = hindcast.resample_iterations(replace=True, iterations=10).verify(...).get_p_value()
skill.where(p).plot() # mask and plot for the above implementation we only need to add I think we should have this clean BEFORE implementing this feature into |
Beta Was this translation helpful? Give feedback.
-
all this resampling works nice for |
Beta Was this translation helpful? Give feedback.
-
why not speed up by doing #if resample_dim=='init' and 'init' in dim and metric != 'pearson_r'
resample_dim='init'
skill = he.generate_uninitialized().verify(metric=metric, comparison='m2o', dim=[], reference=references, alignment=alignment)
resampled = resample_func(skill,iterations,dim=resample_dim).mean(dim)
resampled.sst.plot(hue='iteration',col='skill') |
Beta Was this translation helpful? Give feedback.
-
we need bootstrapping to get a measure of significance. but skill we just want to calculate from the original (not resampled) data.
we agree that from a computation point of view, we first resample and then calc skill on the resampled data. how to we want to implement this in our classes?
I see two ways with
iterations=100
:pmb
to calc p value therepro: ?
con: new created object, easy to rerun pm and pmb separately
pro: only one object
con: annoying to remember first iteration original data
Comments:
pm.bootstrap
anyways but ratherpm.resample(iterations)
Your opinions? @bradyrx @ahuang11
Beta Was this translation helpful? Give feedback.
All reactions