You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm proposing a full reorganization of the documentation based around the Divio system. Instead of a galley and user guide, we would have the following. All of these sections will use the Ensaio data.
Getting started
Installing
A taste of Verde: Quick showcase of using Verde to generate a grid and a profile from a dataset. Use a Cartesian one or just run it on lon,lat with a warning that this isn't the best way to do it. Main goal is to get people excited about Verde so don't need to explain a lot here. No weights, blockmeans, CV, etc. In the end, instruct to start the tutorial
Citing Verde
Tutorial
The tutorial should have several parts that grow in complexity and are meant to be followed in order by someone new to Verde. They can include some notes with further reading but shouldn't try to explain how things work under the hood or anything more advanced than what each one is aiming to tackle (leave that for other sections).
Creating your first grid: Show how to make a grid in Cartesian coordinates using BlockReduce projection, and Spline with default parameters. Go over how to add metadata to the grid and saving it to netCDF. Use the bathymetry data.
Grids and profiles in geographic coordinates: Show how to pass a projection to the grid and profile methods to get geographic grids/profiles. How to edit the metadata to get proper names. Use the same data. After this, do all of the following tutorials using projections.
Using data weights and uncertainties: How to use weights in BlockMean/BlockReduce and Spline. First show how to manually add a weight to avoid fitting a point (use BlockReduce). Then show how to use uncertainties to add weights to BlockMean and Spline. Use the vertical GPS data which has weights.
Chaining operations: Use Chain to build a pipeline with BlockMean and Spline for the GPS data. Show how to access each individual step. This is important for the cross-validation section.
Evaluating interpolations through cross-validation: How to use cross-validation in Verde. Only use the blocked versions. Start with train_test_split then show BlockKFold and cross_val_score. Explain why a Chain is needed and link to resources on leakage. Use the GPS data.
Selecting optimal spline parameters: How to do a grid-search to find the best spline damping. Use the BlockKFold from the previous section on a loop. Use the GPS data.
Interpolating 2D and 3D vectors: How to use Vector with a Chained BlockMean and Spline to grid the 3 components at once. Then how to use VectorSpline2D on the horizontal components. Use the GPS data.
How to
These should be short and to the point, focused on the problem presented. They can assume that people have done the tutorials.
Decimate large datasets: How to turn a large dataset into a smaller one with BlockReduce. Particularly when data are oversampled along tracks. The bathymetry data is good for this.
Interpolate large datasets: For large than ~10k the Spline is too heavy. Say to use KNeighbors instead and show an example using it on a full lidar data (the volcano one maybe) with no BlockReduce. Try doing CV to find the number of neighbors.
Project a grid: Make a sythetic lon,lat grid and project it to the polar coordinates.
Select points inside a region: How to use vd.inside to index points inside a region.
Mask grid points too far from data points: Use verde.distance_mask. Trail island dataset + KNeighbors is a good one for this.
Mask grid points outside of the data convexhull: Use the convex_hull_mask. Maybe use the Bushveld height data for this.
Estimate and remove a polinomial trend: Fit a trend and remove it from the point data, grid the trend.
Calculate statistics on spatial bins: Use BlockReduce to calculate standard deviation within blocks on volcano lidar as a measure of roughness.
Split point data into spatial blocks: Run block_split on pretty much any dataset and show how to loop over the blocks.
Split point data into rollowing windows: Run rolling_window to split the data and loop over the blocks.
Explanations
These are meant to explain how things work and why they are that way.
Weights and uncertainties in data decimation: How weights work in BlockMean and BlockReduce. This actually goes into detail about what each of them mean, not just how to pass the weights.
Adjust spacing or region in grid coordinates: How these work and what it looks like when they are changed.
Grid-node and pixel registration: Explain the difference and what they look like when we make a grid.
How spline interpolation works: Theory beind the spline interpolation. Build the Jacobian matrix and solve the linear system by hand. Show how to make a prediction.
Conventions and definitions: List of conventions and definitions used throughout the project.
Reference documentation
What we already have.
API
References
Changelog
Version compatibility
Documentation for other versions
A lot of the existing docs can be repurposed for this. The only thing I would change are the datasets used, update the writing, and use more notes/hints/etc.
This can be done in parts, one section at a time. When it's all done, we can delete the sphinx-gallery parts and remove the sample data.
The text was updated successfully, but these errors were encountered:
I saw that talk a while ago and it stayed in the back of my mind all this time. I finally think I digested that enough to come up with this plan. Hope it works!
The first tutorial of the new documentation structure (see #433). Trying
to be as simple as possible on how to generate a grid from some data. No
cross-validation or other fancy things if it can be avoided.
I'm proposing a full reorganization of the documentation based around the Divio system. Instead of a galley and user guide, we would have the following. All of these sections will use the Ensaio data.
Getting started
Tutorial
The tutorial should have several parts that grow in complexity and are meant to be followed in order by someone new to Verde. They can include some notes with further reading but shouldn't try to explain how things work under the hood or anything more advanced than what each one is aiming to tackle (leave that for other sections).
grid
andprofile
methods to get geographic grids/profiles. How to edit the metadata to get proper names. Use the same data. After this, do all of the following tutorials using projections.How to
These should be short and to the point, focused on the problem presented. They can assume that people have done the tutorials.
Explanations
These are meant to explain how things work and why they are that way.
Reference documentation
What we already have.
A lot of the existing docs can be repurposed for this. The only thing I would change are the datasets used, update the writing, and use more notes/hints/etc.
This can be done in parts, one section at a time. When it's all done, we can delete the sphinx-gallery parts and remove the sample data.
The text was updated successfully, but these errors were encountered: