Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SKETCH] Add xarray reader support for item endpoints #2

Closed
wants to merge 11 commits into from

Conversation

abarciauskas-bgse
Copy link

This PR is just the start of what I thought we might need to implement for MS PC titiler with both xarray and rasterio support. I was scoping this work to just the item endpoint and for tiling either COGs or NetCDF.

High level, what needs to happen is that when a tiles request comes in to the items endpoint /collections/{collection_id}/items/{item_id}, it is required to have the assets parameter, and depending on the media type of that asset, the item will be tiled using either the XarrayReader or rasterio Reader.

In this PR:

  1. When the request comes in, the ItemId path dependency makes a STAC API request to set the item argument to the tiles endpoint.
  2. The inner tile function uses the StacReader to determine the asset's media type (assuming just one asset for now) and then sets what asset reader to use
  3. (Not yet in this PR): handle sending the right set of parameters to each asset reader
  4. (Not yet in this PR): Re-implement parts of the ZarrReader we need to complete the XarrayReader
  5. (Not yet in this PR): Testing

Questions

  • there are 2 levels of reader here - StacReader which then determines the asset reader (Xarray or Rasterio) - should we instead call this StacBackend like titiler-cmr has CmrBackend?

Todos in this PR can be seen in the files, but should also include handling of multiple assets.

Latter PRs - we could create issues for these if we un-fork this repo:

  • info and tilejson endpoints for items
  • handle multiple assets ("band math")
  • design and implement collections endpoint which accepts STAC query parameters and works for both COG collections (mosaic) and NetCDF collections, where those collections have a single xarray-readable (virtual zarr) endpoint or the query parameters return a single item

@vincentsarago I opened this PR so you could take a look and provide some feedback if it's on the right path. But I then realized we need the collections endpoint more than this items endpoint -- for the STAC explorer to work, it will be using STAC queries not item ids -- so I will start looking into that.

@vincentsarago
Copy link
Member

But I then realized we need the collections endpoint more than this items endpoint -- for the STAC explorer to work, it will be using STAC queries not item ids -- so I will start looking into that.

I think we will use the /collections/{collection_id} endpoint yes, If we don't want/need the Item endpoint, we won't need a proper STAC reader but a simpler one like https://github.com/stac-utils/titiler-pgstac/blob/main/titiler/pgstac/mosaic.py#L60-L135 which will be used in the Backend (https://github.com/stac-utils/titiler-pgstac/blob/main/titiler/pgstac/mosaic.py#L139)

@abarciauskas-bgse
Copy link
Author

@vincentsarago I started taking a look at how the current /collections/{collection_id} endpoint works in this repo (titiler-stacapi) and I think it is similar to the links you shared, but the STACAPIBackend of course relies on a stac query (in STACAPIBackend#get_assets rather than a pgstac query.

To get the functionality we want in this repo - that is, a STAC API query which creates tiles using, conditionally, either the XarrayReader or rio_tiler mosaic_reader I think what is needed is the following:

  1. Keep assets_for_tile and get_assets within the STACAPIBackend as-is, but then in the tile method, there should be a condition to use the XarrayReader or the mosaic_reader. To do this...
  2. Use the STACReader _get_asset_info function to assert that all assets returned from the STAC query are the same type and in the set of VALID_TYPES.
    a. if the type is NetCDF, only tile the first item and asset, since we don't have mosaic-ing for multiple NetCDFs yet
  3. If the type is NetCDF, use the XarrayReader, if the type is COG use rio_tiler.mosaic_reader
  4. Move the with rasterio.Env(**env): context manager to inside the conditional on asset type in the tile method of STACAPIBackend as we don't need the rasterio context manager for the XarrayReader

I'm not sure the purpose behind reader_params or backend_params, but we will need some way, like DatasetParams, to set all the optional XarrayReader params. I would implement something like DatasetParams but for Xarray arguments.

This titiler-stacapi code also has all the timing middleware - I think it would be interesting to keep this but should check that it is working.

@vincentsarago let me know if ⬆️ makes sense to you.

As far as the items endpoints go, I think we could leave this draft PR open for now and / or remove those item endpoints from the repo until we get them working for MS PC API.

@abarciauskas-bgse
Copy link
Author

@vincentsarago 👍🏽 that makes sense to me, was that in response to any part of my previous comment? Because if it was

I'm not sure the purpose behind reader_params or backend_params, but we will need some way, like DatasetParams, to set all the optional XarrayReader params. I would implement something like DatasetParams but for Xarray arguments.

We will need additional parameters to use in xarray.open_dataset, specifically variable, but also possibly others (e.g. https://github.com/developmentseed/titiler-xarray/blob/dev/titiler/xarray/reader.py#L200-L206)

@abarciauskas-bgse
Copy link
Author

Closing this as we are working on the collections endpoint for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants