-
Notifications
You must be signed in to change notification settings - Fork 7
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Representation of coordinates #15
Comments
Maybe we can start with something intermediate? What I have in mind is a simple 2d (lat/lon) binning of ocurrence data. It's not quite sites, but it would make it easy to get data in a way where |
Yup, there is no doubt that the spatial (coordinate) types internally should be explicitly geographical types consistent with the framework being developed at JuliaGeo. To explain how it is now: the With regards to I'd also like to add a SiteFields type consistent with https://github.com/JuliaGeo/GeoInterface.jl for polygon sites (e.g. countries - this is intended to replace Shapefiles, I think). Not quite sure what you mean by binning - but the |
Got it. I had some issues navigating the types. I'd like to maybe have one of the |
It should be - here is some spaghetti code that does the trick, an
using GBIF, DataFrames
uk_birds_query = Dict(
"taxonKey"=>5231190,
"country"=>"GB",
"hasCoordinate"=>true,
"year"=>2015)
uk_birds = GBIF.occurrences(uk_birds_query) #SpatialEcology also defines occurrences - any good idea for a better name to give it in SpatialEcology? It is the occurrences of a single species in the data set.
uk_birds.query["limit"] = 200
complete!(uk_birds)
# a function to extract field values, from your dataframe.jl
loc(o::Occurrences, f::Symbol) = map(x -> getfield(x, f) == nothing ? NA : getfield(x, f), o)
# get the relevant fields
long = loc(uk_birds, :longitude)
lat = loc(uk_birds, :latitude)
sites = ((x,y)->"$(x)_$(y)").(long, lat) #an identifier for unique sites
abun = loc(uk_birds, :individualCount)
species = loc(uk_birds, :species)
# construct a DataFrame and consolidate all duplicated point occurrences
occ = DataFrame(sites = sites, abun = abun, species = species)
occ = by(occ, [:sites, :species]) do df
sum(df[:abun][isfinite.(df[:abun])])
end
# I am going to throw out the abundance information for now, as Assemblage types don't allow for NA abundances
# to keep it in, I should:
#occ = DataFrame(sites = occ[:sites], abun = occ[:x1], species = occ[:species])
# construct a DataFrame in the Phylocom format
occ = DataFrame(sites = occ[:sites], abun = 1, species = occ[:species])
# construct a DataFrame of coordinates
coords = DataFrame(sites = sites, long = long, lat = lat)
unique!(coords, :sites)
using SpatialEcology
birds = Assemblage(occ, coords)
using Plots
plot(birds, aspect_ratio = 1.5, alpha = 0.3) |
Perhaps we could define stubs for all the types we use in an EcoBase package imported by all the other packages, so SpatialEcology wouldn't need to depend on GBIF.jl to define a constructor that takes |
The constructor should also extract the taxonomic information and site information from the |
Oh I agree that SpatialEcology should not depend on GBIF -- it should be the other way around, like the way it's for DataFrames in GBIF. All I need is to declare a method on my side to return an object in the correct format for SpatialEcology. I think in the comments of the code snippet above you make a good point about namespaces. I like the way it's done in R, which forces to be explicit when calling functions from another namespace. But to answer the question, what about writing a new method for |
The example uses
:Lat
and:Lon
, and I don't think this is the best solution.What about using https://github.com/JuliaGeo/Geodesy.jl objects instead? If the users has standard column names (
:latitude
,:longitude
would be mandatory,:projection
assumed wgs84, and:altitude
being optional), then the spatial coordinates can be manipulated as a single object instead of two columns.The text was updated successfully, but these errors were encountered: