Skip to content
Justin Clark-Casey edited this page Jun 4, 2018 · 17 revisions

Introduction

This is a master page for information on all the technical aspects of publishing and consuming Bioschemas markup. This will grow over time. Please feel free to edit or contact anybody in the Bioschemas technical group for more information or help.

Topics

Format

Bioschemas strongly recommends using JSON-LD to publish markup, as also recommended by Google. schema.org also allows RDFa and Microdata, but standardizing on JSON-LD allows Bioschemas example markup to be simpler and more consistent. JSON-LD also separates its markup from the page HTML, which may be better for scientific sites publishing large volumes of markup that may change relatively infrequently compared to the human-readable webpage.

Publishing Bioschemas markup

Publishing markup is no good if consuming applications cannot find it or parse it efficiently. Here are some general recommendations and pointers.

Make Bioschemas markup reachable from the site's sitemap.xml

In principle, you should be able to publish Bioschemas markup in any of your webpages, much like any schema.org markup. This will always be true for general search engines such as Bing and Google, as and when they process Bioschemas markup. However, so that markup can be found by life sciences specific search engines and other applications, we strongly recommend that all markup can be reached by crawling the website's sitemap.xml.

You can aggregate Bioschemas markup and put it on any page.

There is no need to publish Bioschemas markup for a particular object (DataSet, Event, BioChemEntity, etc.) on the same page used by humans for that entity. In fact, markup can live anywhere on your site and markup for many objects can be aggregate on the same page. So one strategy for making your markup findable is to publish it on only a few webpages and make those reachable via your site's sitemap.xml, as detailed above.

Don't publish Bioschemas markup dynamically

We strongly recommend that you publish Bioschemas markup on statically rendered webpages, not using dynamic means such as Javascript. This is so that applications can find your markup without having to render the entire webpage in a headless browser - an expensive operation for which life sciences projects do not have the resources.

Links

Adding profile specific relations to BioChemEntity and DataRecord

Future ideas

  • The Bioschemas common crawl - how to access a common crawl for applications that need a large amount of collected Bioschemas markup but don't want to operate their own crawling facilities. Either this is collected by commoncrawl.org or published by a search engine gathering this information anyway, such as Buzzbang.