diff --git a/_images/action_call_sequence_diagram.svg b/_images/action_call_sequence_diagram.svg new file mode 100644 index 00000000..ce962c91 --- /dev/null +++ b/_images/action_call_sequence_diagram.svg @@ -0,0 +1,1498 @@ + + + + + + + + image/svg+xml + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + User + + + + + + + + Interface + + + + + + + Framework + + + + + + + + + Plugin + + + + + + + + + + + + + + + + + + + + + plugin action + (file paths, parameters) + + + + + + + + + + + get plugin + + + + + + + + + + + get action + + + + + + + + + + + + + + + load artifacts + (file paths) + + + + + + + + + + + artifacts + + + + + + + + + + + + + + + call action + (artifacts) + + + + + + + + + + + + + + + validate input + (artifacts, parameters) + + + + + + + + + + + + + + + transform + (artifacts) + + + + + + + + + + + transformed input + + + + + + + + + + + + + + + execute action + (transformed input, parameters) + + + + + + + + + + + + + + + perform + bioinformatics + + + + + + + + + + + results + + + + + + + + + + + + + + + transform + (results) + + + + + + + + + + + transformed results + + + + + + + + + + + + + + + write /data/ + (transformed results) + + + + + + + + + + + + + + + write /provenance/ + (artifacts, parameters) + + + + + + + + + + + result artifacts + + + + + + + + + + + + + + + save + (artifacts) + + + + + + + + + + + artifact file paths + + + + + + + + + + + artifact file paths + + + + + + + + + + + + + diff --git a/_images/complex_component_diagram.svg b/_images/complex_component_diagram.svg new file mode 100644 index 00000000..fdf870db --- /dev/null +++ b/_images/complex_component_diagram.svg @@ -0,0 +1,124 @@ + + + + + + + + + + + + + + + + + + + + + Framework + <qiime2> + + Interfaces + + Plugins + + + + + + + q2-types + + + + + + + q2-dada2 + + + + + + + q2cli + + + + + + + q2galaxy + + + + + + + SDK + <qiime2.sdk> + + + + + + + Artifact API + <qiime2.plugins> + + + + + + + q2-feature-table + + + + + + + Plugin API + <qiime2.plugin> + + + + + + + Internal API + <qiime2.core> + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + diff --git a/_images/simple_component_diagram.svg b/_images/simple_component_diagram.svg new file mode 100644 index 00000000..7ebb274e --- /dev/null +++ b/_images/simple_component_diagram.svg @@ -0,0 +1,71 @@ + + + + + + + + + + + + + + + + + + + + + + + + + Framework + <qiime2> + + + + + + + + + + + + + + + + + + + Plugins + + + + + + + + + + + + + + + + + + + Interfaces + + + + + + + diff --git a/_sources/bibliography.md b/_sources/back-matter/bibliography.md similarity index 100% rename from _sources/bibliography.md rename to _sources/back-matter/bibliography.md diff --git a/_sources/back-matter/genindex.md b/_sources/back-matter/genindex.md new file mode 100644 index 00000000..1630f897 --- /dev/null +++ b/_sources/back-matter/genindex.md @@ -0,0 +1 @@ +# Index \ No newline at end of file diff --git a/_sources/back-matter/glossary.md b/_sources/back-matter/glossary.md new file mode 100644 index 00000000..0fc5fe02 --- /dev/null +++ b/_sources/back-matter/glossary.md @@ -0,0 +1,103 @@ +# Glossary + +```{glossary} +Action + A generic term to describe a concrete {term}`method`, {term}`visualizer`, or {term}`pipeline`. + Actions accept parameters and/or files ({term}`artifacts ` or {term}`metadata`) as input, and generate some kind of output. + +Archive + The directory structure of a QIIME 2 {term}`Result`. + Contains *at least* a root directory (named by {term}`UUID`) and a ``VERSION`` file within that directory. + +Artifact + A QIIME 2 {term}`Result` that contains data to operate on. + +Deployment + An installation of QIIME 2 as well as zero-or-more {term}`interfaces ` and {term}`plugins `. + The collection of interfaces and plugins in a deployment can be defined by a {term}`distribution` of QIIME 2. + +Directory Format + A string that represents a particular layout of files and or directories as well as how their contents will be structured. + +Distribution + A collection of QIIME 2 plugins that are designed to be installed together. + These are generally grouped by a theme. For example, the Amplicon Distribution provides a collection of plugins for analysis of microbiome amplicon data, while the Shotgun Distribution provides a collection of plugins for analysis of microbiome shotgun metagenomics data. + When a distribution is installed, that particular installation of QIIME 2 is an example of a {term}`deployment`. + +Format + A string that represents a particular file format. + +Framework + The engine of orchestration that enables QIIME 2 to function together as a cohesive unit. + +Identifier + A unique value that denotes an individual sample or feature. + +Identity + Distinguishes a piece of data. QIIME 2 does not consider a rename (like UNIX ``mv``) to change identity, however re-running a command, would change identity. + +Input + Data provided to an {term}`action`. Can be an {term}`artifact` or {term}`metadata`. + +Interface + A user-interface responsible for coordinating user-specified intent into {term}`framework`-driven action. + +Metadata + Columnar data for annotating additional values to existing data. Operates along Sample IDs or Feature IDs. + +Method + A method accepts some combination of QIIME 2 {term}`artifacts ` and {term}`parameters ` as {term}`input`, and produces one or more QIIME 2 artifacts as {term}`output`. + +Output + Objects returned by an {term}`action`. Can be {term}`artifact(s) ` or {term}`visualization(s) `. + +Parameter + A value that alters the behavior of an {term}`action`. + +Payload + Data that is meant for primary consumption or interpretation (in contrast to *metadata* which may be useful retrospectively, but is not primarily useful). + +Pipeline + A pipeline accepts some combination of QIIME 2 {term}`artifacts ` and {term}`parameters ` as {term}`input`, and produces one or more QIIME 2 {term}`artifacts ` and/or {term}`visualizations ` as {term}`output`. + +Plugin + A discrete module that registers some form of additional functionality with the {term}`framework`, including new {term}`methods `, {term}`visualizers `, {term}`formats `, or {term}`transformers `. + +Primitive Type + A {term}`type` that is used to communicate parameters to an {term}`interface`. These are predefined by the {term}`framework` and cannot be extended. + +Result + A generic term for either a {term}`Visualization` or an {term}`Artifact`. + +Provenance + Data describing how an analysis was performed, captured automatically whenever users perform a QIIME 2 {term}`Action`. + Provenance information describes the host system, the computing environment, Actions performed, parameters passed, primary sources cited, and more. + +Semantic Type + A {term}`type` that is used to classify {term}`artifacts` and how they can be used. + These types may be extended by {term}`plugins`. + +Transformer + A function registered on the {term}`framework` capable of converting data in one {term}`format` into data of another {term}`format`. + +Type + A term that is used to represent several different ideas in QIIME 2, and which is therefore ambiguous when used on its own. + More specific terms are *file type*, *semantic type*, and *data type*. See [](types-of-types) for more information. + +UUID + Universally Unique IDentifier, in the context of QIIME 2, almost certainly refers to a *Version 4* UUID, which is a randomly generated ID. + See this [RFC](https://tools.ietf.org/html/rfc4122) or this [wikipedia entry](https://en.wikipedia.org/wiki/Universally_unique_identifier) for details. + +View + A particular representation of data. This includes on-disk formats and in-memory data structures (objects). + +Visualization + A QIIME 2 {term}`Result` that contains an interactive visualization. + +Visualization (Type) + The {term}`type` of a {term}`visualization`. + There are no subtyping relations between this type and any other (it is a singleton) and cannot be extended (because it is a singleton). + +Visualizer + A visualizer accepts some combination of QIIME 2 {term}`artifacts ` and {term}`parameters ` as {term}`input`, and produces exactly one {term}`visualization` as {term}`output`. +``` diff --git a/_sources/framework/explanations/architecture.md b/_sources/framework/explanations/architecture.md new file mode 100644 index 00000000..db2d6262 --- /dev/null +++ b/_sources/framework/explanations/architecture.md @@ -0,0 +1,104 @@ +(q2-architecture-overview)= +# QIIME 2 architecture overview + +The goal of this document is to give the reader a high-level understanding of the components of QIIME 2, and how they are inter-related. + + +At the highest level, there are three kinds of components in QIIME 2: + + - The interfaces, which are responsible for translating user intent into action. + - The framework, whose behavior and purpose will be described in further detail below. + - The plugins, which define *all* domain-specific functionality. + +```{figure} ../images/simple_component_diagram.svg +:name: Box and Arrow diagram of QIIME 2 components + +**Box and Arrow diagram of QIIME 2 components.** +Interfaces only interact with plugins through the framework, which will invoke plugin behavior as needed. +Solid arrows are direct dependency. +Dash-dotted arrows are a deferred dependency (via entry-point). +``` + +The above diagram illustrates the most important restriction of the architecture. +Interfaces cannot and *should not* have any particular knowledge about plugins ahead of time. +Instead they must request that information from the framework, which provides a high-level description of all of the actions available via SDK (Software Development Kit) objects. + +At first glance, this restriction may seem onerous. +However, because interfaces cannot communicate directly with plugins, they also never need to coordinate with them. +This means interfaces are entirely decoupled from the plugins and more importantly, plugins are always decoupled from interfaces. +A developer of a plugin, does not need to concern themselves with providing any interface-specific functionality, meaning development of both plugins and interfaces can be done in parallel. +Only changes in the framework itself require coordination between the other two component types. +This key constraint coupled with a set of semantically rich SDK objects allows *multiple* kinds of interfaces to be dynamically generated. +This allows QIIME 2 to adapt its UI to both the audience and the task at hand. + +## Detailed Component Diagram +A more complete version of the above figure is found below: + +```{figure} ../images/complex_component_diagram.svg +:name: Detailed Box and Arrow diagram of QIIME 2 components + +**Detailed Box and Arrow diagram of QIIME 2 components.** +Solid arrows are a direct dependency. +Dash-dotted arrows are a deferred dependency (via entry-point). +Dashed rounded boxes surrounding other components indicate a group of like-components. +The larger gray box indicates a nested component, containing sub-components. +Text within angle-brackets (`<>`) indicate a Python package/import name. +``` + +Here we observe that interfaces use a particular sub-component of the framework called the SDK. +We also see that one of the interfaces is built into the framework itself (the Artifact API), however it is not any more privileged compared to any of the other interfaces, and none of the other interfaces use it directly. + +Looking now at the plugins we see that they use a sub-component of the framework called the Plugin API. +This is responsible for constructing and registering the relevant SDK objects for use by interfaces. +We also see that plugins can depend on other plugins. + +At this point the rough picture of how an interface uses a plugin can be seen. +Plugins are loaded by the framework's SDK via an entry-point (more on that later). +This in turn causes the plugin code to interact with the Plugin API, which constructs SDK objects. +These SDK objects are then introspected and manipulated by any number of Interfaces. + +## Following A Command Through QIIME 2 + +To get a better idea of where the responsibility of these components starts and ends, we can look at a sequence diagram describing the execution of an action by a user. + +```{figure} ../images/action_call_sequence_diagram.svg +:name: UML Sequence Diagram of an action-call in QIIME 2 + +**UML Sequence Diagram of an action-call in QIIME 2** +This diagram is read from top to bottom, which indicates the passage of some non-specific amount of time. +Components are vertical columns. +An activated state of a component is indicated by a narrow box. +Components can perform actions either upon other components, or upon themselves. +These actions are denoted with a solid arrow pointing at the actor in question. +The label indicates what action is performed and, when provided, parenthesis indicate some kind of argument that is provided. +Not all arguments are enumerated for brevity. +Results of an action are denoted with a dashed arrow and are usually labeled with the result's name. +``` + +This figure has four components: a User, an Interface, the Framework, and a Plugin. +We see first, a User invoking some action with some files and parameters. +The Interface receives this and is activated. +It locates the plugin and the action requested from the Framework, receiving SDK objects (not shown). +Then it loads the provided files as QIIME 2 Artifacts. +It is then ready to call the action (an SDK object) with the User's artifacts and parameters. + +The Framework then provides some input validation (it is much faster to check that the data provided will work for the action requested than to fail halfway through a very long process, requiring the User to start over). +The Framework then identifies what format the data should be in, and invokes relevant code defined by a Plugin (though not necessarily the same one) for converting that data. +Finally, with validated input and a compatible format, the data is provided to the plugin to perform whatever task the User intended. + +Once finished, the Plugin returns the results and the Framework will again convert that data into a particular format for storage using a Plugin. +The Framework then writes that data into an archive (as `/data/`) and records what steps just occurred (in `/provenance/`). +This completed archive is now an artifact and is returned to the Interface. +The Interface decides to save the artifact to a file and then returns that to the User. + +## Summary + +In this example we see that the activation of each component is strictly nested. +It forms a sort of "onion of responsibility" between the component layers. +We also note that the Interface waits for the task to finish before becoming inactive; there are other modes of calling actions which are asynchronous and can be used instead. +In either case, we see that each component is successively responsible for less tasks which become more specific as we move to the right. + +The end result is: + - The Interface need only care about communicating with the User. + - The Plugin need only care about manipulating data to some effect. + - The Framework concerns itself with coordinating the overall effort and recording the data surrounding the action. diff --git a/_sources/plugins/explanations/actions.md b/_sources/plugins/explanations/actions.md new file mode 100644 index 00000000..217a937b --- /dev/null +++ b/_sources/plugins/explanations/actions.md @@ -0,0 +1,20 @@ +# Types of QIIME 2 Actions + +A QIIME 2 plugin `action` is any operation that accepts parameters and files (:term:`artifact` or metadata) as input, and generates some type of output. +`Actions` are interpreted as "commands" by QIIME 2 interfaces and come in three flavors: + +1. A `Method` accepts some combination of QIIME 2 `Artifacts` and `Parameters` as input, and produces one or more QIIME 2 artifacts as output. +These output artifacts could subsequently be used as input to other QIIME 2 `Methods` or `Visualizers`. +`Methods` can produce intermediate or terminal outputs in a QIIME 2 analysis. +For example, [the `rarefy` method defined in the `q2-feature-table` plugin](https://github.com/qiime2/q2-feature-table/blob/9c79fab38fc30c775cba453713703bfa177770b0/q2_feature_table/plugin_setup.py#L35) accepts a feature table artifact and sampling depth as input and produces a rarefied feature table artifact as output. +This rarefied feature table artifact could then be used in another analysis, such as alpha diversity calculations provided by the `alpha` method in `q2-diversity`. + +2. +A `Visualizer` is similar to a `Method` in that it accepts some combination of QIIME 2 `Artifacts` and `Parameters` as input. +In contrast to a method, a visualizer produces exactly one `Visualization` as output. +Visualizations, by definition, cannot be used as input to other QIIME 2 methods or visualizers. +Thus, visualizers can only produce terminal output in a QIIME 2 analysis. + +3. +A `Pipeline` accepts some combination of QIIME 2 `Artifacts` and `Parameters` as input and produces one or more artifacts and/or visualizations as output. +It does so by incorporating one or more `methods` and/or `visualizers` into a single registered `action`. \ No newline at end of file diff --git a/_sources/plugins/explanations/transformers.md b/_sources/plugins/explanations/transformers.md new file mode 100644 index 00000000..65cc55da --- /dev/null +++ b/_sources/plugins/explanations/transformers.md @@ -0,0 +1,37 @@ +(transformer-explanation)= +# Transformers +`Transformers` are functions for converting `Artifact Classes` into data types to be consumed by Python functions that are registered as `Actions`. +These transformers are typically defined along with the semantic types for which they are designed, and [`q2-types`](https://github.com/qiime2/q2-types) provides a number of common types and associated transformers. +Plugins can also define semantic types and/or transformers. + +## How are transformers used by a plugin? +`Transformers` are not called directly at any time within a plugin. +Transformations are handled by the QIIME 2 framework, as long as the appropriate `transformers` are registered. +The framework identifies the *input* `Artifact` source format for the transformation as the format registered to the `Artifact`'s semantic type, and it identifies the transformation's destination type based on the functional annotation associated with the input in an `Action`'s registered function. +When *output* is generated by a `Method`, the framework identifies the source type for the transformation as the registered function's output annotation, and the destination format for the output `Artifact` from the output's semantic type defined in the `Action`'s registration. + +For example, we can see how functional annotations define input and output formats in `q2_diversity.beta_phylogenetic`: + +```python +def beta_phylogenetic(table: biom.Table, + phylogeny: skbio.TreeNode, + metric: str)-> skbio.DistanceMatrix: +``` + +This function requires `biom.Table` and `skbio.TreeNode` objects as input, and produces an `skbio.DistanceMatrix` object as output. + +We can examine the first few lines of the `Action` registration for this function to determine the semantic types of these input and output objects: + +```python +plugin.pipelines.register_function( + function=q2_diversity.beta_phylogenetic, + inputs={'table': FeatureTable[Frequency], + 'phylogeny': Phylogeny[Rooted]}, + parameters={'metric': Str % Choices(beta.phylogenetic_metrics())}, + outputs=[('distance_matrix', DistanceMatrix)], +``` + + +This pair of code examples illustrate that the `biom.Table` object used in `beta_phylogenetic` begins its life as a `FeatureTable[Frequency]` artifact, and the `skbio.TreeNode` comes from a `Phylogeny[Rooted]` artifact. +The output `skbio.DistanceMatrix` generated by `beta_phylogenetic` must be coerced to become a `DistanceMatrix` artifact. +The QIIME 2 framework takes care of all of those conversions for you, provided the appropriate transformers have been defined and registered ([which you can learn how to do here](howto-create-register-transformer)). \ No newline at end of file diff --git a/_sources/plugins/explanations/types-of-types.md b/_sources/plugins/explanations/types-of-types.md index 7e47e2c2..397ea308 100644 --- a/_sources/plugins/explanations/types-of-types.md +++ b/_sources/plugins/explanations/types-of-types.md @@ -4,9 +4,7 @@ The term _type_ is overloaded with a few different concepts. The goal of this Explanation article is to disambiguate how it's used in QIIME 2. To achieve this, we'll discuss two ways that it's commonly used, and then introduce a third way that it's used less frequently but which is important to QIIME 2. -The three kinds of types that are used in QIIME 2 are **file types (more frequently referred to file formats in QIIME 2)**, **data types**, and **semantic types**. - - +The three kinds of types that are used in QIIME 2 are **file types**, **data types**, and **semantic types**. ````{margin} ```{admonition} Video @@ -20,16 +18,16 @@ For example, newick is a file type that is used for storing phylogenetic trees. Files are used most commonly for archiving data when it's not actively in use. Data types refer to how data is represented in a computer's memory (i.e., RAM) while it's actively in use, such as the data structure or object class that a file is loaded into. -For example, if you are adding a root to an unrooted phylogenetic tree (a concept discussed in Part 2 of this book), you may use a tool like IQTree2. -You would provide a path to the file containing the unrooted phylogenetic tree to IQTree2, and IQTree2 would load that tree into some object in the computer's memory to work on it. -The object that IQTree2 uses internally to represent the phylogenetic tree is synonymous with _data type_, as used here. -The kind of object that is used is a decision made by the developers of IQTree2 based on available functionality, efficiency for an operation they plan to carry out, their familiarity with the object, or something else. -If IQTree2 successfully completes the requested rooting operation, it could then write the resulting tree from its internal data type into a new newick-formatted file on the hard disk, and exit. +For example, if you are adding a root to an unrooted phylogenetic tree, you may use a tool like [IQ-Tree](http://www.iqtree.org/). +You would provide a path to the file containing the unrooted phylogenetic tree to IQ-Tree, and IQ-Tree would load that tree into some object in the computer's memory to work on it. +The object that IQ-Tree uses internally to represent the phylogenetic tree is synonymous with _data type_, as used here. +The kind of object that is used is a decision made by the developers of IQ-Tree based on available functionality, efficiency for an operation they plan to carry out, their familiarity with the object, or something else. +If IQ-Tree successfully completes the requested rooting operation, it could then write the resulting tree from its internal data type into a new newick-formatted file on the hard disk, and exit. -One thing to notice from this example is that there are at least three *independent* choices being made by the developer regarding "types": what file type to use as input, what data type to use internally, and what file type to use as output. -Software users, shouldn't need to know or care about what data types are used internally by a program. -They just care about what file types are used as input and output. -Software developers, on the other hand, should care a lot about what data types are used by their program: choosing an appropriate tpye can have huge impacts on the performance of the software, for example. +One thing to notice from this example is that there are at least three *independent* choices being made by the developer regarding *types*: what file type to use as input, what data type to use internally, and what file type to use as output. +Users of command line software, like IQ-Tree, shouldn't need to know or care about what data types are used internally by a program. +They just need to know what file types are used as input and output. +Software developers, on the other hand, should care a lot about what data types are used by their program: choosing an appropriate type can have huge impacts on the performance of the software, for example. ## Semantic types The third _type_ that is important in QIIME 2 is the semantic type of data. @@ -47,8 +45,7 @@ I'll typically check on the job for a few minutes, to make sure that it seems to I may then leave, with the hope that the job completes over the weekend and I'll have data to work with on Monday morning. It's very frustrating to come in Monday morning and find out that my job failed just a few minutes after I left on Friday for a reason that I could have quickly addressed had I known in time. - -```{warning} +```{note} There's actually a worse outcome than a delayed error from a computer program when inappropriate input is provided. When a program fails and provides an error message to the user, whether or not that error message helps the user solve the problem, the program has failed loudly. Something went wrong, and it told the user about it. @@ -78,6 +75,6 @@ Our example plugin `q2-dwq2` defines a action called `duplicate_table` which [ta The function registered to this action [declares that it will "view" the input `table` as a `pd.DataFrame`, and also return the output as a `pd.DataFrame`](https://github.com/caporaso-lab/q2-dwq2/blob/e8fe1e5b32bfc2a331d48611b3a70b0fa5b19165/q2_dwq2/plugin_setup.py#L32). File types are associated with semantic types when [Artifact Classes are defined](https://github.com/qiime2/q2-types/blob/e25f9355958755343977e037bbe39110cfb56a63/q2_types/feature_table/_type.py#L42). -Each kind of type discussed here represents different information about the data: how it's stored on disk (file type), how it's used by a function (it's data type), and what it represents (it's semantic type). +Each kind of type discussed here represents different information about the data: how it's stored on disk (file type), how it's used by a function (its data type), and what it represents (its semantic type). The motivation for creating QIIME 2's semantic type system was to avoid issues that can arise from providing inappropriate data to actions. The semantic type system also helps users and developers better understand the intent of QIIME 2 actions by assigning meaning to the input and output, and allows for the discovery of new potentially relevant QIIME 2 actions. \ No newline at end of file diff --git a/_sources/plugins/how-to-guides/artifact-collections-as-io.md b/_sources/plugins/how-to-guides/artifact-collections-as-io.md new file mode 100644 index 00000000..ff73cc8b --- /dev/null +++ b/_sources/plugins/how-to-guides/artifact-collections-as-io.md @@ -0,0 +1,152 @@ +(howto-artifact-collections-io)= +# Use Artifact Collections as Action inputs or outputs + +Commands in QIIME 2 can take collections of Artifacts as singular inputs or return collections of Artifacts as singular outputs. +They may also take collections of primitives as single parameters. + +## Registering an Action that Takes an Input Collection + +Input or parameter collections can be in the form of lists or dictionaries. +For inputs the type annotation for function registration is the QIIME 2 semantic type of the Artifacts expected. +For parameters it is just the type of the parameter. + +An example of registering collection inputs and parameters is shown below. +For a list input, the syntax is `"List[SemanticType]"` and for a dictionary it is `"Collection[SemanticType]"`. + +```python +dummy_plugin.methods.register_function( + function=example, + inputs={ + 'int_list': List[SingleInt], + 'int_dict': Collection[SingleInt], + }, + parameters={ + 'bool_list': List[Bool], + 'bool_dict': Collection[Bool] + }, + outputs=[ + ('return', Collection[SingleInt]), + ], + name='Example', + description=('Example collection method') +) +``` + +In the actual function definition, the type annotation is the view type of the Artifacts and does NOT contain the collection type annotation. +The fact that the annotations indicate the view type and not the semantic type is due to the fact that by the time we reach the actual method `int_list` will be a list of integers and not `SingleInt` Artifacts. + +```python +def list_of_ints(int_list: int, + int_dict: int, + bool_list: bool, + bool_dict: bool) -> int: + return int_list.extend(list(int_dict.value())) +``` + +```{warning} +The fact that function definitions do not contain the collection type annotations is an implementation detail that may change in the future. +We will provide advance warning of this backward incompatible change on the QIIME 2 Forum at least one release prior to the change, if this does end up changing. +``` + +## Registering an Action that Returns an Output Collection + +Returning an output collection works much the same as returning anything else in QIIME 2. +Using the same example method as earlier, you register your return as a Collection of the type of Artifact you are returning. + +```python +dummy_plugin.methods.register_function( + function=example, + inputs={ + 'int_list': List[SingleInt], + 'int_dict': Collection[SingleInt], + }, + parameters={ + 'bool_list': List[Bool], + 'bool_dict': Collection[Bool] + }, + outputs=[ + ('return', Collection[SingleInt]), + ], + name='Example', + description=('Example collection method') +) +``` + +The return type annotation on the action itself is still the view type of the Artifacts within the collection. + +```python +def list_of_ints(int_list: int, int_dict: int, bool_list: bool, bool_dict: bool) -> int: + return int_list.extend(list(int_dict.value())) +``` + +In this instance, the value `ints` that is returned is a list, but it could also have been a dict. +The actual QIIME 2 Result you get is a `ResultCollection` object which is essentially a wrapper around a dictionary. +If the original return is a list, the `ResultCollection` uses the list indices as keys. + +## Using Collections + +### Using Collections with the command line interface (CLI) + +In the CLI, output collections require an output path to a directory that does not exist yet. +The directory will be created, and the Artifacts in the collection will be written to the directory along with a .order file that lists the order of the Artifacts in the collection. + +These collections can then be used as inputs to new actions by simply passing that directory as the input path. +You can also create a new directory yourself and place artifacts in it manually to use as an input collection. +This directory may or may not have a .order file. +If it does not contain a .order file, the artifacts in the directory will be loaded in whatever order the file system presents them in. + +De-facto collections of parameters and inputs may also be created on the CLI by simply passing the corresponding argument multiple times. +For example, the following will create a collection of foo.qza and bar.qza for the ints input. + +```bash +qiime plugin action --i-ints foo.qza --i-ints bar.qza +``` + +The collection will be loaded in the order the arguments are presented to the command line in so in this case `[foo, bar]` if ints wants a list or `{'0': foo, '1': bar}` if it wants a dict. + +You may also explicitly key the values as follows: + +```bash +qiime plugin action --i-ints foo:foo.qza --i-ints bar:bar.qza +``` + +As you might imagine, this would look like `{'foo': foo, 'bar': bar}` internally if ints wanted a dict. +If ints wanted a list, it would just strip the keys and be `[foo, bar]` again. + +### Using Collections with the Python API + +When working through QIIME 2's Python API, you can pass in a list or a dict and it follows the same rules as the CLI. +Internally QIIME 2 will turn it into the collection type it needs. +If it needs a dict but you gave it a list it will use list indices as keys. +If it needs a list but you gave it a dict, it will strip the keys and make a list of the values. + +### The `ResultCollection` object + +```{note} +The following content will be moved to the *References* section as API documentation. +``` + +QIIME 2 outputs collections in the form of `ResultCollection` objects. +On the CLI, these objects are handled internally, but in the Python API they must be interacted with directly. +Fortunately, these objects are very simple. + +A `ResultCollection` is basically a simple wrapper around a dictionary that can be referenced through its `collection` attribute. + +#### __init__ +Instantiating a `ResultCollection` object without any arguments will create a `ResultCollection` with an empty dictionary as its collection. +Instantiating a `ResultCollection` with a dictionary as its argument will create a `ResultCollection` with that dictionary as its collection. +Instantiating a `ResultCollection` with any other iterable will enumerate the iterable and use the indices as keys to the dictionary that is used as the collection. + +#### load +You can load a directory of Artifacts (an output collection from CLI for example) into a `ResultCollection` by calling `ResultCollection.load('path to directory')`. +If this directory contains a `.order` file, the Artifacts will be loaded in the order specified in the `.order` file. +Otherwise they will be loaded in the order the OS presents them in (not defined by us). +The names of the files will be used as the keys to the Artifacts. + +#### save +You can save your `ResultCollection` to disk by calling `ResultCollection.save('path to destination')` where the destination is a directory that does not exist yet. +This will save all Artifacts in the collection to .qzas in the directory using their key as their name. +It will also create a `.order` file in the directory that lists the keys in the collection in order. + +Other than these methods, you may set and read values on a `ResultCollection` just the same as a dictionary, you may also call keys, values, and items on a `ResultCollection` in the same way as a dictionary. +The validate method also exists on `ResultCollection` objects and will validate all Artifacts that are part of the collection. diff --git a/_sources/plugins/how-to-guides/create-register-method.md b/_sources/plugins/how-to-guides/create-register-method.md new file mode 100644 index 00000000..3e19ebf5 --- /dev/null +++ b/_sources/plugins/how-to-guides/create-register-method.md @@ -0,0 +1,103 @@ +(howto-create-register-method)= +# Create and register a Method + + +A `method` accepts some combination of QIIME 2 `artifacts` and parameters as input, and produces one or more QIIME 2 artifacts as output. +These output artifacts could subsequently be used as input to other QIIME 2 `methods` or `visualizers`. + +## Create a function to register as a Method + +A function that can be registered as a `Method` will have a Python 3 API, and the inputs and outputs for that function will be annotated with their data types using [mypy](http://mypy-lang.org/) syntax. +mypy annotation does not impact functionality (though the syntax is new to Python 3), so these can be added to existing functions in your Python 3 software project. +An example is [`q2_diversity.pcoa`](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/_ordination.py#L23C1-L24C71), which takes an `skbio.DistanceMatrix` and an `int` as input, and produces an `skbio.OrdinationResults` as output. +The signature for this function is: + +```python +def pcoa(distance_matrix: skbio.DistanceMatrix, + number_of_dimensions: int = None) -> skbio.OrdinationResults: +``` + + +As far as QIIME is concerned, it doesn’t matter what happens inside this function (as long as it adheres to the contract defined by the signature regarding the input and output types). +For example, `q2_diversity.pcoa` is making some calls to the `skbio` API, but it could be doing anything, including making system calls (if your plugin is wrapping a command line application), executing an R library, etc. + +(howto-register-method)= +## Register the Method +Once you have a function that you’d like to register as a `Method`, and you’ve instantiated your `Plugin` object, you are ready to register that function as a `Method`. +This will likely be done in the file where the `Plugin` object was instantiated, as it will use that instance (which will be referred to as `plugin` in the following examples). + +We register a `Method` by calling `plugin.methods.register_function` as follows (see the original source [here](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/plugin_setup.py#L192)). + +```python +from q2_types import DistanceMatrix, PCoAResults +from qiime2.plugin import Int, Citations + +import q2_diversity + + +citations = Citations.load('citations.bib', package='q2_diversity') + + +plugin.methods.register_function( + function=q2_diversity.pcoa, + inputs={'distance_matrix': DistanceMatrix}, + parameters={ + 'number_of_dimensions': Int % Range(1, None) + }, + outputs=[('pcoa', PCoAResults)], + input_descriptions={ + 'distance_matrix': ('The distance matrix on which PCoA should be ' + 'computed.') + }, + parameter_descriptions={ + 'number_of_dimensions': "Dimensions to reduce the distance matrix to. " + "This number determines how many " + "eigenvectors and eigenvalues are returned," + "and influences the choice of algorithm used " + "to compute them. " + "By default, uses the default " + "eigendecomposition method, SciPy's eigh, " + "which computes all eigenvectors " + "and eigenvalues in an exact manner. For very " + "large matrices, this is expected to be slow. " + "If a value is specified for this parameter, " + "then the fast, heuristic " + "eigendecomposition algorithm fsvd " + "is used, which only computes and returns the " + "number of dimensions specified, but suffers " + "some degree of accuracy loss, the magnitude " + "of which varies across different datasets." + }, + output_descriptions={'pcoa': 'The resulting PCoA matrix.'}, + name='Principal Coordinate Analysis', + description=("Apply principal coordinate analysis."), + citations=[citations['legendrelegendre'], + citations['halko2011']] +) +``` + + +The values being provided are: +- `function`: The function to be registered as a method. +- `inputs`: A dictionary indicating the parameter name and its `semantic type`, for each input `Artifact`. +These semantic types differ from the data types that you provided in your `mypy`_ annotation of the input, as `semantic types` describe the data, where the data types indicate the structure of the data. +(See {ref}`(types-of-types)` for more detail on the difference between data types and semantic types.) +In the example above we’re indicating that the table parameter must be a `FeatureTable` of `Frequency` (i.e. counts), and that the `phylogeny` parameter must be a `Phylogeny` that is `Rooted`. + Notice that the keys in inputs map directly to the parameter names in `q2_diversity.beta_phylogenetic`. + - `parameters`: A dictionary indicating the parameter name and its semantic type, for each input Parameter. + These parameters are primitive values (i.e., non-`Artifacts`). + In the example above, we’re indicating that the metric should be a string from a specific set (in this case, the set of known phylogenetic beta diversity metrics). + - `outputs`: A list of tuples indicating each output name and its semantic type. + - `input_descriptions`: A dictionary containing input artifact names and their corresponding descriptions. + This information is used by interfaces to instruct users how to use each specific input artifact. + - `parameter_descriptions`: A dictionary containing parameter names and their corresponding descriptions. + This information is used by interfaces to instruct users how to use each specific input parameter. + You should not include any default parameter values in these descriptions, as these will generally be added automatically by an interface. + - `output_descriptions`: A dictionary containing output artifact names and their corresponding descriptions. + This information is used by interfaces to inform users what each specific output artifact will be. + - `name`: A human-readable name for the Method. + This may be presented to users in interfaces. +- `description`: A human-readable description of the Method. +This may be presented to users in interfaces. +- `citations`: A list of bibtex-formatted citations. +These are provided in a separate `citations.bib` file, loaded via the `Citations` API, and accessed here by using their bibtex indices as keys. diff --git a/_sources/plugins/how-to-guides/create-register-pipeline.md b/_sources/plugins/how-to-guides/create-register-pipeline.md new file mode 100644 index 00000000..864e1a1c --- /dev/null +++ b/_sources/plugins/how-to-guides/create-register-pipeline.md @@ -0,0 +1,123 @@ +(howto-create-register-pipeline)= +# Create and register a pipeline +A `Pipeline` accepts some combination of QIIME 2 `Artifacts` and parameters as input, and produces one or more QIIME 2 artifacts and/or `Visualizations` as output. +This is accomplished by stitching together one or more `Methods` and/or `Visualizers` into a single `Pipeline`. + +## Create a function to register as a Pipeline + +Defining a function that can be registered as a `Pipeline` is very similar to defining one that can be registered as a `Method` with a few distinctions. + +First, `Pipelines` do not use function annotations and instead receive `Artifact` objects as input and return `Artifact` and/or `Visualization` objects as output. + +Second, `Pipelines` must have `ctx` as their first parameter, which provides the following API: + - `ctx.get_action(plugin: str, action: str)`: returns a *sub-action* that can be called like a normal Artifact API call. + - `ctx.make_artifact(type, view, view_type=None)`: this has the same behavior as `Artifact.import_data`. It is wrapped by `ctx` for pipeline book-keeping. + +Let's take a look at [`q2_diversity.core_metrics`](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/_core_metrics.py#L10) for an example of a function that we can register as a `Pipeline`: + +```python +def core_metrics(ctx, table, sampling_depth, metadata, n_jobs=1): + rarefy = ctx.get_action('feature_table', 'rarefy') + alpha = ctx.get_action('diversity', 'alpha') + beta = ctx.get_action('diversity', 'beta') + pcoa = ctx.get_action('diversity', 'pcoa') + emperor_plot = ctx.get_action('emperor', 'plot') + + results = [] + rarefied_table, = rarefy(table=table, sampling_depth=sampling_depth) + results.append(rarefied_table) + + for metric in 'observed_otus', 'shannon', 'pielou_e': + results += alpha(table=rarefied_table, metric=metric) + + dms = [] + for metric in 'jaccard', 'braycurtis': + beta_results = beta(table=rarefied_table, metric=metric, n_jobs=n_jobs) + results += beta_results + dms += beta_results + + pcoas = [] + for dm in dms: + pcoa_results = pcoa(distance_matrix=dm) + results += pcoa_results + pcoas += pcoa_results + + for pcoa in pcoas: + results += emperor_plot(pcoa=pcoa, metadata=metadata) + + return tuple(results) +``` + +## Registering the Pipeline + +Registering `Pipelines` is the same as registering `Methods`, with a few exceptions. + +First, we register a `Pipeline` by calling `plugin.pipelines.register_function`. + +Second,`visualizations` produced as an output are listed in `outputs` as a `tuple` with `Visualization` as the second value. +E.g., `('jaccard_emperor', Visualization)`. +A description of this output should be included in `output_descriptions` + +Citations do not need to be added for the pipeline unless unique citations are required for the pipeline that are not appropriate for the underlying `Methods` and `Visualizers` that it calls. +Citations for these underlying actions are automatically logged in citation provenance for this pipeline. + +As an example for registering a `Pipeline`, we can look at `q2_diversity.core_metrics` (find the original source [here](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/plugin_setup.py#L494)): + +```python +plugin.pipelines.register_function( + function=q2_diversity.core_metrics, + inputs={ + 'table': FeatureTable[Frequency], + }, + parameters={ + 'sampling_depth': Int % Range(1, None), + 'metadata': Metadata, + 'n_jobs': Int % Range(0, None), + }, + outputs=[ + ('rarefied_table', FeatureTable[Frequency]), + ('observed_otus_vector', SampleData[AlphaDiversity]), + ('shannon_vector', SampleData[AlphaDiversity]), + ('evenness_vector', SampleData[AlphaDiversity]), + ('jaccard_distance_matrix', DistanceMatrix), + ('bray_curtis_distance_matrix', DistanceMatrix), + ('jaccard_pcoa_results', PCoAResults), + ('bray_curtis_pcoa_results', PCoAResults), + ('jaccard_emperor', Visualization), + ('bray_curtis_emperor', Visualization), + ], + input_descriptions={ + 'table': 'The feature table containing the samples over which ' + 'diversity metrics should be computed.', + }, + parameter_descriptions={ + 'sampling_depth': 'The total frequency that each sample should be ' + 'rarefied to prior to computing diversity metrics.', + 'metadata': 'The sample metadata to use in the emperor plots.', + 'n_jobs': '[beta methods only] - %s' % sklearn_n_jobs_description + }, + output_descriptions={ + 'rarefied_table': 'The resulting rarefied feature table.', + 'observed_otus_vector': 'Vector of Observed OTUs values by sample.', + 'shannon_vector': 'Vector of Shannon diversity values by sample.', + 'evenness_vector': 'Vector of Pielou\'s evenness values by sample.', + 'jaccard_distance_matrix': + 'Matrix of Jaccard distances between pairs of samples.', + 'bray_curtis_distance_matrix': + 'Matrix of Bray-Curtis distances between pairs of samples.', + 'jaccard_pcoa_results': + 'PCoA matrix computed from Jaccard distances between samples.', + 'bray_curtis_pcoa_results': + 'PCoA matrix computed from Bray-Curtis distances between samples.', + 'jaccard_emperor': + 'Emperor plot of the PCoA matrix computed from Jaccard.', + 'bray_curtis_emperor': + 'Emperor plot of the PCoA matrix computed from Bray-Curtis.', + }, + name='Core diversity metrics (non-phylogenetic)', + description=("Applies a collection of diversity metrics " + "(non-phylogenetic) to a feature table.") +) +``` + +See the text describing [registering methods](howto-register-method) for a description of these values. \ No newline at end of file diff --git a/_sources/plugins/how-to-guides/create-register-transformer.md b/_sources/plugins/how-to-guides/create-register-transformer.md new file mode 100644 index 00000000..59802ad0 --- /dev/null +++ b/_sources/plugins/how-to-guides/create-register-transformer.md @@ -0,0 +1,32 @@ +(howto-create-register-transformer)= +# Creating and registering a Transformer + +Transformers are often short Python functions that convert one file format or data type to another file format or data type. +These functions are never directly called by users or developers, so by convention they don't get informative function names (as the annotations of the input and output provide complete detail on what they do). + +Here's are two example `transformer` that are [defined and registered in `q2-types`](https://github.com/qiime2/q2-types/blob/e25f9355958755343977e037bbe39110cfb56a63/q2_types/distance_matrix/_transformer.py#L16): + +```python +import skbio + +from ..plugin_setup import plugin +from . +import LSMatFormat + + +@plugin.register_transformer +def _1(data: skbio.DistanceMatrix) -> LSMatFormat: + ff = LSMatFormat() + with ff.open() as fh: + data.write(fh, format='lsmat') + return ff + + +@plugin.register_transformer +def _2(ff: LSMatFormat) -> skbio.DistanceMatrix: + return skbio.DistanceMatrix.read(str(ff), format='lsmat', verify=False) +``` + + +These transformers define how an `skbio.DistanceMatrix` object is transformed into an `LSMatFormat` object (the underlying format of the data in a `DistanceMatrix` artifact class, defined [here in q2-types](https://github.com/qiime2/q2-types/blob/e25f9355958755343977e037bbe39110cfb56a63/q2_types/distance_matrix/_format.py#L15), and registered to the `DistanceMatrix` semantic type [here](https://github.com/qiime2/q2-types/blob/e25f9355958755343977e037bbe39110cfb56a63/q2_types/distance_matrix/_type.py#L18)). +The transformers are registered using the `@plugin.register_transformer` decorator. \ No newline at end of file diff --git a/_sources/plugins/how-to-guides/create-register-visualizer.md b/_sources/plugins/how-to-guides/create-register-visualizer.md new file mode 100644 index 00000000..b64a4b87 --- /dev/null +++ b/_sources/plugins/how-to-guides/create-register-visualizer.md @@ -0,0 +1,62 @@ +(howto-create-register-visualizer)= +# Create and register a visualizer + +A `Visualizer` accepts some combination of QIIME 2 `Artifacts` and parameters as input, and produces exactly one `Visualization` as output. +Visualizations are visual representations of analytical results (e.g., plots, statistical results, summary tables) and by definition cannot be used as input to other QIIME 2 methods or visualizers. +Thus, visualizers can only produce terminal output in a QIIME 2 analysis. + +## Create a function to register as a Visualizer + +Defining a function that can be registered as a `Visualizer` is very similar to defining one that can be registered as a `Method` with a few additional requirements. + +First, the first parameter to this function must be `output_dir`. +This parameter should be annotated with type `str`. + +Next, at least one `index.*` file must be written to `output_dir` by the function. +This index file will provide the starting point for your users to explore the `Visualization` object that is generated by the `Visualizer`. +Index files with different extensions can be created by the function (e.g., `index.html`, `index.tsv`, `index.png`), but at least one must be created. +You can write whatever files you want to `output_dir`, including tables, graphics, and textual descriptions of the results, but you should expect that your users will want to find those files through your index file(s). +If your function does create many different files, an `index.html` containing links to those files is likely to be helpful. + +Finally, the function cannot return anything, and its return type should be annotated as `None`. + +[`q2_diversity.alpha_group_significance`](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/_alpha/_visualizer.py#L31) is an example of a function that can be registered as a `Visualizer`. +In addition to its `output_dir`, it takes alpha diversity results in a `pandas.Series` and sample metadata in a `qiime2.Metadata` object and creates several different files (figures and tables) that are linked and/or presented in an `index.html` file. +The signature of this function is: + +```python +def alpha_group_significance(output_dir: str, + alpha_diversity: pd.Series, + metadata: qiime2.Metadata) -> None: +``` + +## Register the Visualizer + +Registering `Visualizers` is the same as registering `Methods`, with two exceptions. + +First, you call `plugin.visualizers.register_function` to register a `Visualizer`. + +Next, you do not provide `outputs` or `output_descriptions` when making this call, as `Visualizers`, by definition, only return a single visualization. +Since the visualization output path is a required parameter, you do not include this in an `outputs` list (it would be the same for every `Visualizer` that was ever registered, so it is added automatically). + +Registering `q2_diversity.alpha_group_significance` as a `Visualizer` looks like the following (find the original source [here](https://github.com/qiime2/q2-diversity/blob/99a0ccaaec14838b95845dbfe57f874d092b65c7/q2_diversity/plugin_setup.py#L648)): + +```python +plugin.visualizers.register_function( + function=q2_diversity.alpha_group_significance, + inputs={'alpha_diversity': SampleData[AlphaDiversity]}, + parameters={'metadata': Metadata}, + input_descriptions={ + 'alpha_diversity': 'Vector of alpha diversity values by sample.' + }, + parameter_descriptions={ + 'metadata': 'The sample metadata.' + }, + name='Alpha diversity comparisons', + description=("Visually and statistically compare groups of alpha diversity" + " values."), + citations=[citations['kruskal1952use']] +) +``` + +See the text describing [registering methods](howto-register-method) for a description of these values. diff --git a/_sources/plugins/how-to-guides/register-a-plugin.md b/_sources/plugins/how-to-guides/register-a-plugin.md new file mode 100644 index 00000000..ecbe1f8c --- /dev/null +++ b/_sources/plugins/how-to-guides/register-a-plugin.md @@ -0,0 +1,101 @@ +(howto-register-plugin)= +# Register a QIIME 2 plugin + +This document will describe how to `register` a plugin, allowing this plugin to interact with the QIIME 2 framework. + +## Overview + +There are several high-level steps to registering a QIIME 2 plugin: + +1. A QIIME 2 plugin must define one or more Python 3 functions that will be accessible through QIIME. +2. The plugin must be a Python 3 package that can be installed with `setuptools`. +3. The plugin must then instantiate a `qiime2.plugin.Plugin` object and define some information including the name of the plugin and its URL. In the plugin package’s `setup.py` file, this instance will be defined as an `entry point`. +4. The plugin must then register its functions as QIIME 2 Actions, which will be accessible to users through any of the QIIME 2 interfaces. +5. Optionally, the plugin could be distributed through [Anaconda](https://anaconda.org/) or [pypi](https://pypi.org/) as that will simplify installation for QIIME 2 users. + +These steps are covered in detail below. + +Writing a simple QIIME 2 plugin should be a straightforward process. +For example, the [`q2-emperor`](https://github.com/qiime2/q2-emperor) plugin, which connects Emperor to QIIME 2, is written in a little over 100 lines of code (excluding unit tests and assets). +It is a standalone plugin that defines how and which functionality in Emperor should be accessible through QIIME 2. +Plugins will vary in their complexity. +For example, a plugin that defines a lot of new functionality would likely be quite a bit bigger. +[q2-diversity](https://github.com/qiime2/q2-diversity) is a good example of this. +Unlike `q2-emperor`, there is some specific functionality (and associated unit tests) defined in this project, and it depends on several other Python 3 compatible libraries. + +Before starting to write a plugin, you should install QIIME 2 and some plugins to familiarize yourself with the system and to provide a means for testing your plugin. + +## Instantiating a plugin + +The next step is to instantiate a QIIME 2 `Plugin` object. +This might look like the following: + +```python +from qiime2.plugin import Plugin +import q2_diversity + +plugin = Plugin( + name='diversity', + version=q2_diversity.__version__, + website='https://github.com/qiime2/q2-diversity', + package='q2_diversity', + description=('This QIIME 2 plugin supports metrics for calculating ' + 'and exploring community alpha and beta diversity through ' + 'statistics and visualizations in the context of sample ' + 'metadata.'), + short_description='Plugin for exploring community diversity.', +) +``` + +This will provide QIIME with essential information about your `Plugin`. + +The `name` parameter is the name that users will use to access your plugin from within different QIIME 2 interfaces. +It should be a "command-line-friendly" name, so should not contain spaces or punctuation. +(Avoiding uppercase characters and using dashes (`-`) instead of underscores (`_`) are preferable in the plugin `name`, but not required). + +`version` should be the version number of your package (the same that is used in its `setup.py`). + +`website` should be the page where you'd like end users to refer for more information about your package. + +`package` should be the Python package name for your plugin. + +`description` should give a brief description of this plugin's functionality. +This will be displayed when that plugin's help documentation is accessed via the QIIME 2 framework. + +`short_description` should give a very brief description of this plugin's functionality. +This will be displayed when the QIIME 2 help documentation is accessed. + +While not shown in the previous example, plugin developers can optionally provide the following parameters to `qiime2.plugin.Plugin`: + +* `citations`: A list of bibtex-formatted citations. +These are provided in a separate `citations.bib` file, loaded via the `Citations` API, and accessed by using their bibtex indices as keys. +Citations can be listed during plugin or action registration, or both, but will usually only be listed for individual actions unless if a single reference is appropriate for all actions in that plugin. +`q2-diversity` has no such plugin-wide citation listed here. + +* `user_support_text`: free text describing how users should get help with the plugin (e.g. +issue tracker, StackOverflow tag, mailing list, etc.). +If not provided, users are referred to the `website` for support. +Plugin developers are free to support their plugins on the QIIME 2 Forum, so you can include that URL as the `user_support_text` for your plugin. +If you do that, you should get in the habit of monitoring the QIIME 2 Forum for technical support questions. + +The `Plugin` object can live anywhere in your project, but by convention it will be in a file called `plugin_setup.py`. +You can see a complete working example in q2-dwq2 [here](https://github.com/caporaso-lab/q2-dwq2/blob/e8fe1e5b32bfc2a331d48611b3a70b0fa5b19165/q2_dwq2/plugin_setup.py#L21). + + +## Defining your plugin object as an entry point + +Finally, you need to tell QIIME where to find your instantiated `Plugin` object. +This is done by defining it as an `entry_point` in your project's `setup.py` file. +In `q2-diversity`, this is done as follows: + +```python +setup( + ... + entry_points={ + 'qiime2.plugins': ['q2-diversity=q2_diversity.plugin_setup:plugin'] + } +) +``` + +The relevant key in the `entry_points` dictionary will be `'qiime2.plugins'`, and the value will be a single element list containing a string formatted as `=:`. +`` is the name of the Python package distribution (matching the value passed for `name` in this call to `setup`); `` is the import path for the `Plugin` instance you created above; and `` is the name for the `Plugin` instance you created above. diff --git a/bibliography.html b/back-matter/bibliography.html similarity index 57% rename from bibliography.html rename to back-matter/bibliography.html index f6fcd2b9..504ee906 100644 --- a/bibliography.html +++ b/back-matter/bibliography.html @@ -19,55 +19,56 @@ - - - + + + - - - - + + + + - - - - - - - - + + + + + + + + - - - - - - - - - - - - + + + + + + + + + + + + - + - + - - - - - - + + + + + + + @@ -104,7 +105,7 @@
@@ -163,40 +164,52 @@ @@ -251,7 +264,7 @@ -
  • List of works citedList of works cited

    previous

    -

    Introduction to CI and Release Process

    +

    Glossary

    +
    +
    + +
    +

    next

    +

    Index

    +
    @@ -447,8 +469,8 @@

    List of works cited - + +
    diff --git a/back-matter/genindex.html b/back-matter/genindex.html new file mode 100644 index 00000000..c468e76e --- /dev/null +++ b/back-matter/genindex.html @@ -0,0 +1,461 @@ + + + + + + + + + + + + Index — Developing with QIIME 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + +
    + + + + + + + + + + + +
    +
    +
    + + + + Ctrl+K +
    +
    + + + +
    +
    + + + +
    + + + +
    + +
    +
    + +
    +
    + +
    + +
    + +
    + + +
    + +
    + +
    + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    +
    + + + +
    +

    Index

    + +
    +
    + +
    +
    +
    + + + + +
    + +
    +

    Index#

    +
    + + + + +
    + + + + + + + + +
    + + + + +
    +
    + + +
    + + +
    +
    +
    + + + + + +
    +
    + + \ No newline at end of file diff --git a/back-matter/glossary.html b/back-matter/glossary.html new file mode 100644 index 00000000..80034587 --- /dev/null +++ b/back-matter/glossary.html @@ -0,0 +1,543 @@ + + + + + + + + + + + + Glossary — Developing with QIIME 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + +
    +
    +
    +
    +
    + + + +
    +
    + + + +
    + + + +
    + +
    +
    + +
    +
    + +
    + +
    + +
    + + +
    + +
    + +
    + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    +
    + + + +
    +

    Glossary

    + +
    +
    + +
    +
    +
    + + + + +
    + +
    +

    Glossary#

    +
    +
    Action#

    A generic term to describe a concrete method, visualizer, or pipeline. +Actions accept parameters and/or files (artifacts or metadata) as input, and generate some kind of output.

    +
    +
    Archive#

    The directory structure of a QIIME 2 Result. +Contains at least a root directory (named by UUID) and a VERSION file within that directory.

    +
    +
    Artifact#

    A QIIME 2 Result that contains data to operate on.

    +
    +
    Deployment#

    An installation of QIIME 2 as well as zero-or-more interfaces and plugins. +The collection of interfaces and plugins in a deployment can be defined by a distribution of QIIME 2.

    +
    +
    Directory Format#

    A string that represents a particular layout of files and or directories as well as how their contents will be structured.

    +
    +
    Distribution#

    A collection of QIIME 2 plugins that are designed to be installed together. +These are generally grouped by a theme. For example, the Amplicon Distribution provides a collection of plugins for analysis of microbiome amplicon data, while the Shotgun Distribution provides a collection of plugins for analysis of microbiome shotgun metagenomics data. +When a distribution is installed, that particular installation of QIIME 2 is an example of a deployment.

    +
    +
    Format#

    A string that represents a particular file format.

    +
    +
    Framework#

    The engine of orchestration that enables QIIME 2 to function together as a cohesive unit.

    +
    +
    Identifier#

    A unique value that denotes an individual sample or feature.

    +
    +
    Identity#

    Distinguishes a piece of data. QIIME 2 does not consider a rename (like UNIX mv) to change identity, however re-running a command, would change identity.

    +
    +
    Input#

    Data provided to an action. Can be an artifact or metadata.

    +
    +
    Interface#

    A user-interface responsible for coordinating user-specified intent into framework-driven action.

    +
    +
    Metadata#

    Columnar data for annotating additional values to existing data. Operates along Sample IDs or Feature IDs.

    +
    +
    Method#

    A method accepts some combination of QIIME 2 artifacts and parameters as input, and produces one or more QIIME 2 artifacts as output.

    +
    +
    Output#

    Objects returned by an action. Can be artifact(s) or visualization(s).

    +
    +
    Parameter#

    A value that alters the behavior of an action.

    +
    +
    Payload#

    Data that is meant for primary consumption or interpretation (in contrast to metadata which may be useful retrospectively, but is not primarily useful).

    +
    +
    Pipeline#

    A pipeline accepts some combination of QIIME 2 artifacts and parameters as input, and produces one or more QIIME 2 artifacts and/or visualizations as output.

    +
    +
    Plugin#

    A discrete module that registers some form of additional functionality with the framework, including new methods, visualizers, formats, or transformers.

    +
    +
    Primitive Type#

    A type that is used to communicate parameters to an interface. These are predefined by the framework and cannot be extended.

    +
    +
    Result#

    A generic term for either a Visualization or an Artifact.

    +
    +
    Provenance#

    Data describing how an analysis was performed, captured automatically whenever users perform a QIIME 2 Action. +Provenance information describes the host system, the computing environment, Actions performed, parameters passed, primary sources cited, and more.

    +
    +
    Semantic Type#

    A type that is used to classify artifacts and how they can be used. +These types may be extended by plugins.

    +
    +
    Transformer#

    A function registered on the framework capable of converting data in one format into data of another format.

    +
    +
    Type#

    A term that is used to represent several different ideas in QIIME 2, and which is therefore ambiguous when used on its own. +More specific terms are file type, semantic type, and data type. See Semantic types, data types, and file formats for more information.

    +
    +
    UUID#

    Universally Unique IDentifier, in the context of QIIME 2, almost certainly refers to a Version 4 UUID, which is a randomly generated ID. +See this RFC or this wikipedia entry for details.

    +
    +
    View#

    A particular representation of data. This includes on-disk formats and in-memory data structures (objects).

    +
    +
    Visualization#

    A QIIME 2 Result that contains an interactive visualization.

    +
    +
    Visualization (Type)#

    The type of a visualization. +There are no subtyping relations between this type and any other (it is a singleton) and cannot be extended (because it is a singleton).

    +
    +
    Visualizer#

    A visualizer accepts some combination of QIIME 2 artifacts and parameters as input, and produces exactly one visualization as output.

    +
    +
    +
    + + + + +
    + + + + + + + + +
    + + + + +
    +
    + + +
    + + +
    +
    +
    + + + + + +
    +
    + + \ No newline at end of file diff --git a/ci/intro.html b/ci/intro.html index b58eec23..a3e513d8 100644 --- a/ci/intro.html +++ b/ci/intro.html @@ -67,8 +67,8 @@ - - + + @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    @@ -383,20 +395,20 @@

    Introduction to CI and Release Process

    previous

    -

    Introduction to the Framework

    +

    QIIME 2 architecture overview

    next

    -

    List of works cited

    +

    Glossary

    diff --git a/docs/developer-documentation.html b/docs/developer-documentation.html index cfd3d9d3..f627530e 100644 --- a/docs/developer-documentation.html +++ b/docs/developer-documentation.html @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    diff --git a/docs/user-documentation.html b/docs/user-documentation.html index b2262860..4f8a3c50 100644 --- a/docs/user-documentation.html +++ b/docs/user-documentation.html @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    diff --git a/framework/explanations/architecture.html b/framework/explanations/architecture.html new file mode 100644 index 00000000..8473c6cc --- /dev/null +++ b/framework/explanations/architecture.html @@ -0,0 +1,590 @@ + + + + + + + + + + + + QIIME 2 architecture overview — Developing with QIIME 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + +
    +
    +
    +
    +
    + + + +
    +
    + + + +
    + + + +
    + +
    +
    + +
    +
    + +
    + +
    + +
    + + +
    + +
    + +
    + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    +
    + + + +
    +

    QIIME 2 architecture overview

    + +
    + +
    +
    + + + + +
    + +
    +

    QIIME 2 architecture overview#

    +

    The goal of this document is to give the reader a high-level understanding of the components of QIIME 2, and how they are inter-related.

    +

    At the highest level, there are three kinds of components in QIIME 2:

    +
      +
    • The interfaces, which are responsible for translating user intent into action.

    • +
    • The framework, whose behavior and purpose will be described in further detail below.

    • +
    • The plugins, which define all domain-specific functionality.

    • +
    +
    +../../_images/simple_component_diagram.svg
    +

    Fig. 1 Box and Arrow diagram of QIIME 2 components. +Interfaces only interact with plugins through the framework, which will invoke plugin behavior as needed. +Solid arrows are direct dependency. +Dash-dotted arrows are a deferred dependency (via entry-point).#

    +
    +
    +

    The above diagram illustrates the most important restriction of the architecture. +Interfaces cannot and should not have any particular knowledge about plugins ahead of time. +Instead they must request that information from the framework, which provides a high-level description of all of the actions available via SDK (Software Development Kit) objects.

    +

    At first glance, this restriction may seem onerous. +However, because interfaces cannot communicate directly with plugins, they also never need to coordinate with them. +This means interfaces are entirely decoupled from the plugins and more importantly, plugins are always decoupled from interfaces. +A developer of a plugin, does not need to concern themselves with providing any interface-specific functionality, meaning development of both plugins and interfaces can be done in parallel. +Only changes in the framework itself require coordination between the other two component types. +This key constraint coupled with a set of semantically rich SDK objects allows multiple kinds of interfaces to be dynamically generated. +This allows QIIME 2 to adapt its UI to both the audience and the task at hand.

    +
    +

    Detailed Component Diagram#

    +

    A more complete version of the above figure is found below:

    +
    +../../_images/complex_component_diagram.svg
    +

    Fig. 2 Detailed Box and Arrow diagram of QIIME 2 components. +Solid arrows are a direct dependency. +Dash-dotted arrows are a deferred dependency (via entry-point). +Dashed rounded boxes surrounding other components indicate a group of like-components. +The larger gray box indicates a nested component, containing sub-components. +Text within angle-brackets (<>) indicate a Python package/import name.#

    +
    +
    +

    Here we observe that interfaces use a particular sub-component of the framework called the SDK. +We also see that one of the interfaces is built into the framework itself (the Artifact API), however it is not any more privileged compared to any of the other interfaces, and none of the other interfaces use it directly.

    +

    Looking now at the plugins we see that they use a sub-component of the framework called the Plugin API. +This is responsible for constructing and registering the relevant SDK objects for use by interfaces. +We also see that plugins can depend on other plugins.

    +

    At this point the rough picture of how an interface uses a plugin can be seen. +Plugins are loaded by the framework’s SDK via an entry-point (more on that later). +This in turn causes the plugin code to interact with the Plugin API, which constructs SDK objects. +These SDK objects are then introspected and manipulated by any number of Interfaces.

    +
    +
    +

    Following A Command Through QIIME 2#

    +

    To get a better idea of where the responsibility of these components starts and ends, we can look at a sequence diagram describing the execution of an action by a user.

    +
    +../../_images/action_call_sequence_diagram.svg
    +

    Fig. 3 UML Sequence Diagram of an action-call in QIIME 2 +This diagram is read from top to bottom, which indicates the passage of some non-specific amount of time. +Components are vertical columns. +An activated state of a component is indicated by a narrow box. +Components can perform actions either upon other components, or upon themselves. +These actions are denoted with a solid arrow pointing at the actor in question. +The label indicates what action is performed and, when provided, parenthesis indicate some kind of argument that is provided. +Not all arguments are enumerated for brevity. +Results of an action are denoted with a dashed arrow and are usually labeled with the result’s name.#

    +
    +
    +

    This figure has four components: a User, an Interface, the Framework, and a Plugin. +We see first, a User invoking some action with some files and parameters. +The Interface receives this and is activated. +It locates the plugin and the action requested from the Framework, receiving SDK objects (not shown). +Then it loads the provided files as QIIME 2 Artifacts. +It is then ready to call the action (an SDK object) with the User’s artifacts and parameters.

    +

    The Framework then provides some input validation (it is much faster to check that the data provided will work for the action requested than to fail halfway through a very long process, requiring the User to start over). +The Framework then identifies what format the data should be in, and invokes relevant code defined by a Plugin (though not necessarily the same one) for converting that data. +Finally, with validated input and a compatible format, the data is provided to the plugin to perform whatever task the User intended.

    +

    Once finished, the Plugin returns the results and the Framework will again convert that data into a particular format for storage using a Plugin. +The Framework then writes that data into an archive (as /data/) and records what steps just occurred (in /provenance/). +This completed archive is now an artifact and is returned to the Interface. +The Interface decides to save the artifact to a file and then returns that to the User.

    +
    +
    +

    Summary#

    +

    In this example we see that the activation of each component is strictly nested. +It forms a sort of “onion of responsibility” between the component layers. +We also note that the Interface waits for the task to finish before becoming inactive; there are other modes of calling actions which are asynchronous and can be used instead. +In either case, we see that each component is successively responsible for less tasks which become more specific as we move to the right.

    +

    The end result is:

    +
      +
    • The Interface need only care about communicating with the User.

    • +
    • The Plugin need only care about manipulating data to some effect.

    • +
    • The Framework concerns itself with coordinating the overall effort and recording the data surrounding the action.

    • +
    +
    +
    + + + + +
    + + + + + + + + +
    + + + + + + +
    +
    + + +
    + + +
    +
    +
    + + + + + +
    +
    + + \ No newline at end of file diff --git a/framework/intro.html b/framework/intro.html index 4ba6e854..39701ef7 100644 --- a/framework/intro.html +++ b/framework/intro.html @@ -67,7 +67,7 @@ - + @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    @@ -350,6 +362,8 @@

    Introduction to the Framework

    Introduction to the Framework#

    +
    +
    + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + +
    +
    +
    +
    +
    + + + +
    +
    + + + +
    + + + +
    + +
    +
    + +
    +
    + +
    + +
    + +
    + + +
    + +
    + +
    + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    +
    + + + +
    +

    Types of QIIME 2 Actions

    + +
    +
    + +
    +
    +
    + + + + +
    + +
    +

    Types of QIIME 2 Actions#

    +

    A QIIME 2 plugin action is any operation that accepts parameters and files (:term:artifact or metadata) as input, and generates some type of output. +Actions are interpreted as “commands” by QIIME 2 interfaces and come in three flavors:

    +
      +
    1. A Method accepts some combination of QIIME 2 Artifacts and Parameters as input, and produces one or more QIIME 2 artifacts as output. +These output artifacts could subsequently be used as input to other QIIME 2 Methods or Visualizers. +Methods can produce intermediate or terminal outputs in a QIIME 2 analysis. +For example, the rarefy method defined in the q2-feature-table plugin accepts a feature table artifact and sampling depth as input and produces a rarefied feature table artifact as output. +This rarefied feature table artifact could then be used in another analysis, such as alpha diversity calculations provided by the alpha method in q2-diversity.

    2. +
    3. +
    +

    A Visualizer is similar to a Method in that it accepts some combination of QIIME 2 Artifacts and Parameters as input. +In contrast to a method, a visualizer produces exactly one Visualization as output. +Visualizations, by definition, cannot be used as input to other QIIME 2 methods or visualizers. +Thus, visualizers can only produce terminal output in a QIIME 2 analysis.

    +
      +
    1. +
    +

    A Pipeline accepts some combination of QIIME 2 Artifacts and Parameters as input and produces one or more artifacts and/or visualizations as output. +It does so by incorporating one or more methods and/or visualizers into a single registered action.

    +
    + + + + +
    + + + + + + +
    + +
    +
    +
    + +
    + + + + +
    +
    + + +
    + + +
    +
    +
    + + + + + +
    +
    + + \ No newline at end of file diff --git a/plugins/explanations/intro.html b/plugins/explanations/intro.html index 1918e5fc..b381ab27 100644 --- a/plugins/explanations/intro.html +++ b/plugins/explanations/intro.html @@ -68,7 +68,7 @@ - + @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    @@ -357,6 +369,7 @@

    Explanations

    @@ -392,12 +405,12 @@

    Explanations

    previous

    -

    Set up your development environment

    +

    Use Artifact Collections as Action inputs or outputs

    + + + + + + + + + Transformers — Developing with QIIME 2 + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + +
    + + + + + + + + + + + +
    +
    +
    +
    +
    + + + +
    +
    + + + +
    + + + +
    + +
    +
    + +
    +
    + +
    + +
    + +
    + + +
    + +
    + +
    + + + + + + + + + + + + + + + + + + + +
    + +
    + +
    +
    + + + +
    +

    Transformers

    + +
    +
    + +
    +

    Contents

    +
    + +
    +
    +
    + + + + +
    + +
    +

    Transformers#

    +

    Transformers are functions for converting Artifact Classes into data types to be consumed by Python functions that are registered as Actions. +These transformers are typically defined along with the semantic types for which they are designed, and q2-types provides a number of common types and associated transformers. +Plugins can also define semantic types and/or transformers.

    +
    +

    How are transformers used by a plugin?#

    +

    Transformers are not called directly at any time within a plugin. +Transformations are handled by the QIIME 2 framework, as long as the appropriate transformers are registered. +The framework identifies the input Artifact source format for the transformation as the format registered to the Artifact’s semantic type, and it identifies the transformation’s destination type based on the functional annotation associated with the input in an Action’s registered function. +When output is generated by a Method, the framework identifies the source type for the transformation as the registered function’s output annotation, and the destination format for the output Artifact from the output’s semantic type defined in the Action’s registration.

    +

    For example, we can see how functional annotations define input and output formats in q2_diversity.beta_phylogenetic:

    +
    def beta_phylogenetic(table: biom.Table,
    +                      phylogeny: skbio.TreeNode,
    +                      metric: str)-> skbio.DistanceMatrix:
    +
    +
    +

    This function requires biom.Table and skbio.TreeNode objects as input, and produces an skbio.DistanceMatrix object as output.

    +

    We can examine the first few lines of the Action registration for this function to determine the semantic types of these input and output objects:

    +
    plugin.pipelines.register_function(
    +    function=q2_diversity.beta_phylogenetic,
    +    inputs={'table': FeatureTable[Frequency],
    +            'phylogeny': Phylogeny[Rooted]},
    +    parameters={'metric': Str % Choices(beta.phylogenetic_metrics())},
    +    outputs=[('distance_matrix', DistanceMatrix)],
    +
    +
    +

    This pair of code examples illustrate that the biom.Table object used in beta_phylogenetic begins its life as a FeatureTable[Frequency] artifact, and the skbio.TreeNode comes from a Phylogeny[Rooted] artifact. +The output skbio.DistanceMatrix generated by beta_phylogenetic must be coerced to become a DistanceMatrix artifact. +The QIIME 2 framework takes care of all of those conversions for you, provided the appropriate transformers have been defined and registered (which you can learn how to do here).

    +
    +
    + + + + +
    + + + + + + + + +
    + + + +
    + + +
    +
    + + +
    + + +
    +
    +
    + + + + + +
    +
    + + \ No newline at end of file diff --git a/plugins/explanations/types-of-types.html b/plugins/explanations/types-of-types.html index a36843e7..15357759 100644 --- a/plugins/explanations/types-of-types.html +++ b/plugins/explanations/types-of-types.html @@ -67,7 +67,7 @@ - + @@ -172,10 +172,17 @@
  • Tutorials: Developing a QIIME 2 Plugin
  • How To Guides
  • Explanations
  • References
      @@ -189,7 +196,10 @@

    Framework Internals

    CI Internals

    Back Matter

    @@ -365,7 +377,7 @@

    Contents

    The term type is overloaded with a few different concepts. The goal of this Explanation article is to disambiguate how it’s used in QIIME 2. To achieve this, we’ll discuss two ways that it’s commonly used, and then introduce a third way that it’s used less frequently but which is important to QIIME 2. -The three kinds of types that are used in QIIME 2 are file types (more frequently referred to file formats in QIIME 2), data types, and semantic types.

    +The three kinds of types that are used in QIIME 2 are file types, data types, and semantic types.