Skip to content

Latest commit

 

History

History
161 lines (111 loc) · 8.32 KB

README.md

File metadata and controls

161 lines (111 loc) · 8.32 KB

GitHub code navigation

GitHub code navigation helps you to read, navigate, and understand code by linking definitions of named symbols (like a class or method) to references to that symbol, as well as linking references to the symbol's definition. GitHub has developed two code navigation approaches:

  • Search-based: searches all definitions and references across a repository to find symbols with a given name
  • Precise: resolves definitions and references based on the set of classes, functions, and imported definitions at a given point in your code

Search-based code navigation is implemented using the Tree-sitter parser ecosystem. A few languages support precise code navigation, built with stack graphs.

For more information, see "Navigating code on GitHub."

Supported languages

Code navigation is supported for the following languages:

Language Search-based Precise
Bash ✔️ ✖️
C# ✔️ ✖️
C++ ✔️ ✖️
CodeQL ✔️ ✖️
Elixir ✔️ ✖️
Go ✔️ ✖️
JSX ✔️ ✖️
Java ✔️ ✖️
JavaScript ✔️ ✖️
Lua ✔️ ✖️
PHP ✔️ ✖️
Protocol Buffers ✔️ ✖️
Python ✔️ ✔️
R ✔️ ✖️
Ruby ✔️ ✖️
Rust ✔️ ✖️
Scala ✔️ ✖️
Starlark ✔️ ✖️
Swift ✔️ ✖️
Typescript ✔️ ✔️

If your programming language is not one of them, you can help us add it.

Adding code navigation for a new language

To add code navigation for a new language, you must follow these steps:

  1. Add the language to Linguist.
  2. Define a Tree-sitter parser for the language.
  3. Write tags queries.
  4. Write fully-qualified name queries (if applicable).
  5. Open an issue in this repo.

For details, see below.

Note

Adding a language is at the discretion of GitHub. We may not add every language. Common reasons to reject language support include an immature Tree-sitter parser, excessive resources required to parse, or low use on GitHub.

Add the language to Linguist

First, the language must be added to Linguist. Linguist is the source of truth for all languages on GitHub.

You can check to see if the language exists in Linguist by searching the languages.yml file. If your language is not included in Linguist, follow the contribution guidelines to get it added.

Tree-sitter parser

Next, we require a mature, well-maintained Tree-sitter parser for the language. The parser must publish a Rust crate to crates.io.

Most popular programming languages already have a Tree-sitter grammar, but if you need to create one, you can review the documentation for creating a new parser.

Tags query

Once the language has a Tree-sitter parser, you need to write tag queries to extract the structure of the code for navigation. A tag query is a Scheme-like expression that navigates the Abstract Syntax Tree generated by the Tree-sitter parser to extract a symbol. You can look at existing Tree-sitter parsers for inspiration. Parsers usually contain a file called tags.scm with tag queries (for example, see the JavaScript tag queries). Additionally, Tree-sitter has documentation about using tags queries for code navigation.

GitHub code navigation supports extracting definitions for these types of symbols:

Category Tag
Class @definition.class
Constant @definition.constant
Enum @definition.enum
Enum variant @definition.enum_variant
Field @definition.field
Function @definition.function
Implementation @definition.implementation
Interface @definition.interface
Macro @definition.macro
Method @definition.method
Module @definition.module
Struct @definition.struct
Trait @definition.trait
Type @definition.type
Union @definition.union

Additionally, references to function or method calls can be extracted as @reference.call.

Not all programming languages support all of these symbol types. The tag queries should contain only those that make sense for your programming language.

Fully-qualified names

For languages that support defining functions, methods, or other entities within another structure, GitHub code navigation supports extracting fully-qualified names. Fully-qualified names are used to improve code navigation as well as the relevance of search results.

Here is an example from our Java extractor. The following Java code defines a class named Cat that has a method named noise:

public class Cat {
  public String noise() {
    return "meow";
  }
}

Our tag queries extract @definition.class and @definition.method and tag the identifiers with @name:

(class_declaration name: (identifier) @name) @definition.class

(method_declaration name: (identifier) @name) @definition.method

The extracted identifier names are used to prefix the method name (noise) with its container's name (Cat), resulting in the fully-qualified name Cat::noise.

However, not all languages define nested items within the container. For example, Go has methods, but they are defined separately from the struct they belong to:

type Cat struct {}

func (c Cat) Noise() string {
    return "meow"
}

To implement fully-qualified names for languages like Go, GitHub code navigation adds a @scope capture name:

(method_declaration
  receiver: (parameter_list (parameter_declaration type: (type_identifier) @scope))
  name: (field_identifier) @name
) @definition.method

Our extractor uses the @scope capture to create the fully qualified name Cat.Noise.

If your language supports nested entities that are defined separately, include a @scope capture for best results with GitHub code navigation.

File a request to add your language

Finally, create an issue in this repository. We will evaluate adding the parser to the code search indexing system.

License

This project is licensed under the terms of the MIT open source license. Please refer to the license for the full terms.

Maintainers

This project is maintained by members of the GitHub code search team.

Support

Please file an issue for support. See SUPPORT.md for details.

Acknowledgments

GitHub code navigation is made possible by the Tree-sitter ecosystem and all the Tree-sitter parser maintainers. Thank you!