From f5c4671bfbad96bf346bd7e9a21fc4317b4959df Mon Sep 17 00:00:00 2001 From: Indrajith K L Date: Sat, 3 Dec 2022 17:00:20 +0530 Subject: Adds most of the tools --- ctags/man/ctags-client-tools.7.html | 869 ++++++++++++++++++++++++++++++++++++ 1 file changed, 869 insertions(+) create mode 100644 ctags/man/ctags-client-tools.7.html (limited to 'ctags/man/ctags-client-tools.7.html') diff --git a/ctags/man/ctags-client-tools.7.html b/ctags/man/ctags-client-tools.7.html new file mode 100644 index 0000000..1ddb80c --- /dev/null +++ b/ctags/man/ctags-client-tools.7.html @@ -0,0 +1,869 @@ + + + + + + +ctags-client-tools + + + +
+ +

ctags-client-tools

+

Hints for developing a tool using ctags command and tags output

+ +++ + + + + + + + +
Version:5.9.0
Manual group:Universal Ctags
Manual section:7
+
+

SYNOPSIS

+
+
ctags [options] [file(s)]
+
etags [options] [file(s)]
+
+
+
+

DESCRIPTION

+

Client tool means a tool running the ctags command +and/or reading a tags file generated by ctags command. +This man page gathers hints for people who develop client tools.

+
+
+

PSEUDO-TAGS

+

Pseudo-tags, stored in a tag file, indicate how +ctags generated the tags file: whether the +tags file is sorted or not, which version of tags file format is used, +the name of tags generator, and so on. The opposite term for +pseudo-tags is regular-tags. A regular-tag is for a language +object in an input file. A pseudo-tag is for the tags file +itself. Client tools may use pseudo-tags as reference for processing +regular-tags.

+

A pseudo-tag is stored in a tags file in the same format as +regular-tags as described in tags(5), except that pseudo-tag names +are prefixed with "!_". For the general information about +pseudo-tags, see "TAG FILE INFORMATION" in tags(5).

+

An example of a pseudo tag:

+
+!_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/
+
+

The value, "2", associated with the pseudo tag "TAG_PROGRAM_NAME", is +used in the field for input file. The description, "Derived from +Exuberant Ctags", is used in the field for pattern.

+

Universal Ctags extends the naming scheme of the classical pseudo-tags +available in Exuberant Ctags for emitting language specific +information as pseudo tags:

+
+!_{pseudo-tag-name}!{language-name}     {associated-value}      /{description}/
+
+

The language-name is appended to the pseudo-tag name with a separator, "!".

+

An example of pseudo tag with a language suffix:

+
+!_TAG_KIND_DESCRIPTION!C        f,function      /function definitions/
+
+

This pseudo-tag says "the function kind of C language is enabled +when generating this tags file." --pseudo-tags is the option for +enabling/disabling individual pseudo-tags. When enabling/disabling a +pseudo tag with the option, specify the tag name only +"TAG_KIND_DESCRIPTION", without the prefix ("!_") or the suffix ("!C").

+
+

Options for Pseudo-tags

+
+
--extras=+p (or --extras=+{pseudo})
+

Forces writing pseudo-tags.

+

ctags emits pseudo-tags by default when writing tags +to a regular file (e.g. "tags'.) However, when specifying -o - +or -f - for writing tags to standard output, +ctags doesn't emit pseudo-tags. --extras=+p or +--extras=+{pseudo} will force pseudo-tags to be written.

+
+
--list-pseudo-tags
+

Lists available types of pseudo-tags and shows whether they are enabled or disabled.

+

Running ctags with --list-pseudo-tags option +lists available pseudo-tags. Some of pseudo-tags newly introduced +in Universal Ctags project are disabled by default. Use +--pseudo-tags=... to enable them.

+
+
--pseudo-tags=[+|-]names|*
+

Specifies a list of pseudo-tag types to include in the output.

+

The parameters are a set of pseudo tag names. Valid pseudo tag names +can be listed with --list-pseudo-tags. Surround each name in the set +with braces, like "{TAG_PROGRAM_AUTHOR}". You don't have to include the "!_" +pseudo tag prefix when specifying a name in the option argument for --pseudo-tags= +option.

+

pseudo-tags don't have a notation using one-letter flags.

+

If a name is preceded by either the '+' or '-' characters, that +tags's effect has been added or removed. Otherwise the names replace +any current settings. All entries are included if '*' is given.

+
+
--fields=+E (or --fields=+{extras})
+

Attach "extras:pseudo" field to pseudo-tags.

+

An example of pseudo tags with the field:

+
+!_TAG_PROGRAM_NAME      Universal Ctags /Derived from Exuberant Ctags/  extras:pseudo
+
+

If the name of a normal tag in a tag file starts with "!_", a +client tool cannot distinguish whether the tag is a regular-tag or +pseudo-tag. The fields attached with this option help the tool +distinguish them.

+
+
+
+
+

List of notable pseudo-tags

+

Running ctags with --list-pseudo-tags option lists available types +of pseudo-tags with short descriptions. This subsection shows hints +for using notable ones.

+
+
TAG_EXTRA_DESCRIPTION (new in Universal Ctags)
+

Indicates the names and descriptions of enabled extras:

+
+!_TAG_EXTRA_DESCRIPTION       {extra-name}    /description/
+!_TAG_EXTRA_DESCRIPTION!{language-name}       {extra-name}    /description/
+
+

If your tool relies on some extra tags (extras), refer to +the pseudo-tags of this type. A tool can reject the tags file that +doesn't include expected extras, and raise an error in an early +stage of processing.

+

An example of the pseudo-tags:

+
+$ ctags --extras=+p --pseudo-tags='{TAG_EXTRA_DESCRIPTION}' -o - input.c
+!_TAG_EXTRA_DESCRIPTION       anonymous       /Include tags for non-named objects like lambda/
+!_TAG_EXTRA_DESCRIPTION       fileScope       /Include tags of file scope/
+!_TAG_EXTRA_DESCRIPTION       pseudo  /Include pseudo tags/
+!_TAG_EXTRA_DESCRIPTION       subparser       /Include tags generated by subparsers/
+...
+
+

A client tool can know "{anonymous}", "{fileScope}", "{pseudo}", +and "{subparser}" extras are enabled from the output.

+
+
TAG_FIELD_DESCRIPTION (new in Universal Ctags)
+

Indicates the names and descriptions of enabled fields:

+
+!_TAG_FIELD_DESCRIPTION       {field-name}    /description/
+!_TAG_FIELD_DESCRIPTION!{language-name}       {field-name}    /description/
+
+

If your tool relies on some fields, refer to the pseudo-tags of +this type. A tool can reject a tags file that doesn't include +expected fields, and raise an error in an early stage of +processing.

+

An example of the pseudo-tags:

+
+$ ctags --fields-C=+'{macrodef}' --extras=+p --pseudo-tags='{TAG_FIELD_DESCRIPTION}' -o - input.c
+!_TAG_FIELD_DESCRIPTION       file    /File-restricted scoping/
+!_TAG_FIELD_DESCRIPTION       input   /input file/
+!_TAG_FIELD_DESCRIPTION       name    /tag name/
+!_TAG_FIELD_DESCRIPTION       pattern /pattern/
+!_TAG_FIELD_DESCRIPTION       typeref /Type and name of a variable or typedef/
+!_TAG_FIELD_DESCRIPTION!C     macrodef        /macro definition/
+...
+
+

A client tool can know "{file}", "{input}", "{name}", "{pattern}", +and "{typeref}" fields are enabled from the output. +The fields are common in languages. In addition to the common fields, +the tool can known "{macrodef}" field of C language is also enabled.

+
+
TAG_FILE_ENCODING (new in Universal Ctags)
+
TBW
+
TAG_FILE_FORMAT
+
See also tags(5).
+
TAG_FILE_SORTED
+
See also tags(5).
+
TAG_KIND_DESCRIPTION (new in Universal Ctags)
+

Indicates the names and descriptions of enabled kinds:

+
+!_TAG_KIND_DESCRIPTION!{language-name}        {kind-letter},{kind-name}       /description/
+
+

If your tool relies on some kinds, refer to the pseudo-tags of +this type. A tool can reject the tags file that doesn't include +expected kinds, and raise an error in an early stage of +processing.

+

Kinds are language specific, so a language name is always +appended to the tag name as suffix.

+

An example of the pseudo-tags:

+
+$ ctags --extras=+p --kinds-C=vfm --pseudo-tags='{TAG_KIND_DESCRIPTION}' -o - input.c
+!_TAG_KIND_DESCRIPTION!C      f,function      /function definitions/
+!_TAG_KIND_DESCRIPTION!C      m,member        /struct, and union members/
+!_TAG_KIND_DESCRIPTION!C      v,variable      /variable definitions/
+...
+
+

A client tool can know "{function}", "{member}", and "{variable}" +kinds of C language are enabled from the output.

+
+
TAG_KIND_SEPARATOR (new in Universal Ctags)
+
TBW
+
TAG_OUTPUT_EXCMD (new in Universal Ctags)
+
Indicates the specified type of EX command with --excmd option.
+
TAG_OUTPUT_FILESEP (new in Universal Ctags)
+
TBW
+
TAG_OUTPUT_MODE (new in Universal Ctags)
+
TBW
+
TAG_PATTERN_LENGTH_LIMIT (new in Universal Ctags)
+
TBW
+
TAG_PROC_CWD (new in Universal Ctags)
+

Indicates the working directory of ctags during processing.

+

This pseudo-tag helps a client tool solve the absolute paths for +the input files for tag entries even when they are tagged with +relative paths.

+

An example of the pseudo-tags:

+
+$ cat tags
+!_TAG_PROC_CWD        /tmp/   //
+main  input.c /^int main (void) { return 0; }$/;"     f       typeref:typename:int
+...
+
+

From the regular tag for "main", the client tool can know the +"main" is at "input.c". However, it is a relative path. So if the +directory where ctags run and the directory +where the client tool runs are different, the client tool cannot +find "input.c" from the file system. In that case, +TAG_PROC_CWD gives the tool a hint; "input.c" may be at "/tmp".

+
+
TAG_PROGRAM_NAME
+
TBW
+
TAG_ROLE_DESCRIPTION (new in Universal Ctags)
+

Indicates the names and descriptions of enabled roles:

+
+!_TAG_ROLE_DESCRIPTION!{language-name}!{kind-name}    {role-name}     /description/
+
+

If your tool relies on some roles, refer to the pseudo-tags of +this type. Note that a role owned by a disabled kind is not listed +even if the role itself is enabled.

+
+
+
+
+
+

REDUNDANT-KINDS

+

TBW

+
+
+

MULTIPLE-LANGUAGES FOR AN INPUT FILE

+

Universal ctags can run multiple parsers. +That means a parser, which supports multiple parsers, may output tags for +different languages. language/l field can be used to show the language +for each tag.

+
+$ cat /tmp/foo.html
+<html>
+<script>var x = 1</script>
+<h1>title</h1>
+</html>
+$ ./ctags -o - --extras=+g /tmp/foo.html
+title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h
+x       /tmp/foo.html   /var x = 1/;"   v
+$ ./ctags -o - --extras=+g --fields=+l /tmp/foo.html
+title   /tmp/foo.html   /^  <h1>title<\/h1>$/;" h       language:HTML
+x       /tmp/foo.html   /var x = 1/;"   v       language:JavaScript
+
+
+
+

UTILIZING READTAGS

+

See readtags(1) to know how to use readtags. This section is for discussing +some notable topics for client tools.

+
+

Build Filter/Sorter Expressions

+

Certain escape sequences in expressions are recognized by readtags. For +example, when searching for a tag that matches a\?b, if using a filter +expression like '(eq? $name "a\?b")', since \? is translated into a +single ? by readtags, it actually searches for a?b.

+

Another problem is if a single quote appear in filter expressions (which is +also wrapped by single quotes), it terminates the expression, producing broken +expressions, and may even cause unintended shell injection. Single quotes can +be escaped using '"'"'.

+

So, client tools need to:

+
    +
  • Replace \ by \\
  • +
  • Replace ' by '"'"'
  • +
+

inside the expressions. If the expression also contains strings, " in the +strings needs to be replaced by \".

+

Client tools written in Lisp could build the expression using lists. prin1 +(in Common Lisp style Lisps) and write (in Scheme style Lisps) can +translate the list into a string that can be directly used. For example, in +EmacsLisp:

+
+

System Message: WARNING/2 (ctags-client-tools.7.rst, line 308)

+

Cannot analyze code. No Pygments lexer found for "EmacsLisp".

+
+.. code-block:: EmacsLisp
+
+   (let ((name "hi"))
+     (prin1 `(eq? $name ,name)))
+   => "(eq\\? $name "hi")"
+
+
+
+

The "?" is escaped, and readtags can handle it. Scheme style Lisps should do +proper escaping so the expression readtags gets is just the expression passed +into write. Common Lisp style Lisps may produce unrecognized escape +sequences by readtags, like \#. Readtags provides some aliases for these +Lisps:

+
    +
  • Use true for #t.
  • +
  • Use false for #f.
  • +
  • Use nil or () for ().
  • +
  • Use (string->regexp "PATTERN") for #/PATTERN/. Use +(string->regexp "PATTERN" :case-fold true) for #/PATTERN/i. Notice +that string->regexp doesn't require escaping "/" in the pattern.
  • +
+

Notice that even when the client tool uses this method, ' still needs to be +replaced by '"'"' to prevent broken expressions and shell injection.

+

Another thing to notice is that missing fields are represented by #f, and +applying string operators to them will produce an error. You should always +check if a field is missing before applying string operators. See the +"Filtering" section in readtags(1) to know how to do this. Run "readtags -H +filter" to see which operators take string arguments.

+
+
+

Parse Readtags Output

+

In the output of readtags, tabs can appear in all field values (e.g., the tag +name itself could contain tabs), which makes it hard to split the line into +fields. Client tools should use the -E option, which keeps the escape +sequences in the tags file, so the only field that could contain tabs is the +pattern field.

+

The pattern field could:

+
    +
  • Use a line number. It will look like number;" (e.g. 10;").
  • +
  • Use a search pattern. It will look like /pattern/;" or ?pattern?;". +Notice that the search pattern could contain tabs.
  • +
  • Combine these two, like number;/pattern/;" or number;?pattern?;".
  • +
+

These are true for tags files using extended format, which is the default one. +The legacy format (i.e. --format=1) doesn't include the semicolons. It's +old and barely used, so we won't discuss it here.

+

Client tools could split the line using the following steps:

+
    +
  • Find the first 2 tabs in the line, so we get the name and input field.
  • +
  • From the 2nd tab:
      +
    • If a / follows, then the pattern delimiter is /.
    • +
    • If a ? follows, then the pattern delimiter is ?.
    • +
    • If a number follows, then:
        +
      • If a ;/ follows the number, then the delimiter is /.
      • +
      • If a ;? follows the number, then the delimiter is ?.
      • +
      • If a ;" follows the number, then the field uses only line number, and +there's no pattern delimiter (since there's no regex pattern). In this +case the pattern field ends at the 3rd tab.
      • +
      +
    • +
    +
  • +
  • After the opening delimiter, find the next unescaped pattern delimiter, and +that's the closing delimiter. It will be followed by ;" and then a tab. +That's the end of the pattern field. By "unescaped pattern delimiter", we +mean there's an even number (including 0) of backslashes before it.
  • +
  • From here, split the rest of the line into fields by tabs.
  • +
+

Then, the escape sequences in fields other than the pattern field should be +translated. See "Proposal" in tags(5) to know about all the escape sequences.

+
+
+

Make Use of the Pattern Field

+

The pattern field specifies how to find a tag in its source file. The code +generating this field seems to have a long history, so there are some pitfalls +and it's a bit hard to handle. A client tool could simply require the line: +field and jump to the line it specifies, to avoid using the pattern field. But +anyway, we'll discuss how to make the best use of it here.

+

You should take the words here merely as suggestions, and not standards. A +client tool could definitely develop better (or simpler) ways to use the +pattern field.

+

From the last section, we know the pattern field could contain a line number +and a search pattern. When it only contains the line number, handling it is +easy: you simply go to that line.

+

The search pattern resembles an EX command, but as we'll see later, it's +actually not a valid one, so some manual work are required to process it.

+

The search pattern could look like /pat/, called "forward search pattern", +or ?pat?, called "backward search pattern". Using a search pattern means +even if the source file is updated, as long as the part containing the tag +doesn't change, we could still locate the tag correctly by searching.

+

When the pattern field only contains the search pattern, you just search for +it. The search direction (forward/backward) doesn't matter, as it's decided +solely by whether the -B option is enabled, and not the actual context. You +could always start the search from say the beginning of the file.

+

When both the search pattern and the line number are presented, you could make +good use of the line number, by going to the line first, then searching for the +nearest occurence of the pattern. A way to do this is to search both forward +and backward for the pattern, and when there is a occurence on both sides, go +to the nearer one.

+

What's good about this is when there are multiple identical lines in the source +file (e.g. the COMMON block in Fortran), this could help us find the correct +one, even after the source file is updated and the tag position is shifted by a +few lines.

+

Now let's discuss how to search for the pattern. After you trim the / or +? around it, the pattern resembles a regex pattern. It should be a regex +pattern, as required by being a valid EX command, but it's actually not, as +you'll see below.

+

It could begin with a ^, which means the pattern starts from the beginning +of a line. It could also end with an unescaped $ which means the pattern +ends at the end of a line. Let's keep this information, and trim them too.

+

Now the remaining part is the actual string containing the tag. Some characters +are escaped:

+
    +
  • \.
  • +
  • $, but only at the end of the string.
  • +
  • /, but only in forward search patterns.
  • +
  • ?, but only in backward search patterns.
  • +
+

You need to unescape these to get the literal string. Now you could convert +this literal string to a regexp that matches it (by escaping, like +re.escape in Python or regexp-quote in Elisp), and assemble it with +^ or $ if the pattern originally has it, and finally search for the tag +using this regexp.

+
+
+

Remark: About a Previous Format of the Pattern Field

+

In some earlier versions of Universal Ctags, the line number in the pattern +field is the actual line number minus one, for forward search patterns; or plus +one, for backward search patterns. The idea is to resemble an EX command: you +go to the line, then search forward/backward for the pattern, and you can +always find the correct one. But this denies the purpose of using a search +pattern: to tolerate file updates. For example, the tag is at line 50, +according to this scheme, the pattern field should be:

+
+49;/pat/;"
+
+

Then let's assume that some code above are removed, and the tag is now at +line 45. Now you can't find it if you search forward from line 49.

+

Due to this reason, Universal Ctags turns to use the actual line number. A +client tool could distinguish them by the TAG_OUTPUT_EXCMD pseudo tag, it's +"combine" for the old scheme, and "combineV2" for the present scheme. But +probably there's no need to treat them differently, since "search for the +nearest occurence from the line" gives good result on both schemes.

+
+
+
+

JSON OUTPUT

+

Universal Ctags supports JSON (strictly +speaking JSON Lines) output format if the +ctags executable is built with libjansson. JSON output goes to +standard output by default.

+
+

Format

+

Each JSON line represents a tag.

+
+$ ctags --extras=+p --output-format=json --fields=-s input.py
+{"_type": "ptag", "name": "JSON_OUTPUT_VERSION", "path": "0.0", "pattern": "in development"}
+{"_type": "ptag", "name": "TAG_FILE_SORTED", "path": "1", "pattern": "0=unsorted, 1=sorted, 2=foldcase"}
+...
+{"_type": "tag", "name": "Klass", "path": "/tmp/input.py", "pattern": "/^class Klass:$/", "language": "Python", "kind": "class"}
+{"_type": "tag", "name": "method", "path": "/tmp/input.py", "pattern": "/^    def method(self):$/", "language": "Python", "kind": "member", "scope": "Klass", "scopeKind": "class"}
+...
+
+

A key not starting with _ is mapped to a field of ctags. +"--output-format=json --list-fields" options list the fields.

+

A key starting with _ represents meta information of the JSON +line. Currently only _type key is used. If the value for the key +is tag, the JSON line represents a normal tag. If the value is +ptag, the line represents a pseudo-tag.

+

The output format can be changed in the +future. JSON_OUTPUT_VERSION pseudo-tag provides a change +client-tools to handle the changes. Current version is "0.0". A +client-tool can extract the version with path key from the +pseudo-tag.

+

The JSON output format is newly designed and has no limitation found +in the default tags file format.

+
    +
  • The values for kind key are represented in long-name flags. +No one-letter is here.
  • +
  • Scope names and scope kinds have distinguished keys: scope and scopeKind. +They are combined in the default tags file format.
  • +
+
+
+

Data type used in a field

+

Values for the most of all keys are represented in JSON string type. +However, some of them are represented in string, integer, and/or boolean type.

+

"--output-format=json --list-fields" options show What kind of data type +used in a field of JSON.

+
+$ ctags --output-format=json --list-fields
+#LETTER NAME           ENABLED LANGUAGE         JSTYPE FIXED DESCRIPTION
+F       input          yes     NONE             s--    no    input file
+...
+P       pattern        yes     NONE             s-b    no    pattern
+...
+f       file           yes     NONE             --b    no    File-restricted scoping
+...
+e       end            no      NONE             -i-    no    end lines of various items
+...
+
+

JSTYPE column shows the data types.

+
+
's'
+
string
+
'i'
+
integer
+
'b'
+
boolean (true or false)
+
+

For an example, the value for pattern field of ctags takes a string or a boolean value.

+
+
+
+

SEE ALSO

+

ctags(1), ctags-lang-python(7), ctags-incompatibilities(7), tags(5), readtags(1)

+
+
+ + -- cgit v1.2.3