The interface of CINTIL-Treebank Online Searcher is simple.
(1) To help you, we have examples of 3 different levels of difficulty: simple, complex and advanced.
(2) There is a text box where you must type the syntactic pattern you want to search.
(3) You can mark the option to the show the POS tag in the trees.
(4) You can choose the number of results returned (between 1 and 20 sentences).
(5) Once the search results are returned, use the navigation buttons and arrows to search for the next results.
(6) To view the tree just place the cursor on the sentence you want and click.
(7) The syntactic tree corresponding to the sentence will appear below.
(8) The dependency tree corresponding to the sentence will appear beneath the syntactic tree.
Searching by linguistic tags
To start the search by linguistic tags, you must know the tags and syntax for searching.
The tagsets used in the annotation of CINTIL-Treebank are available for quick reference under the tab "Tagsets" at the top of this panel.
The table below presents the syntax and symbols used for searching in the CINTIL-Treebank. In the search by linguistic tags, tags should always be capitalized.
|A << B||A dominates B||NP << N|
|A >> B||A is dominated by B||V >> VP|
|A < B||A immediately dominates B||PP < P|
|A > B||A is immediately dominated by B||CONJ > NP|
|A $ B||A is a sister of B||NP $ CONJ|
|A .. B||A precedes B||P .. POSS-M|
|A . B||A immediately precedes B||CONJ . VP|
|A ,, B||A follows B||CARD ,, VP|
|A , B||A immediately follows B||D-SP , NP-C|
|A <<, B||B is a leftmost descendent of A||VP <<, P|
|A <<;- B||B is a rightmost descendent of A||PP <<;- N|
|A >>, B||A is a leftmost descendent of B||ADV >>, S|
|A >>;- B||A is a rightmost descendent of B||S >>;- VP|
|A <, B||B is the first child of A||PP <, P|
|A >, B||A is the first child of B||V >, VP|
|A <- B||B is the last child of A||PP <- NP-C|
|A >- B||A is the last child of B||CARD >- D-SP|
|A <i B||B is the ith-to-last child of A||NP-C <1 D-SP|
|A >i B||A is the ith-to-last child of B||ADV >1 ADVP|
|A <: B||B is the only child of A||NP-C <: N|
|A >: B||A is the only child of A||N >: NP|
|A <<# B||B is a head of phrase A||D-SP <<# CARD|
|A <# B||B is the immediate head of phrase A||NP <# N|
|@A||All tags that have string A||@NP|
Searching by regular expressions
It is possible to search with regular expressions. The usual notational conventions are followed:
- Alternatives are introduced by the | (vertical bar) character:
NP|VPmatches all parse trees with a noun phrase and all parse trees with a verbal phrase.
- There are three forms of expressing iteration.
.*(final mark + star) operators permit that the character/expression preceding it is matched zero or more times, provided it is enclosed in bars
/NP.*/matches any parse tree with tag NP, for example: NP, NP-C, NP-M e NP-SJ.
- To delimit the beginning and end of a tag, you can use special
$. This type of search is useful when you want to find parse trees with a composition of semantic roles and grammatical tags, provided it is enclosed in bars
/^NP.*.ARG1$/matches any parse tree with beginning with tag NP, with any tag in the middle, but ending with tag ARG1, which indicates the semantic role of first argument, for example: NP-DO-ARG1 e NP-SJ-ARG1.
Searching by words
The search can also be performed in leaves of trees where the words
To find any word, type it in the text box. For example:
Click the button "Search" and all sentences where the word exists will be
The search by words depends upon their spelling in the treebank. The word can be written both in upper or lower case.
To improve the search we can try words with different spellings. For example:
Searching by sentence identifier
All sentences in the CINTIL-Treebank have a unique number identifier. The identifier is shown when the sentence is returned on the screen.
The user can use this number to directly find sentences in the
CINTIL-Treebank. In order to search for a sentence using its number
identifier the user must make a note of the corresponding returned with the
sentence. The search uses the pattern "ID:". For example,
will select the sentence with identifier 102 in the
CINTIL-Treebank. To visualize the parse tree just click on the sentence.
Search non-matching trees
The CINTIL-Treebank Searcher provides an option to find parse trees
that don't have a determined pattern.
To use this search option, it is required to use the word "INV", following the colon ":". Thus, the parse trees where the pattern is not found are return as a result. For example,
INV:VP will select all sentences that do not
have verbal phrases as a result.
To visualize the parse tree just click on the sentence.