intent package

Subpackages

Submodules

intent.subcommands module

intent.trees module

class intent.trees.Count[source]

Bases: object

inc(n=1)[source]
val()[source]
class intent.trees.DepEdge(head=None, dep=None, type=None, pos=None)[source]

Bases: object

Container object for holding the head/child, and dependency type.

class intent.trees.DepTree(label, children=(), id=None, type=None, word_index=None, pos=None)[source]

Bases: intent.trees.IdTree

copy()[source]
delete(promote=True)[source]

By default, :param propagate: Whether or not to delete “empty”

nonterminals. Default to false, since DepTrees don’t have the same notion of nonterminal/terminal.
find_heads(term)[source]
find_index(idx)[source]
find_terminal(term)[source]
findall_indices(idx)[source]
classmethod fromstring(tree_string, id_base='', stype='stanford', **kwargs)[source]

Read a dependency tree from a string using several different formats.

:param tree_string: String to parse
:type tree_string: str
:param id_base: ID string on which to base the IDs in this tree.
:type id_base: str
:param stype: The format of the string to parse...
pos_list()[source]

:rtype : list[POSToken]

classmethod root()[source]
similar(other)[source]
span()[source]
stanford_str(separator=' ')[source]

Return a string representation in the stanford parser format.

structurally_eq(other)[source]
subtrees(filter=None, include_root=False)[source]

Override the subtrees finder from the parent class with the default that we will not include the root.

Parameters:
  • filter
  • include_root
Returns:

list of Deptrees

Return type:

list[DepTree]

to_conll(lowercase=False, clean_token=False, match_punc=False, multiple_heads=False, unk_pos='_')[source]

Return a string in CONLL format

(see:
http://ilk.uvt.nl/conll/

under “Data Format”)

to_indices()[source]

Return a representation of the deptree as just a list of (head, child) indices.

word_index
exception intent.trees.DepTreeProjectionError[source]

Bases: intent.trees.TreeProjectionError

class intent.trees.IdTree(label, children=None, id=None)[source]

Bases: nltk.tree.ParentedTree

This is a tree that inherits from NLTK’s tree implementation, but assigns IDs that can be used in writing out the Xigt format.

ancestors()[source]

:rtype : list[DepTree]

assign_ids(id_base='')[source]

Assign IDs to the elements of the tree, using the “id_base” string as a leading element. | Example: id_base of ‘ds’ would result in ‘ds1’, ‘ds2’ etc.

Parameters:id_base (str) – base which to build the IDs from
copy()[source]

Perform a deep copy

Return type:IdTree
delete(propagate=True, promote=False)[source]

Delete self from parent.

param propagate:
 If true, then delete parents that are made empty by this deletion.
type propagate:bool
param promote:If true, then promote the children of this node to be children of the parent.
type promote:bool
depth()[source]
find(filter)[source]
find_index(idx)[source]
find_start_index(idx)[source]
find_stop_index(idx)[source]
findall(filter)[source]
classmethod fromstring(tree_string, id_base='', **kwargs)[source]
Parameters:
  • tree_string – String of a phrase structure tree in PTB format.
  • id_base
  • kwargs

:rtype : IdTree

indices_labels()[source]

Iterate through the tree, and return the list of (label, head, child) tuples.

insert_by_span(t)[source]
insert_sibling(t)[source]
is_preterminal()[source]

Check whether or not the given node is a preterminal (its height should be == 2)

lprint()[source]
merge(i, j, unify_children=True)[source]

Merge the node indices i and j

Parameters:
  • i (int) –
  • j (int) –
nonterminals()[source]
preterminals()[source]
promote()[source]

Delete this node and promote its children

replace(t)[source]

Replace this node in its parent with t

Parameters:t – The tree to replace this instance with
similar(other)[source]

Test equivalency in a tree, but without labels

Parameters:other
Returns:
span(caller=None)[source]

Return the span of indices covered by this node.

spanlength()[source]
swap(i, j)[source]

Swap the node indices i and j. :param i: :type i: int :param j: :type j: int

tagged_words()[source]
exception intent.trees.NoAlignmentProvidedError[source]

Bases: intent.trees.TreeProjectionError

exception intent.trees.PhraseTreeError[source]

Bases: intent.trees.TreeError

class intent.trees.Terminal(label, index=None)[source]

Bases: object

copy()[source]
similar(other)[source]
span(caller=None)[source]
exception intent.trees.TreeError[source]

Bases: Exception

exception intent.trees.TreeMergeError[source]

Bases: intent.trees.TreeProjectionError

exception intent.trees.TreeProjectionError[source]

Bases: Exception

intent.trees.aln_indices(tokens)[source]
intent.trees.build_dep_edges(edges)[source]
intent.trees.contains(t, s_sup, s_sub)[source]
intent.trees.fix_tree_parents(t, preceding_parent=None)[source]

For some reason, the parents are getting broken during tree projection reordering. So, this function will go through and reassign parents of nodes to reflect the top-down view.

Parameters:t – Input Tree
intent.trees.get_ancestors(t)[source]
intent.trees.get_dep_edges(string, stype='stanford')[source]
Parameters:string – A string representation of the dependency tree produced by the stanford parser.
Returns:List of DepEdges
Return type:list[DepEdge]
intent.trees.lowest_common_ancestor(t1, t2)[source]
intent.trees.paren_level_contents(string, f=<function <lambda>>, i=None)[source]

Tail-recursive way to parse a matched set of parens

Parameters:
  • string (str) –
  • f
  • init_open_parens
intent.trees.project_ds(src_t, tgt_w, aln)[source]
  1. Our DS projection algorithm is similar to the projection algorithms

    described in (Hwa et al. 2002) and (Quirk et al. 2005).

    It has four steps:

    1. Copy the English DS. and remove all the unaligned English words from the DS.

    2. We replace each English word in the DS with the corresponding source words. If an English word x aligns to several source words, we will make several copies of the node for x, one copy for each such source word. The copies will all be siblings in the DS. If a source word aligns to multiple English words, after Step 2 the source word will have several copies in the resulting DS.

    3. In the third step, we keep only the copy that is closest to the root and remove all the other copies.

    4. In Step 4, we attach unaligned source words to the DS using the heuristics described in (Quirk et al. 2005).

Parameters:
  • src_t (DepTree) – Source (English) tree to project from
  • tgt_w (RGWordTier) – Set of target (non-English) words to use for projection
  • aln (Alignment) – list of [(src, tgt)] index pairs (src == English)
intent.trees.project_ps(src_t, tgt_w, aln)[source]
  1. Copy the English PS, and remove all unaligned English words.

  2. Replace each English word with the corresponding target words.

    • If an English word x aligns to several target words, make copies of the t, one copy for each such word. The copies will all be siblings.
  3. Start from the root of the projected PS and for each t x with more than one child, reorder each pair of x’s children until they are in the correct order.

    • Let y_i and y_j be two children of x

    • Spans are:
      • S_i = [a_i,b_i]
      • S_j = [a_j,b_j]
    • Reordering y_i and y_j gives four scenarios:
      • S_i and S_j don’t overlap.
        • Put y_i before y_j if a_i < a_j

        • Put y_i after y_j if a_i > a_j

        • S_i is contained within S_j
          • Remove y_i and promote its children
        • S_j is contained with S_i
          • Remove y_j and promote its children
        • S_i and S_j overlap, but neither contains the other.

          • Remove both, promote their children
          • If they are both leaf nodes with the
          • Same span, merge them. (IN+DT, for example)
  4. Reattach unaligned words.
    • For each unaligned word x:
      • Find closest left and right aligned neighbor
      • Attach x to the lowest common ancestor of the two.
intent.trees.read_conll_file(path)[source]

:rtype : list[DepTree]

intent.trees.reorder_tree(t, prev_t_list=[])[source]

Recursively reorder a tree.

Parameters:t
intent.trees.to_conll(ds, words, lowercase=False, clean_token=False, match_punc=False, multiple_heads=False, unk_pos='_', tagmap=None)[source]

Return a string in CONLL format

(see:
http://ilk.uvt.nl/conll/

under “Data Format”) :type ds: DepTree

Module contents