Leopar main page

Leopar's dev page

More info

InriaGForge

Leopar is hosted by InriaGForge as a subproject of the Semagramme project

The companion matrix

To use the full set of filters, you must have the companion matrix.

Raw grammar

If you use a raw (not compiled) grammar (file with extension gram), the companion matrix is described in a file with the cmp extension.

For instance, with the frig grammar given by the command

svn checkout svn://scm.gforge.inria.fr/svnroot/semagramme/frig/frig/frigram

You can download the corresponding companion file in the frigram subdir with the command below

wget http://wikilligramme.loria.fr/extern/frigram.cmp -O frigram/frigram.cmp

and then let tell leopar to use these grammar:

leopar -grammar frigram/frigram.gram -save

Compiled grammar

If you use a compiled grammar (file with extension cgram), the companion matrix is packed inside the file.

WARNING: With the current linux version of the frig package, the compiled grammar loading fails with a Seg fault. Until the problem is fixed, please use the methods described in the Raw grammar section above.

Lexgram

matrice de croisement lexique/grammaire

  • construction: job make_lexgram_matrix dans jenkins (ce job dépend logiquement du job update_lexgram_matrix_cc mais ce n'est pas automatique pour éviter de faire tourner inutilement le cluster)
  • téléchargement
cd /tmp
rm -rf lexgram.matrix
wget ftp://backup:backup@tarski.loria.fr/frig/lexgram.matrix
  • config leopar
leopar -lexgram_matrix /tmp/lexgram.matrix -save
  • usage
leopar -dev_tools

Companions' matrix computation

The companion's matrix is long to compute (around 100 hours for a 3000 trees grammars). Here is the way to compute it on a cluster. Each line of the matrix corresponds to a polarity in the grammar and can be computed independently. We will divide the computation subsets of lines. We suppose below that the grammar is defined in a file g.gram

Compute subsets

Suppose that we want to compute it with 100 resources. The lines of the matrix is splitted in 100 subsets and the command

leopar -comp _0_100_

build the part of the matrix for the first subset. The result is stored locally in a file named:

constr_0_100.partial

So to build the full matrix, we use the commands:

leopar -comp _0_100_
...
leopar -comp _99_100_

This produces 100 files with the partial extension.

Puts subsets together

Then the command

leopar -comp collect

puts all partial files together and produces a (compressed) matrix in a suitable format for use in leopar's parsing. The compressed matrix is stored in a file with the same name as the input grammar but with the cmp extension.

NB: The previous command can be used only if the set of local partial files is a partition of the set of polarities of the grammar. It's important to remove partial files from a previous companion computation for instance.

Install it

To “install” the matrix, the file g.cmp must be put in the same directory than g.gram.