# Documentation of the GR graph format

**Grew** use a dedicated file format to describe graphs which is called the **gr graph format** and it can be used as input and output format for **Grew**.

Files using this format are expected to used the `.gr`

file extension.

Nodes and edges of graphs are defined relatively to a feature domain and to a label domain which are themselves described in a **GRS** file.

## Features

A **feature** is a pair (*feature_name*, *feature_value*) and is written with the `=`

symbol. For instance, the feature (cat,v) is written:

cat = v

It's always possible to write feature values with surrounding double quotes:

lemma = "accuser"

Moreover double quotes are required if the feature value is not an identifier (as define above):

lemma = "accusé"

Feature values can be numbers (this supposes that the GRS file declares the corresponding feature name as a numerical feature):

x = 12

z = 12.34

An example of numerical feature is given by the feature named **position** which is used in dependency structure to describe linear word order.

## Feature structures

A **feature structure** is a set of features with different feature names.
They are written between brackets and with “,” as the separator between features:

[cat = V, lemma = "accuser"]

## Nodes

A **node** definition consists of an identifier and a feature structure:

P_23 [cat = V, lemma = "accuser"]

The feature structure can be empty:

X []

## Edges

An **edge** definition is given by two identifiers separated by a label declaration surrounded by symbols

and **-[**

.
**]-****>**

X -[obj]-> P_23

## Graphs

The input syntax for graph uses the keyword **graph**.
The description is enclosed in braces and it contains a set of nodes and edges definitions separated by a colon.

For instance, the code below correspond to the graph above:

graph { A [phon="Elle", lemma="il", cat=PRO ]; B [phon="pense", lemma="penser", cat=V, m=ind ]; B -[suj]-> A; C [phon="venir", lemma="venir", cat=V, m=inf ]; B -[suj]-> C }

## Dependency structures

**Grew** often manipulates dependency structures and so, it is possible to draw graphs this way.
But, to draw a sensible dependency structure, word order is necessary.
The word position in the structure is described by the **position** feature.

The position value can be given after the identifier and between parenthesis; hence the two lines below are equivalent:

A (0) [phon="Elle", lemma="il", cat=pro ]

A [phon="Elle", lemma="il", cat=pro , position=0]

The graph above becomes:

graph { A (0) [phon="Elle", lemma="il", cat=PRO ]; B (1) [phon="pense", lemma="penser", cat=V, m=ind ]; B -[suj]-> A; C (2) [phon="venir", lemma="venir", cat=V, m=inf ]; B -[obj]-> C }

and it can be displayed with dep2pict:

## Consistency

Graphs are supposed to verify the following properties to be well-formed:

- all node identifiers are different;
- node identifiers in edge definition refer to some previously defined node in the same graph;
- it is not allowed to defined twice the same edge (with the same source node, target node and label).