Comments on the EDIFACT Syntax

EDIFACT files consist of

  • segments,

  • fields and

  • components.

Segments can be seen as rows, fields as columns, and components as a part of a column. A segment is always started using a segment identifier and is concluded using an ending character. Example: DTM+200:20060414:102'

The string DTM is the segment identifier, the simple quote sign is the ending character. A new segment has to be started after the end of a segment or there may not be any additional data.

The fields in a segment are separated by a metacharacter. This is, by default, the plus sign (+). Example: GID+2+00000005+00000005'

The segment consists of four field values: GID, 2, 00000005 and 00000005.

The components of a field are separated by a metacharacter. This is, by default, the colon (:). Example: UNH+IFTMIN:D:95B:UN:SUTC+1'

The second field in the segment consists of the 5 following components: IFTMIN, D, 95B, UN and SUTC.

The characters for segment end, delimiter, and component separator can be defined in the UNA segment of an EDIFACT file. The segment UNA is a special case: It describes the characters, with which segments and data are divided or masked within the segment. This segment is optional. If it is not specified, the default settings apply. If there is a UNA segment, it always has to be at the beginning of the document.

The following methods of compression are used in EDIFACT, in order to keep the size of the file small.

  • Blank fields are indicated by an additional field separator. Example: GID+++00000005'

The segment consists of 4 fields. Fields 2 and 3 are empty and will be skipped. The same mechanism is applied to components.

  • Blank fields at the end of a segment are indicated by stating the segment end after the last, non-blank character. Example: GID+2'

The segment actually consists of 4 fields, but fields 3 and 4 are empty. The segment end after the second field indicates that all other fields of the segments are empty.