Goal:
EDI documents are flat file documents that have certain delimiters that separate data elements and segments. Unlike XML which has self describing nodes, EDI documents have to follow strict rules regarding the position of these delimiters for them to be parsed correctly. This blog post is intended to provide an in depth analysis of the various delimiters found in an EDI document, and how R2 Engine discovers these delimiters during disassembly.
X12:
The X12 standard defines the following delimiters that are discovered through the fixed length ISA segment according to the following algorithm:
- Data Element Separator: 4 Char of ISA denotes data element separator. Used to separate simple data element or composite data structure. Commonly used is *
- Component Element Separator: ISA16 denotes this separator. Used to separate simple data elements within a composite data structure. Commonly used is :
- Repetition Separator: ISA11 is reused to denote repetition separator. Used to separate repeated occurrences of a simple data element or a composite data structure. Commonly used is ^
- Note: Cross verification with PAM settings needs to be conducted to verify usage of ISA11 for this delimiter. If ISA11 is not configured as repetition separator in PAM, then there can be NO repetitions with the fields.
- Segment Separator: 106th Char of ISA (fixed length segment) denotes segment separator. Used to separate segments. Commonly used is ~
- Segment Separator Suffix: The value present between ISA and the following GS segment. Used to provide better readability for the Interchange. R2 supports “None”, “CR”, “LF” and “CR LF”
Example:
The following ISA and GS segment illustrates the above
ISA*00*00000ISA02*00*11111ISA04*01*2222222222ISA06*01*3333333333ISA08*670123*0123*^*00401*111121891*0*I*:~
GS*AA*GS02*GS03*45670123*01234567*892*X*00306~
In the above example:
- Data Element Separator = “*” (4th Char of ISA)
- Component = “:” (ISA 16)
- Repetition = “^” (ISA 11)
- Segment = “~” (106th char of ISA)
- Segment Suffix = “CR LF”
EDIFACT:
EDIFACT has an entire segment dedicated to defining the separators. And as expected this is the first segment and is called UNA, Service String Advice. Its function is to define the characters selected for use as delimiters and indicators in the rest of the interchange that follows. When transmitted, the UNA will appear immediately before the Interchange Header (UNB) segment and begin with the upper case characters UNA immediately followed by the six characters selected by the sender to indicate the applicable separators, in sequence.
Sample Segment: UNA:+.?*’
In the above stream
- The fourth character ‘:’ is the Component element separator
- The fifth character ‘+’ is the Data Element Separator
- The sixth character ‘.’ is the Decimal Notation
- The seventh character ‘?’ is the Release Character
- The eight character ‘*’ is the Repetition Char
- The ninth character ‘’’ is the Segment Separator
- The character between UNA and UNB is used to Segment Separator Suffix. Like X12, R2 supports “None”, “CR”, “LF” and “CR LF” for EDIFACT as well
Exception Handling:
- UNA is an optional string – so if in an interchange this segment is not present than the default values from the Pipeline component properties are used. If these are not valued than the interchange is suspended and appropriately logged in Event Viewer
- Some users don’t use Release indicators in the payload and consequently don’t value it in UNA segment. The standard requires that if not present – space character will be inserted. Importantly space is not be interpreted as the Release indicator. Space char indicates that releasing is not supported in the interchange.
Pipeline Component Properties:
- The highlighted section of the EDI Receive Pipeline is used for EDIFACT documents in case UNA segment is not present in the interchange. The format is Comma Separated Values as Hex Chars in the order (Component Data Element, Data Element, Decimal Notation, Release Indicator, Repetition Separator, Segment Separator, Suffix 1, and Suffix 2)
- The default EDIFACT Separators are 0×3A, 0×2B, 0×2C, 0×3F, 0×20, 0×27
- These values can be configured to any values provided they are in the allowed range and follow the format as described above.
Allowed Range:
Both X12 and EDIFACT delimiters have to be ASCII characters. That means they fall in the range of 0-127 (Decimal) or 00-7F (HEX)
X12 vs. EDIFACT:
The following table outlines the similarities and differences between X12 and EDIFACT
|
Separator Type |
EDIFACT | X12 | ||
|
Field |
Optionality |
Field |
Optionality |
|
|
Component Element Separator |
UNA1 |
Mandatory |
ISA16 |
Mandatory |
|
Data Element Separator |
UNA2 |
Mandatory |
Implicit – Char preceding ISA1 (4th Char) |
Mandatory |
|
Decimal |
UNA3 |
Mandatory |
N/A |
N/A |
|
Release |
UNA4 |
Mandatory |
N/A |
N/A |
|
Repetition |
UNA5 |
Mandatory |
ISA11 |
Optional |
|
Segment (along with 2 optional suffix field) |
UNA6 |
Mandatory |
Implicit – Char following ISA16 (106th Char) |
Required |
End Note:
I hope this post has been useful to you. As always your comments/questions are welcome. This is mohsin signing off!
Mohsin Kalam
Hi Mohsin,
Great article. Must be read by all EDI + R2 practitioners.
Thanks!
Comment by Leonid Ganeline — November 5, 2007 @ 7:41 am
Thank you Mohsin, often the EDIFACT files have suffix1 (CR) and suffix2 (LF), i’ve solved the issue by practice, but this article confirms my discover……
Comment by Fabio Aluisi — January 12, 2008 @ 5:20 pm
Hello, Is it at all possible to distinguish or better yet, promote a repeatable segment (e.g., TDS01) to use in an orchestration for inclusion in a Decision shape? I can not get this to work for whatever reason and had read that repeatable segments (TDS01 - TDS04) were not allowed to become promoted or distinguished.
Comment by Matt Moore — February 28, 2008 @ 8:48 am
Hi Mohsin,
i am very new to the biztalk,so kindly let me know what is the way to convert EDIFACT TO XML using BTS
Thanks in Advance
Comment by Ramjeet — February 29, 2008 @ 2:43 am
The out of the box EDI Receive Pipeline will convert an EDIFACT document to XML. You need to do the following steps
1. Create a receive port, receive location and add EDIReceive Pipeline to your receive location
2. Create a send port with Pass Thru Transmit and subscribe to the Receive Port name create in step 1
3. Deploy the schema this document points to. R2 ships with 8k+ schemas and you can find them under “Program Files\Microsoft BizTalk Server 2006\XSD_Schema\EDI\MicrosoftEdiXSDTemplates.exe”. You would have to unzip this file
4. Drop in the EDI instance in the file receive location and you will have the XML output from the send port
Hope this helps
Mohsin
Comment by Mohsin Kalam — February 29, 2008 @ 2:24 pm
If you are talking about promoting a property through the PropertySchema, at this time this is not supported by the EDI Receive Pipeline. However we have identified this and looking into a possible solution in the upcoming release.
Thanks
Mohsin
Comment by Mohsin Kalam — February 29, 2008 @ 2:26 pm
Hi Mohsin,
Thats very great,i have tried to make the pipeline and also created the send and recieve ports,but later on i have to map the generated XML to the SQL as well.thats the later part but could u plz let me know how to map the recieved edi to the form which i want in the XML
Thanks in Advance
Comment by Ramjeet — March 3, 2008 @ 6:39 am
Thanks for the response Mohsin. I’m actually able to promote the TDS01 property directly in the x12.810 Microsoft schema, but can’t seem to allow it conditional logic in my decision shape in the orchestration. This is different from promotion through the PropertySchema, no? How can one perform business logic (i.e., decisions) against these various EDI-x12-schema fields. Business rules perhaps? Thank you and pardon any ignorance here, i’m new to the BizTalk product still.
Comment by Matt Moore — March 5, 2008 @ 7:26 am
Hi,
I tried these steps you outlined.1. Create a receive port, receive location and add EDIReceive Pipeline to your receive location
2. Create a send port with Pass Thru Transmit and subscribe to the Receive Port name create in step 1
3. Deploy the schema this document points to. R2 ships with 8k+ schemas and you can find them under “Program Files\Microsoft BizTalk Server 2006\XSD_Schema\EDI\MicrosoftEdiXSDTemplates.exe”. You would have to unzip this file
4. Drop in the EDI instance in the file receive location and you will have the XML output from the send port
With a document containing
UNA:+.? ‘
UNB+UNOA:3+9313938000631:ZZ+9311803999998:ZZ+080225:0023+530′
UNH+52600001+ORDERS:D:01B:UN:EAN010′
BGM+220+TESTTEST1021+9′
DTM+137:20071102:102′
DTM+2:20071104:102′
UNT+5+52600001′
UNZ+1+530′
I deployed the schema EFACT_D01B_ORDERS.xsd
but had to rename the root in the xsd from EFACT_D01B_ORDERS to EFACT_D01B_ORDERS_EAN010
And I always receive the error:
Error encountered during parsing. The Edifact transaction set with id ‘52600001′ contained in interchange (without group) with id ‘530′, with sender id ‘9313938000631′, receiver id ‘9311803999998′ is being suspended with following errors:
Error: 1 (Segment level error)
SegmentID: UNT
Position in TS: 5
15: Use of segment, data-type or segment not supported in this position
If I look in tracking at the message body, it has been transformed to XML
52600001ORDERSD01BUNEAN010220TESTTEST1021913720071102102220071104102552600001
but it never makes it to the output
Any ideas?
Comment by Steve McGrath — March 6, 2008 @ 5:43 pm
Hi Mohsin..
i am still not clear on Repetition Separator on ISA 11 in 4020 version and above..
well i ma not able to get how this seperator is used in EDI File..?
Can u Give an example of EDI file in which it is used so that it will make clear..
as i m confused between ISA11 and ISA 16.
If you can give a example of EDI file then it will be very helpful..
Luking forward at ur response..
tc ..bye..
Comment by Nitin — May 1, 2008 @ 4:52 am
Hello,
Can you plz. explain the Repetition Seperator
that is used in ISA 11 position above 4010 Version.Plz. give an ex of EDI file where it is used so that it can be clear to everyone..
thks & Regards
Nitin Pandey
Comment by Nitin — May 6, 2008 @ 2:33 am
The repetition separator is used to separate repetitions of data elements. Repetitions at Component elements are not allowed and documents containing such are suspended. Repetitions of a field is determined based on the Min/Max occurs field in the schema.
Min Occurs = 0 -> Element is optional
Min Occurs = 1 -> Element is required
Min Occurs = 5, Max Occurs = 10 -> Element is required and should repeat atleast 5 times but no more than 10
For X12, this field is specified as ISA11. For EDIFACT, this field is specified as UNA5.
Receive:
During document parsing, if the disassembler encounters a repetition separator in a data element, it will check to see the min occurs attribute for that field in the schema. If repetition is allowed, the document is processed fine, if not the document is suspended.
Send:
During serialization, the XML node of a data element is checked against the schema to see if it can repeat. If it can, then the Assembler applies the repetition separator for each repetition of the data element.
Here is how it looks like in EDI and XML format
Suppose we have a node in the Schema as follows
BEG01: Min Occurs = 2, Max Occurs = 4
ST*810*1234
BEG*01^01*BEG02
SE*3*1234
which will be transformed in XML during Parsing as follows
810
1234
01
01
BEG02
3
1234
Hope this helps
Mohsin Kalam
Comment by Mohsin Kalam — May 27, 2008 @ 3:44 pm