Length

The length is the second part of every encoding, comming immediately after the identifier. It also occupies an integral number of octets, at least one. Because of conveying only one field of information, it is easy to split it in a necessary number of octets. It has three variants:

Short and long variants refer to te end of encoding contents, thats why are sometimes called as definite variant. The third one, indefinite, does not reffer directly to the end of contents, it uses special bit pattern to indicate it.

The short lenght field format is as follows:

0 L L L L L L L

That means, it can be used to encode contents up to 127 octets long.

The long lenght variant uses the same space to define size in octets of length field. The length itself is carried then by following n octets encoded MSB first, Big Endian.

1 0 < n < 127
L L L L L L L L
...
L L L L L L L L

octet #1

octet #2

octet #(n+1)

You may ask now, why there is a difference between encoding tags and length. This is a reflection of different requrements. While tags are rarely bigger numbers, the length may vary. Tags are just compared, but length is aslo added. And then it is also good to have such often accessed number octet-aligned, so that no processing is needed to get it together.

Theoreticaly there is an upper limit of the length of contents using length field in long variant, it is about 21008-1 octets. Therefore a special encoding is given if such a strange situation occurs - indefinite length. It started in length field very similar to long length, but with number of octets equal to 0:

1 0 0 0 0 0 0 0

This indicates that enf of contents marker should be found to determine the next identifier section. This mark looks as follows:

0 0 0 0 0 0 0 0
0 0 0 0 0 0 0 0

octet #1

octet #2

As you can see, this is equal to encoding of some value with primitive encoding, universal tag zero and zero length. However, universal tag zero is not allocated to any ASN.1 type. It should be noted on this place, that the indefinite length can be used only for constructed contents, which beeing a series of encodings ensures the end of contents marker to be recognisable.