[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: BER versus ASCII (was: RE:)



On Thu, 6 Dec 2001, Subrata Goswami wrote:

> >That is a good description. Does not the MIB specify what type of value an
> >attribute should have ? I would think it is redundant information.

It does, however, a few points:

1. You don't always have the MIB definition available when you're decoding
a message.

2. The agent and manager may for some reason disagree on what type of
value it is supposed to be: particularly if you don't use names instead of
OIDs to reference object defintions (see below), but also because one of
them my simply be wrong.

3. The ASN.1 CHOICE data type (which, while not widespread in SNMP, is
used in one of the older basic data types -- namely NetworkAddress, though
with only one "choice" available) allows different data types to be
associated with the same object.  I don't know of any object data types
aside from NetworkAddress that are defined as CHOICEs, but I do believe
one of the items on the objectives list for SMIng is discriminated union
functionality, which amounts to the same thing.

4. Unfortunately the DISPLAY-HINT format allows for the possibility of a
one-way transformation from internal representation to "human readable"
representation.  Which do you use on the wire?  If you're doing ASCII,
then you're probably going to want the formatted version, which isn't
particularly useful except in display to the user.  But some data types
you would have to.  Take the DateAndTime textual convention, for example.
It uses a single octet of value (NOT ASCII character) equal to the
day/month/hour/etc.  (ie., the octet value "0x00" in the appropriare
position equals midnight).  These amount to non-ASCII characters that you
would have to either print in hex or escape somehow -- but then you have
the problem of figuring out whether "0x00" means 4 characters ('0', 'x',
'0' and '0') or the one octet with all zero bits.

> >Well, that is not exactly true.  Object names are unique as are the OID's.
> >Otherwise the SNMP manager would have a hard time resolving an object
> >name to OID. Have you come across any situation where the same object
> >name corresponds to different OIDs in the same MIB ?

There is no requirement that object names be unique.  There is no
authority for assigning SNMP MIB namespaces the way IANA assigns branches
in the OID tree.  It is completely possible to have to different MIBs
define unrelated objects with the same name.  In IETF MIBs it is avoided,
partly by naming conventions and most likely something the editors check
for before allowing a document to be published to the standards track.

You don't have the luxury of a uniqueness assumption when handling two
arbitrary vendors' MIBs.  IANA assigns them a branch of the OID space
under 'enterprises'; beyond that, it's entirely in the hands of that
enterprise.

You might also have a copy of an unofficial version of a MIB, where
objects have been defined under the 'experimental' branch until such time
as an IANA assignment has been made.  The object names won't necessarily
change with the "official" OID assignments -- depends on whether or not
the MIB author had the foresight to prefix all the object names with
something like "exp".


> > Agreed. What I am saying is BER really needed ? It appears more as an
> > obfuscation layer. The MIB definition (in ASN.1) contains all the relevant
> > information about an attribute and the type of values it is allowed to
> have.

In order to answer the question of whether or not something is "needed"
you have to consider requirements.  If efficient use of CPU cycles
encoding/decoding messages and efficient use of network bandwidth are not
an issue for a particular protocol, ASCII may be perfectly reasonable.

[Note: I think really what you're advocating is a "plain text" format and
not specifically an "ASCII" format.]

> > Given that I think if agent undestands the MIB definition (in ASN.1 form)
> > then BER is not required.  For example we can simplify the varbind to
> > look as (s y s D e s c r, m y - b o x) rather than
> > (06 08 2B 06 01 02 01 01 01 01 00, 04 06 m y - b o x)

And how do you deal as an agent if you receive...

{i f T y p e . 1, A S C I I S t r i n g}

...?  (That is, a non-integer value sent erroneously by one or the other
parties).  True, I suppose you can see that the first character is not a
digit and return an error.  But it becomes expensive to do type checking
by string matching because you have to scan the entire value -- every
value -- for invalid characters and such.  It's much, much faster to
simply look at a single byte that says you're decoding an integer.

> > The OID for sysDescr.0 is 1.3.6.1.2.1.1.1.0, which also can be used
> instead of sysDescr.
> > That way each time an attribute value is exchanged, we do not send its
> type along.
> > To shorten integers, a simpler encoding can be adopted. The delimiters can
> be some
> > byte (e.g. 00).

> > Would doing somethig like this imply agents spend more CPU cycles ?

Yes.  Delimited data is more expensive to parse than knowing the number of
bytes to grab ahead of time.  You have to scan for the delimiter, keeping
track of "escaped" (or quoted, etc.) versions of the delimiters and such
until you find the end of the piece your parsing, extract that bit, ad
nauseum.

Delimited data also causes problems in less robust implementations that
are lazy about bounds checking.  Imagine, if you will, a format that
-does- use the null-byte as a delimiter character, and an implementation
that lazily does an sprintf() or sscanf() other operation on the string,
assuming that the delimiter is there.  Bad stuff happens.  This, I
imagine, is why most "plain text" format protocols have a lot more
syntactic structure...  like opening and closing tags in XML or HTML, or
having every field quoted, separated by a comma, with
quotes-inside-strings escaped with backslash and such in CSV files.

A lot of "extra" data goes into a plain text format protocol in order to
give it a well defined structure that can be parsed, expensively, without
running into big problems (or, at least, being able to hopefully recover
if you do).  This stuff adds very quickly.  You can save bandwidth by
utilizing compression techniques (gzip or whathaveyou), but that again
costs a lot of CPU time.

BER isn't an obfuscation layer.  I would describe it more as a means of
data compression.  Everything, beyond having each item start with a single
octet 'tag' specifying the data type, and one or more octets indicating
field length, generally gets encoded in the fewest number of bytes
possible in BER, assuming a reasonable implementation.  Some
implementations are lazier than others.  All of the following are "legal"
encodings if the integer value 1, for example:

02 01 01
02 84 00 00 00 01 01
02 84 00 00 00 04 00 00 00 01

Most implementations tend to be smart enough to encode in the fewest
number of bytes possible while still following the [Tag][Length][Contents]
structure.

At any rate.. I appologize for rambling on so long.  I hope the
information has been useful.  I do recommend reading the BER
specification.  It's really pretty short (the entire X.209 document is
only 25 pages, and most of that is whitespace, definitions, and examples).