[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

OOPS Honesty About AgentX

  1) Please read all the way through before you put on your flame
  2) FYI, The most likely discussion/options points are prefixed with
     DISCUSSION.  There are also 3 PROPOSALS tagged below as well.
     However, please see #1 above.
  3) The material here-in applies mostly just to Get-Object-PDU and
     Get-Object-Response-PDU messages, though elements of it will
     apply to all.
  4) This material is not for the feint of heart.  It requires
     extensive knowledge of SNMP, Agentx and OOPS.

Many have brought up potential OOPS impacts on AgentX (and Randy said
something like "lets be honest about [OOPS] impact on AgentX", hence
the subject).  Though I have not reviewed every aspect of the
situation yet with intense detail, and I do agree that there are some
issues.  Much of the software I am involved with makes heavy use of
AgentX, so I don't want it to be unusable either.  Without change,
AgentX certainly won't be able to help as much toward some of the
goals behind the OOPS work.  Without change, however, some of the
benefits will still be gained.  The important thing is that we find a
solution that will allow at some gain to be achieved until subagent
protocols can catch up.

This is actually a more generic issue than just subagent technologies.
Specifically, when a new OOPS operation comes into an agent which does
not have mib-instrumentation level changes to support the filtering
and other features provided by the OOPS protocol, what should an agent
do with the incoming request?


  Let's start with a diagram:

                       |  |
                 SNMPv2|  |OPPS
                       V  V
                     |Agent |
                   /     |    \
                  /      |     \
        +--------+  +--------+  +-----+
        |Subagent|  |Internal|  |Proxy|   ...
        |        |  |  API   |  |     |
        +--------+  +--------+  +-----+

        |         Shared Table        |

  This depicts a worst-case scenario: One table is shared across
  multiple methods of access.  Specifically, data may be accessed by
  subagent protocols (such as AgentX), internal API calls, and by
  proxying, and by ...  I'll refer to these methods below as
  "accessors".  It is important to note that different rows in a common
  table may require access through different methods to the data
  contained within.  (XXX: columns and wild-carding).

  When SNMPv2 PDUs (or SNMPv1 PDUs) are being processed by the master
  agent, the master agent simply divides up the request and queries the
  appropriate table-data accessors.  The master agent must carefully
  control processing between these possible multiple access points to
  ensure that GETNEXT operations are properly lexicographically sorted
  when returned in the RESPONSE message.  This means that if the rows
  are allocated as follows:

     row1   subagent1
     row2   internal
     row3   subagent1
     row4   proxy

  The master agent must properly query accessors to return data in the
  RESPONSE messages in exactly this order.  Any other order violates
  GETNEXT PDU processing.  One possible way of organizing calling
  information within the master agent is heavily discussed in the AgentX
  protocol document, so it won't be reiterated here.

  For an OOPS Get-Object-PDU request, things change a bit.  Rows within
  the Get-Object-PDU are logically organized together within the packet,
  and selected elements of the data is returned to the caller
  (which may or may not contain a complete index set).

  In an agent where the accessor mechanism is any of the above (Internal
  API, Subagent, or Proxy (or ...)) and the accessor functionality
  doesn't understand OOPS optimized access to the data
  storage/functionality, some method of translation must be done.  This
  means an agent has two choices:

  1) attempt to support the request via an internal OOPS->GETNEXT
  2) don't support OOPS requests to that table.

  #2 above should probably be discussed first.  If an agent implements
  OOPS, should all objects be accessible under the OOPS PDUs.  Or
  should an agent be allowed to only return data when the underlying
  mechanism supports the needed advanced notions.  Personally, I'm
  more in favor of #2 since it allows for incremental improvements to
  an agent and doesn't require that an agent do a massive update to
  its internal infrastructure.

  Doing a OOPS->GETNEXT internal conversion shouldn't be a huge amount
  of work.  The problem is that it effectively embeds some management
  code into the agent in order to do row-wise data collection in order
  to return the appropriate OOPS response.  The cursor field returned in
  the Get-Object-Response-PDU can merely be an encoded OID indicating
  where to restart the GETNEXT traversal when the next request comes
  in.  But what does this mean for the agent (quick summary)?

  a) the agent contains internal management-like code to do data
     collection across older internal APIs and across subagents and
  b) The filtering and data selection still get applied to the
     resulting collected rows.
  c) The overall packet sizes returned on the network should still be
     significantly smaller, due to b) and due to the more efficient
     Get-Object-Response-PDU encoding (over the RESPONSE encoding from
     GETNEXT/GETBULK counterparts).

  I was originally thinking that a OOPS knowledgeable master agent could
  make cleaver use of the cursor field by encoding a particular subagent
  "id" into the cursor such that the master agent could walk one
  subagent at a time and not have to worry about interleaving row
  results, as it has had to do in the past.  There are two problems with

    1) Currently, cursors are supposed to be reusable forever, even
       including post master-agent-reboot time.  This causes problems
       with subagents need to be uniquely identified.

    2) subagent 1 can allocate an index, then deallocate it and then
       subagent 3 can reallocate the same index later.  If the row
       ordering of Get-Object-Response-PDU replies must be consistent
       for all time, then there is no way to create a cursor which is
       not based at least in part by a GETNEXT OID across all subagents,
       which defeats half the purpose.  IE, if ordering must be
       preserved at all times and subagents are allowed to switch data
       from one subagent to the next, there is no way for a master agent
       to guarantee the ordering returned between subagents.


    Drop the requirement that cursors must be valid for all time.  I
    think the infinite lifetime will cause only harm.  It's unlikely
    managers will need (note I didn't use "want") to keep cursor data
    around forever and it's much more likely they'll only use them to
    continue traversal in future follow-on Get-Object-PDU requests.
    So, I'd like to drop the requirement that they must remain valid
    forever but change it so that they must be valid until the next
    time the agent reboots in the future.  I think this is a more
    reasonable expectation to be imposed on an agent.

    Discard the requirement that rows must be returned in a
    dependendable order.  The more I thought about it, I'm not sure
    why I was so determined to put ordering in the document at all.
    If I recall, I wrote that requirement in before cursors were put
    in place in the PDUs and thus ordering was needed to assure the
    skip-objects field and the max-return-objects field was usable at
    all.  The important thing is that data not be skipped and that
    duplicates are not returned (though the later is less important
    than the first, IMHO).  Since cursors provide this functionality,
    by requirement, then I don't see the need to keep the requirement
    that from one Get-Object-PDU based walk to the next that the data
    must be returned in the same order.

  However, even with these two proposals it is still impossible to
  design a cursor for return by a master agent for use with subagents
  which don't basically encompass the exact GETNEXT style OID
  semantics into the cursor.  This is ok, however.  In the future,
  subagent technologies will hopefully incorporate the newer ideas
  behind the OOPS proposal and thus we'll gain an advantage in the

  Subagents don't break indexes into pieces, which makes it difficult
  for a master agent without MIB table knowledge to properly construct
  Get-Object-Response-PDU packets which require that index encodings
  be separated out.

    Add a CHOICE element to the index encodings that allow for master
    agents to return a OID for an index which can't be broken down.
    ASN.1-wise, This would mean modification of the ElementSpecifier
    to change the index-number range from 0..4294967295 (which was
    really unnecessarily large in the first place) to -1..2147483647
    such that a value of -1 would indicate the data portion of the
    DataList would be the raw OID instance identifier (with a 0.0

In summary, there are definitely OOPS issues with respect to subagent
protocols.  The proposals above help alleviate some of the problems.
In the mean time, thoughts on the above would be appreciated.

Wes Hardaker
Network Associates Laboratories