[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Comments on draft-ietf-snmpconf-pm-02




Frank,

  Thanks for the questions and feedback. My comments are inline.

Frank Strauss wrote:
> 
> Several *Index object types have integer range restrictions
> (0..65535). Why at all? Why 64k?

2 reasons:

  1) SMICng requires range restrictions on index objects. If I didn't
     put them there people would complain that it didn't compile
     cleanly. Sometimes it's better to just submit :-)
  2) It's useful to declare that 0 is a valid value. Otherwise this
     would be an FAQ.

Unfortunately, when you put in a range, sometimes you have to make an
arbitrary decision on one end of the range. That's what the 65535
represents (a number larger than reasonably possible). One of the
chief benefits of policy-based management is that it reduces the
number of management decisions that need to be made (or in plain
english, there won't be that many policies) , so a limit of 64K
seems reasonable.

> Why is pmPolicyFilter restricted to a maximum size of 65535 octets?
> If there's a good reason for this size, why is pmPolicyAction not
> restricted?

Because the SMI restricts octet strings to 64K. It was my intent to be
consistent with both objects but I missed policyAction.

> Why is pmPolicyFilterMaxLatency of type Integer32 (signed) but restricted
> to 0..2147483647? Why not just Unsigned32?

Good idea.

> What is the purpose of pmPolicyActionMaxLatency? At first sight, it
> seems like an action is called exactly when a filter is evaluated being
> true. Thus the action latency simply depends on pmPolicyFilterMaxLatency
> and the result of the filter evaluation.

This architecture allows the agent to implement them coupled (more
obvious and simpler to implement) or decoupled (provides more
scalability for larger systems).
  Where coupled means "if (filter) then (action)"
  Where decoupled means:
        Once every filterMaxLatency period
          Evaluate filter and keep a list of matching elements
        Once every actionMaxLatency period
          Evaluate action for every element currently in matching list

Here's the argument for the decoupled mode:

  Usually we think of the actions being more complex and costly to run
  than the filters because they execute set requests, but sometimes
  the filter is more expensive than the action. For example:
     "Select the several trunk ports on a high-density access
     concentrator and configure the line protocol"
    In this case, I might be able to re-assert the correct line
    protocol every 30 seconds since it only touches several ports but
    I'd be unwilling to search through all ports more than once every 30
    minutes.

So, treat FilterMaxLatency as the maximum amount of time you are
willing to wait for an element to be "discovered" as a target of a
policy (or to be discarded if it no longer matches).
Treat ActionMaxLatency as the maximum amount of time you are willing
to leave an element without re-asserting the desired state.

When implementing coupled mode, you simply run the
if/then statement no less frequently than dictated by
MIN(filterMaxLatency, actionMaxLatency).

> The pmPolicyPrecedence DESCRIPTION confuses me completely. ;-) Furthermore,
> it says `These values must be unique on the local policy system...' which
> is difficult to ensure, I think.
>
> It is not possible to lookup pmPolicyTable rows by pmPolicyGroup,
> since pmPolicyGroup is no INDEX element. However this might be useful.

At the last interim meeting we agreed that each of the several
proposals for grouping and priority (these objects were one) were
flawed and that we were taking them off the table until we could get
it right.

> The pmPolicyMatches DESCRIPTION says `The number of policies ...' while
> it means the number of elements. However, I do not see much value in
> this object type. When a manager reads this variable, it does not know
> at which stage in policy evaluation this value is current.

Yes, it should say "The number of elements ...".

Once filterMaxLatency time has passed, you know this value will be
current.

Regarding how valuable this is, this object is intended to:
  A) Be a first level Quality Control indicator for the policy author
    (did it match roughly the number of elements expected?)
  B) be helpful to a technician (how much effect will disabling this
     policy have?)

> The pmElementTypeRegOIDPrefix has a DESCRIPTION that explains the
> separation of the prefix and the instance identifying part, which
> consists of N sub-identifiers. This is what I guess is
> correct. However, the text describes that "$1" can be used in place of
> any (single?!) decimal sub-identifier, e.g. in section 6.2.1.
> 
> I think "$1" is not a good name, since there cannot be a "$2". How
> about "*"?

This is work in progress. One way is to have each subid be a separate
parameter $X. The other way is for them to be one parameter and supply
tools for breaking them apart. I have to try some real-world examples,
but I think that $1, $2, $3, ... will be vastly preferable.

> It's explicitly stated that `no state is remembered from the previous
> invocation' in filters and actions. So: how can counters be evaluated
> by a filter in a meaningful way?

I'll answer "how" below, but first I'd like to discuss why we
don't want to remember state - the reasons are complicated and
vitally important.

In this discussion, let's say that we have 3 policies (1-3) and 5
elements (A-E). A particular execution context would be a pairing of
one of each (policy 2 on element D is 2D). In the most obvious case,
we would evaluate 1A, 1B, ..., 1E, 2A, ..., 2E, 3A, ..., 3F, one at a
time and in order. However this might no always be the case.

First of all, we want the scheduler to be free to execute these
contexts in any order. This freedom provides simplicity and greater
scalability because it can execute multiple policies in parallel or
suspend a blocked execution while others continue. Therefore
context 2C can't assume the previous invocation was context 2B.
  (This is the rationale for not using static variables).

About the only thing we COULD rationally do is have context 2C keep
state from the last execution of context 2C.

Each combination of policy and element is a separate thread of
execution, so we could use threads, but we don't want to have to
support P*E threads (we must allow P*E to be a very large number,
say P=100 and E=10,000). So threads are out.

We could automatically remember the last state of all automatic
(local) variables, but this could be a huge burden, especially
considering that many scripts will have multiple local variables but
few will need to remember any from invocation to invocation.


However, there may be times that we need to remember state from one
context to another. Here are 3 ways: (sorry for the shorthand)

A) Scratchpad MIB:
   Sparse table with one read-create column

   scratchPadPolicyIndex Integer32,
   scratchPadElement     RowPointer,
   scratchpadVariable    Integer32, -- allows multiple vars per context
   scratchpadValue       OCTET STRING  -- read-create,
                                       -- the only accessible column
   INDEX { scratchpadPolicyIndex, scratchpadElementIndex,
           scratchpadVariable }
   scratchpadValue.policyIndex.elementIndex.variableIndex

   if for ifIndex #17, we retrieved inoctets = 49 and outoctets = 35
   for policy #3, we could store:

     scratchpadValue.3.`ifIndex.17`.1 = 49
     scratchpadValue.3.`ifIndex.17`.2 = 35
   
B) Scratchpad accessor function
   
   void setscratchpad(int varIndex, char *value)
   char *getscratchpad(int varIndex, char *value)

   This function implicitly knows the policyIndex and element and
   will store them internally associated with this context.

   varIndex lets you save multiple variables per context

C) Counter specific accessor function

   int deltaValue(int counterValue)

   This accessor function remembers the countervalue last passed to it
   by this context and performs a delta with the last value,
   returning the delta.


Finally, let's keep some rigor in our requirements analysis (i.e. a
little skeptecism). I can't imagine a filter or action script making a
meaningful policy-based decision based on the 'instantaneous' delta
value derived from 2 counter values (whether they were collected 2
minutes apart or 2 hours apart). It seems that at least some smoothing
is required.
  What type of policies would depend on traffic or error rates?
  What types of rates would they need? smoothed? peak?
  Where is the best place to get those rates? (another MIB perhaps?)

> The DESCRIPTION of pmElementTypeRegName says it's a `description'.
> I guess it should be a `name'.

I think you're right.

> The MIB module contains various comments. This is no good idea, since
> this information is not accessible. Important information should be
> places in DESCRIPTIONs. Unimportent information should be removed.

Well, the way it is now it is best for someone reading the MIB (and
admittedly leaves some high-level stuff out for someone reading MIB
objects through a MIB browser). Personally I think the latter is a
dying breed.

We can:
1. Leave it how it is.
2. Move it into the table description
3. Duplicate it into both comment and table description

I don't care that much - let's see what the consensus is.

> The pmCapabilitiesTable represents a new way to express agent
> capabilities, but with a limited scope. I think it is not a good idea.

Agent capabilities describes the capabilities of the instrumentation
(i.e. SNMP Agent), is static, and is not accessible by SNMP
retrievals.

The capabilities describe the capabilities of the managed system, they
are dynamic, and they are accessible by SNMP retrievals (as well as by
the capMatch accessor function).

In particular, you need this functionality if you want your policy
to know if diffserv is currently supported by the card plugged into
slot #3.

Jon wrote the current iteration of the capabilities table so I'll let
him field your questions about it.



Regarding the rest of your valuable feedback below, I'll fix those
items in the next draft.

> Some terminology is used not very consequently and not explained
> clearly. Sections 5.2 and 5.3 introduce the term `script', while this
> mean `filter' and `action'. The term `address' is introduced in 5.1
> but not used at other places, e.g. where "$1" is explained in 6.2.1
> or in the pmElementTypeRegOIDPrefix DESCRIPTION.
> 
> The grammar in section 6.1 containes weired quotes for the
> unary_operator and char rules.
> 
> Section 6.2.1 contains the conditional sentence `...when the policy
> MIB module resides on the same system as the manages elements.' This
> conflicts with section 4 which says `These policies are executed on
> managed devices, where the objects live [...] and where operations on
> these objects will be performed.
> 
> getstring() (section 6.2.2) returns a string without a length
> (probably NULL terminated). This means that no arbitrary octet strings
> (containing zero bytes) can be handled.
> 
> Section 6.2.5 is named strcmp() but describes strncmp().
> 
> Accessor function names look quite arbitrary to me. Some use only
> lowercase letters, others are mixed. Some use `_', others don't.
> strcmp() is named after a well-known libc function, lc_strcmp()
> is not (strcasecmp()).
> 
> What happens if "$1" appears in an action, but not in the filter?
> 
> Examples of useful filter/action pairs should be added.


Thanks,
Steve