[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: snmpconf-pm-04 notes continued




Wes,

  Sorry I've backburnered your note for too long. Now that I have the
draft out I've been able to finish my response. Note that a lot of the
fixed things are in the 05 draft already.

Wes Hardaker wrote:

> ***
> 22) The most blunt of my statements today:  The pmNotificationRegTable
>     is redundant with the existence of the NOTIFICATION-MIB and, IMHO,
>     should be dropped from the mib entirely (but, obviously, the
>     notifications definitions themselves should remain).  Duplication
>     of standardized work is a bad thing.

I need to reconstruct my rationale on this. If I can't justify it, I'll
delete it.

> 23) The mib is sprinkled with text in the form of comments that
>     describe the table(s) below them.  This is a pet peeve of mine, as
>     I frequently use mib parsing tools to browse through mib
>     descriptions where I'd expect that information to be properly
>     kept.  In particular, I'd like to suggest that the text preceding
>     the following mib items should be moved into the items themselves:
> 
>        pmRoleESTable [this one has to be done carefully, as it
>                      describes two tables not just one]
>        pmElementTypeRegTable
>        pmCapabilitiesTable
>        pmDebuggingTable
>        pmNotificatioonRegTable [unless this table is dropped; see above]
> 
>     I'd actually suggest a note be put into the -bcp document as well
>     describing that comments should never be used to describe
>     functionality of objects, but rather to mark things to remember, etc.

The problem with this is that often it inhibits readability for those
reading the document the regular way. At the beginning of a non-trivial
table I like to describe why it is there and a high-level overview of
how it works.

What I've done is fixed up all the comments in MIB, ensuring that every
piece of "specification" that was in a comment was also in the MIB
objects and in cases where there was text that wasn't appropriate for a
high-level intro, I delete the text. I think you'll be happy with the
results although I didn't do the notification table yet pending my
getting my act together on your comment #22.

> 24) the capMatch function definition is not clear as to what the OID
>     should be.  After reading everything, I'm guessing that it should
>     be an oid corresponding to a pmCapabilitiesType definition, but I
>     could easily accidentally read it as possibly a
>     pmCapabilitiesSubType OID or even a row pointer into the
>     pmCapabilitiesTable itself.
> 
>     (while I'm at it, the capabilities table seems almost redundant
>     with respect to the sysORTable.  There is a lot of overlap and it
>     took me a while to figure out why both are needed: the table
>     defined here references mib accessible changes, while the
>     sysORTable contains only a pointer to a mib object.  I'd be
>     tempted to write a sysORTable agent-capabilities mib table instead)

I agree that the oid needs to be specified. 

I think there are a number of issues regarding the capabilitiesTable.
I'll be writing about them soon in another note.

> 25) section 5.3 implies that an action is always to be run after a
>     filter has evaluated to true, but 5.2 dictates that it's only to
>     be executed within the time frame of policyFilterActionMaxLatency.
>     It's possible that the action will not be run given a filter's
>     output if the corresponding value of policyFilterActionMaxLatency
>     is greater than twice the of pmPolicyFilterMaxLatency.  Acutally,
>     the whole issue of those two objects is rather odd.  The way I see
>     it, these two objects allow for two different management functions
>     to be run separetly:
> 
>       1) runs a filter at least as frequently as
>          pmPolicyFilterMaxLatency and remembers the results.
>       2) runs a set of actions based on the results from #1 above with
>          at least the frequency of policyFilterActionMaxLatency.

Yes. The first controls how urgently you update the set membership (of
the set of matching elements). The second controls how urgently you
enforce the policy on members of the set.

Another very relevant thing is the rule that states that when an element
first matches, the action is run immediately. The fact that actions run
periodically isn't designed to deal with changing set membership. It's
there to make sure that if a MIB variable changes from the policy, that
it will be set back in less than policyActionMaxLatency seconds. This
makes your "odd" scenarios below seem less odd.

Some elements appear and disappear more frequently. This suggests a
lower filterMaxLatency.
Some elements are thrown out of policy more frequently by outside
influence. This suggests a lower actionMaxLatency.
Some elements cannot withstand to be out of policy for more than brief
moments. This suggests a lower actionMaxLatency.
Some elements are expensive to run the filter on. This suggests a higher
filterMaxLatency.
Some elements are expensive to run the action on. This suggests a higher
actionMaxLatency.

>     The odd thing is that these items allow you to configure the row
>     in very odd ways, like the example described above where
> 
>       policyFilterActionMaxLatency > pmPolicyFilterMaxLatency*2
> 
>     which might produce both cases of an action not running for every
>     filter check, and a case where an action may not get run at all
>     (if a filter evaluates to true then false during the second
>     reading).  (I guess 5.3 also states it must be run immediately
>     upon the first discovery of a filter that returns true, but future
>     invocations may not happen).

The "immediate" clause is key here. Also, note that if it disappears and
comes back, it will be run immediately the second time as well.

>     Another odd state would be where
> 
>       policyFilterActionMaxLatency < pmPolicyFilterMaxLatency*2
> 
>     which would mean that the action would run even though a recent
>     check hadn't been performed to make sure it should be run (which,
>     I guess, is not *that* odd).
> 
>     So if an implementation didn't want to separate these two phases
>     of collection and responding, they should do filtering/action
>     running based on min(policyFilterActionMaxLatency,
>     pmPolicyFilterMaxLatency)?

Yes.

> ***
> 26) setRowStatus() comes up with a random index based on maxTries, but
>     is somewhat inefficient if some knowledge of the table exists
>     already.  I'd really like to specify a starting value as well, in
>     case the table contents were exceptionally large (and I'd default
>     to 0 if I knew nothing about the contents).  If I need to create 5
>     rows, I'd like to start my 2nd-5th calls searching with the last
>     assigned value from the previous runs.

Sounds like a good idea. I've added it.

> ***
> 27) Its still unclear to me when in the API (if ever) you're allowed
>     to reference oids by textual name.  I'm gathering that never is
>     the case for all APIs?  If so, I'd replace the examples (like
>     11.1.2) that reference textual oids with numeric ones (even if you
>     state earlier than some management apps may convert them for you
>     before uploading the code).

I want the reader to understand what the example is attempting to do.
Further, we explicitely suggest that users be given the ability to
program scripts with descriptors (which are translated before script
installation). The fact that the over-the-wire representation does not
incude descriptors is VERY explicitely spelled out right after the
example.

> 28) 11.1.2.2
>         type is the type of the value parameter and will be set to
>         one of the values for DataType Constants.
>      ->
>         type will be the type of the value parameter and will be set to
>         one of the values for DataType Constants.

Done.

> ***
> 29) snmpsend() should really really really pass back the errindex (into
>     a new parameter).  Efficiently parsing through a huge varbind list
>     to find the source of the error would be otherwise impossible.

Haven't had a chance to take a look at this closely yet, but I will.

> ***
> 30) I'd suggest a new function called "logDebugString()" that would
>     log a string to the pmDebuggingTable if the row was configured
>     with debugging turned on (otherwise it'll result in essentially a
>     noop()).

I'm still thinking about this.

> 31) It's kind of odd that you declare the constants as "const" even
>     though the language doesn't allow the "const" token.  I'd be
>     happier if the document just declared that all constants can be
>     treated as an integer and simple list the name and value pairs
>     without trying to translate them into C-like statements at all.

It's just a notational convenience understood by many, immediately
preceeded by text clearly spelling out that const isn't in the language
. There was an extensive discussion about this in Knoxville in which a
roomfull of people convinced themselves this is the best thing so I
think there is clear consensus to do it this way.

> ***
> 32) I think all the constants should be matched with the similar BER
>     tag encodings where possible (just for ease-of-implementation
>     issues).  For instance, a string shouldn't be const "2" but rather
>     "4".  Similarly, the error constants like NosuchObject should
>     probably be "128" instead of "21", etc.  Non-snmp related
>     constants should then be moved further away from the most likely
>     places for the SNMP protocol spec to declare future additions (bad
>     parameter might move to 32767 or something).

I'll see what I can do with this.

> 33) setScratchpad(varName, scope, value) should have its arguments
>     reordered to (scope, varName, value).  Just makes more sense,
>     IMHO, to have name and value next to each other.

This makes sense. I'll make the change.

> 34) Many functions like getScratchpad, for example, don't state what
>     will be returned in variables like getScratchpad's value argument
>     when an error has occurred.  I'd suggest explicitly stating that
>     the value is unchanged from before the call, or that it is
>     undefined (I'd prefer it remain the same for consistency of
>     programming).

I agree with your proposed semantics. I'll try to track each down and
fix them.

> **
> 35) Throughout the document strings and integers are frequently mixed
>     and shouldn't be unless this is legal (which from my reading, I
>     don't think it is).  The "Scratchpad Usage Examples" section, for
>     instance, has the return values of the getScratchpad call being
>     '== 55' for example, when it should really be '== "55"'.
> 
>     I'd be somewhat happier as a scripting programmer if the values
>     were indeed inter-usable (automatically converted as needed), as
>     perl's variables are (since you reference it earlier in the
>     document).

I agree. You'll be happy to see that we now have that.

> 36) oidncmp() has an argument "int n" that isn't defined in the text.
>     I assume its the length against which to compare the oids in
>     question.  So if I want to compare 2 oids to their completion, do
>     I:
> 
>     a) oidncmp(oid1, oid2, 0)
>     b) oidncmp(oid1, oid2, 128)
>     c) int o1, o2, otmp;
>        o1 = oidlength(oid1);
>        o2 = oidlength(oid2);
>        if (o1 > o2)
>          otmp = o1;
>        else
>          otmp = o2;
>        oidncmp(oid1, oid2, otmp);

I'll fix it.


> 37) subidwrite() and subid() says their  "int n" argument must
>     start at "0".  I'd suggest replacing the argument instead with
>     "unsigned int n", which solves the problem the correct way (and
>     forces a complier check).  In fact, I think all the function
>     definitions could probably use a once over for this type of
>     checking.  The "int subid" of subidwrite(), for example, is
>     another case where it should probably be an unsigned int (since an
>     oid can't have negative values in it).

I was trying to say indexing doesn't start at 1, that zero names the
first element. unsigned doesn't say that.

> 38) A note should be put in subidwrite() saying that appending is
>     possible through the string concatinator operator (+).

Good idea.

> 39) the definition of oidsplice() is somewhat ambiguous.
>     Specifically, "oidsplice replaces n subidentifiers in oid1 with
>     all of the subidentifiers from oid2 starting at the m'th
>     subidentifier of oid1".  So, does this mean that:
> 
>       oidsplice(".1.2.3.4", 2, ".5.6.7.8.9", 1)
> 
>     will result in ".1.5.6.7.8.9.3.4"?  Thats how I read it, but I
>     don't know if thats correct.

Yes, it's correct. I've reworded it a bit to make it more clear.

> 40) library accessor functions uses atoi, but I'd rather have strtoul
>     around instead if possible.

They are both unnecessary now.

> 41) library accessor functions defines random, but doesn't state a
>     minimum acceptability of random-ness provided by the function.
>     Just a thought.

It's supposed to use the POSIX standard random() function.

> 42) Why are memcmp() and memmove() in the library accessor functions?
>     We have no pointers?

They're historical, but gone now.

> 43) substr() would be a good type of function to add to the list of
>     functions.

It's there now.

> 44) You should probably import UTF8String from SYSAPPL-MIB::Utf8String
>     instead of defining your own.  Though you don't put a size
>     restriction on yours, but yet your display hint has one???

The length restriction is a problem for us. We also had that problem
with SNMPAdminString.

The text of the definition itself came from SNMPAdminString which has 
a more extensive definition of a UTF8 String than SysApplMIB. The 2 
differences are:
1. The definition doesn't restrict it to administrative information
2. There is no length restriction.

> ***
> 45) You're tables are frequently indexed by arbitrary integer numbers,
>     which is another pet-peeve of mine.  This number has no meaning
>     what so ever and iterating through a table is always required if I
>     need information.  I'd much rather that things be indexed by a
>     string where possible.  For example, the pmPolicyIndex should be
>     "pmPolicyName" instead and be an snmpAdminString (IMHO, of
>     course).  Most of the other tables could use this type of indexing
>     change as well.  I'd be happy to come up with a list if the idea
>     is acceptable.

There are many considerations to take into account in making these
decisions. The current indexing scheme reflects these many
considerations. In some circumstances that has resulted in string/oid
indexes, in other cases it hasn't.

My modus operandi is to try to use a "natural" index if it exists. In
the case of policies, there is no natural index. Trying to fake one
leads to multiple-manager problems.

OK, there is one place where another rule took precedence. The rule: I
hate OID indexes. The pmElementTypeRegTable is a short table so scanning
cost wasn't really an issue and it gave me an option to avoid the OID
index (I used an integer index). In other tables where the table was
long I used the natural OID index. If this is really important, I'm not
violently opposed to changing the pmElementTypeRegTable.

> ***
> 46) If pmPolicyPrecedence values must be unique, then I *highly*
>     recommend making that the index (I know, I'm contradicting what
>     I'm suggesting in #45 above but now the number would have
>     significance).

They are no longer unique.

(I don't think you were contradicting yourself - I think your pet peeve
is against non-natural (unnatural?) indexes, not against integer
indexes.)

> 47) A flow diagram or state table describing pmPolicyPrecedence is
>     greatly needed.

Agreed.

> 48) Any reason why pmPolicyDescription can only be 255?  I'd be
>     tempted to allow larger sizes (and implementations can reduce the
>     length and document it in their agent-capabilities statements).

It seems like a reasonable limit. But if I hadn't put it there someone
would've said "Any reason why pmPolicyDescription doesn't have a size
clause?" There's those who want to put size limits on everything and I
keep them in mind when (I think) it's not harmful, but then I get the
opposite questions from others. I wish I could be left out of the
debate.

> 49) pmPolicyMatches: "The number of elements that are currently
>     matched by the associated pmPolicyFilter".  When is this number
>     determined?  At the time of the last filter run or does the filter
>     run right now to come up with the number?

At the time of the last filter run.

> 50) I don't understand the difference between
>     pmPolicyAbnormalTerminations and pmPolicyExecutionErrors?

I've updated them, try reading them again.

> 51) Why are pmPolicyAbnormalTerminations and pmPolicyMatches writable
>     (and Gauges for that matter)???

Typo. Fixed.

> ***
> 52) pmPolicyStatus should indicate if any of the previous row columns
>     are mandatory to set before coming active?  Certainly some objects
>     need setting, like the pmPolicyPrecedence object before it can
>     become operational.  Default values might be needed for some of
>     the other objects as well (like the latency objects).

I'll work on this.

> 53) pmPolicyCodeText has a type "The code" in the middle of the
>     description that probably is either supposed to be removed or
>     added to the beginning of the next sentence.

Fixed.

> ***
> 54) pmElementTypeRegTable should be indexed by the
>     pmElementTypeRegOIDPrefix OID in question to help managers
>     determine if an entry already exists for that object.
> 
>     Additionally, I'd suggest that the index should point to the table
>     object and a second "column number" object be created to indicate
>     what column should be used when searching for sub-objects.  This
>     would allow guaranteed uniqueness of the pmElementTypeRegOIDPrefix

Let me think about both of these.

> 55) pmElementTypeRegMaxLatency should give a pointer as to what to do
>     after a new element is discovered.

The RegTable and RegOIDPrefix objects contain that info.

> ***
> 56) pmCapabilitiesTable should be indexed by the type and/or subtype
>     for faster traversal when you're looking for a particular
>     capability.  (The type object really needs a better description,
>     since I have no idea what its value should indicate other than it
>     has been "assigned by IANA".)

This needs work.

> 57) pmTrackingElementToPolicyElement:
>    "The element this policy is configuring" -> um...
>      "The element that is configured by the policy defined in this row."
> 
>     I don't really like that either, though, but it does need to
>     change, IMHO.

I've reworded it.

> 58) Why define "off" in pmTrackingElementToPolicyStatus (and
>     pmTrackingPolicyToElementStatus) at all?  Why not just say that
>     the row doesn't exist unless the policy is on.  The enum value
>     will *never* get used.

I'm thinking about it.

> 59) Can we force creation of rows into the
>     pmTrackingElementToPolicyTable by setting the
>     pmTrackingElemenToPolicyStatus object to "forceOff"?  IE, so its
>     definition is remembered when it comes active via the calendar
>     later on?

Yes. I'll clarify this in the text.

> ***
> 60) most of these tables could use a "StorageType" object.
> 
>     (note, the suggested use of the StorageType object should probably
>     be mentioned in the -bcp document as well).

Yes, it's been on the todo list for a while. They're now in the draft.

> 61) A minimum/maximum number of support entries in the
>     pmDebuggingTable should be defined somewhere (the maximum being a
>     mib object returned by an object in the agent).  Also, roll over
>     of the log entries is done by, what, time?  A time index would
>     probably be a good thing for the table as well.

I'll look into it.

> 62) notification registration table coments, step 4:
>     downloaded -> uploaded & download -> upload

Done


Thanks again for all the feedback.


Steve