[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

snmpconf Interim Meeting Notes Day Two




The following are the interim meeting notes for day two of the
SNMPCONF working group interim meeting that took place in Pittsburgh
PA on August 4 and 5, 2000.

If you have comments or clarifications, please send them to
moulton@snmp.com.  Please submit any changes you have by
mid-day Friday September 8.  I will merge changes and
submit the minutes to the IETF on Friday, September 8, and resubmit
them to this list if there are significant change>ԏ/QFU?)/1ۢ>ԏ/?R*FU;)1ۢ>ԏ/._l˾ߏc)=&W2Zu:d\}gc)?ښZK\N]A싗W2G~6P{#R>pm}ަA
Q>tpm}ަA
##Kmgoz\7##_+RI$,4	i!&mDz4=!4Z|`Bɪq쮫MsHm2}.N6a+< È1#qW[zfgM4f2Dq$x.#jz/M]w&l7W^ ď﫿Wz_Η_N,sy$^w\uYe,%ĸ˜'z߫/KtR\K<|VZi$YXN';#W@yCU~NvFدF;Y~m|V֏zq9_b]:G+RI$I$_?W^xl7sٸ9r"H^w?=/-{74DI]~^{'[f1ȉ ki779T	k@1-N?KZ#Z}#
='
~6=U8Z}˜P?Xۅf4gYO:z<>zG٫mM-dyο
޳n=QjSKYud?Vsulfs+5够ohW~WתZEWYkCI>Я`YίUճϦ(Sև}+RI$I$_+RI$I$_
_S	_?AxI$_^PQ$I$I%^PQ$I$I%SNMPCONF 8/5/00  9 AM

Jon Saperia was the first moderator for the day.  He started the
the meeting stating that he would like to pick a few issues that are 
solvable so that we have some progress by the end of the day.

Joel Halpern has agreed to lead a discussion on what we mean by policy.
(he is now Policy Framework WG Co-Chair).


Lists of items that we may tackle today (not in any order):

    Review of Policy Override                 (90 mins)
    Capabilities table in BCP                 (scratch)
    Schedule and Time                         (45 minutes)
    testing and debug                         (deferred)
        what about diagnostic information - what might be required
        for larger scale scripts
    fate sharing and groups - do groups have special meaning in eval
        possible use of groups with precedence, groups could be
        used to collect all members of one class (all QoS) or
        in order to pick a member of a class (best security option
        from the list)                        (90 mins)
    diffserv policy module                    (5 mins)

    planning next steps                        (30 mins)
    discussion of policy examples              (45 mins)


David Partain asked from the floor if there any chance of getting 
closure on the language issue.  He would like to take what is there 
and run with it, or if we are going to spin our wheels on it a little 
bit more.  This affects many other issues.  It would be nice to have 
some resolution.

The comment was made that some of the difficulty of yesterday is we 
were trying too hard to reach closure on everything in the absence of 
examples. 

Jon Saperia stated that there does not seem to be a consensus to add 
language to today's topics.

There was a discussion that resulted in the table above.

Andy Bierman requested that we discuss putting mechanisms in the MIB
to help debug when configuring arbitrarily large systems.   He does not
want to schedule time at this point though.


Topic 1: Discussion of Policy Examples.  Lead by Joel Halpern.

Joel:  At this point I am trying to collect suggestions as to what
policy is.  What kinds of problems are we trying to solve?

Many dimensions were brought up, including

.  Creating a persistent change (sticks across reboot) versus a 
   temporary change.

.  Cause a change on all entities running a particular application

.  Selection based on vendor, model, software/firmware versions,
   installed capabilities.

.  Selection based on resource availability, disk usage, cpu usage,
   forwarding capability.

.  Selection based on utilization

.  Selection based on role.

A specific case was mentioned, where one has a configuration that
one wants to apply to N different systems, but each is slightly 
different.  Sometimes the selection substitutions are easy and
sometimes they are hards.  An example: IP address/netmask substitution
is easier than distributing a configuration that is thresholding
on a temporary storage area on a variety of different systems.  This
is quite different on a NT vs. unix, or a Cisco vs. Nortel.  We need
to hide that from an operator who might not be completely aware of
the implications.

Subsequent discussion brought up many issues. Many vendors
currently have proprietary solutions for doing these types of things,
where we are trying to evolve an interoperable multi-vendor
environment.  It is normal to provision for what you care about, and
services/customers that don't receive any preference get the
leftovers.  The opposite approach is to limit by service (say let
video have as a maximum 30% of the bandwidth), but many will want to
do this.  It was mentioned that this approach may have multiple ways
of being provided and that stating the policy goal in this way may
lead to an under-constrained solution.

Other possible requirements were brought up:

.  The ability to specify roles to the system about how to react over time
   to configuration or state changes.

.  For security reasons, the ability to disable all ports that
   are not explicitly enabled.

.  Avoid multiple scripts for minor device variations.

.  Enable coherence of configuration across devices.

   Systems such as diffserv or QoS are distributed by nature and for 
   them to work you need to have global coherence - while each device may 
   be slightly different or use a slightly different style it should be 
   pointing in the correct direction.

.  Well constrained programming environment in devices for configurable 
   behavior.

.  Be able to change state of running activities or processes.

.  Minimize number distinct OS versions: either fix or report.
   Not every box runs the same set of features, and there is
   version skew/quality skew/feature skew based on device.

   One can't really have only one version, even in things like routers.
   One may have reasons for running different versions, such as some 
   features working better in one version or other.  The script may 
   need to take notice of these differences.  You may also want to 
   run an older version due to internal reasons so script may need
   to down rev new entities.  Scripts should be able to notice the 
   mismatch and report it somewhere - possibly causing a down load 
   itself or possibly leaving that for some other code or a human.

.  Find and repair misfits.

.  Be able to drop all traffic of a given nature (say Microsoft
   Workgroup) on interfaces facing the internet.

.  Apply "standard" route filtering at all "appropriate" places
   (possibly trap on exceptions, rather than filter them).

.  Set VPN information based on a user "login".

.  Create "access ports" with just public internet access (no
   intranet).

.  Change connectivity of a port or resource for a time based on
   payment, and be able to cancel such connectivity either at
   end of time period or end of usage ("I'm done, now").

   Suppose you were in a hotel and have ethernet access fee based.
   You may want to restrict access to the site where you pay the fee. You
   want to provision a particular to be available until noon the next
   day.  You can have a cancel availability.

   One needs to configure to the mission, and then abstract 
   to SNMP operations.

.  Hiding complexity.

.  Abstract access to type-specific and complex instance information.

.  Specify roles to the system how to react over time to configuration
   or state changes -- decompose the configuration.
   
A warning about the diffserv arena:  There is seldom a crossing between 
"user wants service level" and "service level x means this and that".

.  Disabled a BGP operation over a scheduled period.

.  Enable coherence of configuration across devices.  They cannot be
consistent, but must be coherent.

.  Use subnet as the parameter, for both values and conditions.  
   May be several approaches.

No one objected that anything was out of scope.  We need to encourage
people to add more.  This list is not ordered in any order or priority.  
Nor is the list a working list for this working group.  

[break]

Topic 2: Schedule and Time.  Presentation by Thippanna Hongal
         (hongal@yagosys.com)

The slide presentation touched on several topics, such as RFC2445 
(Internet Calendaring and Scheduling Core Object Specification),
local time objects and explicit dates.  Several possible methods
for dealing with time issues (accessor functions, duration, 
predefined scalar variables, explicit start and end date, 
and using RFC2591 (schedule MIB) with duration extensions) were discussed.  
The slides are available.

The presentation was well received, with the following comments made:

.  One needs to have a software flipflop.  If something is not done,
   then do it (so that you have idempotency).

.  This can be made a little richer by adding semantics to the accessor
   functions so that duration is automatically handled.

.  Change the single letter to H for Thursday, or use digits for days.
   (the example in question used MTWTF).

.  The accessor function needs to be richer to allow for some sort of 
   between operation (between 9 and 5 rather than a trigger happening 
   at 9 and 5).

.  When using the schedule MIB, you would need to create a MIB entry
   that would point back to the entry that would be started.  
   Actually you would need two entries in the schedule MIB one for starting 
   (setting the admin status to enable) and one for stopping 
   (setting the admin status to disable).  May be able
   to add an augments clause to the schedule MIB to pack the 
   information into one entry.

.  One needs to make sure that the forward and backward pointers
   stay in sync.  To do this, there should be a one-to-one pairing.
   Have two schedule entries pointing to one ifAdminStatus would be
   difficult to maintain.

.  If we use the schedule MIB, we need to bear in mind that it has
   up to one minute latency after one presses the go button.

.  One could group things into a policy filter result.  You could make
   it so that the expressions are evaluated synchronously based on a
   max latency timer.  You can make it so that when the result column is
   retrieved it causes an evaluation. 

.  One would like a policy that says if an action follows in a certain
   time frame, page someone, otherwise turn on a blue light in the
   machine room.  

.  If you have both a trigger column and a pointer to time stuff,
   things will still work if a schedule tool on a network management
   station or elsewhere can set the object on a system without a concept 
   of time.

   The general idea is that you have a trigger button that when twiddled
   causes an edge-like action.  You also have a time base set of
   arguments that cause it to happen based on time, and you have a
   coupling that allows the time based stuff to diddle the trigger.
   (synchronous and asynchronous).  

   Another way to do that would be to have that trigger something in the
   event MIB from DISMAN to do the discrimination.  

   In summary, we are talking about augmenting the schedule MIB so
   that we have the start and stop times rather than having multiple
   entries.  This is a proposed change.

[lunch break]

Topic 3: Report of the Accessor Function Design Team.

.  Functions are designed in groups
.  All functions of a group must be implemented to have the group advertised.
.  We are silent about the point that you may implement less than the
   whole group and not advertise the group.  i.e., we do not prohibit
   partial implementation but we do explicitly prohibit advertising a partial
   implementation.
.  Groups are exposed via MIB objects in the capabilities table,
   except the mandatory core group, which is explicitly implicit
   (which may need exposure in the capabilities table anyway, for
   versioning).
.  Access functions are always defined in information modules
.  Groups are identified by Object ID
.  Accessor functions are identified by name and are not exposed individually
.  vendors are allowed at define their own groups in their own name spaces
.  We have an accessor function capMatch(group)
.  We do not support a functionExists(accessor function)
.  Versioning is done at the group, not the accessor level (capabilities
   subtype)
.  We may have more issues with versioning.

Following the presentation, there was a discussion about naming,
name spaces, and versioning, summarized below.

.  Libraries (groups) should be backwards compatible (can only add functions,
   cannot remove functions).  The "you cannot change count
   or type of arguments to a function" rule was mentioned.

.  Names are assumed to be unique within the standards name-space; 
   names within a vendor name space should have a prefix attached based
   on some sort of vendor-specific information (enterprise number-type
   information) and the vendor will be responsible for ensuring
   the uniqueness of the function names.  The prepending cannot
   be done via dots, as this has a meaning within C-like languages,
   which is a constraint we have due to our restriction to extant
   languages.

.  Function groups can be identified by OID and invoked by name.

.  The semantics of capMatch should be such that capMatch(1.0)
   means that 1.0 functions are available, and capMatch(2.0) means
   that 1.0 and 2.0 functions are available.  However, since
   the name space and signature of the functions are unique, it
   can be argued that you don't need versioning.

.  We have to make sure standard groups never standardize two groups with
   the same name, and across two standard groups don't have two identical
   names.

[End of design team report and hash]


Topic 4: Diffserv Policy MIB Report, by David Partain.

First thing we discovered is that there was some logic that was not
exactly right in the -02 draft.  In discussion with Joel Halpern, 
just like in the diffserv MIB, we want to build up a linked list of
elements that are acted on.   To know where the linked list ends, we
have to do a linked list traversal.  To fix this, we put in an
"associative index".   So diffservPolicy #1 means the whole list of
things.  We added a linked list of all of the lists.

How to you instantiate this stuff?  Joel H came up with a start table
where you provide the instance information that is needed (ifIndex,
direction) for tat policy and tell it which policy to start.  We added
a table indexed by (ifIndex, direction).  You push a button (set an
integer index to, say, 3), and this instantiates a policy.

In the diffserv working group, Joel suggested the same things, as a result 
of which change were made to their MIB.   This way we no longer to
take each revision of their draft and clean it up.  Whether or not Joel 
and others can push this through the diffserv working group is an open 
question. 

Harrie found a couple of things that were bugs.   As a part of this,
we realized that if the instance and policy information go into a 
separate table in the diffserv MIB, then it is likely that the
diffserv policy MIB may no longer be needed.  

This provides an interesting lesson on how to model this stuff
template-wise.  We now understand diffserv a lot better.


Jon Saperia raised several points:

.  Have you written this up?  We should have a policy section in the 
   BCP document.

.  Also, could you tell the working group if we are omitting this MIB.

.  This working group has been asked by the IPSEC group for some help in writing
   up the MIB and etc to config IPSEC.


Brief Presentation on instance/object fanout, by Jeff Case.

Jeff put up a drawing showing various ways of traveling the MIBs
to get to instance information.  There were three basic routes:

1  NMS -> diffserv MIB
2  NMS -> Policy MIB -> diffserv MIB
3  NMS -> Policy MIB -> QOS Policy MIB -> diffserv MIB

The Policy MIB gives you fanout on the number of instances.  The
QoS MIB gives you fanout on the number of objects.  Going through
both, you get m*n fanout.  Two points of interest:

.  the first fanout (Policy MIB) is more interesting as you may
   get a much larger fanout (for example 1000s to several).
.  if the accessor functions are allowed to go off board
   (to another managed element) then some of the policy
   instrumentation can be located elsewhere.  All of our work so far 
   has assumed that QoS,  Policy MIB and diffserv MIB are all on the 
   same managed system.  If our accessor functions can do a
   snmp get, then Policy MIB or QoS MIB can be colocated at the agent 
   or the manager or a mid-level manager.


Topic 5: Policy Override Discussion, led by Steve Waldbusser.

Steve Waldbusser:  In the general area of policy conflicts.  I am
uncomfortable with the term "conflicts"  It implies things that happen
behind our backs.

When policies have relationships, sometimes the policy writer wants to
specify enough information to have the agent know what to do when two
or more policies might apply.

You might have related sets of policies that need arbitration:

   if capMatch(802) configure 802
   if capMatch(diffServ) configure DiffServ

You may not want both happening at the same time if there is a
conflict.  Some approaches one might use are:

.  Pick one set based on capabilities or other state
.  Use defined configuration for non-matching elements
.  Use a default configuration

You also need to have a defined state when a policy goes out of effect 
(rather than leave policy in limbo).  

You have completely different policies different domains, goals, and
human authors, yet they diddle the same object.  Which one wins?

Everything falls into one of these two groups:

.  Related sets of policies that need arbitration
.  Unrelated sets of policies


We might use two objects to arbitrate related policies

.  pmPolicyGroup
.  pmPolicyGroupPriority

and one object to a arbitrate instance conflict

.  pmPolicyConflictPriority

Walter Weiss had two issues with this approach:

.  If we have two alternatives, which one do I select
.  If we have two alternatives, which do I do first

Jeff Case broke it down into the following cases:

.  Policy 1 modifies object/instance A, policy 2 modifies object/instance B
   no problem.
.  Policy 1 modifies object/instance a and b, policy 2 modifies a
   and b, and there is a perfect overlap.  You only do the one with 
   highest precedence.
.  Policy 1 modifies object/instance a and b, policy 2 modifies a
   and b, and you want a way to say "by the way, I want you to do
   both a and b or neither a nor b.  You needs groups for this
   case.

Joel Halpern disagreed with this approach, as the policy is an atomic 
thing.  You cannot do half a policy.

Two approaches to this were discussed.  One might use two policies 
with difference precedences, or you might use a group in a logical 
switch statement.   Joel Halpern liked the logical composition, as 
using precedence is more likely to lead to programmer error.  He 
went on to say that there is no good way to catch overlapping 
policies, they'll just have to diagnose it on the box.  We need 
to say we thought about it.  He thinks a precedence value is about as 
good as we can get. There may be a corollary in the execution environment, 
if we have several sets from a policy action, you need to check the 
precedence to make sure the sets are ordered correctly.

Jon Saperia made that point that policies that simultaneously control 
various MIB objects are difficult to implement.   You put two things 
in a group because their actions could conflict.  If these two things 
match on the same instance then the box burns up.

The observation was made that you cannot make sure that policy writers 
do groups correctly.  From policy group, it is important that you be able 
to detect all potential policy conflicts.  The problem is when you are not
coordinating your policy writers.

Walter Weiss:   Do we have the scripts or policies invoke other scripts or
policies?  It seems to me that for many of the conflict issues this
may be the ideal level of sophistication.  We are playing games on 
criteria.  More often than not, what is more important on circumstances 
rather than conflict.

Joel Halpern:  In most cases, policy conflicts are caused by underspecified
conditions.  "It can't happen!  Well, unfortunately it did."  It
is messy when unrelated sets of policies conflict.

The discussion over the next few minutes focused in policy
interactions and the need for human supervision to deal with
interaction problems.  Concerns were expressed about how to
handle a partial policy implementation due to partial failures
and whether or not we should be able to do policy unwinding (like
SNMP sets are handled now).   The problem of having initial
configuration policies, subsequent incremental policies, and
interaction resolution between the two was mentioned.  The point
that there are cases where we don't want the agent to choose
randomly was made.

A policy set of actions can be done starting at the beginning and
end at the end.  This is unlike how we do PDU processing.  A
more SNMP-set-like set of operations (must all succeed or they fail)
will not work, as most policies aren't going to work on a
larger machine (since some sets will most always fail).  We are going
to have to live with this.

Sometimes we want a partial policy execution to succeed, sometimes we 
want it to fail (roll back). For example, we may have a policy that 
says provide power to the oxygen generator if possible.  The other 
is "when on emergency generator shut down as much as you can".  I 
don't think that since we cannot shut down the oxygen generator we 
can't shut down any outlets.   I believe you will have times that say 
"do as much as you can", and sometimes you say "do all or none".  You 
want to have both kinds of semantics expressible.  We have a mechanism 
for doing both but we don't understand the computational complexity yet.
When you want partial results, put them in separate groups, otherwise
put them in same group.  You could have an accessor function that says 
"do all or none", or separated into separate unrelated actions.


Meeting wrapup.

How do do people feel about having the interim at the end of the IETF?

Great to minimize travel, but we are toastier than toast.

We made a great deal of progress between last interim and this interim.

Proposed dates for next interim Oct 12/13 and Oct 19/20.  Knoxville
has been generally agreed up as the location.  We'll take the
dates issue to the list.

The Policy MIB  document has a lot of stuff that has been effected.
The policy module is probably go to a tiny set of objects.

Harrie says he wants to wait to see what diffserv is going to do.

DIffserv is probably going to decide in the next two weeks.

We've made some changes that require that we change the BCP document.  
My goal is to republish before the end of September.

Thanks to SNMP Research and Ericsson for sponsoring this interim
meeting.  


---
Administrivia

Since there are no blue sheets for an interim meeting, an
attendance list must be submitted.  Many thanks to Shawn
Routhier, both for his excellent notes, and for the attendance list.

Attendance list for meeting
David Partain                   David.Partain@ericsson.com
Steve Moulton                   moulton@snmp.com
Jon Saperia                     saperia@jdscons.com
Matt White                      mwhite@torrentnet.com
Harrie Hazewinkel               harrie@covalent.net
Walter Weiss                    wweiss@ellacoy.com
Thippanna Hongal                hongal@riverstonenet.com
David T. Perkins                dperkins@dsperkins.com
Omar Cherkaoui                  cherkaoui.omar@ugam.ca
Kwok Ho Chan                    khchan@nortelnetworks.com
Zhifeng Xiao                    zhifeng@cs.mcgill.ca
Joel Halpern                    joel@longsys.com
Rob Frye                        rfrye@longsys.com
Mike MacFaden                   mrm@riverstonenet.com
Bert Wijnen                     bwijnene@lucent.com
Shawn A. Routhier               sar@epilogue.com
Andy Bierman                    abierman@cisco.com
Chris Elliott                   chelliot@cisco.com
Dan Romascanu                   dromasca@avaya.com
Dale Francisco                  dfrancis@cisco.com
Barr Hibbs                      rbhibbs@ultradns.com
Jeff Case                       case@snmp.com
Assaf Zeira                     assafz@p-cube.com
Steve Waldbusser                waldbusser@nextbeacon.com
David Harrington                dbh@enterasys.com