[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: snmpconf-pm-04 notes




Following up on the discussion related issues:

>  "Policies always express a notion of:
>     if (an object has certain characteristics) then (apply an operation(s) to
>     an object)"                                            ^^
>     ^^
Steve> What if it said: "Policies are intended to express a notion
Steve> of:" and then we could use the original pseudo-code, which is
Steve> designed to show the intent of the policy-based management
Steve> model rather than the limits of what a *could* be done.

What I didn't like about the original statement is that it implied to
me that operations always applied to the objects that triggered them,
which is hardly the case (if anything, its probably counter to how
they'd typically be used.  The operation is more likely to apply to a
near by object to the one that triggered the action).  Anyway, you can
do what you want of course.  The notion of applying multiple actions
is less important (IMHO) than the notion of applying the action to a
possibly different object.

> 4) section 4:
>   Repeated use of the word 'download' is used to say that the manager
>   downloads policies to the managed devices.  IMHO, this is
>   "uploading".

Steve> Very interesting. You probably say uploading because the code
Steve> is being "pushed" from here to there. I say it is downloading
Steve> because the code is going from server to client (central to
Steve> distributed).

Ahh.  To me it has always implied a direction associated with who is
performing the action.  The m-w.com dictionary defines it as:

   to transfer (information) from a microcomputer to a remote computer
   usually with a modem

which is just horrible out of date and could be interpreted either way
(though I'd argue that is directional).

However,

Steve> I'm not sure I agree that it is "uploading" but it's clear that
Steve> it's not clear. I'm thinking using the word "install"
Steve> instead. What do you think?

That works.

>> 5) section 4:
>> There are security issues with not uploading policies to a managed
>> object until they are discovered.  This issue comes up repeatedly in
>> the document.  During the time that the policy is not in place it
>> obviously can't be enforced (which could be serious if financial or
>> highly secure information was transmitted (EG) during that time
>> window).  In theory, you'd think, that this shouldn't happen that
>> often and the window would be small.  But what if the
>> pmNewRoleNotification isn't received by the management station?

Steve> I agree. I'll make some mention of this as appropriate (at least in the
Steve> security section). How's this text?:

Steve> -- Note that using this algorithm to avoid installing "unnecessary"
Steve> -- policies may result in delays in having the policy available when
Steve> -- the policy becomes necessary. This delay could become extensive if
Steve> -- an interruption of communications prevents the notification from
Steve> -- being delivered and/or the policy from being downloaded, causing
Steve> -- the sytem to not be in compliance with policy for a period of
Steve> -- time. In particular, if the policy is enforcing security rules,
Steve> -- this could open up security vulnerabilities during this period of
Steve> -- time.

That looks good.  2 comments:

1) please don't put it in the MIB as a comment (see issues discussed
   in my second mail message with respect to putting functionality
   related discussions in MIB comments)

2) download -> installed    ;-)

> ***
> 9) 5.3 and others
>    If syntax or processing errors occur, the action will terminate
>    immediately for this element.  I think failures need to be dealt
>    with in a better way.  The document repeatedly references failing
>    actions and that processing stops.
> 
>    First and most importantly, I think a failure notification needs to
>    be sent out (if configured to do so), as there are security
>    implications with policies that fail to run properly.

Steve> For each policy, there's a rollup of the total number of
Steve> instances of the policy that are failing now (a gauge,
Steve> pmPolicyAbnormalTerminations).  Also, when I send an updated
Steve> draft next week, check out pmTrackingPolicyToElementInfo which
Steve> contains info about errors and exceptions on a
Steve> per-execution-context basis (i.e. per policy/element pair). Of
Steve> course, there's also the debugging table.

Steve> So we've made it easy to poll but haven't achieved sub-second
Steve> notification (only possible with a notification). And it's here
Steve> that I'm a bit reluctant to say that a notification is a
Steve> requirement. To be honest, I really don't have a sense of this
Steve> yet. One more notification isn't that big a deal but I'm most
Steve> worried about getting into ratholes about how we can make sure
Steve> that floods of notifications don't occur.

Steve> All we would absolutely need would be to have a notification
Steve> that is sent whenever the abnormalTerminations gauge goes from
Steve> zero to non-zero and no more than 1 notification per 60 seconds
Steve> (or some similar non-configurable constant). If we could keep
Steve> it simple like this it might be worth it.

IMHO, it is a serious security issue not to enable a way for failed
actions to be run incorrectly and not have a way to inform an
administrator of this immediately.  You're right that the counter
objects exist, but (IMHO) thats not enough.  I don't want to have to
rely on a management station monitoring 50,000 boxes to determine
which ones are failing.  The chances of monitoring it frequently
enough to ensure that my policies are being implemented properly
everywhere is slim.  HP-OV, the last time I ran it, defaulted to
polling objects only every 30 or 60 minutes (and I couldn't decrease
it from that because the polling was taking 20 minutes in the first
place (granted it was a slow box)).  I'd much rather have a trap sent
warning me of such a serious problem. 

The other possibility is to configure backup actions to be used when a
primary action fails.  This is what many other policy related
architectures are trying to do (specifically, the one I'm most
familiar with is the ipsec-policy WG and they're providing a list of
backup actions to take).  In theory, then, you could implement a
backup action that sent a trap.  Though I think making a specific
notification to handle failures is a better way to go.

> 16) repeatedly, starting in 11.1.1.1:
> 
>      The agent will retrieve the instance in the same SNMP context
>      in which the element resides.
> 
>    What exactly does this mean?  In the SNMP notion of contexts, an
>    item can exist in multiple contexts.

Steve> It's supposed to mean the context in which the element was discovered
Steve> (by the pmElementTypeRegTable).

It's already been too long since I read it.  I'll have to go rescan it
now, as this one is already hazy in my head.

-- 
Wes Hardaker
NAI Labs
Network Associates