RE: Re: sFlow Datagram Extensibility

From: Peter Phaal (peter_phaal@inmon.com)
Date: 09/13/02

  • Next message: Peter Phaal: "Improvements to sFlow MIB"

    Marc,

    I've made changes to the documents based on your recommendations. The latest
    version are:
    http://www.sflow.org/drafts/draft3/SFLOW-DATAGRAM.txt
    http://www.sflow.org/drafts/draft3/SFLOW-STRUCTS.txt

    I've also placed links to them from the sFlow.org documents page so that the
    latest proposals will always be easy to find.

    > To support this, I think it should be documented that any
    > software decoding
    > an sFlow packet, should always use the encoded length
    > information, and not
    > assume that a structure is of a particular length, since
    > structures may
    > grow. This should apply to standard structures as well,
    > since presumably
    > they could also be extended as a result of an update to the standard.

    I added the following note to SFLOW-DATAGRAM.txt:

    Note: sFlow implementors are permitted to extend structures at the end
          without changing structure numbers. Any changes that would
          alter or invalidate fields in published structure definitions
            requires must be implemented using a new structure number. This
            policy allows additional data to be added to structures while
            still maintaining backward compatibility. Applications receiving
            sFlow data must always use the opaque length information when
            decoding opaque<> structures.

    > You might want to explicitly specify whether data_format
    > values need to be
    > unique across different structure types. In other words, can
    > a single value
    > be used to identify a flow structure, a counter structure,
    > and a sample
    > structure, or should values not be reused in that manner?

    I added this description to SFLOW-DATAGRAM.txt:

        There are currently three opaque structures where which data_formats
        are used:
           1. sample_data
           2. counter_data
           3. flow_data

         Structure format numbers may be re-used within each of these contexts.
         For example, an (inmon,1) data_format could identify a particular
         set of counters when used to describe counter_data, but refer to
         a set of flow attributes when used to describe flow_data.

    >
    > You might consider renaming counter_block to counter_record
    > for consistency
    > with the other structures.

    Done.

    > In the data structures document, I think it would be good to have some
    > documentation about how the structures should be filled in,
    > particularly
    > when not all information is available. For example, for
    > extended_switch, if
    > either the source or destination VLAN information is not
    > available, should
    > the corresponding fields be set to zero? Likewise for
    > extended_user, I
    > presume it's acceptable to encode a zero-length string if one
    > of the user
    > ids is not available.

    How about this clarification in SFLOW-STRUCTS.txt?

       The following values should be used for fields that are
       unknown (unless otherwise indicated in the structure
       definitions).
          - Unknown integer value. Use a value of 0 to indicate that
            a value is unknown.
          - Unknown counter. Use the maximum counter value to indicate
            that the counter is not available. Within any given sFlow
            session a particular counter must be always available, or
            always unavailable. An available counter may temporarily
            have the max value just before it rolls to zero. This is
            permitted. */
          - Unknown string. Use the zero length empty string.

    > In the extended_router documentation, it is not clearly
    > specified whether
    > the mask fields' format is a bit mask or a count of bits.

    Modified the structure definition as follows:

    struct extended_router {
       address nexthop; /* IP address of next hop router */
       unsigned int src_mask; /* Source address prefix mask
                                   (expressed as number of bits) */
       unsigned int dst_mask; /* Destination address prefix mask
                                   (expressed as number of bits) */
    }

    > For flow_sample.drops, I think it would be good to clarify
    > the documentation
    > with regard to what kind of packets are being counted (i.e.
    > are they only
    > sFlow packet drops that are being counted?).

    Added the following text to SFLOW-DATAGRAM.txt:

     unsigned int drops; /* Number times a packet marked to be
                                               sampled was dropped due to
                                       lack of resources. A high drop rate
                                       indicates that the management agent
                                       is unable to process samples as fast
                                       as they are being generated by
                                       hardware. Increasing sampling_rate
                                       will reduce the drop rate. */

    > Should the ETHERNET-ISO8023 enum be named ETHERNET-ISO88023 instead?

    Good catch.

    > In flow_sample, the input and output fields have special
    > values to represent
    > the case where the interface is "unknown". If packets originating or
    > terminating at the switch itself are sampled, then one of the
    > two interface
    > fields will not apply. I'm wondering if it might be good to have an
    > additional special "none" value to indicate this, rather than
    > using the
    > "unknown" value, which might wind up getting used for other
    > cases as well.

    How about this change. It creates a third category that lets us capture the
    reason that a packet was discarded. Can you think of other reason codes?

       unsigned int output; /* SNMP ifIndex of output interface.
                                          0 if interface is not known.
                                          The most significant 2 bits are used
                                          to indicate the format of the
                                          30 bit value.
                                            format = 0 single destination
                                                       interface, value is
    ifIndex
                                                       of the interface.
                                            format = 1 packet discarded, value
    is
                                                       a reason code. Currently
                                                       the following codes are
                                                       defined.
                                                         0 = unknown
                                                         1 = ACL
                                                         2 = no buffer space
                                                         3 = RED
                                                         4 = no route to dest.
                                            format = 2 multiple destination
                                                       interfaces, value is the
                                                       number of interfaces. A
                                                       value of 0 indicates an
                                                       unknown number greater
                                                       than 1.

                                          Examples:
                                             0x00000002 indicates ifIndex = 2
                                             0x00000000 ifIndex unknown.
                                             0x40000001 packet discarded
    because
                                                         of ACL.
                                             0x80000007 indicates a packet sent
                                                         to 7 interfaces.
                                             0x80000000 indicates a packet sent
                                                         to an unknown number of
                                                         interfaces greater than
                                                         1. */

    This additional information could be very useful for identifying
    connectivity/performance problems.

    > In the extended_user data, there is an issue of what character set and
    > encoding the user ids are expressed in. I'm sure there will
    > be contexts in
    > which they will not be in ASCII. In an ideal world, I'd just
    > say these
    > should be encoded in UTF-8, but agents may receive the data
    > in different
    > encodings, and it seems better for the agents not to need to
    > delve into
    > character set translations. Therefore, I think it would be a
    > good idea to
    > be able to include information about the character set of
    > each user id (for
    > each field independently). This may assist a collector in
    > being able to
    > properly display the ids or map them into different character
    > sets. For
    > character set issues, see RFCs 2277 and 2978. RFC 2978
    > defines a scheme for
    > registering character sets and encodings (collectively dubbed
    > "charsets").
    > The registry contents can be found at
    > http://www.iana.org/assignments/character-sets. Fortunately,
    > the registry
    > includes a "MIBenum" integer for each charset. I propose
    > that these values
    > be used to identify the charset for each user id string, with
    > the reserved
    > value zero being used to indicate that the charset is
    > unknown. So, for
    > example, if an agent knows that a user id is in UTF-8, the
    > MIBenum value
    > would be 106. UTF-8 could probably be considered the
    > preferred charset, if
    > the agent is able to obtain the data in different charsets.

    Good suggestion. I impemented it in SFLOW-STRUCTS.txt as follows:

    struct extended_user {
       unsigned int charset; /* MIBEnum value of character set used to
    encode
                                      user information - See RFC 2978
                                      Where possible UTF-8 encoding
    (MIBEnum=106) should
                                      be used. */
       string src_user<>; /* User ID associated with packet source */
       string dst_user<>; /* User ID associated with packet destination
    */
    }

    Regards,
    Peter
    ----------------------
    Peter Phaal
    InMon Corp.

    Peter_Phaal@inmon.com



    This archive was generated by hypermail 2.1.4 : 09/13/02 PDT