RE: Re: sFlow Datagram Extensibility

From: Peter Phaal (peter_phaal@inmon.com)
Date: 09/12/02

  • Next message: Marc Lavine: "Re: Re: sFlow Datagram Extensibility"

    Mark,

    Thank you for your comments. I've incorporated your suggestions in a new version of
    the packet specification (attached below).

    > I think there's an additional change that is worth making as well, which would be
    > to add length information for each overall sample. This would provide for the
    > possibility of extending each sample structure (e.g. if it's determined that a
    > new field is needed within a flow_sample -- see my explanation below regarding
    > this type of extensibility), and it would also provide the ability to add new
    > sample types while maintaining backward compatibility. So, what I propose might
    > look something like this:
    >
    > struct sample_record {
    > sample_types sample_type; /* Specifies the type of sample data */
    > opaque sample_data<>; /* A sample structure, such as flow_sample or
    > counters_sample */
    > }
    >
    > struct sample_datagram_v5 {
    > ... /* same as before */
    > sample_record samples<>; /* An array of sample_records */
    > }
    >
    > Note that I haven't used the "enterprise" extension mechanism here, although you
    > could if you want to keep it consistent with the other structures.

    Good idea. It's not a bad idea to use the the same mechanism for defining
    structure types. It would allow vendor specific data to be specified at this
    level.

    > Sorry if I wasn't clear enough about my intent here. I didn't mean to imply
    > that an sFlow agent could extend an existing structure in any way it pleased.
    > Rather, my intent was that the organization that had defined a particular
    > structure could later define an extended version of that same structure with some
    > additional fields added to the end. Existing fields in the structure would not
    > be allowed to be modified, only new ones could be added to the end. The
    > presence of the length information would allow a collector to determine which
    > version of a structure was in use, so that it could take advantage of the
    > additional fields if it had been updated to understand them. Collectors which
    > did not understand the new fields would continue to use the earlier set of fields,
    > and would skip over the new fields, because a properly implemented collector
    > would always use the supplied length information to advance from one structure
    > to the next (whether the structures were known to it or not). This should allow
    > the incremental evolution of structures, where appropriate, without the overhead
    > of introducing a new structure with its associated type information.

    Agreed. Vendors would be allowed to extend their own structures, but not the
    standard structures or those of other vendors. Keeping a single authority
    for each structure definition ensures that there won't be any clashes on the
    extensions.

    > I hadn't thought about just encoding the last part of the enterprise OID as a
    > simple integer, since I guess I was thinking that in theory, the enterprise id
    > could be multiple OID components, but I guess that's not likely to happen anytime
    > soon. The way you've proposed to encode it is more compact, which is good.
    > The OUI-based approach I suggested would be even more compact, using only a
    > single integer for the combined OUI+format value. If one wished to gamble that
    > they're not going to use up more than 24 bits worth or enterprise ids any time
    > soon (they've used a little less than 1% of that range so far), then one could
    > use 24 bits of enterprise id plus an 8-bit format id. Of course, either of
    > these approaches would restrict one to 256 formats per global id, which feels
    > sort of small, but I'm not sure we'd be likely to hit that limit, in practice.

    The current data_format definition does provide a very large (excessive?)
    name space. It allows 2*32-1 enterprises and 2*32-1 structures per
    enterprise.

    Currently the largest enterprise number assignment is 14609
    http://www.iana.org/assignments/enterprise-numbers

    A 24 bit enterprise and 8 bit struct number does seem a little constraining.
    How about 20 bits for the enterprise and 12 bits for the structure? This
    would allow for over a million enterprises and allows each enterprise over
    4000 structures. I don't see hitting either of these limits any time soon.

    > I'm thinking that it might be good to make a distinction between standard
    > and vendor extension structures. To do this, the standard structures could
    > use enterprise id zero (which is a reserved id), rather than using InMon's id.

    Good idea. See definition of data_format below.

    > I noticed the addition of the sampled_ethernet format. Since the data
    > structures no longer use a union to restrict a flow sample to being represented
    > by a single structure, I think there need to be some guidelines on how the
    > sampled_* structures should be used. Should sampled_ethernet be provided along
    > with sampled_ipv4? Should the other sampled_* structures not be included if
    > sampled_header is provided? When should the sampled_ethernet and sampled_ip*
    > structures be used?

    I agree that this should be clarified. How about adding the following
    comment to the standard structures file?

    /* Flow Data types

       A flow_sample must contain packet header information. The
       prefered format for reporting packet header information is
       the sampled_header. However, if the packet header is not
       available to the sampling process then one or more of
       sampled_ethernet, sampled_ipv4, sampled_ipv6 may be used.

       enterprise = 0 refers to standard sFlow structures. An
       sFlow implementor should use the standard structures
       where possible, even if they can only be partially
       populated. Vendor specific structures are allowed, but
       should only be used to supplement the existing
       structures, or to carry information that hasn't yet
       been standardized. */

    Does this sufficiently clarify the issue?

    Regards,
    Peter
    ----------------------
    Peter Phaal
    InMon Corp.

    Peter_Phaal@inmon.com

    --------
    /* Proposed sFlow Datagram Version 5 (draft 2) */

    /* Revision History
       - version 5 adds support for:
             MPLS extensions.
             Remove limit on packet header size.
             Adds host field to URL extension and clarifies url_direction.
             Adds NAT support.
             Vendor specific extensions.
             Splits sFlow datagram definition from flow/counter data definitions.
             Adds length information to data fields.
             Adds length information to sample types
       - version 4 adds support for BGP communities
       - version 3 adds support for extended_url information
    */

    /* Address types */

    typedef opaque ip_v4[4];
    typedef opaque ip_v6[16];

    enum address_type {
       IP_V4 = 1,
       IP_V6 = 2
    }

    union address (address_type type) {
       case IP_V4:
         ip_v4;
       case IP_V6:
         ip_v6;
    }

    /* Data Format
         The data_format uniquely identifies the format of an opaque structure in
         the sFlow specification. A data_format is contructed as follows:
           - The most significant 20 bits correspond to the SMI Private Enterprise
             Code of the entity responsible for the structure definition. A value
             of zero is used to denote standard structures defined by sflow.org.
           - The least significant 12 bits are a structure format number assigned
             by the enterprise that should uniquely identify the the format of the
             structure.

         Enterprises are encouraged to publish structure definitions in XDR format to
         www.sflow.org */

    typedef unsigned int data_format;

    /* sFlowDataSource encoded as follows:
         The most significant byte of the source_id is used to indicate the type
         of sFlowDataSource (0 = ifIndex, 1 = smonVlanDataSource,
         2 = entPhysicalEntry) and the lower three bytes contain the relevant
         index value. */

    typedef unsigned int sflow_data_source;

    struct flow_record {
       data_format flow_format; /* The format of sflow_data */
       opaque flow_data<>; /* Flow data uniquely defined
                                           by the flow_format */
    }

    /* Format of a single flow sample
         enterprise = 0
         format = 1 */

    struct flow_sample {
       unsigned int sequence_number; /* Incremented with each flow sample
                                           generated by this source_id */
       sflow_data_source source_id; /* sFlowDataSource */
       unsigned int sampling_rate; /* sFlowPacketSamplingRate */
       unsigned int sample_pool; /* Total number of packets that could have
                                           been sampled (i.e. packets skipped by
                                           sampling process + total number of
                                           samples) */
       unsigned int drops; /* Number times a packet was dropped due to
                                           lack of resources */

       unsigned int input; /* SNMP ifIndex of input interface.
                                            0 if interface is not known. */
       unsigned int output; /* SNMP ifIndex of output interface,
                                            0 if interface is not known.
                                            Set most significant bit to indicate
                                            multiple destination interfaces
                                            (i.e. in case of broadcast or multicast)
                                            and set lower order bits to indicate
                                            number of destination interfaces.
                                            Examples:
                                               0x00000002 indicates ifIndex = 2
                                               0x00000000 ifIndex unknown.
                                               0x80000007 indicates a packet sent
                                                           to 7 interfaces.
                                               0x80000000 indicates a packet sent
                                                           to an unknown number of
                                                           interfaces greater than
                                                           1. */

       flow_record flow_records<>; /* Information about a sampled packet
    */
    }

    struct counter_block {
       data_format counter_format; /* The format of counters */
       opaque counters<>; /* A block of counters uniquely defined
                                          by the enterprise,format pair
                                          Enterprises are encouraged to publish
                                          structure definitions in XDR format to
                                          www.sflow.org */
    }

    /* Format of a single counter sample
         enterprise = 0
         format = 2 */

    struct counters_sample {
       unsigned int sequence_number; /* Incremented with each counter sample
                                          generated by this source_id */
       sflow_data_source source_id; /* sFlowDataSource */
       unsigned int sampling_interval; /* sFlowCounterSamplingInterval */
       counter_block counters<>; /* Counters polled for this source */
    }

    /* Format of a sample datagram */

    struct sample_record {
       data_format sample_type; /* Specifies the format of sample_data */
       opaque sample_data<>; /* A structure corresponding to the sample_type */
    }

    struct sample_datagram_v5 {
       address agent_address /* IP address of sampling agent,
                                         sFlowAgentAddress. */
       unsigned int sequence_number; /* Incremented with each sample datagram
                                         generated */
       unsigned int uptime; /* Current time (in milliseconds since device
                                         last booted). Should be set as close to
                                         datagram transmission time as possible.*/
       sample_record samples<>; /* An array of sample records */
    }

    enum datagram_version {
       VERSION5 = 5
    }

    union sample_datagram_type (datagram_version version) {
       case VERSION5:
          sample_datagram_v5 datagram;
    }

    struct sample_datagram {
       sample_datagram_type version;
    }



    This archive was generated by hypermail 2.1.4 : 09/12/02 PDT