<?xml version="1.0" encoding="UTF-8"?>
<!-- edited with XMLSPY v5 rel. 3 U (http://www.xmlspy.com)
     by Daniel M Kohn (private) -->
<!DOCTYPE rfc SYSTEM "rfc2629.dtd" [
<!ENTITY rfc2119 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.2119.xml">
<!ENTITY rfc3611 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3611.xml">
<!ENTITY rfc4566 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.4566.xml">
<!ENTITY rfc3550 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.3550.xml">
<!ENTITY rfc6390 PUBLIC "" "http://xml.resource.org/public/rfc/bibxml/reference.RFC.6390.xml">
<!ENTITY I-D.ietf-avtcore-monarch PUBLIC "" "http://xml.resource.org/public/rfc/bibxml3/reference.I-D.ietf-avtcore-monarch.xml">
]>


<rfc category="std" docName="draft-ietf-xrblock-rtcp-xr-concsec-03.txt"ipr="trust200902">
<?rfc strict="yes" ?>
<?rfc toc="yes"?>
<?rfc tocdepth="4"?>
<?rfc symrefs="yes"?>
<?rfc sortrefs="yes" ?>
<?rfc compact="yes" ?>
<?rfc subcompact="yes" ?>
  <front>
    <title abbrev="RTCP XR Concealed Seconds">RTP Control Protocol (RTCP)
    Extended Report (XR) Block for Concealed Seconds Metric Reporting</title>

    <author fullname="Glen Zorn" initials="G." surname="Zorn">
      <organization>Network Zen</organization>

      <address>
        <postal>
          <street>227/358 Thanon Sanphawut</street>

          <city>Bang Na</city>
          
          <region>Bangkok</region>

          <code>10260</code>

          <country>Thailand</country>
        </postal>

        <phone>+66 (0) 909-0201060</phone>

        <email>glenzorn@gmail.com</email>
      </address>
    </author>

    <author fullname="Qin Wu" initials="Q." surname="Wu">
      <organization>Huawei</organization>

      <address>
        <postal>
          <street>101 Software Avenue, Yuhua District</street>

          <city>Nanjing</city>

          <region>Jiangsu</region>

          <code>210012</code>

          <country>China</country>
        </postal>

        <email>sunseawq@huawei.com</email>
      </address>
    </author>
    
    <author fullname="Alan Clark" initials="A." surname="Clark">
      <organization abbrev="Telchemy">Telchemy Incorporated</organization>

      <address>
        <postal>
          <street>2905 Premiere Parkway, Suite 280</street>

          <city>Duluth</city>

          <region>GA</region>

          <code>30097</code>

          <country>USA</country>
        </postal>

        <email>alan.d.clark@telchemy.com</email>
      </address>
    </author>
    
    <author fullname="Claire Bi" initials="C." surname="Bi">
      <organization abbrev="STTRI">Shanghai Research Institure of China
      Telecom Corporation Limited</organization>

      <address>
        <postal>
          <street>No.1835,South Pudong Road</street>

          <city>Shanghai</city>

          <code>200122</code>

          <country>China</country>
        </postal>

        <email>bijy@sttri.com.cn</email>
      </address>
    </author>


    <date year="2012" />

    <area>Real-time Applications and Infrastructure Area</area>

    <workgroup>Audio/Video Transport Working Group</workgroup>

    <keyword>RFC</keyword>

    <keyword>Request for Comments</keyword>

    <keyword>I-D</keyword>

    <keyword>Internet-Draft</keyword>

    <keyword>Real Time Control Protocol</keyword>

    <abstract>
      <t>This document defines an RTP Control Protocol(RTCP) Extended Report
      (XR) Block that allows the reporting of Concealed Seconds metrics,
      primarily for audio applications of RTP.</t>
    </abstract>
  </front>

  <middle>
    <section anchor="intro" title="Introduction">
      <section title="Concealed Seconds Block">
        <t>This draft defines a new block type to augment those defined in
        <xref target="RFC3611"></xref>, for use primarily in audio
        applications of RTP.</t>

        <t>At any instant, the audio output at a receiver may be classified as
        either 'normal' or 'concealed'. 'Normal' refers to playout of audio
        payload received from the remote end, and also includes locally
        generated signals such as announcements, tones and comfort noise.
        'Concealed' refers to playout of locally-generated signals used to
        mask the impact of network impairments such as lost packets or to
        reduce the audibility of jitter buffer adaptations.<list>
            <t>Editor’s Note: For video applications, the output at a receiver
            should also be classified as either normal or concealed. Should
            this paragraph be clear about this?</t>
          </list></t>

        <t>The new block type provides metrics for concealment. Specifically,
        the first metric (Unimpaired Seconds) reports the number of whole
        seconds occupied only with normal playout of data which the receiver
        obtained from the sender's stream. The second metric (Concealed
        Seconds) reports the number of whole seconds during which the receiver
        played out any locally-generated media data. A third metric (Severely
        Concealed Seconds (SCS)) reports the number of whole seconds during
        which the receiver played out locally-generated data for longer than SCS
        Threshold milliseconds.</t>

        <t>The metric belongs to the class of transport-related end system
        metrics defined in Wu, Hunt &amp; Arden <xref target="I-D.ietf-avtcore-monarch"></xref>.</t>
      </section>

      <section title="RTCP and RTCP XR Reports">
        <t>The use of RTCP for reporting is defined in Schulzrinne et al.&nbsp;
        <xref target="RFC3550"></xref>.
        Freidman, et al.&nbsp;<xref target="RFC3611"></xref> defines an
        extensible structure for reporting using an RTCP Extended Report (XR).
        This draft defines a new Extended Report block that MUST be used as
        specified in RFC 3550 and RFC 3611.</t>
      </section>

      <section title="Performance Metrics Framework">
        <t>The Performance Metrics Framework <xref target="RFC6390"></xref>
        provides guidance on the definition and specification of performance
        metrics. The RTP Monitoring Architecture <xref
        target="I-D.ietf-avtcore-monarch"></xref> provides guidelines for formatting RTCP XR blocks. 
        The Metrics Block described in this document are in
        accordance with the guidelines in <xref target="RFC6390"></xref> and
        <xref target="I-D.ietf-avtcore-monarch"></xref>.</t>
      </section>

      <section title="Applicability">
        <t>This metric is primarily applicable to audio applications of RTP.
        </t>

        <t>EDITOR'S NOTE: are there metrics for concealment of transport
        errors for video?</t>

        <t>Editor’s Note: note that with video it is possible to use RTP based
        retransmission and also FEC (e.g. COP3) - typically these would only
        be used with IPTV as this is less delay sensitive than interactive
        services.</t>
      </section>
    </section>

    <section title="Terminology">
      <section title="Standards Language">
        <t>The key words "MUST", "MUST NOT", "REQUIRED", "SHALL", "SHALL NOT",
        "SHOULD", "SHOULD NOT", "RECOMMENDED", "MAY", and "OPTIONAL" in this
        document are to be interpreted as described in <xref
        target="RFC2119">RFC 2119</xref>.</t>

        <t>In addition, the following terms are defined:</t>

        <t><list style="hanging">
            <t><vspace blankLines="1" />Editor’s Note: For Video loss
            concealment, at least the following four methods are used,i.e.,
            Frame freeze,inter-frame extrapolation, interpolation, Noise
            insertation, should this section consider giving definition of
            these four methods for video loss concealment?<vspace
            blankLines="1" /></t>
          </list></t>
      </section>
    </section>

    <section title="Concealment Seconds Block">
      <t>This block provides a description of potentially audible
      impairments due to lost and discarded packets at the endpoint, expressed
      on a time basis analogous to a traditional PSTN T1/E1 errored seconds
      metric.<list>
          <t>Editor’s Note: Should impairment also cover video
          application?</t>
        </list></t>

      <t>The following metrics are based on successive one second intervals as
      declared by a local clock. This local clock does NOT need to be
      synchronized to any external time reference. The starting time of this
      clock is unspecified. Note that this implies that the same loss pattern
      could result in slightly different count values, depending on where the
      losses occur relative to the particular one-second demarcation points.
      For example, two loss events occurring 50ms apart could result in either
      one concealed second or two, depending on the particular 1000 ms
      boundaries used.</t>

      <t>The seconds in this sub-block are not necessarily calendar seconds.
      At the tail end of a session, periods of time of less than 1000ms shall
      be incorporated into these counts if they exceed 500ms and shall be
      disregarded if they are less than 500ms.</t>

      <section title="Report Block Structure">
        <t>
          Concealed Seconds Metrics Block
<figure title="Figure 1: Report Block Structure">
            <artwork align="center">
 0                   1                   2                   3
 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 2 3 4 5 6 7 8 9 0 1 
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|    BT=NCS     | I |plc|Rserved|       block length=4          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                         SSRC of Source                        |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Unimpaired Seconds                         |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                    Concealed Seconds                          |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Severely Concealed Seconds    |    RESERVED   | SCS Threshold |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
</artwork>
          </figure></t>
      </section>

      <section title="Definition of Fields in Concealed Seconds Metrics Block">
        <t><list style="hanging">
            <t hangText="Block type (BT): 8 bits"><vspace blankLines="1" />A
            Concealed Seconds Metrics Report Block is identified by the
            constant &lt;NCS&gt;. <vspace blankLines="1" />[Note to RFC Editor: please
            replace &lt;NCS&gt; with the IANA provided RTCP XR block type for this
            block.] <vspace blankLines="1" /></t>

            <t hangText="Interval Metric flag (I): 2 bit"><vspace
            blankLines="1" />This field is used to indicate whether the
            Concealed Seconds metrics are Sampled, Interval or Cumulative
            metrics, that is, whether the reported values applies to the most
            recent measurement interval duration between successive metrics
            reports (I=10) (the Interval Duration) or to the accumulation
            period characteristic of cumulative measurements (I=11) (the
            Cumulative Duration) or is a sampled instantaneous value (I=01)
            (Sampled Value). <vspace blankLines="1" /></t>

            <t hangText="Packet Loss Concealment Method (plc): 2 bits"><vspace
            blankLines="1" />This field is used to identify the packet loss
            concealment method in use at the receiver, according to the
            following code: <list>
                <t>bits 014-015<list>
                    <t>0 = silence insertion</t>

                    <t>1 = simple replay, no attenuation</t>

                    <t>2 = simple replay, with attenuation</t>

                    <t>3 = enhanced</t>

                    <t>Other values reserved<list>
                        <t>Editor’s Note 1 : In the packet loss concealment
                        methods,”Enhanced” is defines as one new Packet loss
                        Concealment method? However it is not clear what this
                        packet loss concealment method looks like?</t>

                        <t>Editor’s Note 2: For Video loss concealment, there
                        are a range of methods used, for example:<list>
                            <t>(i) Frame freeze In this case the impaired video
                            frame is not displayed and the previously displayed
                            frame is hence “frozen” for the duration of the loss
                            event</t>

                            <t>(ii) Inter-frame extrapolation If an area of the
                            video frame is damaged by loss, the same area from
                            the previous frame(s) can be used to estimate what
                            the missing pixels would have been. This can work
                            well in a scene with no motion but can be very
                            noticeable if there is significant movement from one
                            frame to another. Simple decoders may simply re-use
                            the pixels that were in the missing area, more
                            complex decoders may try to use several frames to do
                            a more complex extrapolation.</t>

                            <t>(iii) Interpolation A decoder may use the
                            undamaged pixels in the image to estimate what the
                            missing block of image should have</t>

                            <t>(iv) Noise insertion A decoder may insert random
                            pixel values - which would generally be less
                            noticeable than a blank rectangle in the image.</t>
                          </list>Therefore more text required in the future
                        draft to discuss Techniques for Video Loss Concealment
                        method in this document.</t>
                      </list></t>
                  </list></t>
              </list><vspace blankLines="1" /></t>

            <t hangText="Reserved (resv): 4 bits"><vspace
            blankLines="1" />These bits are reserved. They SHOULD be set to
            zero by senders and MUST be ignored by receivers.<vspace
            blankLines="1" /></t>

            <t hangText="Block Length: 16 bits"><vspace blankLines="1" /> The
            length of this report block in 32-bit words, minus one. For the
            Delay block, the block length is equal to 4.<vspace
            blankLines="1" /></t>

            <t hangText="SSRC of source: 32 bits"><vspace blankLines="1" />As
            defined in Section 4.1 of RFC 3611.<vspace
            blankLines="1" /></t>

            <t hangText="Unimpaired Seconds: 32 bits"><vspace
            blankLines="1" />A count of the number of unimpaired Seconds that
            have occurred.<vspace blankLines="1" />An unimpaired Second is
            defined as a continuous period of 1000ms during which no frame
            loss or discard due to late arrival has occurred. Every second in
            a session must be classified as either OK or Concealed.
            <vspace blankLines="1" />
            If voice activity detection
            <xref target="VAD"></xref> is used, normal playout of comfort noise or other silence
            concealment signals during periods of talker silence SHALL be counted as unimpaired
            seconds.
            <list>
                <t>Editor’s Note: It should be clear that voice activity detection does not apply
                to video.</t>
              </list></t>

            <t>If the measured value exceeds 0xFFFFFFFD, the value 0xFFFFFFFE
            SHOULD be reported to indicate an over-range measurement. If the
            measurement is unavailable, the value 0xFFFFFFFF SHOULD be
            reported. <vspace blankLines="1" /></t>

            <t hangText="Concealed Seconds: 32 bits"><vspace
            blankLines="1" />A count of the number of Concealed Seconds that
            have occurred.<vspace blankLines="1" />A Concealed Second is
            defined as a continuous period of 1000ms during which any frame
            loss or discard due to late arrival has occurred. <vspace
            blankLines="1" /> Equivalently, a concealed second is one in which
            some Loss-type concealment has occurred. Buffer adjustment-type
            concealment SHALL NOT cause Concealed Seconds to be incremented,
            with the following exception. An implementation MAY cause
            Concealed Seconds to be incremented for 'emergency' buffer
            adjustments made during talkspurts. <vspace
            blankLines="1" />Loss-type concealment is reactive insertion or
            deletion of samples in the audio playout stream due to effective
            frame loss at the audio decoder. "Effective frame loss" is the
            event in which a frame of coded audio is simply not present at the
            audio decoder when required. In this case, substitute audio
            samples are generally formed, at the decoder or elsewhere, to
            reduce audible impairment. <vspace blankLines="1" />Buffer
            Adjustment-type concealment is proactive or controlled insertion
            or deletion of samples in the audio playout stream due to jitter
            buffer adaptation, re-sizing or re-centering decisions within the
            endpoint. <vspace blankLines="1" />Because this insertion is
            controlled, rather than occurring randomly in response to losses,
            it is typically less audible than loss-type concealment. For
            example, jitter buffer adaptation events may be constrained to
            occur during periods of talker silence, in which case only silence
            duration is affected, or sophisticated time-stretching methods for
            insertion/deletion during favorable periods in active speech may
            be employed. For these reasons, buffer adjustment-type concealment
            MAY be exempted from inclusion in calculations of Concealed
            Seconds and Severely Concealed Seconds. <list>
                <t>Editor’s Note: In this document, two kind of concealments
                are defined: a. Loss-type concealment b. Buffer
                Adjustment-type concealment Loss-type concealment is
                applicable to both audio and video. However Buffer
                Adjustment-type concealment is usually applied to audio.
                Should this section be clear about this?</t>
              </list><vspace blankLines="1" />However, an implementation
            SHOULD include buffer-type concealment in counts of Concealed
            Seconds and Severely Concealed Seconds if the event occurs at an
            'inopportune' moment, with an emergency or large, immediate
            adaptation during active speech, or for unsophisticated adaptation
            during speech without regard for the underlying signal, in which
            cases the assumption of low-audibility cannot hold. In other
            words, jitter buffer adaptation events which may be presumed to be
            audible SHOULD be included in Concealed Seconds and Severely
            Concealed Seconds counts.</t>

            <t>Concealment events which cannot be classified as Buffer
            Adjustment- type MUST be classified as Loss-type.</t>

            <t>For clarification, the count of Concealed Seconds MUST include
            the count of Severely Concealed Seconds.</t>

            <t>If the measured value exceeds 0xFFFFFFFD, the value 0xFFFFFFFE
            SHOULD be reported to indicate an over-range measurement. If the
            measurement is unavailable, the value 0xFFFFFFFF SHOULD be
            reported.
            <vspace blankLines="1"/>
            </t>

            <t hangText="Severely Concealed Seconds: 16 bits"><vspace
            blankLines="1" />A count of the number of Severely Concealed
            Seconds.<vspace blankLines="1" />A Severely Concealed Second is
            defined as a non-overlapping period of 1000 ms during which the
            cumulative amount of time that has been subject to frame loss or
            discard due to late arrival, exceeds the SCS Threshold. <vspace
            blankLines="1" />If the measured value exceeds 0xFFFD, the value
            0xFFFE SHOULD be reported to indicate an over-range measurement.
            If the measurement is unavailable, the value 0xFFFF SHOULD be
            reported. <vspace blankLines="1" /></t>

            <t hangText="Reserved: 8 bits"><vspace blankLines="1" />These bits
            are reserved. They SHOULD be set to zero by senders and MUST be
            ignored by receivers. <vspace blankLines="1" /></t>

            <t hangText="SCS Threshold: 8 bits"><vspace blankLines="1" />The
            SCS Threshold defines the amount of time corresponding to lost or
            discarded frames that must occur within a one second period in
            order for the second to be classified as a Severely Concealed
            Second. This is expressed in milliseconds and hence can represent
            a range of 0.1 to 25.5 percent loss or discard. <vspace
            blankLines="1" />A default threshold of 50ms (5% effective frame
            loss per second) is suggested. <vspace blankLines="1" /></t>
          </list></t>
      </section>
    </section>

    <section title="SDP Signaling">
      <t>RFC 3611 defines the use of
      Session Description Protocol (SDP, <xref target="RFC4566"/>
      for signaling the use of XR blocks. XR blocks
      MAY be used without prior signaling.</t>

      <section title="SDP rtcp-xr-attrib Attribute Extension">
        <t>This section augments the SDP 
        attribute "rtcp-xr" defined in Section 5.1 of RFC 3611 by
        providing an additional value of "xr-format" to signal the use of the
        report block defined in this document.</t>

        <t>The SDP attribute for the block has an additional optional
        paremeter, "thresh", used to supply a value for the SCS Threshold
        parameter. If this parameter is present, the RTP system receiving the
        SDP SHOULD use this value for the current session. If the parameter is
        not present, the RTP system SHOULD use a locally configured value.</t>

        <figure>
          <artwork>
xr-format =/ xr-conc-sec-block

xr-conc-sec-block = "conc-sec" ["=" thresh]

thresh      = 1*DIGIT          ; threshold for SCS (ms)
DIGIT          = &lt;as defined in Section 3.4 of [RFC5234]&gt;
</artwork>
        </figure>
      </section>

      <section title="Offer/Answer Usage">
        <t>When SDP is used in offer-answer context, the SDP Offer/Answer
        usage defined in Section 5.2 of RFC 3611 applies.</t>
      </section>
    </section>

    <section title="IANA Considerations">
      <t>New block types for RTCP XR are subject to IANA registration. For
      general guidelines on IANA considerations for RTCP XR, refer to Section 6 of RFC 3611.</t>

      <section title="New RTCP XR Block Type value">
        <t>This document assigns the block type value &lt;NCS&gt;
        in the IANA "RTCP XR
        Block Type Registry" to the "Concealed Seconds Metrics Block".</t>

        <t>[Note to RFC Editor: please replace &lt;NCS&gt; with the IANA provided RTCP
        XR block type for this block.]</t>
      </section>

      <section title="New RTCP XR SDP Parameter">
        <t>This document also registers a new parameter "conc-sec" in the
        "RTCP XR SDP Parameters Registry".</t>
      </section>

      <section title="Contact information for registrations">
        <t><figure>
            <artwork>
   The contact information for the registrations is:

   Alan Clark (alan.d.clark@telchemy.com)

   2905 Premiere Parkway, Suite 280
   Duluth, GA  30097
   USA
</artwork>
          </figure><vspace blankLines="1" /></t>
      </section>
    </section>

    <section title="Security Considerations">
      <t>It is believed that this proposed RTCP XR report block introduces no
      new security considerations beyond those described in RFC 3611.
      This block does not provide per-packet
      statistics so the risk to confidentiality documented in Section 7,
      paragraph 3 of <xref target="RFC3611"></xref> does not apply.</t>
    </section>

    <section title="Contributors">
      <t>Geoff Hunt wrote the initial draft of this document.</t>
    </section>

    <section title="Acknowledgements">
      <t>The authors gratefully acknowledge reviews and feedback provided by
      Bruce Adams, Philip Arden, Amit Arora, Bob Biskner, Kevin Connor, Claus
      Dahm, Randy Ethier, Roni Even, Jim Frauenthal, Albert Higashi, Tom Hock,
      Shane Holthaus, Paul Jones, Rajesh Kumar, Keith Lantz, Mohamed Mostafa,
      Amy Pendleton, Colin Perkins, Mike Ramalho, Ravi Raviraj, Albrecht
      Schwarz, Tom Taylor, and Hideaki Yamada.</t>
    </section>
  </middle>

  <back>
    <references title="Normative References">
      &rfc2119;
      &rfc3611;
      &rfc4566;
      &rfc3550;
    </references>

    <references title="Informative References">
      &I-D.ietf-avtcore-monarch;
      &rfc6390;

      <reference anchor="VAD">
        <front>
          <title>http://en.wikipedia.org/wiki/Voice_activity_detection</title>

          <author>
            <organization></organization>
          </author>

          <date />
        </front>
      </reference>
    </references>

  </back>
</rfc>
