DEV Community

Kubo Ryosuke
Kubo Ryosuke

Posted on

URL and Timestamp of MPEG-DASH Segments

Japanese Page

Introduction

MPD which is manifest file of MPEG-DASH indicates segment URLs and any attributes like HLS playlist.
HLS playlist has simple list of URLs and attributes and is very easy to read. However, MPD has compex structure and is more difficult to read than HLS playlist.
In this post, I describe things about segments of MPD.

MPD Examples

You can see Reference Client and many DASH streams at DASH Industry Forum.

https://reference.dashif.org/

BaseURL

If MPD has BaseURL tag, segment URL is indicated as relative path from value of the tag.

<BaseURL>http://localhost/mystream/hd/</BaseURL>
Enter fullscreen mode Exit fullscreen mode

BaseURL can have relative URL from MPD.

<BaseURL>./hd/</BaseURL>
Enter fullscreen mode Exit fullscreen mode

If all segments are concatenated as one file, BaseURL can have full URL of the file.

<BaseURL>http://localhost/mystream/video.mp4</BaseURL>
Enter fullscreen mode Exit fullscreen mode

In this case, player will download those segments by range requests.
Using SegmentBase or SegmentList/SegmentURL which will be described bellow, MPD can indicates byte ranges of those segments to player.

SegmentTemplate

In live profile, SegmentTemplate tag is usually used.
This tag can be used to both of live stream (dynamic type MPD) and VOD stream (static type MPD).
ABEMA which is our video streaming service uses SegmentTemplate tag for all streams.

media attribute and initialization attribute

SegmentTemplate@media attribute and SegmentTemplate@initialization attribute indicate media segments (MP4 containing moof and mdat) and initialization segment (MP4 containing ftyp and moov), respectively.
However, you should resolve templates which is enclosed by $ and embedded to those values.

<SegmentTemplate 
    media="$RepresentationID$/$Time$.mp4"
    initialization="$RepresentationID$/init.mp4">
Enter fullscreen mode Exit fullscreen mode

There are 5 types of template. In most cases, $RepresentationID$ and $Time$ or $Number$ will be used.
$RepresentationID$ template indicates value of Representation@id.
In following sample, There are 2 Representation tags at same level with SegmentTemplate tag.

<AdaptationSet contentType="video" mimeType="video/mp4" segmentAlignment="true">
    <SegmentTemplate timescale="90000" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
    <Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive" />
    <Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive" />
</AdaptationSet>
Enter fullscreen mode Exit fullscreen mode

Inserting video-hd and video-sd to $RepresentationId$ of @initialization attribute, you can get paths: video-hd/init.mp4 and video-sd/init.mp4. These are initialization segment of HD and SD, respectively.
On the other way, what does $Time$ template indicates?

With SegmentTimeline

SegmentTimeline tag enumerates relative time and duration of segments as follows:

<?xml version="1.0" encoding="utf-8"?>
<MPD xmlns="urn:mpeg:dash:schema:mpd:2011" availabilityStartTime="1970-01-01T00:00:00Z" profiles="urn:mpeg:dash:profile:isoff-live:2011" type="dynamic" minBufferTime="PT5.000000S" publishTime="2021-10-28T13:07:58Z" minimumUpdatePeriod="PT5.000000S" timeShiftBufferDepth="PT60.000000S" suggestedPresentationDelay="PT15.000000S">
  <BaseURL>http://localhost/mystream/</BaseURL>
  <Period id="1" start="PT1609426800S">
    <AdaptationSet mimeType="video/mp4" segmentAlignment="true">
      <SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
        <SegmentTimeline>
          <S d="357357" t="11771760" />
          <S d="360360" r="3"/>
          <S d="357357" />
        </SegmentTimeline>
      </SegmentTemplate>
      <Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive" />
      <Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive" />
    </AdaptationSet>
    <AdaptationSet mimeType="audio/mp4" segmentAlignment="true">
      <SegmentTemplate timescale="48000" presentationTimeOffset="5752947" media="$RepresentationID$/$Time$.mp4" initialization="$RepresentationID$/init.mp4">
        <SegmentTimeline>
          <S d="191488" t="6278272"/>
          <S d="192512" r="4"/>
        </SegmentTimeline>
      </SegmentTemplate>
      <Representation id="audio-high" bandwidth="190000">
        <AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2" />
      </Representation>
      <Representation id="audio-low" bandwidth="64000">
        <AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="1" />
      </Representation>
    </AdaptationSet>
  </Period>
</MPD>
Enter fullscreen mode Exit fullscreen mode

SegmentTimeline tag has S tags which indicates each segments.
However, if S tag has non-zero @r attribute, @r is repeat count of the S tag. (When @r is negative value, it means open-ended.) For example, r="3" means +3 segments, in other words, there are continuous 4 segments which have same duration.
S@d attribute is duration of segment and S@t attribute is earliest timestamp of segment.
The second and the subsequent segments can omit @t attribute, and its value is sum of previous segment's @t and @d.
Insert those @t values to $Time$ template, you can get media segment URLs.
For example, earliest segments of above sample is http://localhost/mystream/video-hd/11771760.mp4 and http://localhost/mystream/video-sd/11771760.mp4.

S@t and S@d are always integer values. Dividing those by SegmentTemplate@timescale attribute, you can get time and duration in seconds, respectively.
So, in above sample, the duration of earliest video segment is 357357 / 90000 = 3.97 seconds, and the timestamp is 11771760 / 90000 = 130.797 seconds. These values are synchronized with values in MP4 boxes.
Then, subtracting SegmentTemplate@presentationTimeOffset from S@t, you can get elapsed time from period start.
For above sample, you can calculate elapsed time of latest video segment from period start by following expression.

(11771760 + 357357 + 360360 * 4 - 10786776) / 90000 = 30.93 seconds
Enter fullscreen mode Exit fullscreen mode

Adding Period@start to MPD@availabilityStartTime, you can get absolute time of period start.
So, in above sample, period start is 2020/12/31T15:00:00Z and composition time of latest video segment is 2020/12/31T15:00:30.930Z.

Only SegmentTemplate

As mentioned above, you can use SegmentTimeline to write timestamp (composition time) and duration of each segments.
On the other hand, if all segments have same duration, SegmentTimeline is not mandatory.

<AdaptationSet contentType="video" mimeType="video/mp4" segmentAlignment="true">
    <SegmentTemplate duration="2" startNumber="1000" initialization="$RepresentationID$/init.mp4" media="$RepresentationID$/$Number$.mp4" />
    <Representation id="video-300k" bandwidth="300000" codecs="avc1.64001e" frameRate="30" height="360" width="640" />
</AdaptationSet>
Enter fullscreen mode Exit fullscreen mode

In above sample, SegmentTemplate@duration is segment duration and the all segments are 2 seconds.
@media attribute has $Number$ template, it is index number of segment which is started by @startNumber.
So, segment URLs of above MPD are followings.

video-300k/1000.mp4 --- earliest 2 seconds
video-300k/1001.mp4 --- 2nd 2 seconds
video-300k/1002.mp4 --- 3rd 2 seconds
    :
video-300k/1099.mp4 --- 1000th 2 seconds
    :
Enter fullscreen mode Exit fullscreen mode

This post examples use $Time$ template with SegmentTimeline or $Number$ without SegmentTimeline.
However, not only these usages, you can use $Number$ template with SegmentTimeline.

Relation with Representation

In above samples, SegmentTemplate tag and Representation tags are placed in same layer. However, SegmentTemplate tag can be placed in each Representation tags.

<Representation id="video-hd" bandwidth="2000000" frameRate="30000/1001" height="720" width="1280" scanType="progressive">
    <SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="hd/$Time$.mp4" initialization="hd/init.mp4">
        ...
    </SegmentTemplate>
</Representation>
<Representation id="video-sd" bandwidth="1000000" frameRate="30000/1001" height="480" width="854" scanType="progressive">
    <SegmentTemplate timescale="90000" presentationTimeOffset="10786776" media="sd/$Time$.mp4" initialization="sd/init.mp4">
        ...
    </SegmentTemplate>
</Representation>
Enter fullscreen mode Exit fullscreen mode

SegmentList, Initialization and SegmentURL

SegmentURL tags have URL or byte-range of each segments.
For example, you can place Initialization tag and SegmentURL tags in SegmentList tag as follows:

<SegmentList>
    <Initialization sourceURL="init.mp4" />
    <SegmentURL media="0.mp4" />
    <SegmentURL media="1.mp4" />
    <SegmentURL media="2.mp4" />
</SegmentList>
Enter fullscreen mode Exit fullscreen mode

This is similar to HLS playlist and easy to understand.

SegmentBase

If BaseURL has URL of single MP4 file and MPD has neither SegmentList nor SegmentTemplate, player have no way knowing byte-range of segments.
SegmentBase@indexRange attribute indicates byte-range of information about byte-range of each segments (ex. sidx Box).

<BaseURL>http://localhost/sample.mp4</BaseURL>
<SegmentBase indexRange="896-1730"/>
Enter fullscreen mode Exit fullscreen mode

For this example, player will send range-request of 896-1730 at first. The response data contains timestamp and offset of segments and player will send range-request for each segments.

Conclusion

There are many types of MPD usage. So it is difficult to understand.
This post described major ways to represent segments in MPD.

References

Top comments (0)