Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Shaka wrongly parses pts and dts values from mpegts stream after ffmpeg #1426

Open
acris5 opened this issue Aug 27, 2024 · 10 comments
Open

Shaka wrongly parses pts and dts values from mpegts stream after ffmpeg #1426

acris5 opened this issue Aug 27, 2024 · 10 comments
Labels
flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this priority: P2 Smaller impact or easy workaround type: bug Something isn't working correctly
Milestone

Comments

@acris5
Copy link

acris5 commented Aug 27, 2024

System info

Operating System: docker
Shaka Packager Version: b5c2cb8

Issue and steps to reproduce the problem

I got http mpegts stream and decode it with ffmpeg and then pass to packager.
ffmpeg -re -i http://172.17.0.1:8008/2222 -map "0:v:0" -map "0:a:0" -map "0:s?" -map "0:d?" -forced-idr 1 -force_key_frames "expr:gte(t,n_forced*2)" -sc_threshold 0 -c:v:0 libx264 -c:a aac -c:s copy -copyts -f mpegts pipe: >/tmp/pipes/pipe1

packager
'in=/shaka-packager/pipes/pipe1,stream=video,init_segment=/shaka-results/bitrate_1/video_init.mp4,segment_template=/shaka-results/bitrate_1/video_$Number$.m4s,playlist_name=/shaka-results/bitrate_1/video.m3u8'
'in=/shaka-packager/pipes/pipe1,stream=audio,lang=ru,init_segment=/shaka-results/bitrate_1/audio_init.mp4,segment_template=/shaka-results/bitrate_1/audio_$Number$.m4s,playlist_name=/shaka-results/bitrate_1/audio.m3u8'
'in=/shaka-packager/pipes/pipe1,stream=text,format=ttml+mp4,init_segment=/shaka-results/bitrate_1/text_init.mp4,segment_template=/shaka-results/bitrate_1/text_$Number$.m4s,lang=ru,playlist_name=/shaka-results/bitrate_1/text.m3u8'
--max_hd_pixels 8294400 --hls_master_playlist_output /shaka-results/demo_master.m3u8 --mpd_output /shaka-results/manifest.mpd --hls_playlist_type LIVE
--segment_duration 2 --min_buffer_time 4 --suggested_presentation_delay 10 --time_shift_buffer_depth 12 --allow_approximate_segment_timeline --preserved_segments_outside_live_window 20

What is the expected result?
HLS and Dash output should contain video audio and teletext, but sometime it lacks some streams.

I noticed that it wrongly parses pts and dts and gets following different values:

2024-08-27T06:35:26.928729959Z TS Packet size=139 pts=-4160261398 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:26.928733417Z TS Packet size=10866 pts=4429674000 dts=4429670400 data_alignment_indicator=0
2024-08-27T06:35:26.928736669Z TS Packet size=2695 pts=4429659343 dts=-9223372036854775808 data_alignment_indicator=0
2024-08-27T06:35:26.928740049Z TS Packet size=139 pts=-4160258691 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:26.928743360Z TS Packet size=31525 pts=4429692000 dts=4429674000 data_alignment_indicator=0
2024-08-27T06:35:27.071983621Z TS Packet size=18825 pts=4429684800 dts=4429677600 data_alignment_indicator=0
2024-08-27T06:35:27.072142371Z TS Packet size=18 pts=126000 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:27.072172674Z TS Packet size=18 pts=126000 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:27.072190385Z TS Packet size=139 pts=-4160253277 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:27.072205791Z TS Packet size=139 pts=-4160250569 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:27.072216067Z TS Packet size=10122 pts=4429681200 dts=-9223372036854775808 data_alignment_indicator=0
2024-08-27T06:35:27.072322665Z TS Packet size=2910 pts=4429674703 dts=-9223372036854775808 data_alignment_indicator=0
2024-08-27T06:35:27.072339553Z TS Packet size=139 pts=-4160247862 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T06:35:27.072345231Z TS Packet size=13549 pts=4429688400 dts=4429684800 data_alignment_indicator=0
2024-08-27T06:35:27.214525312Z TS Packet size=43954 pts=4429706400 dts=4429688400 data_alignment_indicator=0

While ffprobe shows that there are normal values instead:

ffprobe -v 0 -show_entries packet=pts,dts,size,codec_type  demo/pipes/pipe1 

[PACKET]
codec_type=subtitle
pts=4449578454
dts=4449578454
size=139
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=subtitle
pts=4449578454
dts=4449578454
size=139
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=video
pts=4449636000
dts=4449628800
size=22721
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=subtitle
pts=4449600054
dts=4449600054
size=139
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=subtitle
pts=4449600054
dts=4449600054
size=139
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=video
pts=4449632400
dts=4449632400
size=15829
[SIDE_DATA]
[/SIDE_DATA]
[/PACKET]
[PACKET]
codec_type=data
pts=126000
dts=126000
size=18
[SIDE_DATA]
@acris5
Copy link
Author

acris5 commented Aug 27, 2024

As I learned problem is somewhere in the following function

// Given that |time| is coded using 33 bits,
// UnrollTimestamp returns the corresponding unrolled timestamp.
// The unrolled timestamp is defined by:
// |time| + k * (2 ^ 33)
// where k is estimated so that the unrolled timestamp
// is as close as possible to |previous_unrolled_time|.
static int64_t UnrollTimestamp(int64_t previous_unrolled_time, int64_t time) {
  // Mpeg2 TS timestamps have an accuracy of 33 bits.
  const int nbits = 33;

  // |timestamp| has a precision of |nbits|
  // so make sure the highest bits are set to 0.
  DCHECK_EQ((time >> nbits), 0);

  // Consider 3 possibilities to estimate the missing high bits of |time|.
  int64_t previous_unrolled_time_high = (previous_unrolled_time >> nbits);
  int64_t time0 = ((previous_unrolled_time_high - 1) << nbits) | time;
  int64_t time1 = ((previous_unrolled_time_high + 0) << nbits) | time;
  int64_t time2 = ((previous_unrolled_time_high + 1) << nbits) | time;

  // Select the min absolute difference with the current time
  // so as to ensure time continuity.
  int64_t diff0 = time0 - previous_unrolled_time;
  int64_t diff1 = time1 - previous_unrolled_time;
  int64_t diff2 = time2 - previous_unrolled_time;
  if (diff0 < 0)
    diff0 = -diff0;
  if (diff1 < 0)
    diff1 = -diff1;
  if (diff2 < 0)
    diff2 = -diff2;

  int64_t unrolled_time;
  int64_t min_diff;
  if (diff1 < diff0) {
    unrolled_time = time1;
    min_diff = diff1;
  } else {
    unrolled_time = time0;
    min_diff = diff0;
  }
  if (diff2 < min_diff)
    unrolled_time = time2;
  std::cout<<"time: "<<time<<", unrolled: "<<unrolled_time<<std::endl;
  return unrolled_time;

2024-08-27T09:20:12.087333044Z time: 5319075769, unrolled: -3270858823
2024-08-27T09:20:12.087336357Z TS Packet size=139 pts=-3270858823 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T09:20:12.087337907Z time: 5319078476, unrolled: -3270856116
2024-08-27T09:20:12.087339549Z TS Packet size=139 pts=-3270856116 dts=-9223372036854775808 data_alignment_indicator=1
2024-08-27T09:20:12.087341980Z time: 5319063823, unrolled: 5319063823
2024-08-27T09:20:12.087354873Z TS Packet size=2794 pts=5319063823 dts=-9223372036854775808 data_alignment_indicator=0

@acris5
Copy link
Author

acris5 commented Aug 27, 2024

Ffmpeg generates several packages with 126000 pts at start and then starts passing real pts
input 126000 5474206969 leads to negative result

@acris5
Copy link
Author

acris5 commented Aug 28, 2024

And I am not sure if it is correct to parse pts as signed int values.

@zaki699
Copy link
Contributor

zaki699 commented Sep 5, 2024

Hi @acris5,

PESContext in FFMPEG also uses signed integer type for storing the PTS/DTS.

typedef struct PESContext {
    int pid;
    int pcr_pid; /**< if -1 then all packets containing PCR are considered */
    int stream_type;
    MpegTSContext *ts;
    AVFormatContext *stream;
    AVStream *st;
    AVStream *sub_st; /**< stream for the embedded AC3 stream in HDMV TrueHD */
    enum MpegTSState state;
    /* used to get the format */
    int data_index;
    int flags; /**< copied to the AVPacket flags */
    int PES_packet_length;
    int pes_header_size;
    int extended_stream_id;
    uint8_t stream_id;
    int64_t pts, dts;
    int64_t ts_packet_pos; /**< position of first TS packet of this PES packet */
    uint8_t header[MAX_PES_HEADER_SIZE];
    AVBufferRef *buffer;
    SLConfigDescr sl;
    int merged_st;
} PESContext;

The issue must come from somewhere else.

@acris5
Copy link
Author

acris5 commented Oct 31, 2024

I mean that UnrollTimestamp(126000, 5474206969) returns negative value -3115727623
when fixed function should return 5474206969

@acris5
Copy link
Author

acris5 commented Oct 31, 2024

static int64_t UnrollTimestampFix(int64_t previous_unrolled_time, int64_t time) { 
  // Mpeg2 TS timestamps have an accuracy of 33 bits.
  const int nbits = 33;

  // |timestamp| has a precision of |nbits|
  // so make sure the highest bits are set to 0.


  DCHECK_EQ((time >> nbits), 0);
  //126000 5474206969 leads to negative result
  //time0: -2394030290 time1: 6195904302 time2: 14785838894
  //diff0: 2394156290 diff1: 6195778302 diff2: 14785712894
  //time: 6195904302, unrolled: -2394030290, previos: 126000
  if (time > 0 && time > previous_unrolled_time) return time; //fix because we don't want to get wrong time at start
  // Consider 3 possibilities to estimate the missing high bits of |time|.
  int64_t previous_unrolled_time_high = (previous_unrolled_time >> nbits);
  int64_t time0 = ((previous_unrolled_time_high - 1) << nbits) | time;
  int64_t time1 = ((previous_unrolled_time_high + 0) << nbits) | time;
  int64_t time2 = ((previous_unrolled_time_high + 1) << nbits) | time;

  // Select the min absolute difference with the current time
  // so as to ensure time continuity.
  int64_t diff0 = time0 - previous_unrolled_time;
  int64_t diff1 = time1 - previous_unrolled_time;
  int64_t diff2 = time2 - previous_unrolled_time;
  if (diff0 < 0)
    diff0 = -diff0;
  if (diff1 < 0)
    diff1 = -diff1;
  if (diff2 < 0)
    diff2 = -diff2;

  int64_t unrolled_time;
  int64_t min_diff;
  if (diff1 < diff0) {
    unrolled_time = time1;
    min_diff = diff1;
  } else {
    unrolled_time = time0;
    min_diff = diff0;
  }
  if (diff2 < min_diff)
    unrolled_time = time2;
  
  return unrolled_time;
}

@zaki699
Copy link
Contributor

zaki699 commented Oct 31, 2024

static uint64_t UnrollTimestampFix(uint64_t previous_unrolled_time, uint64_t time) { 
  // MPEG-2 TS timestamps have an accuracy of 33 bits.
  const int nbits = 33;
  const uint64_t max_timestamp = (1ULL << nbits);
  const uint64_t half_max = max_timestamp / 2;

  // Ensure the highest bits are set to 0.
  DCHECK_EQ((time >> nbits), 0);

  // Calculate the modulo of the previous unrolled time
  uint64_t previous_mod = previous_unrolled_time % max_timestamp;

  // If time is greater than previous_mod and the difference is less than half the max_timestamp,
  // assume no wrap-around has occurred.
  if (time > previous_mod && (time - previous_mod) < half_max) {
    return (previous_unrolled_time & ~(max_timestamp - 1)) + time;
  }

  // Otherwise, determine the correct wrap-around by finding the closest unrolled time
  uint64_t base = previous_unrolled_time & ~(max_timestamp - 1);

  // Candidates: base - max_timestamp, base, base + max_timestamp
  // Ensure that base - max_timestamp does not underflow
  uint64_t time0 = (base >= max_timestamp) ? (base - max_timestamp) + time : time;
  uint64_t time1 = base + time;
  uint64_t time2 = base + max_timestamp + time;

  // Compute differences
  uint64_t diff0 = (time0 > previous_unrolled_time) ? (time0 - previous_unrolled_time) : (previous_unrolled_time - time0);
  uint64_t diff1 = (time1 > previous_unrolled_time) ? (time1 - previous_unrolled_time) : (previous_unrolled_time - time1);
  uint64_t diff2 = (time2 > previous_unrolled_time) ? (time2 - previous_unrolled_time) : (previous_unrolled_time - time2);

  // Find the candidate with the smallest difference
  uint64_t unrolled_time = time1;
  uint64_t min_diff = diff1;

  if (diff0 < min_diff) {
    unrolled_time = time0;
    min_diff = diff0;
  }

  if (diff2 < min_diff) {
    unrolled_time = time2;
  }

  return unrolled_time;
}


struct TSPacket {
    size_t size;
    uint64_t pts;
    uint64_t dts;
    bool data_alignment_indicator;
    // ... other fields ...
};

I wanted to suggest switching from int64_t to uint64_t for our timestamp handling in the UnrollTimestampFix function. Using uint64_t makes a lot of sense since timestamps are always positive (MPEG-2 TS), and it helps prevent those pesky negative values we’ve been seeing.

Update UnrollTimestampFix: Change the function to use uint64_t instead of int64_t.
Check Related Code: Make sure other parts of the code, like the dts field, also use uint64_t. Right now, some of them are still int64_t, which is why we’re getting negative numbers when we assign large unsigned values.

If we decide to stick with int64_t:
We’ll need to be extra careful to handle any potential overflow or underflow to avoid those negative timestamps. It’s doable, but it adds a bit more complexity to our code.

@joeyparrish
Copy link
Member

Are there legitimate cases for negative DTS/PTS? Particularly, small negative values in the first GOP? If so, we might need to stick with int64_t and just be careful about overflow. And this could be true for any format supported by Packager, not just TS, if internally we use the same structures for all.

If there are no legitimate negative timestamps in any formats we care about, I think uint64_t could be reasonable.

@cosmin, what do you think?

@joeyparrish joeyparrish added type: bug Something isn't working correctly flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this priority: P2 Smaller impact or easy workaround labels Oct 31, 2024
@github-actions github-actions bot added this to the v3.3 milestone Oct 31, 2024
@cosmin
Copy link
Contributor

cosmin commented Nov 1, 2024

Some containers do not support negative timestamps, but that's a limitation of those containers. Negative DTS is needed for any samples that have a composition time offset and a PTS of 0 for example. A common case that I've always run into is HEVC with open GOP and decode-able leading B frames (which have only forward references).

@zaki699
Copy link
Contributor

zaki699 commented Nov 4, 2024

Hi @cosmin @joeyparrish,

Maybe we can implement a conditional Assignment Based on Container Support:

Approach: Before assigning DTS and PTS to a container, check if it supports negative timestamps. If not, adjust the timestamps accordingly.

  • Identify Container Capabilities: Determine whether the target container supports negative timestamps.
  • Adjust Timestamps Accordingly:
  • If Supported: Assign DTS and PTS as-is.
  • If Not Supported: Offset the timestamps to ensure they are non-negative or adjust the unrolling logic to prevent negative values.
#include <cstdint>
#include <cassert>
#include <cstdio>

// Enumeration to represent different container formats
enum class ContainerFormat {
    MPEG2_TS,
    MP4,
    WebM,
    CustomSigned, // Formats that support negative DTS
    // Add other formats as needed
};

// Structure to hold TS Packet information
struct TSPacket {
    size_t size;
    int64_t pts; // Using int64_t to allow negative values where necessary
    int64_t dts;
    bool data_alignment_indicator;
    ContainerFormat format;
    // ... other fields ...
};

// Function to check if a container supports negative timestamps
bool ContainerSupportsNegativeTimestamps(ContainerFormat format) {
    switch (format) {
        case ContainerFormat::CustomSigned:
            return true;
        case ContainerFormat::MPEG2_TS:
        case ContainerFormat::MP4:
        case ContainerFormat::WebM:
            return false;
        // Handle other formats as needed
        default:
            return false;
    }
}

// Modified UnrollTimestampFix that handles signed timestamps
static int64_t UnrollTimestampFix(int64_t previous_unrolled_time, int64_t time, ContainerFormat format) { 
    const int nbits = 33;
    const int64_t max_timestamp = (1LL << nbits);
    const int64_t half_max = max_timestamp / 2;

    // Ensure the highest bits are set to 0.
    assert((time >> nbits) == 0);

    int64_t previous_mod = previous_unrolled_time % max_timestamp;
    if (previous_mod < 0) previous_mod += max_timestamp; // Ensure positive modulo

    if (time > previous_mod && (time - previous_mod) < half_max) {
        return (previous_unrolled_time & ~(max_timestamp - 1)) + time;
    }

    int64_t base = previous_unrolled_time & ~(max_timestamp - 1);
    int64_t time0 = (base >= max_timestamp) ? (base - max_timestamp) + time : time;
    int64_t time1 = base + time;
    int64_t time2 = base + max_timestamp + time;

    // Compute absolute differences
    int64_t diff0 = std::abs(time0 - previous_unrolled_time);
    int64_t diff1 = std::abs(time1 - previous_unrolled_time);
    int64_t diff2 = std::abs(time2 - previous_unrolled_time);

    // Select the closest unrolled time
    int64_t unrolled_time = time1;
    int64_t min_diff = diff1;

    if (diff0 < min_diff) {
        unrolled_time = time0;
        min_diff = diff0;
    }

    if (diff2 < min_diff) {
        unrolled_time = time2;
    }

    // If the container does not support negative timestamps, ensure unrolled_time is non-negative
    if (!ContainerSupportsNegativeTimestamps(format) && unrolled_time < 0) {
        // Adjust unrolled_time to be non-negative
        // For example, by adding max_timestamp until it's non-negative
        while (unrolled_time < 0) {
            unrolled_time += max_timestamp;
        }
    }

    return unrolled_time;
}

// Example usage
int main() {
    // Example packets
    TSPacket packets[] = {
        {139, 5319075769, 0, true, ContainerFormat::MPEG2_TS},
        {139, 5319078476, 0, true, ContainerFormat::MPEG2_TS},
        {2794, 5319063823, 0, false, ContainerFormat::MPEG2_TS},
        // Add more packets as needed
    };

    int64_t previous_unrolled_ts = 0;

    for(auto& packet : packets) {
        // Unroll timestamps
        int64_t unrolled = UnrollTimestampFix(previous_unrolled_ts, packet.pts, packet.format);
        packet.pts = unrolled;
        packet.dts = unrolled; // Or assign based on your logic

        // For containers that do not support negative DTS, adjust as needed
        if (!ContainerSupportsNegativeTimestamps(packet.format) && packet.dts < 0) {
            // Example adjustment: Offset by a large positive value
            packet.dts += (1LL << 33);
        }

        // Log the packet information
        printf("Container: %d, size: %zu, pts: %" PRId64 ", dts: %" PRId64 ", data_alignment_indicator: %d\n",
               static_cast<int>(packet.format),
               packet.size, packet.pts, packet.dts, packet.data_alignment_indicator);

        // Update previous_unrolled_time
        previous_unrolled_ts = unrolled;
    }

    return 0;
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
flag: seeking PR We are actively seeking PRs for this; we do not currently expect the core team will resolve this priority: P2 Smaller impact or easy workaround type: bug Something isn't working correctly
Projects
None yet
Development

No branches or pull requests

5 participants
@cosmin @joeyparrish @acris5 @zaki699 and others