Discussion:
[Bug 780682] New: compositor: corrupts output when input caps change while running
"GStreamer" (GNOME Bugzilla)
2017-03-29 12:34:53 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

Bug ID: 780682
Summary: compositor: corrupts output when input caps change
while running
Classification: Platform
Product: GStreamer
Version: git master
OS: Linux
Status: NEW
Severity: normal
Priority: Normal
Component: gst-plugins-bad
Assignee: gstreamer-***@lists.freedesktop.org
Reporter: ***@tolabaki.gr
QA Contact: gstreamer-***@lists.freedesktop.org
GNOME version: ---

Consider the following pipeline:

gst-launch-1.0 -v videotestsrc is-live=true !
video/x-raw,width=10,height=20,format=I420 ! compositor name=c ! glimagesink
multifilesrc location=%04d.bmp loop=true caps=image/bmp,framerate=10/1 !
avdec_bmp ! videoconvert ! videorate ! c.

...where the .bmp files have been created with:
gst-launch-1.0 videotestsrc num-buffers=10 ! video/x-raw,width=320,height=240 !
avenc_bmp ! multifilesink location=%04d.bmp
gst-launch-1.0 videotestsrc num-buffers=10 ! video/x-raw,width=640,height=480 !
avenc_bmp ! multifilesink location=%04d.bmp index=10
(so there should be a total of 20 images, 10 at 320x240 and 10 at 640x480)

Now at the beginning, the caps of avdec_bmp::src have width=320,height=240 and
therefore the output of compositor is configured to be at 320x240 as well. As
soon as the 11th image is loaded, the caps now have width=640,height=480 and
compositor is reconfigured to output @ 640x480. As soon as this happens, the
output becomes corrupt (!) momentarily, while valgrind shows a couple of
invalid reads around video-converter code:

==26291== Thread 4 c:src:
==26291== Invalid read of size 16
==26291== at 0x4028AC0: ??? (in /run/user/1000/orcexec.x2M1gv (deleted))
==26291== by 0x72DAA68: do_unpack_lines (video-converter.c:2785)
==26291== by 0x72DABF5: gst_line_cache_get_lines (video-converter.c:617)
==26291== by 0x72DAD03: convert_generic_task (video-converter.c:3072)
==26291== by 0x72D6658: gst_parallelized_task_runner_run
(video-converter.c:297)
==26291== by 0x72D93E6: video_converter_generic (video-converter.c:3170)
==26291== by 0x7A290D5: gst_compositor_pad_prepare_frame (compositor.c:572)
==26291== by 0x7E439BA: gst_aggregator_iterate_sinkpads
(gstaggregator.c:380)
==26291== by 0x7C382A9: gst_video_aggregator_do_aggregate
(gstvideoaggregator.c:1364)
==26291== by 0x7C382A9: gst_video_aggregator_aggregate
(gstvideoaggregator.c:1582)
==26291== by 0x7E46784: gst_aggregator_aggregate_func (gstaggregator.c:824)
==26291== by 0x4EE29F8: gst_task_func (gsttask.c:335)
==26291== by 0x5B3DAED: g_thread_pool_thread_proxy (gthreadpool.c:307)

The problem seems to be that compositor chooses input buffers from its queue
based on timestamps and never checks if the buffer it has chosen actually
matches the configured format, so momentarily it picks a 320x240 frame,
thinking that it is a 640x480 one, and sometimes also vice versa when the
resolution drops.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-03-29 12:38:50 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

Sebastian Dröge (slomo) <***@coaxion.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@coaxion.net

--- Comment #1 from Sebastian Dröge (slomo) <***@coaxion.net> ---
That would also be solved by https://bugzilla.gnome.org/show_bug.cgi?id=779945
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-03-29 16:55:21 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #2 from George Kiagiadakis <***@tolabaki.gr> ---
Hmm, theoretically it should be solved, you are right. I just tested, though,
with the patches from Olivier's branch and it doesn't seem to make any
difference...
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-03-30 14:47:07 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #3 from George Kiagiadakis <***@tolabaki.gr> ---
So, it looks like the actual problem in my case is not that the caps event is
not being processed in the src pad thread. The race in bug 779945 happens when
the queue is empty, which is not the case here.

The actual problem is that the framerate of the image stream in my example is
smaller than the framerate of the videotestsrc stream and this is what happens:

The last 320x240 buffer in c:sink_1 has pts=0.8s, while the first 640x480 one
has pts=0.9s
Compositor generates an output buffer (320x240) with t=0.8s, using the last
320x240 input buffer in the queue.
Right after that, it processes the caps event that was serialized and changes
the format to 640x480.
Next, it tries to generate the next output frame, which must have pts=0.833s,
since the output framerate is 30/1.
It considers the 640x480 frame, but notices that its PTS is 0.9s, so it keeps
it in the queue for later and continues using the previous frame (320x240).
Boom.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-03-30 15:56:55 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #4 from George Kiagiadakis <***@tolabaki.gr> ---
Doesn't seem very easy to fix. Basically we need to make sure that event
processing doesn't happen until the next buffer can be picked, but since the
events don't have timestamps, it's not trivial. Also, event processing happens
in GstAggregator, while the frame picking logic is in GstVideoAggregator, which
complicates things. Ideas welcome.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-03-30 16:49:14 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

Olivier Crête <***@ocrete.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@ocrete.ca

--- Comment #5 from Olivier Crête <***@ocrete.ca> ---
Seems like a bug in GstVideoAggregator, which should keep the event until it
picks the relevant buffer.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-04 08:51:50 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #6 from George Kiagiadakis <***@tolabaki.gr> ---
Created attachment 349218
--> https://bugzilla.gnome.org/attachment.cgi?id=349218&action=edit
videoaggregator: delay using new caps from a sink pad until the next buffer in
the queue is taken
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-04 12:06:05 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

George Kiagiadakis <***@tolabaki.gr> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #349218|0 |1
is obsolete| |

--- Comment #7 from George Kiagiadakis <***@tolabaki.gr> ---
Created attachment 349231
--> https://bugzilla.gnome.org/attachment.cgi?id=349231&action=edit
videoaggregator: delay using new caps from a sink pad until the next buffer in
the queue is taken

Added an additional check to fix a regression caught by the compositor unit
test.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-05 20:17:39 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #8 from Olivier Crête <***@ocrete.ca> ---
I wonder if this shouldnt be somehow done by the code in bug #776931 ..
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-06 08:08:50 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #9 from George Kiagiadakis <***@tolabaki.gr> ---
(In reply to Olivier Crête from comment #8)
Post by "GStreamer" (GNOME Bugzilla)
I wonder if this shouldnt be somehow done by the code in bug #776931 ..
No, it isn't done there. The attachment in bug 776931 basically moves code
around, the way I see it. I did try it, but it didn't make any difference on
this issue.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-20 16:17:20 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

George Kiagiadakis <***@tolabaki.gr> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #349231|0 |1
is obsolete| |

--- Comment #10 from George Kiagiadakis <***@tolabaki.gr> ---
Created attachment 350148
--> https://bugzilla.gnome.org/attachment.cgi?id=350148&action=edit
videoaggregator: delay using new caps from a sink pad until the next buffer in
the queue is taken

I noticed that the previous version of the patch had a bug. If _fill_queues()
returned GST_FLOW_NEEDS_DATA and the timeout variable of
gst_video_aggregator_aggregate() was TRUE, then reconfiguration would not
happen.

This is a fixed version of the patch. I simply don't use the return value from
_fill_queues() to determine if there is need for reconfiguration, I check the
src pad flags instead.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-21 10:51:52 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #11 from Sebastian Dröge (slomo) <***@coaxion.net> ---
Not sure if we should get this in before 1.12 now, Olivier what do you think?
It should probably also happen in combination with bug #776931, which is about
similar problems.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-24 18:38:48 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #12 from Olivier Crête <***@ocrete.ca> ---
I'd keen on pushing this, but #776931 seems like it needs another iteration to
fix the comments. I'd also be keen to push my agg-neg-2 branch, which I just
attached as #781673.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-25 09:25:36 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #13 from Sebastian Dröge (slomo) <***@coaxion.net> ---
I'd say now is a bit late, I'm planning 1.11.92 in two days, and then only
really important bugfixes and 1.12.0 hopefully next week.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-25 15:13:58 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #14 from Olivier Crête <***@ocrete.ca> ---
Thinking about this again, I think this one should go in before 1.12 as it
fixes a real observable bug, the others are more future looking.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-05-09 12:48:56 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

--- Comment #15 from Sebastian Dröge (slomo) <***@coaxion.net> ---
Olivier, please merge this now then if it looks all ok to you. Not for 1.12
yet, let's give it some testing time in master first.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-05-20 14:24:26 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

Olivier Crête <***@ocrete.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
Target Milestone|git master |1.13.1

--- Comment #16 from Olivier Crête <***@ocrete.ca> ---
Merged as:

commit 39d2526c34d2185c9d42a95ede8171b2d47fed18
Author: George Kiagiadakis <***@collabora.com>
Date: Tue Apr 4 11:25:43 2017 +0300

videoaggregator: delay using new caps from a sink pad until the next buffer
in the queue is taken

When caps changes while streaming, the new caps was getting processed
immediately in videoaggregator, but the next buffer in the queue that
corresponds to this new caps was not necessarily being used immediately,
which resulted sometimes in using an old buffer with new caps. Of course
there used to be a separate buffer_vinfo for mapping the buffer with its
own caps, but in compositor the GstVideoConverter was still using wrong
info and resulted in invalid reads and corrupt output.

This approach here is more safe. We delay using the new caps
until we actually select the next buffer in the queue for use.
This way we also eliminate the need for buffer_vinfo, since the
pad->info is always in sync with the format of the selected buffer.

https://bugzilla.gnome.org/show_bug.cgi?id=780682
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-05-20 14:24:48 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=780682

Olivier Crête <***@ocrete.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #350148|none |committed
status| |
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
Loading...