Discussion:
[Bug 778830] New: v4l2dec: Fix race when going from PAUSED to READY
Add Reply
"GStreamer" (GNOME Bugzilla)
2017-02-17 13:26:34 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Bug ID: 778830
Summary: v4l2dec: Fix race when going from PAUSED to READY
Classification: Platform
Product: GStreamer
Version: unspecified
OS: All
Status: NEW
Severity: normal
Priority: Normal
Component: gst-plugins-good
Assignee: gstreamer-***@lists.freedesktop.org
Reporter: ***@gnome.org
QA Contact: gstreamer-***@lists.freedesktop.org
GNOME version: ---

See commit message
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-17 13:26:39 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #1 from Thibault Saunier <***@gnome.org> ---
Created attachment 346071
--> https://bugzilla.gnome.org/attachment.cgi?id=346071&action=edit
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t
validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:


ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop:
assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null
repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

The taken solution in that patch is to delay the setting of
`v4l2dec->process` to the time where the processing loop is
actually started.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-17 13:27:10 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@ndufresne.ca
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-17 15:03:30 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Nicolas Dufresne (stormer) <***@ndufresne.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346071|none |needs-work
status| |

--- Comment #2 from Nicolas Dufresne (stormer) <***@ndufresne.ca> ---
Review of attachment 346071:
--> (https://bugzilla.gnome.org/review?bug=778830&attachment=346071)

Have you tested some aggressive seeks (where we don't wait for the seek to
complete) ? This patch needs work since it make
gst_v4l2_video_dec_loop_stopped() and some other code useless. It's all
complicated, and I never found a good solution. What happens is that sometimes,
the pad task will stop before gst_v4l2_video_dec_loop() is called. In this
case, we ended up with bugs, since the handle_frame() should return FLUSHING.

I think I was confused when I wrote this. Later Philipp Zabel found that we can
read and sync on the pad task state. This is being used in _finish() function.
I think the right solution is to completely remove this processing state and to
use the pad state instead to detect early flush (while _process() is being
called).
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-17 15:09:01 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830
Have you tested some aggressive seeks (where we don't wait for the seek to complete) ? This patch needs work since it make gst_v4l2_video_dec_loop_stopped() and some other code useless.
Yeah, I ran the validate testsuite with all the scrub seeking tests with that
patch and there seem to be no regression at all (and the testsuite results are
pretty good on the platform, nice work!).
It's all complicated, and I never found a good solution. What happens is that sometimes, the pad task will stop before gst_v4l2_video_dec_loop() is called. In this case, we ended up with bugs, since the handle_frame() should return FLUSHING.
My patch is about that case in the case of going back to READY, I am not sure
what is your concerns about seeking here.
I think I was confused when I wrote this. Later Philipp Zabel found that we can read and sync on the pad task state. This is being used in _finish() function. I think the right solution is to completely remove this processing state and to use the pad state instead to detect early flush (while _process() is being called).
That makes sense, will give that a try.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 19:47:21 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gnome.org
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 19:47:26 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346071|needs-work |none
status| |
Attachment #346071|0 |1
is obsolete| |

--- Comment #4 from Thibault Saunier <***@gnome.org> ---
Created attachment 346389
--> https://bugzilla.gnome.org/attachment.cgi?id=346389&action=edit
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t
validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:


ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop:
assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null
repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

Here I am removing the whole ->processing logic to base it all on the
GstTask state to avoid duplicating the knowledge.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 19:47:58 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #5 from Thibault Saunier <***@gnome.org> ---
Created attachment 346390
--> https://bugzilla.gnome.org/attachment.cgi?id=346390&action=edit
pad: Add API to get the current state of a task

Avoiding the user to need to deal with the locking himself etc.

API:
gst_pad_task_get_state
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 19:55:34 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #6 from Tim-Philipp Müller <***@zen.co.uk> ---
Comment on attachment 346390
--> https://bugzilla.gnome.org/attachment.cgi?id=346390
pad: Add API to get the current state of a task
+ * set, GST_TASK_STOPPED is returned.
not task -> no task
@pad' -> pad's
+GstTaskState
+gst_pad_get_task_state (GstPad * pad)
+{
+ ...
+ g_return_val_if_fail (GST_IS_PAD (pad), FALSE);
Should probably return a GstTaskState and not a boolean :)
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 20:05:06 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346390|0 |1
is obsolete| |

--- Comment #7 from Thibault Saunier <***@gnome.org> ---
Created attachment 346392
--> https://bugzilla.gnome.org/attachment.cgi?id=346392&action=edit
Since V1:

- Fix documenation typos
- Return a GstTaskStatus on assertions
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-21 20:05:55 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346392|Since V1: |pad: Add API to get the
description| |current state of a task
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-22 08:50:04 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Nicolas Dufresne (stormer) <***@ndufresne.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346389|none |needs-work
status| |

--- Comment #8 from Nicolas Dufresne (stormer) <***@ndufresne.ca> ---
Review of attachment 346389:
--> (https://bugzilla.gnome.org/review?bug=778830&attachment=346389)

It would also be nice to make a version with an internal version of
pad_get_state(), so we could merge in 1.10.

::: sys/v4l2/gstv4l2videodec.c
@@ +478,3 @@
/* When flushing, decoding thread may never run */
+ if (gst_pad_get_task_state (GST_VIDEO_DECODER_SRC_PAD (self)) !=
+ GST_TASK_STOPPED) {

Is that really possible in the task implementation ? When I read
gst_task_func(), the state will always be stopped when this is called. That's
because we take a ref in start_task, and will only leave gst_task_func() when
the state is STOPPED.

What this is trying to do, is to catch the case were we flush before our task
function get called. Because we have no guaranty to be called once. In that
case, the output_flow won't be set properly.

Maybe with a small rework, we could manage to initially set the output_flow to
FLUSHING, and get away with that ?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-22 08:52:07 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #9 from Nicolas Dufresne (stormer) <***@ndufresne.ca> ---
Review of attachment 346392:
--> (https://bugzilla.gnome.org/review?bug=778830&attachment=346392)

::: gst/gstpad.c
@@ +6090,3 @@
+ task = GST_PAD_TASK (pad);
+ if (task == NULL)
+ goto no_task;

I think the goto is not really needed here. Just set ret = GST_TASK_STOPPED,
and then else ...

::: gst/gstpad.h
@@ +1418,3 @@
gboolean gst_pad_pause_task (GstPad *pad);
gboolean gst_pad_stop_task (GstPad *pad);
+GstTaskState gst_pad_get_task_state (GstPad *pad);

Does not seem to align with other declarations.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-02-24 19:21:48 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346392|none |committed
status| |

--- Comment #10 from Thibault Saunier <***@gnome.org> ---
Comment on attachment 346392
--> https://bugzilla.gnome.org/attachment.cgi?id=346392
pad: Add API to get the current state of a task

commit 6326c764ee85fa17f1d7838f6e4d5824e07e3132
Author: Thibault Saunier <***@osg.samsung.com>
Date: Fri Feb 17 15:48:17 2017 -0300

pad: Add API to get the current state of a task

Avoiding the user to need to deal with the locking himself etc.

API:
gst_pad_task_get_state

https://bugzilla.gnome.org/show_bug.cgi?id=778830
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-07 00:37:48 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #11 from Nicolas Dufresne (stormer) <***@ndufresne.ca> ---
Are you still looking into this issue ?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-07 03:03:29 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

--- Comment #12 from Thibault Saunier <***@gnome.org> ---
Yes, I have an updated version of the patch which seems to work well on odroid
and wanted to check if I could test with vivid, gonna propose tomorrow.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-07 13:20:53 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #346389|needs-work |none
status| |
Attachment #346389|0 |1
is obsolete| |

--- Comment #13 from Thibault Saunier <***@gnome.org> ---
Created attachment 349475
--> https://bugzilla.gnome.org/attachment.cgi?id=349475&action=edit
v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t
validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:


ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop:
assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null
repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

Here I am removing the whole ->processing logic to base it all on the
GstTask state to avoid duplicating the knowledge.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-07 13:56:05 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Nicolas Dufresne (stormer) <***@ndufresne.ca> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #349475|none |accepted-commit_now
status| |

--- Comment #14 from Nicolas Dufresne (stormer) <***@ndufresne.ca> ---
Review of attachment 349475:
--> (https://bugzilla.gnome.org/review?bug=778830&attachment=349475)

Loogs good to me. I'm quite happy the we are finally removing some of the
atomic state.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2017-04-07 18:05:52 UTC
Reply
Permalink
Raw Message
https://bugzilla.gnome.org/show_bug.cgi?id=778830

Thibault Saunier <***@gnome.org> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |FIXED
Target Milestone|git master |1.11.91

--- Comment #15 from Thibault Saunier <***@gnome.org> ---
commit 7b7a809818a0fa77dee1e340b42473a1e6e9ea8a
Author: Thibault Saunier <***@osg.samsung.com>
Date: Fri Feb 17 10:01:08 2017 -0300

v4l2dec: Fix race when going from PAUSED to READY

Running `gst-validate-launcher -t
validate.file.playback.change_state_intensive.vorbis_vp8_1_webm`
on odroid XU4 (s5p-mfc v4l2 driver) often leads to:


ERROR:../subprojects/gst-plugins-good/sys/v4l2/gstv4l2videodec.c:215:gst_v4l2_video_dec_stop:
assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

This happens when the following race happens:

- T0: Main thread
- T1: Upstream streaming thread
- T2. v4l2dec processing thread)

[The decoder is in PAUSED state]

T0. The validate scenario runs `Executing (36/40) set-state: state=null
repeat=40`
T1- The decoder handles a frame
T2- A decoded frame is push downstream
T2- Downstream returns FLUSHING as it is already flushing changing state
T2- The decoder stops its processing thread and sets `->processing = FALSE`
T1- The decoder handles another frame
T1- `->process` is FALSE so the decoder restarts its streaming thread
T0- In v4l2dec-> stop the processing thread is stopped
NOTE: At this point the processing thread loop never started.
T0- assertion failed: (g_atomic_int_get (&self->processing) == FALSE)

Here I am removing the whole ->processing logic to base it all on the
GstTask state to avoid duplicating the knowledge.

https://bugzilla.gnome.org/show_bug.cgi?id=778830
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
Loading...