Discussion:
[Bug 795195] New: gst_vaapi_texture_glx_put_surface_unlocked is blocked 75-80% of the time
"GStreamer" (GNOME Bugzilla)
2018-04-12 11:12:48 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

Bug ID: 795195
Summary: gst_vaapi_texture_glx_put_surface_unlocked is blocked
75-80% of the time
Classification: Platform
Product: GStreamer
Version: 1.14.0
OS: Linux
Status: NEW
Severity: normal
Priority: Normal
Component: gstreamer-vaapi
Assignee: gstreamer-***@lists.freedesktop.org
Reporter: ***@canonical.com
QA Contact: gstreamer-***@lists.freedesktop.org
CC: ***@gmail.com, ***@igalia.com
GNOME version: ---

gst_vaapi_texture_glx_put_surface_unlocked is blocked 75-80% of the time.

This means if you need more than 20-25% CPU to decode and render a video,
you're going to miss frames.

I noticed this with two different machines. While they both spend around 75% of
their (real) time blocked in gst_vaapi_texture_glx_put_surface_unlocked, only
the one with the lower CPU requirement (10%) was able to render the video
smoothly. The other machine which requires 40% CPU time to play the same video
stutters badly, probably because 40%+75% > 100%.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 11:14:17 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #1 from Daniel van Vugt <***@canonical.com> ---
Created attachment 370849
--> https://bugzilla.gnome.org/attachment.cgi?id=370849&action=edit
kabylake-rt.prof.pdf

Real time profile from Kaby Lake.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 11:14:50 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #2 from Daniel van Vugt <***@canonical.com> ---
Created attachment 370850
--> https://bugzilla.gnome.org/attachment.cgi?id=370850&action=edit
haswell-rt.prof.pdf

Real time profile from the Haswell.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 11:16:55 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #3 from Daniel van Vugt <***@canonical.com> ---
Note: On platforms that use EGL instead of GLX, there is no problem at all.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 11:18:26 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #4 from Daniel van Vugt <***@canonical.com> ---
Also, decoding to ximagesink avoids the problem and is much smoother than
decoding to glimagesink (!)
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 12:12:13 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

Will Cooke <***@canonical.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@canonical.com
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 12:28:59 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

David Regev <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-12 18:29:13 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

Julien Isorce <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-13 09:54:43 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #5 from Daniel van Vugt <***@canonical.com> ---
I've now got a working patch to solve this bug and will polish it up for review
next week.

Although I seem to have incorrectly assumed that fixing this bug would solve my
own totem playback issues. Seems it doesn't, but the improved efficiency will
still be worth having.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-16 10:29:48 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #6 from Daniel van Vugt <***@canonical.com> ---
Created attachment 370982
--> https://bugzilla.gnome.org/attachment.cgi?id=370982&action=edit
fix-795195-v1.patch

Here's a much simpler fix than what I was working with last week...
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-17 09:05:39 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

Víctor Manuel Jáquez Leal <***@igalia.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #370982|none |reviewed
status| |

--- Comment #7 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
Review of attachment 370982:
--> (https://bugzilla.gnome.org/review?bug=795195&attachment=370982)

Thanks for the patch Daniel.

::: gst-libs/gst/vaapi/gstvaapitexture_glx.c
@@ +362,3 @@

+ /* Ensure vaPutSurface has really finished populating the pixmap */
+ glXWaitX ();

A couple comments:

1. This patch is focused only on Intel driver, where vaSyncSurfce boils down to
DRM_IOCTL_I915_GEM_SET_DOMAIN ioctl (there are many comments in the intertubes
complaining about the latency of this ioctl)

2. We need to test this patch on gallium driver, particularly in radeon, when
decoding H.264, since it is where vaSyncSurface also flushes the pending
operations.

3. Perhaps this is not the correct approach to this bug, perhaps the bug is in
vaapi-intel-driver or even in the kernel. Thus, we would be hiding the bug

4. This patch replacing vaSyncSurface with glXWaitX looks to me similar like
adding a sleep in the code :D in order to avoid a race condition.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-17 09:15:18 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #8 from Daniel van Vugt <***@canonical.com> ---
I can't help with gallium. Although I finally got hold of a radeon machine
recently, I haven't yet got vaapi working with it. So that's a whole different
issue.

While I'm not completely sure the patch is correct, I don't think comparing it
to adding a sleep is correct either. vaSyncSurface is sleeping dramatically
more than glXWaitX does. That's the bug.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-17 09:34:09 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #9 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
(In reply to Daniel van Vugt from comment #8)
Post by "GStreamer" (GNOME Bugzilla)
While I'm not completely sure the patch is correct, I don't think comparing
it to adding a sleep is correct either.
Agree. I'm sorry. I shouldn't made that comparison.
Post by "GStreamer" (GNOME Bugzilla)
vaSyncSurface is sleeping
dramatically more than glXWaitX does. That's the bug.
AFAIK, the bug is that the kernel's driver is taking too much to flush the
pending operations on a single surface.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-17 09:51:39 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #10 from Daniel van Vugt <***@canonical.com> ---
I'm not really sure.

Maybe it would be helpful to forget this bug for a moment and have a look at
what motivated it -> bug 795325
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-17 15:03:22 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #11 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
I asked in #intel-gfx

I just c&p the chat for sake of a real fix in intel-vaapi-driver

ceyusa hi.. I have seen a couple complains about the latency when calling
DRM_IOCTL_I915_GEM_SET_DOMAIN.. do you have some info in this regard?
mkuoppal ceyusa: changing the domain will take time as we might need to wait
for rendering to finish and flush
ceyusa mkuoppal: thanks
ceyusa out of curiosity: do you have any reported bug related?
* ickle lost irc history
ickle ceyusa: related to what?
mkuoppal set domain latency
ceyusa that
ickle ceyusa: friends don't let friends use set-domain
ceyusa ickle: hehehe
ickle if it weren't for random swapping, we could just let userspace handle
cache domains
ceyusa because in va, vaSyncSurface boils down to that ioctl
ickle right, that's just a wait-for-rendering and manipulate cpu caches
ickle SyncSurface should just be GEM_WAIT to avoid the cpu cache
manipulation side-effects
ickle and preferably have userspace fences to avoid touching the kernel
until it wants to sleep+wait
ceyusa agree
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-18 01:19:32 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

--- Comment #12 from Daniel van Vugt <***@canonical.com> ---
Does that mean I should move this to an intel-vaapi-driver bug on github?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-04-18 12:52:39 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

Daniel van Vugt <***@canonical.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Attachment #370982|0 |1
is obsolete| |

--- Comment #13 from Daniel van Vugt <***@canonical.com> ---
Comment on attachment 370982
--> https://bugzilla.gnome.org/attachment.cgi?id=370982
fix-795195-v1.patch

I'm going to obsolete the glXWaitX patch. It's only right in theory. Whereas in
practice it's possibly wrong if vaPutSurface is too low-level to be affected by
it.

Plus I think frame rates are slightly reduced using that patch, suggesting it's
not syncing the right thing.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-11-03 15:53:53 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=795195

GStreamer system administrator <***@gstreamer.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |OBSOLETE

--- Comment #14 from GStreamer system administrator <***@gstreamer.net> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance:
https://gitlab.freedesktop.org/gstreamer/gstreamer-vaapi/issues/90.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
Loading...