Discussion:
[Bug 797152] New: vaapi decode plugins leak dmabuf fds when transitioning from PLAYING to NULL state
"GStreamer" (GNOME Bugzilla)
2018-09-16 19:05:51 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

Bug ID: 797152
Summary: vaapi decode plugins leak dmabuf fds when
transitioning from PLAYING to NULL state
Classification: Platform
Product: GStreamer
Version: 1.14.x
OS: Linux
Status: NEW
Severity: normal
Priority: Normal
Component: gstreamer-vaapi
Assignee: gstreamer-***@lists.freedesktop.org
Reporter: ***@gmail.com
QA Contact: gstreamer-***@lists.freedesktop.org
CC: ***@gmail.com, ***@igalia.com
GNOME version: ---

Created attachment 373668
--> https://bugzilla.gnome.org/attachment.cgi?id=373668&action=edit
Reproduction of the described issue

When a pipeline containing a VAAPI decode element transitions from PLAYING to
NULL state, it seems two dmabuf fds are leaked (when using EGL as the GL
platform and dmabuf as the transfer mechanism from decode element to sink).

A minimal pipeline that reproduces the issue is the following:
videotestsrc ! x264enc ! video/x-h264 ! vaapih264dec !
video/x-raw(memory:DMABuf) ! fakesink

See the attached code which creates this pipeline and repeatedly sets it to
PLAYING and NULL. Each time, the number of open dmabuf fds is printed. The
output is as follows:

count of dmabuf fds: 0
setting pipeline state to PLAYING
count of dmabuf fds: 4
setting pipeline state to NULL
count of dmabuf fds: 2
setting pipeline state to PLAYING
count of dmabuf fds: 6
setting pipeline state to NULL
count of dmabuf fds: 4
setting pipeline state to PLAYING
count of dmabuf fds: 8
setting pipeline state to NULL
after exiting main loop
count of dmabuf fds: 6
after cleanup
count of dmabuf fds: 6

Each time the pipeline transitions to PLAYING, four dmabuf fds seem to be
opened. When transitioning to NULL, only two are closed/freed. Additionally,
when the program exits and the gstreamer elements are cleaned up (after three
cycles), the fds still seem to be left open.

I have an application at hand where it is necessary to repeatedly stop and
start the stream. This issue causes memory leakage and, eventually, program
crashing due to too many open files. I have also tried to more gracefully stop
the pipeline by sending an EOS event and only transitioning to NULL state after
receiving the event on the pipeline's bus. However, the results are similar -
leakage happens even then. Is there something regarding the dmabufs that has to
be taken account during such transitions, or is this simply a bug in
gstreamer-vaapi?

Even though the attached code uses a home-baked function that counts the open
dmabuf fds, the behaviour can be verified by e.g. watching the output of lsof:
lsof -p $(pidof repro) | grep -c dmabuf

I would appreciate any input on how to resolve the issue (or if you can confirm
it's an actual bug in gstreamer-vaapi or some other element).

At least the following elements exhibit the behaviour:
vaapih264dec
vaapih265dec
vaapivp8dec

Hardware:
Intel i5-8250U
Intel UHD Graphics 620

OS:
Arch Linux
gstreamer version 1.14.2
gstreamer-vaapi version 1.14.2
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 03:02:20 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

Julien Isorce <***@gmail.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@gmail.com
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 06:45:21 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

Juha Leino <***@optofidelity.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
CC| |***@optofidelity.com
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 12:13:27 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

Víctor Manuel Jáquez Leal <***@igalia.com> changed:

What |Removed |Added
----------------------------------------------------------------------------
Summary|vaapi decode plugins leak |vaapidec: leak of dmabuf
|dmabuf fds when |fds when transitioning from
|transitioning from PLAYING |PLAYING to NULL state
|to NULL state |
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 12:16:26 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #1 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
(In reply to Sakari Kapanen from comment #0)
Created attachment 373668 [details]
I have an application at hand where it is necessary to repeatedly stop and
start the stream. This issue causes memory leakage and, eventually, program
crashing due to too many open files. I have also tried to more gracefully
stop the pipeline by sending an EOS event and only transitioning to NULL
state after receiving the event on the pipeline's bus. However, the results
are similar - leakage happens even then. Is there something regarding the
dmabufs that has to be taken account during such transitions, or is this
simply a bug in gstreamer-vaapi?
It looks like a bug, indeed.
Even though the attached code uses a home-baked function that counts the
open dmabuf fds, the behaviour can be verified by e.g. watching the output
lsof -p $(pidof repro) | grep -c dmabuf
I would appreciate any input on how to resolve the issue (or if you can
confirm it's an actual bug in gstreamer-vaapi or some other element).
Tracing the created buffers and detect those which aren't deallocated and
deallocate them at the right moment.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 12:29:26 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #2 from Sakari Kapanen <***@gmail.com> ---
(In reply to Víctor Manuel Jáquez Leal from comment #1)

Thank you for the quick response.
Post by "GStreamer" (GNOME Bugzilla)
(In reply to Sakari Kapanen from comment #0)
Post by "GStreamer" (GNOME Bugzilla)
Even though the attached code uses a home-baked function that counts the
open dmabuf fds, the behaviour can be verified by e.g. watching the output
lsof -p $(pidof repro) | grep -c dmabuf
I would appreciate any input on how to resolve the issue (or if you can
confirm it's an actual bug in gstreamer-vaapi or some other element).
Tracing the created buffers and detect those which aren't deallocated and
deallocate them at the right moment.
Sounds sensible to me. Would you mind kindly elaborating what is the best way
to trace the buffers (or how you would do it)? Would it be via GST_DEBUG logs
or something else?

Also, it would be helpful to have some pointers to where dmabuf
allocation/deallocation takes place in the VAAPI decode elements. I am pretty
new to the gstreamer-vaapi module and some starting point would be very helpful
in the investigation.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 12:59:47 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #3 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
(In reply to Sakari Kapanen from comment #2)
Post by "GStreamer" (GNOME Bugzilla)
Would you mind kindly elaborating what is the best
way to trace the buffers (or how you would do it)? Would it be via GST_DEBUG
logs or something else?
Yes, using those that are already in the code, or adding more as you need. Also
I would check the leak tracer tools:

https://www.collabora.com/news-and-blog/blog/2016/06/20/gstreamer-leaks-tracer/

Check the creation and destruction of buffers on logging fdmemory:5
also the dup of the fd in gstreamer-vaapi gst/vaapi/gstvaapivideomemory.c
Post by "GStreamer" (GNOME Bugzilla)
Also, it would be helpful to have some pointers to where dmabuf
allocation/deallocation takes place in the VAAPI decode elements. I am
pretty new to the gstreamer-vaapi module and some starting point would be
very helpful in the investigation.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 18:36:28 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #4 from Sakari Kapanen <***@gmail.com> ---
Created attachment 373676
--> https://bugzilla.gnome.org/attachment.cgi?id=373676&action=edit
Output dump with log data from gstreamer-vaapi

Ok, so I got further in the investigation. I modified the repro a bit, made it
print the open dmabuf fd numbers and do just one cycle of PLAYING->NULL. I also
added some debug prints inside gstreamer-vaapi and enabled
GST_DEBUG="fdmemory:5,vaapivideomemory:5,2".

So it seems it is the original (not dup'ed) fds that are leaked.
gstreamer-vaapi in fact tries to release them in
gst_vaapi_buffer_proxy_release_handle, but the call to vaReleaseBufferHandle
there fails, returning the error code VA_STATUS_ERROR_INVALID_BUFFER (btw, this
only seems to be reported at GST_DEBUG=5 level by default which is kind of a
noisy level).

I digged a bit deeper inside the intel-vaapi-driver. I added some debug prints
to see where i965_ReleaseBufferHandle fails. It's the if condition here:
https://github.com/intel/intel-vaapi-driver/blob/master/src/i965_drv_video.c#L6671
that fails and returns the error code. So indeed the proxy->va_buf handle in
the call to vaReleaseBufferHandle seems to be invalid somehow.

If I detect the VA_STATUS_ERROR_INVALID_BUFFER in
gst_vaapi_buffer_proxy_release_handle and manually call
close(proxy->va_info.handle) if that occurs, the dmabufs are properly freed (of
course). However, this only is a band-aid to the issue - it would be good to
find where the va_buf ID goes invalid. Were you able to reproduce the issue?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 19:26:24 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #5 from Sakari Kapanen <***@gmail.com> ---
Interestingly, if I disable this call to gst_vaapi_buffer_proxy_release_data:
https://github.com/GStreamer/gstreamer-vaapi/blob/b4d6a3b11338c188b2c6d47533732c0e0f28c39d/gst/vaapi/gstvaapivideomemory.c#L1026
then the fds are closed by vaReleaseBufferHandle in
gst_vaapi_buffer_proxy_release_handle just fine.

If I'm reading the code correctly, the gst_vaapi_buffer_proxy_release_data call
seems to destroy the underlying VAAPI image object using vaDestroyImage, which
seems to also destroy the underlying buffer:
https://github.com/intel/intel-vaapi-driver/blob/master/src/i965_drv_video.c#L4985

Therefore, the later call to vaReleaseBufferHandle (when freeing the resources
allocated by the pipeline) fails because on the VAAPI driver side, the
underlying buffer is already destroyed. Does this make sense at all? I hope I
followed the code correctly.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-17 19:47:12 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #6 from Sakari Kapanen <***@gmail.com> ---
(Sorry for the consecutive comments)
It seems to be a similar issue to what's discussed here:
https://bugzilla.gnome.org/show_bug.cgi?id=755072#c38

The difference is that now the image is released by the call to
gst_vaapi_buffer_proxy_release_data in the end of gst_vaapi_dmabuf_memory_new.
(The destroy_func of the dmabuf_proxy is assigned here:
https://github.com/GStreamer/gstreamer-vaapi/blob/master/gst-libs/gst/vaapi/gstvaapisurface_drm.c#L41
)
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-18 13:29:00 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #7 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
Thanks a lot for your work, Sakari.

If I recall correctly, in order to use those buffers for decoding, the
allocated derive image must be released, otherwise the driver couldn't write on
the surface.

I don't know if that changed in the intel-vaapi-driver, or how's the situation
with media-driver or amd state-tracker.

Could you post a patch, removing that data release and run tests with it?
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-18 13:38:02 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #8 from Sakari Kapanen <***@gmail.com> ---
(In reply to Víctor Manuel Jáquez Leal from comment #7)
Post by "GStreamer" (GNOME Bugzilla)
If I recall correctly, in order to use those buffers for decoding, the
allocated derive image must be released, otherwise the driver couldn't write
on the surface.
That's what I also gathered from the comments in the code.
Post by "GStreamer" (GNOME Bugzilla)
I don't know if that changed in the intel-vaapi-driver, or how's the
situation with media-driver or amd state-tracker.
I could reproduce the same leak on the media-driver at least (normally running
intel-vaapi-driver though).
Post by "GStreamer" (GNOME Bugzilla)
Could you post a patch, removing that data release and run tests with it?
Removing the data release broke the actual application I'm working on - no
decoded frames could be received and displayed. So that doesn't seem to be the
way to go at least.

I'm not quite confident with writing a patch for this issue since the solution
is not likely going to be as simple as removing the data release. Despite, it
seems to me that the solution would be something along the lines of
gstreamer-vaapi taking ownership of that dmabuf fd (since the VAAPI driver
won't know about that handle or buffer anymore since the buffer is destroyed on
VAAPI side by the data release call) and taking care of closing it later.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-09-18 15:47:48 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

--- Comment #9 from Víctor Manuel Jáquez Leal <***@igalia.com> ---
(In reply to Sakari Kapanen from comment #8)
Post by "GStreamer" (GNOME Bugzilla)
(In reply to Víctor Manuel Jáquez Leal from comment #7)
Post by "GStreamer" (GNOME Bugzilla)
Could you post a patch, removing that data release and run tests with it?
Removing the data release broke the actual application I'm working on - no
decoded frames could be received and displayed. So that doesn't seem to be
the way to go at least.
Ok. That confirms the issue.
Post by "GStreamer" (GNOME Bugzilla)
I'm not quite confident with writing a patch for this issue since the
solution is not likely going to be as simple as removing the data release.
Despite, it seems to me that the solution would be something along the lines
of gstreamer-vaapi taking ownership of that dmabuf fd (since the VAAPI
driver won't know about that handle or buffer anymore since the buffer is
destroyed on VAAPI side by the data release call) and taking care of closing
it later.
As far as I understand, the solution goes in a different directions. The usage
of vaAcquireBufferHandle() and vaReleaseBufferHandle() is being deprecated
because they hide details of the dmabuf required for negotiation, specially
with other devices (other GPU or so).

Check https://bugzilla.gnome.org/show_bug.cgi?id=779146#c55 for the replacement
of it.

Nowadays, only vaExportSurfaceHandle() works for... exporting... there's still
a need for importing handles onto surfaces API.

Anyway, the fix for this issue is to replace all the machinery for dmabufs for
pipeline downstream with this new libva API. This API is supported by Intel and
AMD backends.

Or more nicely, we should keep this logic only for old implementations of
libva.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
"GStreamer" (GNOME Bugzilla)
2018-11-03 15:55:46 UTC
Permalink
https://bugzilla.gnome.org/show_bug.cgi?id=797152

GStreamer system administrator <***@gstreamer.net> changed:

What |Removed |Added
----------------------------------------------------------------------------
Status|NEW |RESOLVED
Resolution|--- |OBSOLETE

--- Comment #10 from GStreamer system administrator <***@gstreamer.net> ---
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been
closed from further activity.

You can subscribe and participate further through the new bug through this link
to our GitLab instance:
https://gitlab.freedesktop.org/gstreamer/gstreamer-vaapi/issues/106.
--
You are receiving this mail because:
You are the QA Contact for the bug.
You are the assignee for the bug.
Loading...