7

composites happen at 60Hz until the browser window is closed after showing an ar...

 2 years ago
source link: https://bugzilla.mozilla.org/show_bug.cgi?id=1742797
Go to the source link to view the article. You can view the picture content, updated content and better typesetting reading experience. If the link is broken, please click the button below to view the snapshot at that time.
neoserver,ios ssh client
Open Bug 1742797 Opened 2 months ago Updated 5 hours ago

composites happen at 60Hz until the browser window is closed after showing an arrowpanel

Categories

(Core :: Graphics: Layers, defect)

Tracking

(REOPENED bug with no priority)

REOPENED

People

(Reporter: florian, Assigned: hiro)

References

(Depends on 1 open bug, Blocks 2 open bugs, Regressed 1 open bug)

Details

(Keywords: power, Whiteboard: [qf:p1:resource])

I found steps to reproduce for a case that causes composites to happen at 60Hz until the browser window is closed.

  1. Preparation: in a first browser window, start a download (eg. http://test-debit.free.fr/10485760.rnd), let it progress for a few seconds, then cancel it. This makes the downloads toolbar icon appear on all browser windows.
  2. Open a new browser window, and close the first browser window.
  3. Start the profiler.
  4. Click the downloads toolbar icon, this makes the downloads panel appear. Wait for a few seconds.
  5. Dismiss the downloads panel. Wait for a few seconds.
  6. (Optional) Open a new browser window, and close the browser window that was opened at step 2.
  7. Capture the profile.

I have been able to reproduce consistently on the current Nightly on 2 Windows machines. On my fast-ish development laptop (https://share.firefox.dev/3nPqRkF) and on the Quantum reference laptop (https://share.firefox.dev/30Vk6VH). I have not been able to reproduce on Mac (and I have not tried on Linux yet).

In the profiles, when we composite at 60Hz for no obvious reason, the Screenshots track of the downloads panel is black. Reopening the downloads panel makes the composition rate go down to normal.

After step 2, composition happens normally. After step 4 it still happens normally. After step 5 (hiding the downloads panel), it happens at 60Hz. After step 6 it goes back to normal.

Emilio, Markus, could you please forward this needinfo to someone who can figure this out? I'm not sure if this is in cross platform code or if it's Windows specific.

Flags: needinfo?(mstange.moz)
Flags: needinfo?(emilio)

(In reply to Florian Quèze [:florian] from comment #0)

I have not been able to reproduce on Mac (and I have not tried on Linux yet).

On Linux I can't reproduce the 60Hz composite. The one part of the behavior that exists on Linux too is that the panel has its own Screenshots track that ends only when closing the browser window it was attached to. But there's no black screenshots on that track. https://share.firefox.dev/3cMamj0

OS: Unspecified → Windows
Hardware: Unspecified → Desktop

I couldn't repro either on Linux... Maybe Sotaro knows what might cause this on Windows? Maybe windows keeps sending us paint events? Maybe his work on occlusion helps popups as well, unclear.

Flags: needinfo?(emilio) → needinfo?(sotaro.ikeda.g)

I currently don't have a Windows machine to reproduce.

Flags: needinfo?(mstange.moz)

Florian, can you check if this is a regression?

Flags: needinfo?(florian)

:florian, can you check FPS with debug overlay? It could be checked with the following prefs

  • gfx.webrender.debug.profiler-ui:FPS
  • gfx.webrender.debug.profiler:true

I saw the a lot of "Paint" markers in profiler. But FPS of debug overlay was very low.

It seems like handling of profiler marker is not good. "Paint" marker id set even when rendering does not happen. Rendering happens only when aRender==true.
https://searchfox.org/mozilla-central/rev/7fe9421af35256a95acc4620e9e0b76df7867287/gfx/webrender_bindings/RenderThread.cpp#489

Flags: needinfo?(sotaro.ikeda.g)

:mstange, can you comment to "Paint" marker of comment 5?

Flags: needinfo?(mstange.moz)

(In reply to Sotaro Ikeda [:sotaro] from comment #5)

:florian, can you check FPS with debug overlay? It could be checked with the following prefs

  • gfx.webrender.debug.profiler-ui:FPS
  • gfx.webrender.debug.profiler:true

I saw the a lot of "Paint" markers in profiler. But FPS of debug overlay was very low.

The debug overlay disappears at the same time as the panel when I close the panel, so I can't see what it would show when the bug occurs.

It seems like handling of profiler marker is not good. "Paint" marker id set even when rendering does not happen. Rendering happens only when aRender==true.
https://searchfox.org/mozilla-central/rev/7fe9421af35256a95acc4620e9e0b76df7867287/gfx/webrender_bindings/RenderThread.cpp#489

I don't know if that marker is accurate or misleading. The main problem in this bug is that activity is happening at 60Hz (eg. the PVsyncBridge::Msg_NotifyVsync IPCs are also a problem).

Flags: needinfo?(florian)

FYI the criticality of this issue will increase as we ship bug 1733587 since the panel will now open automatically for new downloads.

(In reply to Sotaro Ikeda [:sotaro] from comment #6)

:mstange, can you comment to "Paint" marker of comment 5?

I agree it could be better - we should emit a "Renderer Update #WIN" marker instead of a "Composite" marker if aRender is false. However, as Florian said, the real issue still stands: We are waking up at 60Hz. We should stop listening for vsync and let the CPU sleep.

Flags: needinfo?(mstange.moz)

The Downloads panel has fade-out effect when it closes: https://searchfox.org/mozilla-central/rev/65d4d3399afa79c8de5a0cc11752d2ba7c31edc1/toolkit/content/xul.css#319-326

This opacity animation runs on the compositor. It is possible that we never stop the compositor-side animation if the panel is closed, maybe because we never get to an aRender == true composite when the animation is stopped.

Hiro, could you take a look?

Flags: needinfo?(hikezoe.birchill)

Looks like I filed this four years ago, in bug 1413763, and then lost track of it. To answer Hiro's question from back then:

(In reply to Hiroyuki Ikezoe (:hiro) from comment #2)

How do we hide the panel?
If we do hide by setting visiblity:hidden, the transform animation keep
running.

The panel is hidden by making its nsView hidden and by closing the view's nsIWidget. The panel widget has its own compositor.

It is possible that making the hidden panel display:none would stop the compositor animation. This is currently being investigated in bug 1714846.

Depends on: 1714846
Flags: needinfo?(hikezoe.birchill)

(In reply to Jeff Muizelaar [:jrmuizel] from comment #4)

Florian, can you check if this is a regression?

I can reproduce on the 2020-05-22 build, which is the oldest build where the profiler works.

It looks like the animation (in fact it's two transtions) in question has a finite duration (0.18s), so it sounds odd that the animation keeps running on the compositor.

(In reply to Hiroyuki Ikezoe (:hiro) from comment #15)

It looks like the animation (in fact it's two transtions) in question has a finite duration (0.18s), so it sounds odd that the animation keeps running on the compositor.

Never mind about this comment. I did forget that we use fill-mode: both for transitions to avoid glitches on the compositor, thus it keeps running until an explicit transaction which makes the transitions stop happens.

The severity field is not set for this bug.
:jimm, could you have a look please?

For more information, please visit auto_nag documentation.

Flags: needinfo?(jmathies)
Severity: -- → S4
Flags: needinfo?(jmathies)

I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile. I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here... but I've no idea if there's anything else I'm missing.

Flags: needinfo?(florian)

(In reply to :Gijs (he/him) from comment #18)

I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile.

I guess I forgot to say "look in the Renderer or compositor thread". Your profile has composites until you close the browser window (the parent process has a DOMWindowClose event when the composite markers stop) https://share.firefox.dev/3s2NcO4

I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here...

I'm curious what made you think this. We create a screenshot marker for every composite, and won't trigger any if there's no composite happening.

Flags: needinfo?(florian)

(In reply to Florian Quèze [:florian] from comment #19)

(In reply to :Gijs (he/him) from comment #18)

I think this is fixed by bug 1714846 - Florian, can you confirm? I'm not particularly convinced because I don't see any composite markers at all, so I assume the steps in comment 0 miss, say, which profiler preset you're meant to use or where to look for the markers? https://share.firefox.dev/3EVl5nu is my profile.

I guess I forgot to say "look in the Renderer or compositor thread". Your profile has composites until you close the browser window (the parent process has a DOMWindowClose event when the composite markers stop) https://share.firefox.dev/3s2NcO4

So does that mean this is fixed or did we expect the patch to mean that the composites stop when the panel is closed even while that window remains open? If the latter, who is in a good position to investigate why that doesn't happen? If the former, should we close this bug?

I did already figure out I had to turn off screenshots because otherwise those trigger markers for the screenshots that seemed like they'd interfere with actually determining what was going on here...

I'm curious what made you think this. We create a screenshot marker for every composite, and won't trigger any if there's no composite happening.

The composite markers for screenshots seem to appear even when there is no composite marker on the compositor thread - compare https://share.firefox.dev/3IOceqd and https://share.firefox.dev/3pYsTid (different views on the same profile (with screenshots this time) . Perhaps that's a separate bug?

Flags: needinfo?(florian)

(In reply to :Gijs (he/him) from comment #20)

The composite markers for screenshots seem to appear even when there is no composite marker on the compositor thread - compare https://share.firefox.dev/3IOceqd and https://share.firefox.dev/3pYsTid (different views on the same profile (with screenshots this time) . Perhaps that's a separate bug?

Note that you can Cmd-click threads and get an integrated marker chart of multiple threads: https://share.firefox.dev/3p0PlYM (I also applied a search filter for "composit")
The screenshot markers span the distance between two consecutive composites of the same window.

Ah, and in this particular case, it looks like we're getting "Composite" markers for the hidden panel, but no updated "CompositorScreenshot" markers - instead, the panel's synthesized "CompositorScreenshot" marker spans the full duration of 4.6 seconds until the parent window is closed. So it seems like we're triggering a composite but then canceling compositing somewhere in the pipeline before we capture a new screenshot. Likely because something notices that the window is closed. But the frequent "Composite" and "CompositeToTarget" markers indicate that we still wake up the CPU. And that's the part that this bug is about.

(In reply to :Gijs (he/him) from comment #20)

So does that mean this is fixed or did we expect the patch to mean that the composites stop when the panel is closed even while that window remains open? If the latter, who is in a good position to investigate why that doesn't happen?

We expect composites for the panel to stop when the panel is closed. Given the previous comments (especially comment 16), I would say Hiro would be in a good position to keep investigating.

Flags: needinfo?(florian)

Hiro, based on the last few comments, can you take a look at what's happening in current nightly that keeps us compositing when the downloads panel has been closed, and its content slot has been marked display: none? (I just checked and that change does now take effect - though the downloads panel itself seems to still be display: -moz-popup, it's unclear to me if that's expected - Emilio, did you see that when working on bug 1714846?)

Flags: needinfo?(hikezoe.birchill)

Needinfo for comment #24, sorry for the spam.

Flags: needinfo?(emilio)

Yeah, that's kind of expected. In the past when mentioning this to Neil he mentioned that the current design of popups not being display: none when closed was to avoid recreating native widgets over and over for e.g., the context menu. Maybe the trade-offs are different now?

Flags: needinfo?(emilio)

The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...

Flags: needinfo?(hikezoe.birchill)

I am assuming what Gijs wanted to know is why the animation keeps running after closing the popup. The answer is;

  1. The animation is a CSS transition running on the compositor, we let CSS transitions keep running on the compositor until an explicit notification comes from the main-thread
  2. Though this is a guess, in the popup case the notification is tried to be sent after popup closed thus any display list building stuff won't happen thus the notification will never be arrived to the compositor

So, I guess setting transition:none just before closing the popup would solve this issue? (IIRC there's a such popup blah event)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #27)

The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...

FWIW, this sounds very similar to bug 1739161, where the element is made zero size. In that testcase, the animation ends properly only when we take EndEmptyTransaction painting path, which is only used with retained display lists.
Retained display lists are not used for popups, so this might be related.

(In reply to Miko Mynttinen [:miko] from comment #29)

(In reply to Hiroyuki Ikezoe (:hiro) from comment #27)

The animation in question here is on this downloadsPanel, right? As far as I can tell, it's the animation keeps running on the compositor and I can't see display:none style on the element after the popup was closed. Oh now I see what you meant, yeah the display style is: -moz-popup...

FWIW, this sounds very similar to bug 1739161, where the element is made zero size. In that testcase, the animation ends properly only when we take EndEmptyTransaction painting path, which is only used with retained display lists.
Retained display lists are not used for popups, so this might be related.

That's a different, a problem about bug 1739161 is happening on the main-thread, it keeps restyling on the main-thread, on the other hand in this bug the animation isn't restyled on the main-thread.

I figured out a way to solve this. The way is to call ClearCachedWebrenderResources when the popup gets hidden. I was initially thinking that it's not a good way because of resource recreation cost what Emilio commented in comment 26, but we've already clear the resources when recreating the popup. So I am mostly 100% sure it's okay.

Flags: needinfo?(gijskruitbosch+bugs)

I'm a bit confused. I think this happens for every panel? Both based on the conversation here (that seems to suggest that this happens due to the transition that is on the root of the panel/popup node, which isn't being display: none'd), and because that's what this profile seems to show, AFAICT: https://share.firefox.dev/3sl7By8 , and because that's what the dupe (bug 1413763) says.

But the steps here make me think this is somehow DL-panel specific? Was that intentional when filing it?

Given this isn't DL-panel specific, I'm going to unblock bug 1733587 as it's unrelated, really? People in general are much more likely to open "any arrow panel" either before or after the changes on nightly right now.

AIUI, the outgoing opacity transition is intentional, so I don't think frontend and/or the toolkit css and custom element code can fix this, despite the attempt in bug 1714846, and we might need a fix like the one in attachment 9255857 [details] [diff] [review].

Flags: needinfo?(gijskruitbosch+bugs) → needinfo?(florian)

(In reply to :Gijs (he/him) from comment #32)

I'm a bit confused. I think this happens for every panel? Both based on the conversation here (that seems to suggest that this happens due to the transition that is on the root of the panel/popup node, which isn't being display: none'd), and because that's what this profile seems to show, AFAICT: https://share.firefox.dev/3sl7By8 , and because that's what the dupe (bug 1413763) says.

But the steps here make me think this is somehow DL-panel specific? Was that intentional when filing it?

It was intentional, but it was me being confused. I tried a bunch of other panels at the time and they didn't reproduce. Testing a bit more today, I can reproduce with the hamburger panel and the control center / identity panels. I can't reproduce with the profiler or pocket panels.

AIUI, the outgoing opacity transition is intentional, so I don't think frontend and/or the toolkit css and custom element code can fix this

I don't think front-end code can avoid the bug without removing the opacity & transform transitions entirely.

we might need a fix like the one in attachment 9255857 [details] [diff] [review].

Here is a profile with this attachment applied: https://share.firefox.dev/3pd4033 It confirms that this change does fix the bug.

We might be able to write a simple test for this using ChromeUtils.vsyncEnabled() before/after the hamburger panel has been shown and hidden.

Flags: needinfo?(florian)
Summary: composites happen at 60Hz until the browser window is closed after showing the downloads panel. → composites happen at 60Hz until the browser window is closed after showing an arrowpanel

(In reply to Florian Quèze [:florian] from comment #33)

We might be able to write a simple test for this using ChromeUtils.vsyncEnabled() before/after the hamburger panel has been shown and hidden.

This seems to just always return true for me on Nightly, on Windows, when run from the browser console, both when the panel is open and when it's not. How is it supposed to help here? And, is there an alternative (perhaps using profiler markers) ?

Flags: needinfo?(florian)

(In reply to :Gijs (he/him) from comment #34)

(In reply to Florian Quèze [:florian] from comment #33)

We might be able to write a simple test for this using ChromeUtils.vsyncEnabled() before/after the hamburger panel has been shown and hidden.

This seems to just always return true for me on Nightly, on Windows, when run from the browser console, both when the panel is open and when it's not.

This always returns true because the browser console uses a CSS animation to make the cursor blink. When testing locally I used a setTimeout to workaround it. Eg. run setTimeout(() => console.log(ChromeUtils.vsyncEnabled()), 1000); in the console and quickly move the focus away from the console to make the cursor disappear.

Flags: needinfo?(florian)

Setting a needinfo for Florian to come back to this when he's back from PTO, as I didn't really get very far. The patch I attached works, as far as I can tell, and the test passes but (a) it's kinda ugly and (b) the test also passes before the patch, so that's not much use. Despite Florian's extensive help, I didn't get very far trying to figure out why - I tried to use the profiler to find out what was happening in the automated test, but it doesn't seem to include profiles of the gpu process, even though when I check if the GPU process is running by using requestProcInfo it is indeed running. If I try to collect shutdown profiles by failing the test, the browser crashes, without useful stacks. So I'm a bit lost, and I'm hoping Florian is in a better position to pick this up (but if anyone else has ideas, I'm sure they'd be appreciated too - we should really try to get this fixed sooner rather than later, considering the potential perf and power usage impacts).

Flags: needinfo?(florian)
Assignee: nobody → hikezoe.birchill
Status: NEW → ASSIGNED

I managed to make the test work (on Linux with xul.panel-animations.enabled=true). So what I've noticed is it seems the issue only happens when the popup is closed by a native mouse click event. And I am now unsure though, I've changed to use the download panel instead of the browser menu.

(Though I didn't want to assign this bug to me, but moz-phab forcibly did it)

Assignee: hikezoe.birchill → nobody
Status: ASSIGNED → NEW
Assignee: nobody → hikezoe.birchill
Status: NEW → ASSIGNED

(In reply to :Gijs (back Jan 4, 2022; he/him) from comment #37)

Setting a needinfo for Florian to come back to this when he's back from PTO, as I didn't really get very far. The patch I attached works, as far as I can tell, and the test passes but (a) it's kinda ugly and (b) the test also passes before the patch, so that's not much use.

Given that Hiro managed to make the test work, do you still need my help?

I tried yesterday Hiro's version of the test on my local Windows build. The test passes with the layout/xul/nsMenuPopupFrame.cpp fix applied, and fails without. So it looks like it's testing what we expect.

I forced the test to fail to have a full profile of it including the GPU process: https://share.firefox.dev/3HvPUQA
One thing I noticed in this profile is that the initial waitForCondition lasted 3.6s, waiting for vsync to be disabled at the beginning of the test. This is due to vsync notifications being observed by a content process for an about:blank document (https://share.firefox.dev/3zl1Th3), I don't have a good understanding of what's causing that yet, but it's something to figure out in a separate bug.

Flags: needinfo?(florian)
Flags: needinfo?(gijskruitbosch+bugs)

(In reply to Florian Quèze [:florian] from comment #39)

One thing I noticed in this profile is that the initial waitForCondition lasted 3.6s, waiting for vsync to be disabled at the beginning of the test. This is due to vsync notifications being observed by a content process for an about:blank document (https://share.firefox.dev/3zl1Th3), I don't have a good understanding of what's causing that yet, but it's something to figure out in a separate bug.

I filed bug 1748466 for this.

Okay, I take over this.

Flags: needinfo?(gijskruitbosch+bugs)
Attachment #9256448 - Attachment description: WIP: Bug 1742797 - ensure we stop compositing after closing panels → Bug 1742797 - Discard WebRender resources when popup hides. r?sotaro!,florian!

Do'h. mView in nsMenuPopupFrame is a dangling pointer... I will figure out a way to avoid the situation.

Flags: needinfo?(hikezoe.birchill)

Move if (auto* widget = GetWidget()) {} block before mView gets destroyed by nsIFrame::DestroyFrom.

Status: ASSIGNED → RESOLVED
Closed: 13 days ago
Resolution: --- → FIXED
Target Milestone: --- → 97 Branch
Status: RESOLVED → REOPENED
Flags: needinfo?(hikezoe.birchill)
Resolution: FIXED → ---
Target Milestone: 97 Branch → ---

This is really bad for a range of reason. Bumping the priority. Graphics team might have some thoughts here.

Severity: S4 → S2
Whiteboard: [qf:p1:resource]
No longer regressions: 1748808
See Also: → 1748808

I did investigate two test failures, bug 1748788 and bug 1748808. Now I am pretty sure bug 1748808 is unrelated to this bug. Bug 1748788 is the only one remaining issue to land this bug.

To performance team, I will not spend time for bug 1748788.

Flags: needinfo?(hikezoe.birchill)

A way to avoid bug 1748788 I can think of is to add a new route to invoke ClearAnimationResources and use it in HidePopup, it should just work.

You need to log in before you can comment on or make changes to this bug.

About Joyk


Aggregate valuable and interesting links.
Joyk means Joy of geeK