this post was submitted on 13 Dec 2024
280 points (99.3% liked)

Open Source

31972 readers
95 users here now

All about open source! Feel free to ask questions, and share news, and interesting stuff!

Useful Links

Rules

Related Communities

Community icon from opensource.org, but we are not affiliated with them.

founded 5 years ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 23 points 1 month ago (2 children)

This is probably the best solution I've found so far.

Unfortunately, even this is no match for the user-hostile design of, say, Microsoft Copilot, because it hides content that is scrolled off screen so it's invisible in the output. That's no fault of this extension. It actually DOES capture the data. It's not the extension's fault that the web site intentionally obscures itself. Funnily enough, if I open the resulting html file in Lynx, I can read the hidden text, no problem. LOL.

[–] [email protected] 16 points 1 month ago

I was on a site that did that and was confused why my text search wasn't finding much. Thanks devs for breaking basic browser features.

[–] [email protected] 6 points 1 month ago (1 children)

Actually that might not have been done to deliberately disrupt your flow. Culling elements that are outside of the viewport is a technique used to reduce the amount of memory the browser consumes.

[–] [email protected] 2 points 1 month ago (1 children)

...which should be used only when the browser is running out of memory.

[–] [email protected] 1 points 1 month ago

Well... that would make sense. But it's much much easier to just do it preemptively. The browser API to check how much memory is available are quite limited afaik. Also if there are too many elements the browser will have to do more work when interacting with the page (i.e. on every rendered frame), thus wasting slightly more power and in a extreme cases even lagging.

For what it's worth, I, as a web developer, have done it too in a couple occasions (in my case it was absolutely necessary when working with a 10K × 10K table, way above what a browser is designed to handle).