this post was submitted on 25 Apr 2024
85 points (92.9% liked)

Technology

59217 readers
2726 users here now

This is a most excellent place for technology news and articles.


Our Rules


  1. Follow the lemmy.world rules.
  2. Only tech related content.
  3. Be excellent to each another!
  4. Mod approved content bots can post up to 10 articles per day.
  5. Threads asking for personal tech support may be deleted.
  6. Politics threads may be removed.
  7. No memes allowed as posts, OK to post as comments.
  8. Only approved bots from the list below, to ask if your bot can be added please contact us.
  9. Check for duplicates before posting, duplicates may be removed

Approved Bots


founded 1 year ago
MODERATORS
 

Personally found this article highly interesting, it's very much worth a read. Have included the full article with images and links in the spoiler below 🌻

Tap me to read Full article here

Researchers from Delft University of Technology plan to amplify their BitTorrent client "Tribler" with decentralized AI-powered search. A new demo shows that generative AI models make it possible to search for content in novel ways, without restriction. The ultimate goal of the research project is to shift the Internet's power balance from governments and large corporations back to consumers.

decentralized networkTwenty-five years ago, peer-to-peer file-sharing took the Internet by storm.

The ability to search for and share content with complete strangers was nothing short of a revolution.

In the years that followed, media consumption swiftly moved online. This usually involved content shared without permission, but pirate pioneers ultimately paved the way for new business models.

The original ‘pirate’ ethos has long since gone. There are still plenty of unauthorized sites and services, but few today concern themselves with decentralization and similar technical advances; centralized streaming is the new king with money as the main motivator.

AI Meets BitTorrent

There are areas where innovation and technological progress still lead today, mostly centered around artificial intelligence. Every month, numerous new tools and services appear online, as developers embrace what many see as unlimited potential.

How these developments will shape the future is unknown, but they have many rightsholders spooked. Interestingly, an ‘old’ research group, that was already active during BitTorrent’s heyday, is now using AI to amplify its technology.

Researchers from the Tribler research group at Delft University of Technology have been working on their Tribler torrent client for nearly two decades. They decentralized search, removing the need for torrent sites, and implemented ‘anonymity‘ by adding an onion routing layer to file transfers.

Many millions of euros have been spent on the Tribler research project over the years. Its main goal is to advance decentralized technology, not to benefit corporations, but to empower the public at large.

“Our entire research portfolio is driven by idealism. We aim to remove power from companies, governments, and AI in order to shift all this power to self-sovereign citizens,” the Tribler team explains.

Decentralized AI-powered Search

While not every technological advancement has been broadly embraced, yet, Tribler has just released a new paper and a proof of concept which they see as a turning point for decentralized AI implementations; one that has a direct BitTorrent link.

The scientific paper proposes a new framework titled “De-DSI”, which stands for Decentralised Differentiable Search Index. Without going into technical details, this essentially combines decentralized large language models (LLMs), which can be stored by peers, with decentralized search.

This means that people can use decentralized AI-powered search to find content in a pool of information that’s stored across peers. For example, one can ask “find a magnet link for the Pirate Bay documentary,” which should return a magnet link for TPB-AFK, without mentioning it by name.

This entire process relies on information shared by users. There are no central servers involved at all, making it impossible for outsiders to control.

Endless Possibilities, Limited Use

While this sounds exciting, the current demo version is not yet built into the Tribler client. Associate Professor Dr. Johan Pouwelse, leader of the university’s Tribler Lab, explains that it’s just a proof of concept with a very limited dataset and AI capabilities.

“For this demo, we trained an end-to-end generative Transformer on a small dataset that comprises YouTube URLs, magnet links, and Bitcoin wallet addresses. Those identifiers are each annotated with a title and represent links to movie trailers, CC-licensed music, and BTC addresses of independent artists,” Pouwelse says.

We tried some basic searches with mixed results. That makes sense since there’s only limited content, but it can find magnet links and videos without directly naming the title. That said, it’s certainly not yet as powerful as other AI tools.

de-dsi

In essence, De-DSI operates by sharing the workload of training large language models on lists of document identifiers. Every peer in the network specializes in a subset of data, which other peers in the network can retrieve to come up with the best search result.

A Global Human Brain to Fight Torrent Spam and Censors

The proof of concept shows that the technology is sound. However, it will take some time before it’s integrated into the Tribler torrent client. The current goal is to have an experimental decentralized-AI version of Tribler ready at the end of the year.

While the researchers see this as a technological breakthrough, it doesn’t mean that things will improve for users right away. AI-powered search will be slower to start with and, if people know what they’re searching for, it offers little benefit.

Through trial and error, the researchers ultimately hope to improve things though, with a “global brain” for humanity as the ultimate goal.

Most torrent users are not looking for that, at the moment, but Pouwelse says that they could also use decentralized machine learning to fight spam, offer personal recommendations, and to optimize torrent metadata. These are concrete and usable use cases.

The main drive of the researchers is to make technology work for the public at large, without the need for large corporations or a central government to control it.

“The battle royale for Internet control is heating up,” Pouwelse says, in a Pirate Bay-esque fashion.

“Driven by our idealism we will iteratively take away their power and give it back to citizens. We started 18 years ago and will take decades more. We should not give up on fixing The Internet, just because it is hard.”

The very limited De-DSI proof of concept and all related code is available on Huggingface. All technological details are available in the associated paper. The latest Tribler version, which is fully decentralized without AI, can be found on the official project page.

top 9 comments
sorted by: hot top controversial new old
[–] [email protected] 13 points 6 months ago (3 children)

Sounds like it's just an LLM for DHT indexing...

[–] [email protected] 10 points 6 months ago

It is, but I can see a few use cases that could make it useful. Namely, it can look for common scam/virus patterns to filter more effectively and offer better content suggestions. There are also cases to be made for more descriptive indexing and content identification: lots of torrents have particularly bad naming schemes or misspellings that make finding the content somewhat more difficult or involved.

[–] [email protected] 3 points 6 months ago

Indeed. I suppose it's a good idea in theory and should be helpful though

[–] [email protected] 3 points 6 months ago* (last edited 6 months ago) (1 children)

Yep, there's no "intelligence" involved at all. Just data and statistical computation.

[–] [email protected] 2 points 6 months ago

And moreover DHT search engines already exist.

[–] [email protected] 6 points 6 months ago

An excellent proof of concept.

[–] [email protected] 0 points 6 months ago

It may be useful for something like ed2k too, to fight poisoning (remember all those horse porn files instead of what you were looking for, except for cases where you were looking for horse porn but found Star Wars).

[–] [email protected] 0 points 6 months ago (1 children)

No way it could be abused to deanonymize people.

[–] [email protected] 10 points 6 months ago

DHT is an identifying protocol by design, it is how people find you to send/receive data. If your connection to the swarm is anonymized there really isn't a ton the AI is going to be able to do that isn't already happening with traditional methods.