BermudaHighball

joined 1 year ago
[–] [email protected] 1 points 1 week ago

Yes, I think so. I'll definitely use the example for downloading some of the files (.torrent, metadata file) once I have some items. But first I need to find all the items ever uploaded.

[–] [email protected] 2 points 1 week ago (2 children)

Thank you for the tips. I am actually interested in enumerating metadata for all the "items" as defined by the API page ever uploaded. For example, one item = one ID:

Archive.org is made up of “items”. An item is a logical “thing” that we represent on one web page on archive.org. An item can be considered as a group of files that deserve their own metadata.

You did cause me to look at the API docs again, though, and I think I found something that does enumerate all item names, and as a bonus, it will keep you updated when changes are made: https://archive.org/developers/changes.html

We'll see how much progress I can make. It might take a while to get through all the millions of them.

24
submitted 1 week ago* (last edited 6 days ago) by [email protected] to c/[email protected]
 

I'd love to know if anyone's aware of a bulk metadata export feature or repository. I would like to have a copy of the metadata and .torrent files of all items.

I guess one way is to use the CLI but this relies on knowing which item you want and I don't know if there's a way to get a list of all items.

I believe downloading via BitTorrent and seeding back is a win-win: it bolsters the Archive's resilience while easing server strain. I'll be seeding the items I download.

Edit: If you want to enumerate all item names in the entire archive.org repository, take a look at https://archive.org/developers/changes.html. This will do that for you!

 

It seems like 6 or 7 years ago there was research into new forms of storage, using crystals or DNA that promised ultra high density storage. I know the read/write speed was not very fast, but I thought by now there would be more progress in the area. Apparently in 2021 there was a team that got a 16GB file stored in DNA. In the last month there's some company (Biomemory) that lets you store 1KB of data into DNA for $1,000, but if you want to read it, you have to send it to them. I don't understand why you would use that today.

I wonder if it will ever be viable for us to have DNA readers/writers... but I also wonder if there are other new types of data storage coming up that might be just as good.

If you know anything about the DNA research or other new storage forms, what do you think is the most promising one?

 

In the past, most software I used was paid and proprietary and would have some sort of limitation that I would try to get around by any means possible. Sometimes that would be resetting the clock on my computer, disabling the internet, and other times downloading a patch.

But in the past few years I've stopped using those things and have focused only on free and open source software (FOSS) to fulfill my needs. I hardly have to worry about privacy problems or trying to lock down a program that calls home. I might be missing out on some things that commercial software delivers, but I'm hardly aware of what they are anymore. It seems like the trend is for commercial software providers to migrate toward online or service models that have the company doing all the computing. I'm opposed to that, since they can take away your service at any time.

What do you do?