this post was submitted on 03 Jan 2024
1 points (100.0% liked)
Programming
17270 readers
39 users here now
Welcome to the main community in programming.dev! Feel free to post anything relating to programming here!
Cross posting is strongly encouraged in the instance. If you feel your post or another person's post makes sense in another community cross post into it.
Hope you enjoy the instance!
Rules
Rules
- Follow the programming.dev instance rules
- Keep content related to programming in some way
- If you're posting long videos try to add in some form of tldr for those who don't want to watch videos
Wormhole
Follow the wormhole through a path of communities [email protected]
founded 1 year ago
MODERATORS
you are viewing a single comment's thread
view the rest of the comments
view the rest of the comments
The best programming language for automating things is python. Python is easy and comes with a lot of modules that allow you to do anything and everything, I guarantee you that once you start automating stuff it'll become like a drug and you'll just "automate it" whenever you have anything repetitive.
And BTW, one of the main uses of python is website scraping.
https://musicbrainz.org/doc/MusicBrainz_API
The best language for automation is the one you know best. The second best is one you have to learn.
I think you could do this in bash with YouTube-dl.
Indeed. while my bash-fu is redimentary at best, I don't think Bash can be used for web scrapping ? But I think he could use RSS to get the posts, then extract youtube links with Regex and use the dump feature of yt-dlp* to get the video category, title,etc by using jq to parse the json. Then, it's probably just a matter of using curl to do the API calls and voilà.
*yt-dlp is better maintained than youtube-dl, or so I heard.
I built two scrapers for a website that hosts images and videos using bash.
They're educational, I swear! /s
I looked through the html and figured out regexes for their media. The scripts will parse all the links on the thumbnail pages and then load the corresponding primary pages with curl. On those pages, it then uses wget to grab the file. Some additional pattern matching names the file to the name of the post.
It's probably convoluted, but you can accomplish a lot in bash if you want to.
Man, there's something really wrong with lemmy lately. I only got the notification for your comment 8 days after you sent it. It's the third time this happens but this must be the longest time before the notification reaches me.
Yes, there's a discussion about this on my instance. Someone there provided a link to where this was getting addressed. Some aspects of federation have been broken for a bit.
https://github.com/LemmyNet/lemmy/issues/4288#issuecomment-1878442186
Hope it get fixed soon.
Seems like it. My inbox had five replies yesterday (after >1w of only local replies). Today, even more. Yesterday, the GUI was partially broken. Today looks normal.
I find Python difficult - no idea why, it just doesn't feel right. I've tried a few times but never been able to do anything useful with it - that's why it's not in my list above. It does seem though that my proposed project, and development "style", is best suited to Python. Maybe it's time to try again.
If you work in bash and don’t like python, maybe it’s too strict. Look into Ruby. It was inspired by Perl. I found it more to my style in that there are many correct solutions and not one implied correct solution.