this post was submitted on 25 Oct 2024
11 points (92.3% liked)

Rust Programming

8163 readers
3 users here now

founded 5 years ago
MODERATORS
 

Hi! I'm trying to learn Rust, as a little project, I'm trying to build a web scraper that will scrape some content and rebuild it with a static site generator, or using it for making POST requests.

I'm still at a very early stage and I still don't know much, the simplest error handling strategy I know is using match with Result.

To my eyes, this syntax looks correct, but also looks kind of a lot of lines for a simple http request.

I know the reqwest docs suggest to handle errors with the ? operator, which I don't know yet, therefore I'm just using what I know now.

fn get_document(permalink: String) -> Html {
        let html_content_result = reqwest::blocking::get(&permalink);
        let html_content = match html_content_result {
            Ok(response) => response,
            Err(error) => panic!("There was an error making the request: {:?}", error),
        };

        let html_content_text_result = html_content.text();
        let html_content_text = match html_content_text_result {
            Ok(text) => text,
            Err(error) =>
                panic!(
                    "There was an error getting the html text from the content of response: :{:?}",
                    error
                ),
        };

        let document = Html::parse_document(&html_content_text);

        document
    }

As for my understanding, this is what I'm doing here: I'm making an http request, if i get a Response, I try to get the text out of the response body, otherwise I handle the error by panicking with a custom message. Getting the text out of the request body is another passage that requires error handling, therefore I use the match expression again to get the text out and handle the possible error (In what circumstances can extracting the text of a response body fail?).

Then I can finally parse the document and return it!

I wonder if it is a correct and understandable way of doing what I've in mind.

Do you think this would be a suitable project for someone who is at chapter 7 of the Rust book? I feel like i actually need to build somethiong before keep going with the theory!

you are viewing a single comment's thread
view the rest of the comments
[–] [email protected] 3 points 3 weeks ago* (last edited 3 weeks ago) (1 children)

This will work in general. One point of improvement: right now, if the request fails, the panic will cause your whole program to crash. You could change your function to return a Result<Html, SomeErrorType> instead, and handle errors more gracefully in the place where your function is called (e.g. ignoring pages that returned an error and continuing with the rest).

Look into anyhow for an easy to use error handling crate, allowing you to return an anyhow::Result<Html>

[–] [email protected] 2 points 3 weeks ago

They did say they haven’t learned the ? operator (that’s chapter 9 of the rust book), so this approach might be better for once they get there.