this post was submitted on 26 Dec 2023
10 points (70.8% liked)

Technology

34914 readers
211 users here now

This is the official technology community of Lemmy.ml for all news related to creation and use of technology, and to facilitate civil, meaningful discussion around it.


Ask in DM before posting product reviews or ads. All such posts otherwise are subject to removal.


Rules:

1: All Lemmy rules apply

2: Do not post low effort posts

3: NEVER post naziped*gore stuff

4: Always post article URLs or their archived version URLs as sources, NOT screenshots. Help the blind users.

5: personal rants of Big Tech CEOs like Elon Musk are unwelcome (does not include posts about their companies affecting wide range of people)

6: no advertisement posts unless verified as legitimate and non-exploitative/non-consumerist

7: crypto related posts, unless essential, are disallowed

founded 5 years ago
MODERATORS
 

I got interested in this question a few years ago, when I started writing about the “denominator problem”. A great deal of social media research focuses on finding unwanted behavior – mis/disinformation, hate speech – on platforms. This isn’t that hard to do: search for “white genocide” or “ivermectin” and count the results. Indeed, a lot of eye-catching research does just this – consider Avaaz’s August 2020 report about COVID misinformation. It reports 3.8 billion views of COVID misinfo in a year, which is a very big number. But it’s a numerator without a denominator – Facebook generates dozens or hundreds of views a day for each of its 3 billion users – 3.8 billion views is actually a very small number, contextualized with a denominator.

The paper this post describes can be found here
Abstract:

YouTube is one of the largest, most important communication platforms in the world, but while there is a great deal of research about the site, many of its fundamental characteristics remain unknown. To better understand YouTube as a whole, we created a random sample of videos using a new method. Through a description of the sample’s metadata, we provide answers to many essential questions about, for example, the distribution of views, comments, likes, subscribers, and categories. Our method also allows us to estimate the total number of publicly visible videos on YouTube and its growth over time. To learn more about video content, we hand-coded a subsample to answer questions like how many are primarily music, video games, or still images. Finally, we processed the videos’ audio using language detection software to determine the distribution of spoken languages. In providing basic information about YouTube as a whole, we not only learn more about an influential platform, but also provide baseline context against which samples in more focused studies can be compared.

all 2 comments
sorted by: hot top controversial new old