Wikipedia talk:Wikipedia Signpost/2023-05-22/Recent research

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia

Discuss this story

  • those closer to the community’s core are assumed to be more cooperative – hmm... – Joe (talk) 09:08, 22 May 2023 (UTC)[reply]
  • I would like to read the Gatekeeping article. Does anyone have a preprint? Sandizer (talk) 15:59, 22 May 2023 (UTC)[reply]
    @Sandizer: It's accessible via The Wikipedia Library. – Rhain (he/him) 23:37, 22 May 2023 (UTC)[reply]
  • I feel like the game theory paper is interesting but it also feels like it could be empirically tested to see if its findings hold up in reality because like Joe Roe I'm not sure their statrting assumptions are correct. Best, Barkeep49 (talk) 16:35, 22 May 2023 (UTC)[reply]
    Well, the authors actually did extensive empirical testing, on a dataset spanning more than a decade of edits (as I already briefly mentioned in the review but should probably have expanded upon). From the abstract:

    We validate the model’s prediction through an empirical analysis, by studying the interactions of 219,811 distinct contributors that co-produced 864 Wikipedia articles over a decade. The analysis and empirical results suggest that the factor that determines who ends up owning content is the ratio between one’s cooperative/competitive orientation (estimated based on whether a core or peripheral community member) and the contributor’s creator/curator activity profile (proxied through average edit size per sentence). Namely, under the governance mechanisms, the fractional content that is eventually owned by a contributor is higher for curators that have a competitive orientation.

    But I guess what you and Joe are concerned about is rather what is often called construct validity - i.e., do these metrics defined in the paper really capture what we would commonly perceive as an editor's cooperativeness vs. competitiveness, say? And that's a valid question here. (Of course all models are wrong and it's virtually always possible to pick on some shortcoming where this kind of modeling doesn't fully reflect reality. One would need to ask whether and how much it affects the overall conclusions.)
    Relatedly, while the paper's data availability statement claims that "All relevant data are within the paper and its Supporting information files," the dataset provided there appears to be very incomplete. For example, it does not seem to contain the information necessary to calculate the creator/curator metric, and only contains data on 16,383 users instead of the aforementioned 219,811 (i.e. consists of 16,384 rows including the header, a suspiciously round number). It is also anonymized, meaning that one can't evaluate construct validity by inspecting the ratings for some concrete editors in the dataset manually.
    Regards, HaeB (talk) 18:16, 23 May 2023 (UTC)[reply]
  • I wonder how good the definition of "owned" is. If I for example say "that's a fair enough copy edit which doesn't change meaning, does the other person "own" the sentence that says exactly the same thing as I added. Talpedia 16:18, 6 June 2023 (UTC)[reply]