User talk:Suffusion of Yellow/Archive 10

Page contents not supported in other languages.
From Wikipedia, the free encyclopedia
Archive 5 Archive 8 Archive 9 Archive 10

Lower than expected recall

I have 1,279 things on my mind today so I'll mention this before I forget. Could you look at this guy's article? I'm on Discord. Daniel Quinlan (talk) 22:37, 29 December 2023 (UTC)

@Daniel Quinlan: Took care of it, thanks. Suffusion of Yellow (talk) 22:48, 29 December 2023 (UTC)
Would it be possible to do something similar as a new thing focused on administrator talk pages? I'm not sure about the best actions to take, but it seems like a strong indicator for initial edits. Daniel Quinlan (talk) 00:05, 30 December 2023 (UTC)
Might be worth a try. I'll email you in a day or two. Suffusion of Yellow (talk) 03:25, 31 December 2023 (UTC)
@Daniel Quinlan: Sent you an email. Suffusion of Yellow (talk) 22:13, 31 December 2023 (UTC)

Filter 1242

I've restored that filter (with some modifications); that targets a very active LTA who edits across several large broad ranges. There should be limited collateral because of the AND condition on those ranges. I'm not on the mailing list, so I'm not sure what concern you are talking about. OhNoitsJamie Talk 17:56, 1 January 2024 (UTC)

@Ohnoitsjamie: The concern EggRoll97 raised on the mailing list was fixed with Special:AbuseFilter/history/1242/diff/prev/30471. But I'd feel more comfortable without so many common words there; that regex still matches about 500,000 titles. (Protip: If Special:Search times out when looking for page titles, just download the whole list; it's only about only about 300 MB). Certainly, your latest update reduces the FPs quite a bit, but best to keep an eye on this. Suffusion of Yellow (talk) 21:16, 1 January 2024 (UTC)
If it possible to view profiling for individual filters? OhNoitsJamie Talk 22:01, 1 January 2024 (UTC)
Not in any way that I know of. AFAIK the best that's publicly available is the "Of the last X actions, this filter has matched..." at the top of each filter page, which isn't all that helpful because it only shows the average, not the worst case. Users with the right kind of logstash access (e.g. MusikAnimal) can view of log of "slow filters" but I don't know to what extent it's proactively monitored. Suffusion of Yellow (talk) 22:11, 1 January 2024 (UTC)

Improving filter 397

Hi Suffusion of Yellow. I noticed that filter 397 was matching on a fair number of innocuous words and it also looked like some words could be added so I did some pretty extensive work to improve the filter and make it more maintainable. It could be made a bit faster by not computing the second ccnorm() result unless it's needed, but I want to see how fast the revised version is before I bother doing that. Please let me know if you have any concerns, questions, or feedback. Daniel Quinlan (talk) 05:07, 10 January 2024 (UTC)

Thanks! I wouldn't worry about performance (within reason) for non-mainspace filters, so long as you check the namespace before you do anything expensive. The majority of other filters will short-circuit at page_namespace == 0, so the total runtime will be low. One thing about 397 (hist · log) is that it almost always overlaps with 803 (hist · log). So all that work seems like a bit of a wasted effort. It might be worth looking for other filters to merge some of those regexes into. 384 (hist · log), 260 (hist · log) haven't had major additions in years.
One more thing: if you're going go to through the trouble of commenting each part of the regex, might it make sense to use extended syntax:
regex := "(?x)
    fo+ba+[rz] # foobar and variants
   |xy+zz+y # xyzzy ...
";
That way there's no need to list every part twice.
Lastly, I might have already plugged this to you, but I built User:Suffusion of Yellow/FilterDebugger for exactly this sort of work. It won't work if you're using any really tricky regex (possessive repeats, recursion, etc.) but should be good enough for this. Suffusion of Yellow (talk) 00:07, 11 January 2024 (UTC)
Thanks. It looks like the performance is the same as before because so few edits make it past the initial conditions (as expected).
I wasn't sure if free-spacing was supported, but that might come in handy. I'm hoping to revise several filters based on abuse expressions and 397 seemed like a relatively easy place to start. 384 looks like a good next stop and I suspect most if not all of the patterns can be shared.
I'll try to check out FilterDebugger sometime soon too. Daniel Quinlan (talk) 05:05, 11 January 2024 (UTC)

Frivolous topic

I have to wonder— is your username by any chance a reference to the I Ching calculator in The Long Dark Tea-Time of the Soul? 🌺 Cremastra (talk) 21:10, 10 January 2024 (UTC)

But of course. :-) Suffusion of Yellow (talk) 22:36, 15 January 2024 (UTC)

User script request

Hi. Now that I'm an EFM, I realize that it would be helpful to have a script that linked each warning/disallow message name (like "abusefilter-disallowed-WPWP-extendedconfirmed" on Special:AbuseFilter/1258 for example) to the relevant message (MediaWiki:abusefilter-disallowed-WPWP-extendedconfirmed). Any chance you would be interested in writing such a script? If not I can take a crack at it but you seem to have cornered the market on edit filter-related scripts so I figured I'd ask. Thanks, --DannyS712 (talk) 20:36, 18 January 2024 (UTC)

@DannyS712: I'm not really all that active right now; plus I have maybe 20 or so unfinished scripts that I'd rather get to first. If you want this done, go for it! Suffusion of Yellow (talk) 01:56, 25 January 2024 (UTC)
@Suffusion of Yellow okay, User:DannyS712/AbuseFilterMessageLinks.js --DannyS712 (talk) 09:06, 25 January 2024 (UTC)

Improving Filter 1045

Hi Suffusion of Yellow, do you think the additions of {{Close paraphrasing}} could also be added to line 9? Thanks Nobody (talk) 14:34, 7 February 2024 (UTC)

@1AmNobody24: Thanks,  Done. Suffusion of Yellow (talk) 22:20, 10 February 2024 (UTC)

Scripts++ Newsletter – Issue 24

MediaWiki message delivery (talk) 02:37, 1 March 2024 (UTC)

FilterDebugger

It looks like the implementation of ccnorm doesn't properly map '¡' to i. Special:AbuseLog/36337996 is currently shown as not a match. 0xDeadbeef→∞ (talk to me) 12:53, 11 November 2023 (UTC)

@0xDeadbeef: Thanks; looks like mw:Equivset has finally been updated! I'll rebuild with the newer version. Suffusion of Yellow (talk) 19:23, 12 November 2023 (UTC)
Should be fixed now. Suffusion of Yellow (talk) 19:40, 12 November 2023 (UTC)
works now, thanks! 0xDeadbeef→∞ (talk to me) 07:07, 13 November 2023 (UTC)
You might also want to keep an eye on phab:T357855. 1234qwer1234qwer4 20:30, 26 February 2024 (UTC)
Thanks, subscribed! Suffusion of Yellow (talk) 02:52, 1 March 2024 (UTC)