Langbahn Team – Weltmeisterschaft

Wikipedia talk:Bots/Requests for approval/Archive 9

Archive 5Archive 7Archive 8Archive 9Archive 10Archive 11Archive 15


I've opened a discussion concerning whether it really is a good idea to have this task around. Headbomb {talk / contribs / physics / books} 04:29, 21 June 2011 (UTC)

Snaevar-bot BRFA

I am requesting to reopen the BRFA for Snaevar-bot. As an answer to Hellknow´s question, my bot is a manual one and uses the -async parameter, running in the main article space. Finally, the Request needs to be changed so it reflects that the bot currently runs from nn.wiki, but not is.wiki.--Snaevar (talk) 18:51, 4 July 2011 (UTC)

Replacing start dates and end dates in certain templates with {{Start date}}

Discussion of my request for review at Wikipedia:Bots/Requests for approval/Snotbot 6 has stalled with nothing added after my post on 11 May. What now? Andy Mabbett (User:Pigsonthewing); Andy's talk; Andy's edits 14:32, 5 July 2011 (UTC)

Removing "Requests to add a task to an already-approved bot"

BRFA discussion transclusions

An issue was brought up at TTObot BRFA, which I have myself wondered about. Basically, quoting: "The bot has been moved from "Current requests for approval" with the authorization of a trial, making it appear no longer up for discussion." So why are we not transcluding BRFAs post-trial? Without gathering empirical statistics, I would think this is when the most comments and feedback would occur. I understand the main concern here is page size and loading times. But I don't think this has stopped other noticeboards. After all, low participation is one of BRFAs major problems. Lately, there are not as many open BRFAs and the ones under prolonged marination need to be closed anyway. So I propose we transclude at least "trial complete" BRFA discussions and possibly "in trial" ones. —  HELLKNOWZ  ▎TALK 15:15, 1 August 2011 (UTC)

If we want to transclude both "in trial" and "trial complete" discussions, all that need be done is change {{BRFA}}; if we want to treat the two differently, we would need to add a new value for parameter 3 (e.g. "Trial complete") and adjust all relevant documentation, bots, and scripts. I for one see no particular reason here to transclude one but not the other of the two trial states, as discussion "closes" neither during nor after a trial.
If anyone wants to preview what the page would look like, simply copy WP:BRFA and change "Trial" to "Open" in all instances of {{BRFA}}. Anomie 20:06, 1 August 2011 (UTC)
See a comment about the current instructions at BrFA here. --68.127.234.159 (talk) 21:23, 10 August 2011 (UTC)

I'm just going to be bold and do it, since there are no objections for almost 2 weeks. We only have 11 BRFAS right now, so it won't bloat the page that much. —  HELLKNOWZ  ▎TALK 15:03, 12 August 2011 (UTC)

WikiTransBot

Problems with DASHBot Fair-Use Resize Bot Task

I had previously discussed this with DASHBot's operator User:Tim1357 last month [2] but have not received an adequate reply. I also mentioned this issue at WikiProject Album where it was suggested I come here.

The bot's Approval states that it should be reducing the image width to 325 px. However, the bot's last run on 5th September 2011 [3] confirms that the bot is not functioning correctly. memphisto 14:41, 19 September 2011 (UTC)

Lightbot

I have blocked Lightbot for deployment of unapproved functionality — replacement of already manually converted units with calls to the {{convert}} template — behind misleading edit summaries. Last five edits: [4][5][6][7][8].

Headbomb is insisting that he approved this functionality, and seems unperturbed by his inability to point to any record of relevant requests, discussion, consensus and approval. Unless one credits that Headbomb would twist the truth to defend a friend, one can only conclude that Headbomb has taken to unilaterally approving bot functionality behind the scenes without community disclosure or input. Either way, Headbomb is the wrong BAG member to be acting on this. I am asking for other BAG members to step in and get this back on track.

Discussion is at User talk:Lightmouse#Dynamic conversion deemed bug. and User talk:Lightbot#Messages.

Hesperian 11:31, 7 October 2011 (UTC)

Hesperian is being completely irrational about this and is making ludicrous and misleading claims. I was extremely hostile to Lightbot and Lightmouse in general, and required Lightbot to demonstrates its edits with trials that went on for weeks, and I personally reviewed thousands of edits to make sure it fell within WP:BOTPOL and consensus. Let's examine those 5 sample edits.
    • [9] - This one it shouldn't have done per WP:COSMETICBOT
    • [10] - Fixed a ² instead of 2
    • [11] - Fixed a bunch of conversions with bad characters (such as ² instead of 2), and a few bad conversions. (This edit is however not clean, since the bot also did a few improper convertions, such as changing 186 m → 190 m)
    • [12] - Fixed an overly precise conversion
    • [13] - Brings the conversion in line with our recommendations for precision. I suppose it's a tad questionable since converting "about 3 miles" into "4.8 km" or "5 km" is up to taste, but it's hardly outside of scope.
Lightmouse is fully aware of those issues by now. He always listened to feedback, with quick replies and adjustments to the the bot. But none of this matters to Hesperian, who is simply hellbent on keeping the bot blocked, refuses to listen to reason, and will wikilawyer his will onto the Wikipedian community, despite being clearly WP:INVOLVED in the situation. Headbomb {talk / contribs / physics / books} 11:57, 7 October 2011 (UTC)


<shrug> You can see why I am asking for a fair-minded BAG member with a modicum of decorum to pick this up. The facts of the matter are

  1. Lightbot was running functionality for which there is no record, anywhere, that it was ever proposed, discussed or approved.
  2. That functionality was hidden behind misleading edit summaries
  3. I and others have stated reasons why we hold that functionality to be harmful; I am happy to provide those reasons again, when a BAG member who isn't Headbomb is ready to hear them.

Hesperian 12:14, 7 October 2011 (UTC)

Existance or editing

Hi guys,

I enjoy writing in python and also enjoy editing wikipedia and it's natural to want to bring those two things together - quick question first about the bot approval process - does is apply to all bots, or just ones that make edits? For example if I, to explore how the api works, write a python script that, once a day, checks a list of pages (a 'super' watchlist) and then emails me if they've changed - would that count as a bot that needed approval and its own account, or would that be a seperate thing? Failedwizard (talk) 13:33, 7 October 2011 (UTC)

To put another way for clarity - my understanding is that code that accesses the wiki without editing is ungoverned by the Bot policy, and I wonder if it's covered by another policy? Failedwizard (talk) 15:59, 7 October 2011 (UTC)
There is no way for us to detect whether you are running some sort of read-only bot, so trying to make policy about it would be pointless. You might wind up in trouble if your read-only bot attracts the attention of the Wikimedia sysadmins (no relation to on-wiki admins), but that has nothing to do with us here. Anomie 19:23, 7 October 2011 (UTC)
Thanks Anomie, no chance of me making a blip in the wiki-traffic :) - but it's nice to know, quick follow up question... is the any sort of bot-writer mentor program? Failedwizard (talk) 19:40, 7 October 2011 (UTC)
Not really. For questions on how to use the API, you could ask at the mediawiki-api mailing list or WP:BON; for questions about running bots here on enwiki, you could ask at WP:BON; for questions about doing something in your language of choice, find a support forum for that language or try Stack Overflow. Anomie 20:53, 7 October 2011 (UTC)

over Process

So I had a somewhat philosophical question about bots, and bot approvals. I've currently using AWB to diffuse a few of the Magic: The Gathering set categories. Since I still have a flagged bot account with AWB access that is currently dormant it came to my mind that I could just let my bot do it, however I was hesitant without approval. It seems overly bureaucratic to put in a request for ~99 edits, but at the same time I see the side that we can't just have bots running wild. I think what I'm trying to get at is that maybe there should be some kind of exemption for small scale (<100 edits) but repetitive tasks, being handled by an experienced editor and with presumed consensus, can be run on flagged accounts without formal approval. Thoughts? Crazynas t 22:42, 4 October 2011 (UTC)

Updated User:CrazynasBot#Explanation of Current Use to reflect this. Crazynas t 20:20, 25 October 2011 (UTC)

Recently I denied Snotbot 8, on the basis of a lack of consensus for the task to run at this time. Snottywong has requested that I reconsider my decision, and as such I would like some input from BAG/the bot community. Primarily my decision was based on this discussion which would indicate to me, that those at WikiProject Articles for creation, don't particularly support this bot ("I really don't like the idea of bots declining submissions, even if only for "quick-fail" criteria." User:The Earwig). As well as this, I don't think the discussion in the BRFA adequately addressed the concerns raised by the IP address. I don't think any benefit would have come from keeping the BRFA open or having another trial. In this case, I felt that the consensus was not very clear, and as such the BRFA was not the correct place to continue the discussion. Particularly regarding the WP:BITE concerns raised, I feel that it needed wider community input and discussion such as the village pump. Any thoughts from others, am I being too harsh/conservative? --Chris 02:40, 15 October 2011 (UTC)

Endorse Decline I echo The Earwig's point. AfC is a primary contact point between new, inexperienced users and Wikipedia as a whole. To have a bot, let alone one called Snotbot, come into the mix is not a good idea. Especially since often the person that declines a request, even a quick fail request, is later asked to help the new user fix the problem. A bot can't offer that kind of service. Sven Manguard Wha? 11:36, 15 October 2011 (UTC)
Chris, your analysis of the consensus looks sound to me, especially with regard to the issues of "unreferenced articles" (the bot declining requests on other grounds appears less controversial). I think there needs to be further discussion before this task could be approved. I note Snottywong's point that he thought - having been given the OK for a trial - that BAG had already given the task the go ahead. Accordingly, he spent time working on the bot that he might not have done had he realised that consensus for the task was still regarded as unclear. I think that needs addressing. It might be worth clarifying the position with regard to when trials will be approved so that other bot operators don't feel that their efforts have gone to waste needlessly. WJBscribe (talk) 11:46, 15 October 2011 (UTC)
Overturn pending community discussion of the task. There is no reason the BRFA couldn't remain open while additional community discussion is sought; while discussion could be sought to support possible reopening of the request, the fact that it is currently closed could have a chilling effect. In particular, I'd want to see whether current AFC human reviewers find the bot useful and whether they would be willing to stalk the bot's talk page and otherwise "supervise" it to some extent. It may be that the community will end up not supporting the bot after all, but then we'll know instead of just guessing based only on a proposal that has since become outdated. Anomie 12:56, 15 October 2011 (UTC)
The thought also occurs, the bot could also watch its talk page notices for some period of time and call for human assistance if there appears to be a reply. Anomie 12:56, 15 October 2011 (UTC)
Sven and Anomie: Note that the bot's talk page is a redirect to my own talk page, and so I will receive and respond to any requests for human assistance, comments, or questions about a declined submission. —SW— comment 17:01, 15 October 2011 (UTC)

The comments at Wikipedia talk:WikiProject Articles for creation/2013 6#AfC bot discussion seem to show that there is a level of support for the task, and as such I have reopened the BRFA. I will leave it to another BAG member to decide what needs to be done next. The other point however, regarding the trial, is perfectly valid and I think reflects a flaw in BAGs current design/operations. It stems from the fact that generally speaking members of BAG make decisions based on their own judgment, and there is little discussion between BAG members regarding decisions (apart from occasionally on the IRC channel). In most cases this is fine, however in cases like this, where judgment may differ, BAG members' decisions can obviously come into conflict or contradict each other. I think it might be an idea, to look into possibly implementing some BAG guidelines, to avoid conflicts like this in the future (the guidelines could also help for setting out some other unwritten rules among BAG, such as when to speedy approve a request). --Chris 11:25, 17 October 2011 (UTC)

That's probably a good idea in principle, but given such low BAG activity I'm not sure how well it will work out. I've already called out for more BAG opinions on that BRFA, but I got none until your closure. I'd be happy to have more input, but that so rarely happens. —  HELLKNOWZ  ▎TALK 11:45, 17 October 2011 (UTC)
  • I don't think we need to get more bureaucratic over the decline/overriding of same, and mention of "wheel-war" sets off all kinds of alarm bells. One good thing about BAG is, it rarely ends up in crazy-long AN/ANI/RfX-debates. If the bot as proposed is questionable, but you still think the core principle is a good idea, then it'd be best to think about how to re-frame a fresh proposal.
I dislike the core idea of auto-declining, because I see AFC as a possible solution to the horrors of excess templating/warning of our precious new users, which is due to excess automation (IMHO) instead of personal help. If the WP:ACTRIAL consensus hadn't been shafted by WMF, then we could have made progress in that direction. Further automating AFC - particularly for 'declines' - defeats that objective.  Chzz  ►  08:39, 17 November 2011 (UTC)

Ganeshbot

Can a BAG member please review this task and approve it for trial? Thanks. Ganeshk (talk) 01:48, 21 October 2011 (UTC)

Automating own account

I've been using Wiki.java to do small scale automated tasks a few times before. Now, for the cleanup of WP:IEP, I may need to do some minor edits to a large amount of pages (~500 each time). These edits will mainly involve tagging and untagging pages. I do use a throttle (5 seconds per edit), even though I have the noratelimit userright (from the acctcreator usergroup). When I used to do small scale edits, I used to check every edit, but now it may not be possible. I'm basically posting here to:

  • Ask for permission to do a large volume of automated edits
  • Ask if I don't have to check all the edits (I certainly will check some of them)
  • Ask what throttle I should set
  • Ask if I should create another account for this (I do have a legit sock that I can use, User:Manishsock, or I could create another account)

Thanks, ManishEarthTalkStalk 04:05, 4 November 2011 (UTC)

It is generally considered that if you choose not to check every edit yourself, then you need a separate bot account and a BRFA because you have lost a degree of oversight.
I don't know how easy it is to review edits in Wiki.java, but in (say) AWB it's not difficult to glance at the diff before pressing saving and you can still achieve rates of ~8epm.
Otherwise, feel free to file a BRFA and, if it's a simple request with clear consensus, usually a trial can be awarded early to keep things moving with full approval to follow. - Jarry1250 [Weasel? Discuss.] 09:48, 4 November 2011 (UTC)
Wiki.java doesn't let you check edits (Its only a framework).. I normally check them by opening all the edits in my contribs. My code (which uses wiki.java) allows me to set chunks, so it will wait for me to check the contribs every 10 edits or so. But still, for doing a thousand pages (and not just once, but multiple times possibly), I will probably need a bot flag then. Ok, I'll do the brfa and create the account if approved. ManishEarthTalkStalk 16:22, 4 November 2011 (UTC)

Fbot 9

I just saw an image tagged for non-free reduce via this task Wikipedia:Bots/Requests for approval/Fbot 9 that was just approved today. While the general idea of a bot to tag images larger than a certain size is a good idea, there was no notification at WT:NFC where in the past we have been rather against using automated bot tools to tag images (once tagged, reduction by bot is ok as per DASHBot 9, however). While the rational for the bot task is sensible, the problem is that there are legitimate cases to use images larger than the 160k-ish limit the bot suggests; even a non-free image at 1 megapixels could be potentially legitimate under NFCC#3a, and a bot will not know that.

I do note that Sven (who supported the bot) did bring to WT:NFC a discussion about aligning the appropriate image size to 160k per DASHBot (see [14]) but this was not in light of this tagging bot. Again, as in this discussion, our past consensus on NFC is not to have bots attempting to maintain image size.

I would recommend that the bot task be put on hold until we can resolve this with NFC policy. We do not want another BetaCommand situation coming about again, and while we're not talking about deletion of images, this needs a few additional things, such as either a flag to prevent the bot from tagging images with a specific oversized template, or having a warning about how to prevent the image from being resized, etc. --MASEM (t) 06:02, 29 November 2011 (UTC)

I've seen a number of unwarranted tags and started to reverse a few. I suggest we wind back all the new tags and remand for further discussion. The bot has added a lot of counterproductive clutter, which would be destructive to the encyclopedia if someone actually followed up by reducing the images without thoughtful consideration. - Wikidemon (talk) 06:39, 29 November 2011 (UTC)
None of the three you undid were unwarranted. Sven Manguard Wha? 11:40, 29 November 2011 (UTC)
All three images were just fine - and your mildly WP:POINTy reversion of my edits to prove the bot correct wasn't terribly helpful. All three images, which I uploaded, were very slightly above the arbitrary pixel size limit and not remotely in the territory of any real copyright concerns (and hence, reducing them does not help Wikipedia's free content mission). One is a free photograph of a non-free wall mural, in its setting showing a large field of grass below and sky above. You cropped out some of the grass to fit the recommended size. But the non-free image of the artwork already took up far less than one third of the frame, and doesn't remotely replace the original, deprive the artist of his commercial opportunities to win art commissions, etc. Cropping some of the (free image of) grass out of the bottom of the photograph is completely orthogonal to the question of copyrights and fair use, and arguably degrades however slightly the encyclopedic value of the photograph. Another example is a historical beer label that contains a copyrighted image that is well within the suggested limit, plus text on either side. Shrinking and resampling doesn't do any good nonfree use-wise, but it does degrade the legibility. Again, reducing an image of a beer label in an article about the historical significance of the beer label doesn't change the need for the original, it doesn't help the opportunity of commercial artists to sell their work, etc. It's just a lot of sloppy rule-based busywork WP:CREEP-by-robot. - Wikidemon (talk) 18:34, 29 November 2011 (UTC)
Re: the edits to File:Seasons mural view.jpg, just...wow. I don't think crop detracted from the image, but Wikidemon is right in that doing so in this case had absolutely nothing to do with the image being nonfree because the size of the nonfree content within the image was not affected at all. This is a good example of why bots aren't trusted to do this kind of work; even people apply rules sometimes without regard to whether such application actually furthers the purpose of the rule. Merely looking at the absolute size of the file without regard to what portion of the image is actually nonfree content is clearly the wrong way to go. postdlf (talk) 20:29, 29 November 2011 (UTC)
I was going to say "revert it", but the old version has been revdeled already. It most certainly should not have been revdeled already, seven days have not passed. Sven Manguard Wha? 04:40, 2 December 2011 (UTC)

It would be useful if the bot linked directly to the "Image resolution" section of WP:NFC. Colin°Talk 09:14, 29 November 2011 (UTC)

The number of cases where files shouldn't be resized is tiny. In actuality, a good deal of people think that their pet files deserve to be larger because they are important or have text or whatnot. That's not the case. I've already seen one person complain that their non-free file should be kept large, when in actuality there was no legitimate reason to do so. Text should be put into the description page if it's unreadable, and there are very, very few reasons to overrule the size restraints (mind you DASHBot sizes things bigger than the policy states they should be). I'm against shutting the task off. For the limited number of bad cases, they can be discussed on a case by case basis; many of them will wind up getting resized anyways, I think. Sven Manguard Wha? 11:33, 29 November 2011 (UTC)
Especially with the way guidelines are written in Wikipedia (unworkable if taken literally, usually workable as actually applied by humans ) I think that any "policing by robots" of literal interpretation of rules/guidelines is a bad idea. I think that a recognition of this is slowly emerging. North8000 (talk) 13:02, 29 November 2011 (UTC)
Even if the number of cases where the image should be retained at a size greater than N (N being 165,000 pixels) is very small, and that DASHbot 9 does reduce to below this since if an image is tagged that way, having a bot tag any image larger than N is problematic. First, and importantly, we have avoided absolute codification of what the expected image size is, with only the recognition that about 0.1 MP is typically for most cases. It is much the same with trying to codify the amount of non-frees that one is allowed to have, because the situation is judged on an article by article basis; as soon as you put an upper bound, people will work towards maximizing that use, so in other words if you start putting the size to 165,000 via this bot, people will upload images that just meet this size, even though in the past they could work with smaller images without a problem (for example, I think the size was prompted by 1:1 aspect image like album covers at 400x400, expect most covers have been at 300x300 or less; if we codify this via a bot, we will see editors upload 400x400 album covers even when 300x300 is ok. Secondly specifying N for a task like this that close to the "generally accepted largest size" is a case that will have people complaining about trivial enforcement of the NFCC, potentially leading us to a BetaCommand bot scenario again. (albeit, the operator here isn't also a problem).
I'm not saying that this is a bad task, but the current execution is not in line with how we at NFC deal with image sizes. It needs one or more things to be in play:
  • A means for a image uploader to opt-out the image from being checked via a specific template. Importantly, this puts the template into a category that can be humanly reviewed to make sure that justification for having a higher resolution image can be justified.
  • A means by which an editor can remove the non-free reduce tag if they disagree with it, with enough of a period between when this bot and DASHbot 9 gets a hold of the task to perform the reduction.
  • A warning template for, say, 7 days, before it is actually tagged with non-free reduced, or perhaps put into a human-reviewed category to actually apply the non-free reduce tag.
  • A limit size N that is much larger than just 160,000 pixels; I'd be more comfortable with something around 300 or higher, as to avoid calls of potentially trivial enforcement. --MASEM (t) 14:09, 29 November 2011 (UTC)
Agree with Masem's concerns. There are a number of cases where reduction is inappropriate -- for example, screenshots of games for old 8-bit machines with limited colour palettes (eg 1, 2, 3), where part of the value of the screenshot is to convey how this limitation is reflected.
I agree that a more explicit machine-readable "do not reduce" template could well be a good thing. But at the moment, the bot has no way to judge this, and no way even to take account of specific free-form comments that may have been made in the "low resolution?" section of the image use rationale.
We therefore need to be careful not to allow the actions the bot has already taken to eventually (when followed by automatic resizing, and then automatic revision deletion) lead to any unrecoverable information loss. Jheald (talk) 15:10, 29 November 2011 (UTC)
I think a delay between the two bot tasks is probably the way to go here. - Jarry1250 [Weasel? Discuss.] 18:48, 29 November 2011 (UTC)
I will add that from User_talk:Fastily#Fbot_issue.2C_maybe_minor this discussion that Fbot should work with the {{bots}} opt-out process (there may have a been a snafu with that one image but it looks like Fastily corrected it). I think, however, that's not all that needs to happen before this task should be (re)started. --MASEM (t) 18:54, 29 November 2011 (UTC)
As with the whole Beta image tagging debacle, there's a distinction to be drawn between new uploads and longstanding nonfree use. If we decide as a community that all images must be < N in size, unless the uploader specifically requests an exception, we can do that and the editors can deal with it in an orderly way. However, if we tag tens or hundreds of thousands of existing images, notifying only the editor who uploaded them years ago and is now long gone, there's no effective process for sorting through the tags and we have another bot indiscriminately hacking away at the encyclopedia. Plus we have unintended consequences whenever mass edits are done without considering specific context. One I just caught is free photographs of nonfree 3-dimensional objects. The photo as a whole is nonfree, but the nonfree object depicted in the photo is typically not an issue. Images >160K isn't exactly a problem screaming for a hasty fix. There's plenty of time to setup an orderly procedure, if it's even worth it. - Wikidemon (talk) 19:03, 29 November 2011 (UTC)
Wow guys, thanks for not notifying me of this discussion! :o I'll put the task on hold until we can obtain a clearer consensus on the use of bots in WP:NFCC#3b enforcement. -FASTILY (TALK) 19:45, 29 November 2011 (UTC)
Wait, no, someone did... Facepalm Facepalm just didn't read it until after I posted... -FASTILY (TALK) 19:47, 29 November 2011 (UTC)
Lemmee follow something. What's wrong with configuring the bot to simply flag the images? Perhaps the wording used should be made clear that the image might not comply with NFCC, but ultimately it's a human judgement call. I don't see anything wrong with flagging possible issues and marking them for review. Der Wohltemperierte Fuchs(talk) 20:12, 29 November 2011 (UTC)
The problem is that with Fbot 9 and DASHbot 9 , we have two robots: one that tags oversize images with non-free reduce, and one that resizes images that have non-free reduce on them. They do operate at different time scales, meaning that were someone to catch the tag and disagree with it, they can have it removed, but as I can't easily tell at what time DASHbot operates (beyond nightly runs) this means that the amount of time a user has to remove with a disagreed tag will be variable, possibly as short as a day. Tagging an image to be reviewed by a human to make the discretionary call that non-free reduce should apply, or forcing a reasonable time frame (two weeks) before DASHbot gets involved would help out a lot to prevent this from being seen as an assembly line process. But there's larger issues at play too (the maximum size aspect). --MASEM (t) 21:25, 29 November 2011 (UTC)

I think this bot as currently implemented is a good idea, assuming that people are aware that it's OK to undo a tagging when the bot makes an inevitable bad guess, like here (and that was only a bad guess because bots have no way of parsing a deletion discussion that came to the conclusion that the existing image size was OK.) I saw a few other images I watch tagged, and the tags were all correct. I've got no reason to presume either Fastily or the bot will edit-war over these tags. 28bytes (talk) 20:13, 29 November 2011 (UTC)

If the point is to generate a worklist for editor review, having the bot generate a page (like User:Fbot 9/nonfree images for possible reduction) with file links, pixel size, and links of articles that use the file would be less obtrusive than tagging the image file. postdlf (talk) 20:22, 29 November 2011 (UTC)

There is still an element of human review—the admin deleting the orphaned revision. The resizing could always be reverted if the admin finds it improper. — Train2104 (talk • contribs) 21:33, 29 November 2011 (UTC)

Let me make it clear after thinking about this: there are two issues at play.

  • The first is simple: the process. I've described that we effectively have two bots that unless caught will reduce all images larger than N to smaller than N. Without a point for human intervention this can be a problem. However, I have seen Fastily state that Fbot should now respect the {{bots}} out-opt , but I would like to see
    • Adding a date parameter to non-free reduce which Fbot 9 can fill in, or otherwise have it filled in by one of the bots that fills in the date for tags like {{cn}}.
    • Have DASHbot avoid reducing images under two weeks past that given date.
Doing this dispels concerns I have about the process. It still is "automatic" between the two bots, but there's enough chance for humans to intervein to stop it.
  • The second is the size picked, and that's where I have a bit more concern from NFC enforcement, being the image size. I have absolutely no problems with DASHbot having the 160k target for its images, but I do have concerns with Fbot using the same number. I don't know what the "right" number is for FBot but it would be at least double. Some might feel it higher. Again, this doesn't stop a human editor from saying that a 200k pixel image is too large and tagging it themselves; they are taking responsibility to assure that they know the image is too large. FBot can't. I'd like to have some discussion of what number Fbot should use - possibly based on a current inventory of "oversized" images out there now - to see what is the best choice to use. --MASEM (t) 21:44, 29 November 2011 (UTC)
  • Here's the problem: each file is different. Some files may be large in pixels, but have a low dpi. Others may be large but only contain a small amount of copyrighted material (such as an Internet Explorer window that displays a Wikipedia article). If you reduce that, then people can barely tell what the browser looks like! This is a task that is best left for humans to do, but if the tagging were to instead add a category such as "Non-free files that may fail NFCC criteria 3", that would make more sense. /ƒETCHCOMMS/ 22:29, 29 November 2011 (UTC)
I think that there is work for 2 bots here, but there needs to be a human between them.
  • Fbot should place images larger than N in a category and/or list them on a page for human review, if it edits the image page at all it needs to leave an edit summary linking to an explanatory page. It should also place a note on the talk page of all pages that use the flagged image, that links to the same explanatory page. The explanatory page needs to explicitly say what has happened and that it was done by a bot and so might be wrong, and that nothing will happen until a human reviews the image. It should also make it explicit (including a link) where you can comment that you think the bot is wrong and why (possibly the list page or the image talk page).
  • Once a human has reviewed the image they should either tag it for DASHbot or tag it as an exception to the general rule (if there are no legal issues, then this tag should categorise the image). If there has been any discussion about the image then it should be archived to the image talk page (or the archive linked to from the image talk page), and/or allow a parameter from the don't-reduce template to the location of the archive.
  • If the possibly too big tag is removed without the DASHbot or don't-reduce tags being applied then either Fbot needs to retag it or flag it somewhere for further review. I'm not sure on this point.
  • If an image with the don't-reduce template on is replaced by a new version that is larger than the tagged image then there needs to be a way of catching this. Maybe also do this if the new version is the same size?
If a significant backlog develops we can look again about linking the automation, of course.
Also worth checking for (but certainly not automatically reducing) are where the "low resolution" field contains something like "no" (I've seen this) or other phrases that seem similar (I'm sure those who do a lot of NFCC work will have seen many more than I have). Whether this is a job for Fbot or not I don't know, need to ask someone who knows more about bot coding than I do. Thryduulf (talk) 00:20, 30 November 2011 (UTC)
How many images are we talking about, and do we have community buy-in for editors to follow a bot on its rounds at whatever sequence and speed the bot will be running to check every image it tags? If so, what system will they use to make sure the images are all reviewed and don't get resized without review. If not, we don't really have a system. - Wikidemon (talk) 01:27, 30 November 2011 (UTC)
It would be helpful, to this discussion, to create a histogram of all non-free image files in buckets of pixelcount (say, in 50k groups), as to get an idea of how many images are "large" and what "large" actually issue to define a boundary where human review should really be called into play. I know that personally, considering no other factors on the image, a 500k or great image is one that I would as "is this #3b compliant?" and look more closely at for review, but I don't know if that desire would be there for a 250k pixel image. If we all agree that DASHbot's 165k reduction size is fair game, we need to figure the best bound where we would want human review to tag for further reduction, and knowing what we have now may be the best way to start. --MASEM (t) 14:30, 30 November 2011 (UTC)

The fixed limit of 160k does not work for all types of images. CD cover art is about 5 by 5 inches; an 80 dpi image is 400 by 400 pixels or 160k. Magazines may be 8 by 11 inches; a 45 dpi image is 360 x 495 or 178k. The bots assume that an 80 dpi CD cover is OK but a 50 dpi magazine is verboten. The image must be large enough to see the information that the "non-free use rationale" justifies. When the bot runs on the Non-free content page it will tag an allowed example image that has been there for over 5 years. The Bill Ripken baseball card is 384 by 534 pixels or 205k. -- SWTPC6800 (talk) 02:52, 30 November 2011 (UTC)


Response from Sven Manguard

I'll be blunt: I think you're all terribly wrong about what does and does not deserve to be exempt from resizing. I see a good deal of bad arguments, and bad examples of files that shouldn't be reduced, both here and on Fastily's talk page. I also think that people here want to have things both ways in that they want to keep the official standard for non-free images at 100,000 pixels, but that things slightly over 160,000 shouldn't be a problem. Pick a standard and stick with it please. I personally advocate for the 160,000 standard, but that didn't get much love above.

Additionally, I think that the concern about two bots and zero humans having input is incorrect; an admin does have to check over the resize before performing the revdel. Let's be honest here too, the only admins that would be doing this are the dozen or so that work in file related admin tasks, so they know what they're doing.

That being said, and assuming Fastily says yes, I think that I'd be okay with modifying the task so that Fbot instead of tagging them lists them in a category for manual review. I'll warn you in advance that very few people actually care about this (or files at all) enough to clear that category, and I'm one of them, so chances are a lot of things will still get resized; at the same time though, I do ever so much better reductions than the bot, because I crop out borders, blank space, etc., and also use a much less lossy compression algorithm and a higher dpi (96 vs 72) than DASHbot does.

Do note that I'm saying this with gritted teeth, because it's going to translate into dozens or hundreds of hours of my time to fix this. Sven Manguard Wha? 04:13, 30 November 2011 (UTC)

As to the issue of time, which was brought up in one of the discussions on the bot, the rules currently state that after the image is reduced, it waits for seven days before getting revdeled. I'm opposed to skipping the seven day period entirely (although I did propose getting rid of it months ago), and I'm opposed to elongating it. Seven days is plenty of time. Sven Manguard Wha? 08:15, 30 November 2011 (UTC)
Sven, can I suggest you take a deep breath, and take a step down from your high horse for a moment? The bottom line here is that whether an image is (say) 300x300 or 500x500 is in most cases utterly marginal to the legal fair-use position -- which is why over the years WT:NFC has been intensely relaxed over the question of over-sized images, wherever there's been even a vaguely good reason for them. So long as an image isn't thousands x thousands, there is really no particular urgency here.
I note that when this was recently discussed at WT:NFC, Masem (quite reasonably IMO) suggested a period of one year before necessarily deleting old revisions. Even Hammersoft suggested six months. The point is that for the good of the project wherever possible actions should be taken in a way that is inclusive of the community, so that anyone in the community can see what's happened, and feel that they have the chance to challenge the image modification and open it for discussion. But revision deletion essentially leaves anyone who's not an admin with a fait accompli, making the change revertible only with considerable tiresomeness, removing even the clue that such a change had been made at all for the passer-by to assess.
This is why Masem's stricture in that discussion that a change should be clearly allowed to stabilise first, before anyone goes even near rev-del is a good one. And that goes even more so if the on-article appearance of the image has been changed -- e.g. by the cropping you are so proud of above. As should be clear from Wikidemon's nettled reaction above, changing a carefully considered crop without discussion or consultation with the original uploader can be particularly sensitive -- more sensitive than just changing the resolution. So in such cases whisking the old image away to the memory hole in the bare minimum of time may be a particularly bad idea.
I appreciate that you want to improve the encyclopedia. But can I suggest this is really really small beer compared to, say, any widescale appropriation of historical images with real-world current commercial value in violation of NFCC #2. That is something with the potential to really do real damage to WP's reputation for fair dealing and doing the right thing, and is the kind of thing that ought to be audited incredibly tightly. This in contrast simply is utterly utterly marginal. Jheald (talk) 12:59, 30 November 2011 (UTC)
Your opening was so incredibly rude that I questioned whether or not I wanted to read the rest of it. Nonetheless I persevered (you didn't get any less rude), and I disagree totally with all but your last paragraph. There is no point in resizing images at all if there's going to be a year or six month's time inbetween when the file is resized and when the revdel happened. There are other ways of handling the situation. Consider the following:
We could have that, or a modified version of that, placed on images that have been resized. Unreasonably delaying something in the hope that the community will get involved isn't feasible. At least in my experiances, the only time anyone who dosen't work in files ever cares about what goes on in the namespace is when something happens to one of their uploads, and my experience has also shown that in many cases shouting comes long before trying to understand why what was done was done. It's sad, but you learn to deal with it. Sven Manguard Wha? 17:05, 30 November 2011 (UTC)
If we are employing automatic tools to reduce non-free images and the only notice of this reduction is on the image page itself, it may take a while for that change to be noticed by anyone if the uploader has long left the project. Even notification on the uploader page or the article page wouldn't help in that aspect. Having some period of reasonable time where an image was resized and someone eventually catches that it should not have been is needed even with human involvement in the tagging. That's why the period of 6 months to a year is a reasonable length for images. Heck, I'd be ok with 3 months even, but any shorter, along with the fact we have automatic tools involves, is just the catalyst needed for another BetaCommandBot/Resolution reconciliation problem that plagued use in 2007/2008. --MASEM (t) 18:49, 30 November 2011 (UTC)
I note that this has nothing do to with revdel, though as Jheald states, there are considers with it.
This strictly has to do with image sizing and as I note: between FBot 9 and DASHbot 9, it is a completely automatic process with no human check save for someone removing the non-free reduce tag before DASHbot gets to it.
Note that I'm not complaining about DASHbot operating once non-free reduce has been placed on an image, nor do I have a complain if there's enough time between Fbot 9 adding the tag and DASHbot's reduction for the interested party to remove that tag. (but this is where Jheald's point on revdel comes into play, since even if the interested party missed the message, they have some reasonable length of time to request admin help to revert it).
Importantly, I'm not saying that a human has to do the work of rescaling; Dashbot does just fine there. It's only determining if rescaling is really needed that should have a human review, or should have long-enough time period for editors to react and correct if we leave it to automatic tools. --MASEM (t) 14:36, 30 November 2011 (UTC)
Do you agree with my statement above "That being said, and assuming Fastily says yes, I think that I'd be okay with modifying the task so that Fbot instead of tagging them lists them in a category for manual review." and or the template above (it's big and yellow and I'm not copying it here )? If so we can resolve this with a minimum of additional unneeded hostility. Too much of that has happened already. Sven Manguard Wha? 17:07, 30 November 2011 (UTC)
I have no problem with a human coming into play for approving reduction. I also can support a solution where no human is involved as long as the period between the tagging and the reduction by automatic bots is a known, fixed quantity; the human solution is not required if the time in this solution is appropriately long enough (2 weeks seems about right). --MASEM (t) 18:49, 30 November 2011 (UTC)
What about where a solution where a human is involved, but no bot. There seem to be a lot of solutions proposed here, but I'm not sure we have a problem. Is the current human based system of looking and tagging so utterly hopeless that we need this bot? Maybe a more thorough use of the Template:Non-free reviewed could help, so non-free image, like Flickr images on the commons, do get a human check.-- Patrick, oѺ 00:03, 1 December 2011 (UTC)
Before Fbot 9 started, that technically was the case, but just not in any organized manner. If you saw a non-free image you felt was too large, you tagged it "non-free reduce" and DASHBot 9 would take care of it if a human volunteer didn't get to it first. However, with over 400,000 non-free images on WP and growing, there's no way to have a practical human review of each and every image. Thus, the idea behind Fbot 9 is sound, just the parameters and process it used rubbed against current human-based practice. --MASEM (t) 00:18, 1 December 2011 (UTC)
While all this is being sorted out- can we at least modify the template stuck on the affected pages (false positives) to let the poor content provider understand what has been proposed. If you look at one of my files which was re-uploaded 3 years ago after speedily delete for a perceived violation- one of my files that was squeaky clean then.File:Hawk Mill, Shaw 0014.png you will see that it has been tagged with the following wording:

This non-free media file should be replaced with a smaller version to comply with Wikipedia's non-free content policy and United States copyright law. According to Wikipedia's policy for non-free content, the amount of copyrighted work used under fair use should be as little as possible. In particular, non-free media on Wikipedia should not be usable as substitutes for the original work. A high-resolution non-free image is questionable fair use and may be deleted per Wikipedia's copyright policy.The size of an image may be reduced in an image editing program or by saving and re-uploading a suitably sized thumbnail. Once a reduced version of this file has been uploaded, please replace this template with {-{Non-free reduced|17:06, 1 December 2011 (UTC)}-}.

Which doesn't reflect the conversation here, gives the wrong advice and is threatening. It doesn't say that the file has been tagged for resizing just that it is on a hit list. Nowhere does it say what sixe triggers the bot, the onus of the action is put on the uploaded when actually do nothing is an acceptable option. If I am right, it should say

This non-free media file has been tagged as unnecessarily large it will be automatically reduced in size to comply with Wikipedia's non-free content policy and United States copyright law. According to Wikipedia's policy for non-free content, the amount of copyrighted work used under fair use should be as little as possible. In particular, non-free media on Wikipedia should not be usable as substitutes for the original work. A high-resolution non-free image is questionable fair use and may be deleted per Wikipedia's copyright policy. Currently files lager than 160 000 pixels (405x405) will be adjusted downwards.The size of an image may be reduced in an image editing program or by saving and re-uploading a suitably sized thumbnail, or by ignoring this message and allowing a remote program (bot)called DASHBot 9 to do the task. In rare cases tag the image {-{bot|Rbot=ignor}-}or {-{Rbot|stop|For manual attention|The specific reason such as: avoiding moiré patterns}-} Once a reduced version of this file has been uploaded, please replace this template with {-{Non-free reduced|17:06, 1 December 2011 (UTC)}-}.

The default behaviour must comply with policy- but not attempt to reschedule the lives of prolific content providers who must constantly protect every page they have ever written against these change of POV. The 405 squared rule was a new one on me, and in my case will just introduce moiré distortions in the images that I have uploaded. Don't get me wrong- I am strongly in favour of ditching the junk, and would love a bot to follow every edit I do and correct my mistakes- but the tags do have to be clear and there must be a simple way to request they are reversed. --ClemRutter (talk) 17:06, 1 December 2011 (UTC)

Here's my two bits; I know almost everyone's aware of the background info I'll include here, but bear with me.

  • The only *rule* is "images should be rescaled as small as possible to still be useful as identified by their rationale." No collection of bots is capable of correctly implementing that rule. If bots are involved in identifying images for downscaling we need to carefully avoid generating large numbers of false positives.
  • The *primary guideline* for identifying images which should be able to remain useful if downscaled is "images where one dimension exceeds 1,000 pixels, or where the image size approaches 1.0 megapixels or more, will likely require a closer review to assure that the image needs that level of resolution."
  • The currently-stated *ideal for most common image uses* is roughly 100K px. This is good for most uses but seriously unrealistic for many others. Bots can try to enforce rules but definitely should not be going around trying to enforce ideals.
  • Under the assumption that there's a bot going around automatically downscaling images which have been tagged for downscaling, we can have two threshholds: one above which images will automatically be tagged for reduction (and which thus will be reduced unless there's human intervention) and one above which images will be put into a category for humans to review and decide which to tag for reduction.
  • I think most everybody would be OK with the bot automatically tagging images which exceed the primary guideline's "1000px in one dimension/ 1 megapixel" rule as long as there was a way people could, after "closer review," manually prevent reduction.
  • I can see some legitimate arguments for a lower threshold for automatic tagging, all the way down to a threshold of ~300,000 pixels or so- captures of 640x480 computer displays, of NTSC and PAL TV and DVDs, etc should normally not be at full resolution, even though they come in well under the primary guideline. But the lower you go the more of a mess we have with false positives, and below ~300K px that will be totally unacceptable.
  • If you have too low a threshold for putting images in a human-reviewed category, you'll exhaust the time and patience of those going through them and tagging.
  • My proposal: automatic tagging at >=345,600px (DVD video still); human-reviewed category for >240,000px (avoid having to have them sift through previously-downscaled 3:2 images @600x400 and EGA-res graphics). Yes, some images lower than 240K px are problematic, but let humans tag those manually while going through the encylopedia. Prodicus (talk) 22:07, 1 December 2011 (UTC)


  • No Quoting "Especially with the way guidelines are written in Wikipedia (unworkable if taken literally, usually workable as actually applied by humans ) I think that any "policing by robots" of literal interpretation of rules/guidelines is a bad idea". Don't approve bots for this purpose. Real human beings worked on these articles, real human beings have to make the judgement calls. A bot can, efffectively, silence consensus; a bad bot can alienate thousands in the time it would take a single person to alienate even one. Bots make bad enforcers of gray policies.Randomcommenter (talk) 23:21, 1 December 2011 (UTC)
Just chiming in here, I really don't like the idea of bots talking to each other. Even with a delay between the actions, there's no guarantee that it will every be human-reviewed, and, due to the ninja-ness of bots, the tagging may go unnoticed thereafter. I feel that FBot should tag images with a different template (or the same template with different parameters), which clearly indicates that the image was tagged by the bot, and puts it in a category ignored by DASHBot. Humans can then move it to the normal category with the normal template if the feel that Fbot was justified.
Or, just replace the task with a toolserver script that generates a list of all images greater than xyz size which are not tagged, and provide this script to the community. ManishEarthTalkStalk 04:58, 2 December 2011 (UTC)

DPL bot 2

I just wish to note here, that I think that Wikipedia:Bots/Requests for approval/DPL bot 2 was rushed through too quickly, without time for appropriate checking and evaluation.

I know that some brief testing was done, and it does indeed show some were fixed. What it cannot do though, is show how many people were irritated by alerts over something that is, frankly, fairly trivial.

There were many points raised in an earlier discussion, Wikipedia talk:Disambiguation pages with links#Update and Request for Comment - User dablink notification. I have a feeling that these notifications will irritate quite a large number of users; I think we should be cautious. A single added disambig link isn't a problem - I realise the notes are supposed to be friendly, but I don't think they're particularly clear - and a message of this type from a bot will look like some kind of 'warning' regardless of the intent.  Chzz  ►  06:47, 17 November 2011 (UTC)

Then why didn't you raise any concerns on the brfa? From my evaluation Wikipedia_talk:Disambiguation_pages_with_links#Update_and_Request_for_Comment_-_User_dablink_notification showed that there was support for the bot --Chris 07:50, 17 November 2011 (UTC)
Quite simply, because I didn't notice it; it began on 10th, was trialled on 14th, and approoved on 17th. I'd been monitoring the discussion on that DAB talk page, but considering all the concerns raised there, I assumed it was still up for debate; I didn't notice that a bot was starting up, until I happened to see it on a user talk page and investigated where it had been approved. If that's just my fault for not seeing the proposal, then so be it; I just wanted to comment here, so that I could state I have some concerns and reservations - this seemed the most appropriate place, now that it's already approved.  Chzz  ►  08:31, 17 November 2011 (UTC)
Ok, then let's focus on your concerns. Admittedly, the sample size for the trial was very small, however I think it showed the bot does have some merit, and unless there is a large number of complaints I think the bot should stay approved. That said your concerns also have merit. Firstly, the notice the bot sends. That is fairly easy to fix/change. Do you have an suggestions as to what would be more appropriate/clearer/friendly? Secondly the irritation factor. I think the impact of this can be reduced by, making it easier to opt-out. At the moment the bot respects {{bots}}, however, it should be easier and more user friendly to opt out, (e.g. a note could be placed at the end of the message, informing users how to opt out). Likewise, steps could be taken to reduce the level of "spam". Perhaps only leaving messages for users if they have not received a message from the bot within X days? Do you have any suggestions on how you'd like to see this dealt with? I'll drop a line at at JaGa's talkpage as well. --Chris 09:25, 17 November 2011 (UTC)
Those are all good ideas. I'll have a think, and respond more ASAP.  Chzz  ►  10:55, 17 November 2011 (UTC)
I disagree with the proposition that "a single added disambig link isn't a problem". We built up a collection of well over a million such links primarily through the addition of a single added disambig link repeated many times over, and hundreds more are added every day. We have nevertheless been able to cull these numbers through an arduous slog that I'm sure most of us would not want to continue replicating. bd2412 T 14:30, 17 November 2011 (UTC)
I think Chzz makes a valid point. While the discussion at WT:DAB did indicate support for the task, it also only involved 3 or 4 editors. I'm not exactly sure what my position on the bot is just yet, but I believe that such a task does need to be performed with great care to avoid becoming very annoying. It's also somewhat unprecedented unless I'm unaware of the precedent; what other bots are out there which will post an automated warning on the talk page of any user (including experienced editors, not just noobs) when they make an editorial mistake, e.g. when the only guideline they've violated is WP:MOS. This could become a slippery slope. What other mistakes would we be allowed to automatically template the regulars for? Improperly formatted citation templates? Creating an article with no categories? Adding external links to the body of the article? If not, then why are these problems less important that dab links, to the point that we can't also warn people about them?
Again, I'm not taking a position one way or another, but I think it would benefit everyone to have a much wider discussion about this task than 3 or 4 users. And, having the discussion at WT:DPL (a very obscure location) pretty much guarantees that you're going to mostly get participation from editors for whom dab links are a pet peeve, or a very important issue. This may not represent the general attitude among most or all editors. My vote would be for a proper RfC or at least a VP thread. —SW— spout 15:30, 17 November 2011 (UTC)
I think experienced editors know when they have created an article with no categories, and can see when a template is broken. The insidious thing about disambiguation links is that they can be made very thoughtlessly. Someone types in an article that so-and-so died in the spring, and thinking of a lovely springtime, links spring without realizing that the term is ambiguous and could lead the reader to think that so-and-so died in a natural aquifer. If I made that sort of mistake - as I'm sure I do from time to time - I'd want to be told about it, as I, having added the link, would probably be in the best position to fix it quickly. Wouldn't you? bd2412 T 16:01, 17 November 2011 (UTC)

"Trivial" I also have to object to this. Anything that degrades the user experience is a problem, and should be fixed. Even something as simple as clicking a wikilink to British and getting a disambig instead of an article about British people is a negative user experience. But even more importantly, it often isn't obvious which article the author intended, which is indeed serious, because then a reader hits a dead end in their navigation. If we can get the author to let us know what they meant while they're still around, the article benefits. Also, by asking willing editors to clean up dablinks they were simply unaware of, that frees up resources on the DPL project to tackle the truly difficult dablinks instead of fixing a hundred links to British. (There are still over 650,000 links to disambiguation pages to fix.)

"Some were fixed" I'd say raising a 10% fix rate for the control group to 60% for the messaged group is a phenomenal result!

The message Regarding the message, I'm happy to make any changes. I wanted to make it as short as possible, not feel pushy, and give a quick explanation of what a disambig is. I would welcome any help in making it more coherent.

Irritation to users In the original brfa, I explained why I think the messages would be more irritating to Maintainers than Content Creators. Accordingly, I've done what I can to avoid messaging Maintainers, and have made DPL bot exclusion compliant.

Not knowing how many were irritated I doubt that. Wikipedians are not very good at dishing out praise, but if you upset them, you can bet you'll hear about it! The response has been very mild so far. I am worried about filling people's talk pages up with messages, though. I want to strike while the iron's hot, and ask the user for help while the work is fresh in their memory, but I don't want to make people feel harassed. There's always opt-out, of course, and I'm open to suggestions. --JaGatalk 15:56, 17 November 2011 (UTC)

Templating the regulars This isn't a warning template, and new editors are excluded, so I'm not sure that essay really applies. I'd hate to throw out this opportunity for fear of antagonizing some editors who can simply opt out of the messages. I see your point though, and agree we need to tread carefully.

Dangerous precedent MOS mistakes are different in that they don't block the user from gaining knowledge. In the brfa, I used an example of The Feynman Lectures on Physics, which has linked to Magnetic resonance since February 15. Now, imagine a user is reading about these lectures, and wanting to learn more about this thing called magnetic resonance (very possible, these lectures were intended to introduce non-physics majors to physics). The user clicks the link, and what happens? They get a "Did you mean..." page! Well, heck, they aren't going to know. They'll be confused, and probably drop that line of inquiry. So this isn't just a "pet peeve" issue. It's an attempt to make the information we offer to the public as complete and accessible as possible. --JaGatalk 16:21, 17 November 2011 (UTC)

I agree with most of what both of you are saying, but I still think that this would benefit from more discussion. The discussion that ChrisG linked to above was not a proper discussion. It's like claiming you have consensus for removing the "delete" function on Wikipedia because you had a supportive discussion on WT:ARS. This is a more complex issue than it appears on the surface, and I think the Wikipedia community deserves a chance to comment on this before getting bombarded with templated warnings on their talk page for relatively minor MOS violations. —SW— babble 19:43, 17 November 2011 (UTC)
I'm fine with more discussion, but I do need to point out they're neither warnings nor merely MOS violations. The messages based on WP:INTDABLINK, now, those could be called MOS violation messages, and I'm going to remove those from DPL bot's purview. They caused some confusion among recipients, so I'm looking forward to taking them out. --JaGatalk 20:55, 17 November 2011 (UTC)
You can't "violate" the manual of style. The fact that you think such a thing is possible makes me very uneasy, and indicates that you are missing the point that is being raised here. Gigs (talk) 15:19, 2 December 2011 (UTC)

Status

Just to make it clear, I'm not going to run the bot while this discussion is going on. --JaGatalk 19:59, 17 November 2011 (UTC)

That's extremely understanding of you. I really only wanted to make a little comment; I'm sorry I appear to have stirred up a whole hornet's nest. I know full-well that your intentions are great, and really I'm just hoping to avoid possible irritation. The suggestions Chris G made for improving it sound excellent - ie making it a bit clearer-message, making opt-out very easy, and not bombarding the same user too often. I do also think it opens a wider question - but that's no bad thing; there are other comparable things we could tell users about. We could think about telling users when they add dead links, or deprecated/malformed templates, or or multiple links in one section, or...well, all kinds of stuff, probably. Whether we should could be an interesting debate. I suppose mostly, I just have a feeling that certain experienced and good editors will explode if someone dares tell 'em about these things; but I could be pleasantly wrong.
With regards to the message; let's look at an example: [15]
My first thought is that "this message is to let you know about" is a bit unnecessary.
Also, I think that saying "Yelena Bondarenko (BYuT politician) was linked to People's Deputy" is a bit confusing, because it sounds like it's a redirect or something - rather than one (of many) wikilinks within that article.
So I wonder if it could be e.g.
Hi, In Yelena Bondarenko (BYuT politician), you recently added a disambiguation link to People's Deputy.(check | fix with Dab solver) Such links are almost always unintended, because the target page is merely a list of "Did you mean..." article titles. For more information, see the FAQ or drop a line at the DPL WikiProject.
The immediate problem there is, dealing with several links, I imagine.
And then, I think we need a "To stop receiving these messages, <do this>" - where 'do this' is something very easy and clever which I haven't quite thought of yet; possibly a pre-loaded edit to an opt-out page, or something.
And what about this 'don't notify anyone too often', JaGa; could you think about implementing that? To keep track of all users you've <recently> notified (where <recently> is a variable, such as 30 days)?
Note, I'm just throwing out ideas here; I hope others can improve on them!  Chzz  ►  19:19, 18 November 2011 (UTC)
Minor comment. The link in the description to "disambiguation link" doesn't really clarify what a "disambiguation link" really is, it merely directs one to Wikipedia:Disambiguation which provides a lengthy (and perhaps overwhelming) list of all things disambiguation related. In my mind it would be preferable if this link pointed to something more specific like WP:INTDABLINK. I also feel the use of the term "disambiguation link" verges on being Wiki-jargon, "links to disambiguation pages" is much clearer to someone who isn't familiar with the term. And if we did use this wording instead, a link to Wikipedia:Disambiguation from disambiguation pages would be more appropriate. France3470 (talk) 19:40, 18 November 2011 (UTC)
Sorry if it's been discussed, but if this is random, an opt-in to receive a message every time you create a dab link might be popular as well. "Click here to add your name to the list." As long as people realize it's ok to remove the messages from their talk pages and that there's no social stigma involved, you'll probably see quite a few people signing up. Some people really don't want the world to know they've made an error (however minor it may be), while some people are perfectionists and will embrace your messages. Give them the option, and it'll be smooth sailing. Best of luck with the project, PhnomPencil (talk) 21:43, 18 November 2011 (UTC)
These are all good suggestions. @France3470: you're right about the "disambiguation link", er, link. I'm not sure what would be better, but I do know it could be better. We'll work on it - maybe even put something into the FAQ to link to. @Chzz: I really like the "don't notify too often" idea. I'm going to figure out how to do that, definitely. 30 days is high - I want to contact the editor while the work is still reasonably fresh in their memory, and avoid overwhelming editors with huge messages (which a month's worth of dablinks would do in many cases). Also, on the message, I might go with the current format for multiple dablinks (the minority) and your format for one or two dablinks (the vast majority). I certainly want to reduce the real estate DPL bot gobbles up, and that would help. @PhnomPencil: I like the opt-in daily message idea, and also the idea of letting people know it's OK to remove the messages. The opt-in coding would take some time, though, so I would want to hold off on that for now.--JaGatalk 21:51, 18 November 2011 (UTC)

Changes implemented

OK, here's the new features (you can see samples at User:JaGa/Sandbox):

  • Messages are not sent for most WP:INTDABLINK links. To accomplish this, I've exempted hatnotes, See also sections, and all dablinks coming from other disambiguation pages.
  • If a user has been messaged, they can't receive another message until 5 days have passed. (So if they received a message on Monday, the earliest they'll receive another will be Saturday.)
  • Several message tweaks:
    • Created an "inline style" that is used if the user has added no more than three dablinks to a single article. Standard list style otherwise.
    • Cleaned up and condensed the language more or less along Chzz's and France3470's suggestions.
    • Added an opt-out explanation link.
    • Added a "it's OK to delete this message" note.

--JaGatalk 16:31, 3 December 2011 (UTC)

It's been running well against my Sandbox for a few days now, so I plan on resuming normal service on Tuesday. Thanks all for the help; I think the changes have improved the bot. --JaGatalk 17:09, 5 December 2011 (UTC)

DPL Bot

The User:DPL bot DPL bot was used at Captain Bill Bellamy MC and although it did what it was supposed to (largely) it also replaced every [] and . with # leaving the article in rather a mess. I suggest this bot be examined again. SonofSetanta (talk) 13:20, 2 December 2011 (UTC)

DPL bot only had to do with the tagging. The problem came from Dab Solver, but the bug seems to have been fixed. I'll look more into this. --JaGatalk 15:44, 2 December 2011 (UTC)

Requesting approval for non-bot edits

It seems to me that the situation is unclear with respect to the growing number of situations where non-bot editors are instructed to seek BAG approval for large-scale tasks; see WP:MASSCREATION and now possibly Wikipedia:Arbitration/Requests/Case/Betacommand 3/Workshop#Large scale editing prohibited except under conditions.

I don't think too much needs to change, though. We'd want a different version of Wikipedia:Bots/Requests for approval/InputInit that didn't ask for "Operator", "Exclusion compliant", or "Already has a bot flag", and asks "Manual or Script-assisted" and for programming language/source code "if script-assisted". Then we'd want a new version of Wikipedia:Bots/Requests for approval/Instructions for bot operators. Would we want new categories too, instead of Category:Open Wikipedia bot requests for approval and such? What should they be called? And then, of course, I'd have to update AnomieBOT.

Any other thoughts? Anomie 20:51, 6 December 2011 (UTC)

I don't think we need go overboard. These are going to be infrequent and can problem be dealt with on a case-by-case basis IMHO. - Jarry1250 [Weasel? Discuss.] 21:30, 6 December 2011 (UTC)
The big problem I have is that AnomieBOT has issues trying to deal with a non-bot BRFA. Anomie 03:39, 7 December 2011 (UTC)
Ah, I see. What problems does it have? - Jarry1250 [Weasel? Discuss.] 10:53, 7 December 2011 (UTC)
If the person deletes "Operator", or fills it in with "n/a" or something, it complains. And then it will complain that the 'bot' is editing its own BRFA. And then it will complain that the user is editing before being approved for trial. Having some sort of flag in the BRFA to say "this is a non-bot request for approval" can fix all that. The rest is just cleaning things up for the 'customer', so to speak. Anomie 12:08, 7 December 2011 (UTC)
I suppose, as you say, the best resolution for now would be some kind of zero output template. - Jarry1250 [Weasel? Discuss.] 12:10, 7 December 2011 (UTC)

Blow up a wiki

This brfa created a bit of a problem, mainly because the bot had messed up a lot of edits in a 5000 edit trial, and there was a bit of a dispute over the need of nuking the edits.
I feel that all this could have been avoided if the bot had worked on a mirror wiki, containing the pages from the affected category. I propose that we make this an optional process to limit collateral damage from bot tests. There are three different ways i can think of doing this:

  • Whenever there is a large trial, ask the bot op to set up their own wiki on their own computer, use Special:Export and Special:Import/Import.php to get the relevant pages(atleast a chunk of them) onto the wiki, and run the trial on that wiki. Note that the bot will still query enwiki/toolserver for data, but all edits will be made to the temporary wiki. Of course, if the op does not have apache+php+mysql installed, this will be a long process. It's better to suggest this to users. This also has the other downside that community members cannot check the edits.
  • Set up a few test wikis (ten or so), giving BAG members crat rights. These wikis can be rationed out for a bot test if the BAG/user feels that a brfa needs one. In this case, the BAG will give importer rights to the op and bot rights to the bot on the wiki. The user will then import the relevant pages, and run the trial (again, data will still be fetched from enwiki, but edits will be done to the mirror wiki). If the bot messes up, it's not a pressing issue to fix it as the edits aren't live. The community can check these at their leisure. Once the trial gets over, it should be an easy matter to delete all pages from the wiki (Nuke the imports and bot edits, or use an extension that wipes the wiki.. Not too hard to write...)
  • Create a toolserver tool where the bot can 'login', and edit pages. The ts will store the edits, and display diffs for each bot. The tool will basically have a dummy api, with only login, edit, revert, and move options (one can also have the api redirect other fetch-type requests to the enwp api). In the other two methods, because of the fetch-from-enwiki-edit-testwiki thing, there can be discrepancies in the edits (for example, if someone added some content on enwiki after the import but before the trial, then it will look like the bot added the content after it makes an edit). Here, that issue doesn't come into play.

For a bot op, implementing these changes isn't too much work; all you have to do is encapsulate your edit and login methods, and have it edit the relevant wiki depending on some flag.
I doubt that it will be too much trouble for the wmf to set up the test wikis (I don't know about this, though), and cleaning up after a trial isn't that hard.
Note that all of this should be optional for the bot op, it's mainly a way for the bot op to run a trial without having to worry about collateral damage.

Just wanted to put this idea out there, make of it what you will! ManishEarthTalkStalk 05:48, 16 December 2011 (UTC)

Bot operators are responsible for fixing their edits. I think not just for me, but for a lot of operators, putting bad edits out is not just a pain, it's an embarrassment. One way of avoiding this embarrassment is a lot of pre-edit trialling, looking at what it would have done to a given page to check regexes, etc. Another is to set up a personal wiki. But there's no *need* to if you can't or don't want to: this is a wiki and the BRFA process is forgiving; if you fix the problems with your edits, all previous problems are quickly forgotten. And I would stress that most bot operators' opening gambit isn't a 5000+ page trial. In other words, I think the present system provides lots of testing options, and time and energy need not be devoted to creating more IMHO. - Jarry1250 [Weasel? Discuss.] 08:52, 16 December 2011 (UTC)
Nice ideas (although the first one at least seems unworkable to me), but I don't really see them as being necessary. As Jarry says, a 5,000 edit trial is not something BRFA normally asks for, and in the case you pointed to it only occurred due to a miscommunication, where the bot operator didn't understand what was being asked for by BAG. In general, due to the small size of trials and (certainly in most cases) how easily problems can be reverted, I don't see the issue with doing them to the actual wiki. In the past when we have wanted a dry run done, the results are generally just posted in the bot's or operator's userspace. Doing trials on the actual pages can also have the benefit of alerting users watching the relevant pages to the bot task, if they have not already been informed. - Kingpin13 (talk) 18:30, 3 January 2012 (UTC)