Wikipedia:Administrators' noticeboard/CXT

  • WP:AN/CXT

CXT refers to the content translation tool.

List of outstanding pages

Content translator tool creating nonsense pages

Initial Discussion

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Could admins (and editors) keep an eye on this feed, please?

Alternate view at the NewPages feed. xaosflux Talk 14:35, 26 July 2016 (UTC)[reply]

An editor was blocked today for creating too many nonsense articles using this tool (Link). Apparently, this superb piece of software has been rolled out on Wikipedia with the intent of making it easier for people to translate articles into a Wiki that an article currently doesn't exist in. Apparently 100,000 articles have been "written" using this tool; let's hope that most of them were better than the utter rubbish that the tool is flooding enwiki with. Astonishingly, a number of Wikimedia people were celebrating this mass creation of crap.

Why? Because this "tool" has had the effect, certainly at enwiki, of creating large amounts of nonsense - either unsourced, non-notable, or in completely unreadable gibberish. Here are the last 500 edits tagged with it, though this is unrepresentative as many of the articles created have been speedy deleted or redirected on the spot.

This tool is utterly useless. It is creating masses of work for new page patrollers, editors and admins. It needs to be turned off, or at least only made available to experienced editors (clicking the link above, you will see some perfectly acceptable edits by people such as Rosiestep). Black Kite (talk) 21:20, 25 July 2016 (UTC)[reply]

Definitely needs some fixes like warning users if there is already a page at the target to prevent things like this unhelpful "rewrite" of the Hannah Arendt page [1]. There also should automatic conversions of parameters in citation templates to avoid the red mess like the bottom of this page [[2]]. Maybe access to the tool should be given out like some of the user rights at WP:PERM?---- Patar knight - chat/contributions 21:52, 25 July 2016 (UTC)[reply]
  • No. You need to actually disable it. Anyone can copy/paste from google translate. They shouldn't. Machine translations are dangerous because as well as producing horrible Yoda-speak that's incredibly time consuming to clean up, they can also misrepresent or even invert the meaning of the original text.—S Marshall T/C 21:59, 25 July 2016 (UTC)[reply]
  • Absolutely. Here is the link to the tool, by the way. "By providing a more fluent experience, translators can spend their time creating high-quality content that reads naturally in their language." Hahahaha. Further, there's no bar on creating articles in the wrong language. I deleted Polidaktili earlier, which was a copy of the English version of Polydactyly. And a number of articles are (bad) versions of ones we already have, under different names. Black Kite (talk) 22:03, 25 July 2016 (UTC)[reply]
  • I agree that it should at the very least be restricted to people who can demonstrate a use for it. I can see it would be genuinely helpful for bilingual editors to bring a rough draft across with all the citations in place already, but it shouldn't be used to bulk-create gibberish. Also, it probably ought to output into some kind of draft space until the articles are in a fit state to move across, rather than directly into article space. With my cynical hat on, the WMF probably got a grant for this and need to demonstrate to the donor that it's actually being used, regardless of how much hassle it creates for the peons. ‑ Iridescent 22:22, 25 July 2016 (UTC)[reply]
  • (edit conflict) I notice that the blocked editor has over 3,000 edits on es-wiki but only 8 on en-wiki before s/he started to use the Content Translator tool last month. It seems likely that s/he is not a native English speaker. If this tool is to be used at all it should only be used by people with good enough English to be able to clean up the result. That could be achieved by making its use subject to the granting of a user right, like AWB; but I am inclined to agree with S Marshall that it should be disabled as causing more trouble, disruption and garbled articles than it is worth in extra content. This seems typical of the WMF's "quantity rather than quality" attitude, which is arguably appropriate for smaller WPs, but not here.
Does anyone know what translation algorithm is used - is it Google Translate, or have they developed their own? JohnCD (talk) 22:30, 25 July 2016 (UTC)[reply]
I believe that strictly speaking we also need to go through all those creations adding {{translated}} to the respective talk pages, which will automatically sort them into correct "pages translated from the xxx wikipedia" categories. I expect we'll be needing a bot.—S Marshall T/C 22:39, 25 July 2016 (UTC)[reply]
I have browsed some entries and while some are not problematic at all, my overall impression is that this tool (and it's usage) gives the expression "shit storm" a whole new meaning. Kleuske (talk) 22:48, 25 July 2016 (UTC)[reply]
The are tagged and can be tracked, this utility isn't coming "from" enwiki - we could possibly stop these edits with the edit filter (or stop them for new users...) but we would need consensus about how to deal with these first. — xaosflux Talk 23:24, 25 July 2016 (UTC)[reply]

Amir E. Aharoni please read this discussion. -- Magioladitis (talk) 23:34, 25 July 2016 (UTC)[reply]

  • I posted over on mediwiki about this too, not sure if we will get anyone. — xaosflux Talk 23:35, 25 July 2016 (UTC)[reply]
    • Thanks for bringing this to our attention. Please allow us some time to go through the conversation to understand the problems being reported. --Runa Bhattacharjee (WMF) (talk) 04:15, 26 July 2016 (UTC)[reply]
  • Anyone reading this may also be interested in phab:T138711 - a request to "Enable Machine Translation in English in the content translation tool". — xaosflux Talk 00:22, 26 July 2016 (UTC)[reply]
    I put a "community consensus needed" tag on that phab request, it will at least require someone justify why it isn't needed to remove. — xaosflux Talk 04:30, 26 July 2016 (UTC)[reply]

I would just mass-delete everything created with the tool and block edits from it if the WMF doesn't say anything within a couple days. That might get their attention. If they then make a fuss, I would agree to reverse things just as soon as they agree to help clean up the mess they've made. Based on the past history of the WMF's "response" (if you can call it that) to community requests, polite requests tend to be ignored. --71.110.8.102 (talk) 02:20, 26 July 2016 (UTC)[reply]

  • I'm glad to see people realizing the gravity of this problem. Ever since receiving an e-mail inviting me to use the translation tool—which is a modified version of Google translate—to create articles on fr.wikipedia, I have felt deeply disrespected. When I was more active on the project, one of my major areas of effort was Pages needing translation into English, which perpetually has a massive backlog of bad translations needing clean-up, and where editors work their tails off trying to get well-meaning contributors to realize that a machine translation is worse than no article. It is not only generally incomprehensible, it's often inaccurate: translation programs do not actually translate, they are modified searches that match groups of words to translations of similar groups of words available online, and they tend to not only be studded with untranslated words (especially in highly inflected languages) and contextual howlers, but to translate names, to rearrange and combine elements from different parts of the sentence, and to simply omit things—including negatives. The editors doing this are well-meaning and often new to en.wikipedia, but the WMF is showing either simply massive ignorance of how languages work and what translating involves, or contempt for editors at the receiving wiki. Frankly, I gave up my last respect for the WMF when I read that e-mail, and I've been very sad not to see any protest about it. I salvaged one such translation, Autonomous Port of Abidjan (and completed the article), and I did my best to fix up the French translation of an article I'd written here, but otherwise I've just kept away in despair and a desire not to hurt the editors who are doing what the WMF tells them is right and good. Yes, there are a startling number of topics on other Wikipedias that are not covered here on en. and that I wish someone would write up. (A translation is often not the best approach anyway.) But this is most emphatically not the solution. IMO the PNT people should have a large say in what we do with the articles created up to this point with this tool. But it's urgent, if at all possible, to get the WMF to stop pushing it. (I really have no hope we can get it removed.) Yngvadottir (talk) 04:20, 26 July 2016 (UTC)[reply]
    I regret that you feel disrespected. I greatly respect the work done on WP:PNT. I test drove the automated translation tool once, with spectacularly bad results (the article created was speedily deleted as a machine translation). I thought that it could help the folks at WP:PNT if it could handle the templates, categories etc and leave them to handle the text; but it doesn't do this. Hawkeye7 (talk) 04:49, 26 July 2016 (UTC)[reply]
    I believe that's exactly what it does, though. At least, when I've tried it out, the tool handles citations and templates (within limits, since not all templates are available on all wikis) but actively encouraged me to translate the text myself and directly warned against using machine translation (to the point, if memory served, of disabling the ability to save the page if my text was the same as the in-tool machine translation – atlhough, if machine translation isn't enabled here, then of course it won't have any way to detect someone copying garbage from another online translation tool).
    At any rate, this is the Language team's product, and Runa and Amir are both aware of this conversation. Due to timezones, etc., it may take them a while to figure out what the main sources of the problems are, but I have found that they are very responsive and helpful, even if they might be unable to magically solve every single problem instantly.  :-) Whatamidoing (WMF) (talk) 11:16, 27 July 2016 (UTC)[reply]
    Content Translation is used to create thousand of good articles and we understand that you had to delete many articles by a user who uploaded unedited machine translations. Machine translations were not supposed to be enabled in the English Wikipedia, but it got enabled in error during a configuration change, and now it's disabled. I apologize for this confusion. We are actively monitoring the numbers of articles being created and deleted, and we are listening to community feedback. In addition, we would also like to reiterate that there is a specific warning that is shown to editors if unedited machine translated content beyond a threshold is retained. This may not always be harmful, especially in cases where the quality of machine translated content is very high (e.g Spanish-Catalan, English-Russian and many others) and we respect the editor’s discretion about retaining, discarding or modifying it. The modified content is also available for use by the developer’s of the machine translation system to improve the quality of the system. Thanks.--Runa Bhattacharjee (WMF) (talk) 03:30, 28 July 2016 (UTC)[reply]
@Runa Bhattacharjee: I don't understand: you say machine translation in en:wp "got enabled in error during a configuration change, and now it's disabled"; but your graph shows translations into English rising steadily over the past year to reach over 4,000, of which 1,000 deleted. JohnCD (talk) 09:17, 28 July 2016 (UTC)[reply]
@Runa Bhattacharjee:After some experimenting, I think I do understand. Can you confirm my understanding: translation into any language can only be done starting in that language's WP. Machine translation into that language may be disabled. but the other facilities of the translation tool (formatting, references etc) are still available.
If so, it might be possible one day to enable machine translation into English while restricting use of the tool by means of a user-right to those who can show that they are fluent enough in English to clean up the machine-translated output. JohnCD (talk) 09:45, 28 July 2016 (UTC)[reply]
I also put a reply to this in a new subthread below; I think it would be helpful for this not to get lost in the middle of a big discussion. BethNaught (talk) 09:19, 28 July 2016 (UTC)[reply]

It's called Content Transcrapulator. This "tool" transforms content into crap. I was yelling about this mess a year ago and gave up. From a technical stand point, it produces some horrible code. 100s of span tags in an article that does nothing. ISBNs done as an external wikilinks and lovely tables. It still can't translate dates in a reference. WMF doesn't care. Tickets filed last year are still open with the bugs still happening. This is a good tool in the hands of somebody who knows both languages and cares enough to fix the article. From list of new pages done by CX, one can see the majority are done by editors with very little edits. Bgwhite (talk) 07:10, 26 July 2016 (UTC)[reply]

  • If that was the main problem it would be bad enough. From my point of view, it's not even the main problem. I have a number of specific concerns and a major general concern about it. Specific concerns are things like How does the algorithm handle double-negatives? (E.g. stick "There ain't no way Private Smith's a Nazi, sir!" into the algorithm and see if it produces a denial or an allegation). Also, How does the algorithm cope when there's one word in one language and several in another? (German has two words for "drown", which are ertrinken and ertraenken. One is a tragedy and the other is an accusation of murder, so it's rather important to choose the correct one.) This is the kind of thing I meant when I said that machine translations can misrepresent or even invert the meaning of the original text.

    The major general concern is Who is responsible for these edits? Most of these articles are not well-sourced by en.wiki standards even when fully compliant with the policies of the source language Wikipedia. If we had confidence that the people making these "translations" understood both the text they're copying from and the text they're copying to, then that's one thing; but if they don't, then who's accountable for the problems here? And, to put a slightly sharper point on it... who's liable for any inaccuracies in the translation? Isn't it the WMF who provide the tool?

    I think this is the "I see no ships" approach to machine translation. (Translate that into Japanese with your tool and ask a Japanese person what they understand by it...)—S Marshall T/C 07:45, 26 July 2016 (UTC)[reply]

    Google translate, doing a round trip from English to French and back, gives, "There is no private means Smith is a Nazi, sir!" GoldenRing (talk) 08:44, 27 July 2016 (UTC)[reply]
  • From WP:Translation: "Translation takes work. Machine translation almost always produces very low quality results. Wikipedia consensus is that an unedited machine translation, left as a Wikipedia article, is worse than nothing." That needs to be made much clearer in the instructions.
This tool is useful only if the user is sufficiently fluent in the target language to clean up the output into acceptable prose, and understands the source language well enough to be sure that the meaning has not been scrambled or distorted. It is evidently not being used like that. I suggest that:
  • Use of the tool on en:wp should be made the subject of a user right, like AWB, granted to users who can demonstrate fluency in English and who declare that they will only translate from source languages they understand well enough to be sure that translations are accurate. The right could be withdrawn from users who repeatedly provide bad translations.
  • Output from the tool should go automatically into Draft space, and be put into a new category "Machine translation needing cleanup". (A list like WP:PNT would rapidly become unmanageable). The translator would need explicitly to move the result to mainspace once confident of its quality, or leave it for others working the category to do that. There might need to be a G13-like mechanism to clear out abandoned draft translations.
JohnCD (talk) 11:40, 26 July 2016 (UTC)[reply]
The tool has no utility. Someone with adequate understanding of the original and adequate fluency in English will produce a better translation, faster, from scratch than by running the text through a machine translator and then attempting to correct that text. Translating articles into fluent English often involves reordering things (different languages and different Wikipedias have different conventions) and almost always involves changing the balance of what is explained and what is merely summarized (quite apart from the issue of sourcing raised by S Marshall, which often leads me to omit or summarize passages I can't source). What the tool does for a competent translator is add an extra step of demanding checking and correction. (Even on the level at which machine translations supposedly function best, rendering individual words, it's shocking how often I find myself submitting improvements to Google.) The points about chunky code in the resulting article are also concerning. Where computerized assistance might be useful is checking the links in the original and automatically replacing them with the Wikidata-linked equivalent in the target language, or if there is none, with the ILL template (not checking where the links in the original go to—either omitting the links or leaving them as unnecessary redlinks—is one of the failures I used to run into most often when I used to work at PNT), but so far as I know that functionality is not offered. Honestly, I can't see any use for this whatsoever. It's not a crutch, it's a pitfall. Yngvadottir (talk) 13:20, 26 July 2016 (UTC)[reply]
Agreed, all automated translation should be discouraged. It helps noone.·maunus · snunɐɯ· 13:24, 26 July 2016 (UTC)[reply]
Let's not throw out the baby with the bath. I'd prefer two modifications to your summary statement: the current state of automated translations is not good enough to use as is. The use of an automated translation service may be a useful first step toward translation but should not currently be considered an acceptable end product.--S Philbrick(Talk) 18:51, 26 July 2016 (UTC)[reply]
There is no reason to expect that the state of automated translatoin is going to become acceptable within the next decade. And generally automated translation is not helpful as a useful first step - because if you are good enough to be able to find the errors then you would have been good enough to do the translatoin yourself in the first place.·maunus · snunɐɯ· 14:18, 27 July 2016 (UTC)[reply]
Well, CX isn't "automated translation" (try it out and see for yourself...), so the agreed-upon limitations of automated translation aren't necessarily very relevant. Also, as a point of fact, many professional translators use machine translation as an adjunct, because they think that it saves them time. (Whether to use it depends upon both the language pair, personal preferences, and the text in question – a translator who normally doesn't use it might choose to use it for text that contains mostly a list of proper nouns, for example.) Whatamidoing (WMF) (talk) 09:38, 28 July 2016 (UTC)[reply]
  • I would disagree with JohnCD when he suggests that machine translation output should be dumped into draft space. Category:Wikipedia articles needing cleanup after translation is already overflowing with very old material. Very few editors are interested in dealing with it because if you do have the dual fluency required, you'll always find it so much quicker, easier, and more fun, to do your own translations from scratch than to fix someone else's.—S Marshall T/C 19:03, 26 July 2016 (UTC)[reply]
  • @S Marshall: My thought was that, if we are going to have crap, it would be better to have it in draft space than in mainspace. There would need to be some mechanism like WP:CSD#G13 for clearing out abandoned drafts. But certainly better not to have crap at all, if we can stop it somehow.
The answer to your question, Who is responsible for these edits? is surely: the user who posts the article. It is their responsibility to ensure that the results they produce by using the tool are up to standard; they can't abdicate responsibility and blame the tool. There is a secondary responsibility on the WMF, akin to the responsibility of someone who sells an assault rifle without checking that the buyer is someone who can be trusted with it. JohnCD (talk) 14:11, 27 July 2016 (UTC)[reply]
  • If used according to directions, the tool is fairly useful, in laying out he basic structure of the article. The translation it puts in is-- I think -- simply the machine translation, with all the defects of that program. The virtue of using it is that it translates and formats some of the basic page elements, which are otherwise somewhat tricky in you are not familiar with the other language's WP. The text needs the same attention as all machine translations--as the instructions say, you need a good knowledge of English, with an adequate knowledge of the technical terms in the subject field--the same good knowledge you'd need to write or translate an article from scratch--and a moderate knowledge of the source language. How much knowledge of the source language you need depends on the complexity and type of the article, on how well that language is handled by the program, and on how well you know the conventions of WP articles in that subject. Like gtranslate, it is best used as essentially a dictionary in context, not as a guide to correct syntax or grammar. The point of it, I think, is that it lets people who can handle the languages, but don't know WP formatting, to translate an article. Those people will typically be European university or secondary school students, and I think that's the e group that was in mind as the ideal target audience. It works less effectively for most Americans , and it works terribly for those whose knowledge of English is sketchy--as are many editors here at enWP. (My own experience is not as much translating articles as in fixing up bad translations, and I usually do it with the source article also open. By now I know some of the typical errors of gtranslate, and I know to watch for them. I also know in many subject fields the instances where a word-for-word translation is not idiomatic English--in other words, I know what the article ought to sound like.) We need to set some limitations on the use, but I think it has to be done by making much more prominent warnings about the requirements, and warning people off at an early stage.
I suggest that those people here with true bilingual ability are not necessarily the best group to judge the usefulness, though of course they are the best group to judge the quality of the result) --to put it directly, their standards are too high. The enWP does not insist on a professional grade of English from its contributors, and many original articles by those with a sketchy knowledge of English have to be essentially corrected as if they were poor translations. DGG ( talk ) 00:14, 30 July 2016 (UTC)[reply]
@DGG: Respectfully, our standards are not too high. We don't demand a professional grade of English - grammar can always be corrected. If that were the only issue, we could fix it. What we do demand, however, is that our content meet certain core policies including Verifiability, NPOV, Original Research and BLP. Many of these low quality articles come from wikis with very different requirements, especially regarding sourcing. A number of the ones I've come across were fraught with blatantly unsourced (or sourced to things that don't meet RS) contentious statements about living persons. I know you have long been a voice in favor of retaining and rescuing content that is not "professional grade", but if you have looked over examples of the garbage this tool has put out (especially the obvious BLP implications) and still don't see a problem, then perhaps our standards aren't the ones that need to be adjusted. The WordsmithTalk to me 00:42, 30 July 2016 (UTC)[reply]
{{ping}The Wordsmith}}. I was referring to the standard of English, not the standards of WP:N ,V etc. The variation in WP standards between the different WPs is a problem no matter how skilled the translator. It depends on the WP. deWP, for example, has a higher notability standard than we do, but uses form references to standard reference works, which need to be expanded into the sort of references used in enWP. esWP does have a lower standard of notability--more exactly a lower level of scrutiny of the submitted articles-- , but most of the items here from esWP do meet our standard for notability. This has nothing to do with the CX tool. DGG ( talk ) 00:52, 30 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Where do we go from here?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


There is clearly consensus that the current situation is not tenable. The three main suggestions so far are;

  • Disable the tool completely
  • Allow the tool only to trusted users as a userright
  • Default the tool's output to Draft:space

It would be useful if we had some feedback from the WMF staff responsible for this, but in the mean time what is the way forward. Do we need to create an RfC on the tool? The last thing we want is a Visual Editor type fiasco; hopefully the fact that this tool is creating poor quality articles on a regular basis will mean that there is more consensus between all parties. Black Kite (talk) 20:49, 26 July 2016 (UTC)[reply]

I'd support 2 & 3 (together, not separately), but wouldn't be averse to (1). Given that the WMF clearly think this tool is the dog's bollocks, I suspect they'll refuse to allow it to be disabled, but an RFC with an overwhelming consensus might force their hand, so I'd say taking the "waste everyone's time going through the motions" route is the only one that will work. ‑ Iridescent 20:58, 26 July 2016 (UTC)[reply]
After conducting the experiment of dip-sampling earlier entries, changing my vote to "shut it down altogether". At a conservative estimate, perhaps one creation in every 20 made using this tool is appropriate; we need to be rid of it by any means necessary. ‑ Iridescent 22:17, 26 July 2016 (UTC)[reply]
I'd support 2 or 3, with 2 preferable by a wide margin. Providing a tool to all users to mass-create pages is silly. ~ Rob13Talk 21:06, 26 July 2016 (UTC)[reply]
I'd support 2 and oppose 3. Please don't drop raw machine translations into draft space on the assumption that editors with dual fluency will fix them for you, as this will not happen.—S Marshall T/C 21:09, 26 July 2016 (UTC)[reply]
Since I envisage (2) including some variant of "prove you can actually speak both languages before we flip the bit", I'm hopeful that people will fix their own translations. My personal preference would be for the machine translations to be dumped into the translating editor's userspace for them to clean up themselves, rather than into a general midden. ‑ Iridescent 21:22, 26 July 2016 (UTC)[reply]
Fourth option (because we've never been able to get the WMF to disable any of their abominations): block its use on English Wikipedia, perhaps with an edit filter. We already have a policy that machine translations are deletable. These are machine translations. @S Philbrick: There is no baby. It takes longer to check and fix the translation than it does to do it right the first time, and there's a risk of missing a major inaccuracy (such as an omitted negative or running together of names to create one person out of two; I've seen both frequently when using Google translate, and this is merely an adaptation of Google translate). The option of dropping passages of the original article into Google translate or another machine translation program as an aid already exists, as does the option of copying the original article in edit mode to save time on markup and see the links and reference URLs, or in fact there is the option of formally importing the article to draft or user space, little used though it is on English Wikipedia. Moreover I just found out today via a site I am not supposed to link to that the tool doesn't translate the "File:" prefix. Competent translators don't need this tool and in fact are hampered by it. Instead it's suckering editors new to English Wikipedia into creating terrible articles that get deleted and risk getting them blocked, like poor Cadejoblanco, who may not have known enough English to defend him/herself. Pumpie went from a perennial admin candidate to a community-banned editor in large part because of his terrible translations. Use of this tool needs to be stopped. It's hurting the encyclopedia and new editors. Yngvadottir (talk) 21:36, 26 July 2016 (UTC)[reply]
@Yngvadottir: Myself, I find it much easier to fix articles than to write them (for psychological rather than linguistic reasons) , and similarly I find it much easier to fix bad translations than to start from scratch. Others prefer to write than to fix, in one or both of these cases. Both modes of working are acceptable and needed. They take slightly different skills and a different mindset. There's room for both you and me. DGG ( talk ) 00:57, 30 July 2016 (UTC)[reply]
As a point of fact, the WMF has disabled multiple software projects on this wiki at community request, including the Article Feedback Tool, Mood Bar, and others. Whatamidoing (WMF) (talk) 09:38, 28 July 2016 (UTC)[reply]
As far as I can tell, it will take developer time to make a permission system for this - we could likely hobble together something with the abusefilter - but would need a threshold where this is "accepted" (e.g. 500/30 perhaps?)---but consensus will need more than just this AN post to gather. Perhaps a RfC? — xaosflux Talk 22:13, 26 July 2016 (UTC)[reply]
  • Sure, the proper process is important. We do need to temporarily halt these article creations while those gears are grinding. What's the simplest way to achieve that?—S Marshall T/C 22:30, 26 July 2016 (UTC)[reply]
  • A nuclear option editfilter would be very simple - disallow every edit that generates the ContentTranslator tag. Black Kite (talk) 22:47, 26 July 2016 (UTC)[reply]
  • Isn't that likely to draw the ire of the WMF without community consent? ~ Rob13Talk 22:51, 26 July 2016 (UTC)[reply]
  • Yes, I expect they'll be less than chuffed.—S Marshall T/C 22:55, 26 July 2016 (UTC)[reply]
  • Almost certainly; I was merely pointing out that it was doable. An RfC with wide visibility would be the way to go, I suspect. Black Kite (talk) 22:57, 26 July 2016 (UTC)[reply]
  • The edit filter isn't quite a magic wand, but is powerful. I've created a filter for LOGGING ONLY see log, this filter will log content translations that result in new pages, by users with less than 5000 enwiki edits. — xaosflux Talk 23:15, 26 July 2016 (UTC)[reply]
  • Note my own edit appears because I first set it to 50,000 edits for testing. — xaosflux Talk 23:23, 26 July 2016 (UTC)[reply]
  • Shut it down; my runin with this tool indicated a useless translation, with the usual problems of uncited text and non-reliable sources. SandyGeorgia (Talk) 23:21, 26 July 2016 (UTC)[reply]
  • And for those who don't speak Spanish, most of the original text that appears to be cited in the Aymara Lorenzo sample is not actually cited at all, and is original research -- the text cited is not supported by the sources given in most instances. Cleanup of this kind of mess involves more than knowing the language and fixing the (somewhat obvious) translation errors; it involves reading every source to check for copyvio and correct the poorly sourced text that originated from another Wiki. Translations are generally allowing non-reliable, uncited text into en.wiki unless the translator verifies every source, which rarely happens. And that's without getting into the huge problem of copyvio; many times, people translating do not understand that a direct translation from a source is copyvio. SandyGeorgia (Talk) 02:36, 27 July 2016 (UTC)[reply]
@SandyGeorgia:, these sourcing problems are present in original articles written here also. Our normal process of working allows poorly cited text in enWP also, unless it gets the FA level of scrutiny. And a direct translation from another WP is not copyvio if attributed--and using the tool will I think automatically provide the attribution. DGG ( talk ) 01:06, 30 July 2016 (UTC)[reply]
Yes, it does provide automatic attribution in the edit summary. That's IMO one of the reasons to prefer translation-via-dedicated tools versus translation-via-pasting-wikitext. (Surely no one seriously believes that translation won't happen, if the tool doesn't exist?) Whatamidoing (WMF) (talk) 09:01, 1 August 2016 (UTC)[reply]
  • After reading over some of the material, and reading Yngvadottir's two insightful comments, I believe we should shut it down. Option 2 is indeed an option but sheesh, do we need more user rights? Option 3 will only add to the workload for those poor schmucks working on drafts, and--as some of us know very well--drafts don't always get the attention they deserve. Shut it down. Let the WMF make an argument for why this should be allowed, and let them be aware of the problems. Shut it down. Drmies (talk) 23:31, 26 July 2016 (UTC)[reply]
  • Using the AbuseFilter can be a bit jarring for the editor that gets hit with the block, here is what they would see
Old message

User filter message in translate tool

xaosflux Talk 23:49, 26 July 2016 (UTC)[reply]
The "Warn" mode is almost the same, but seems to cause a bit of a display bug before it allows the edit on the second submit. — xaosflux Talk 23:55, 26 July 2016 (UTC)[reply]
Please note, the only "turn it off" options I can see right now would be the abuse filter, or developer actions; this can be run "from" any project - and dumps the edits to the project of the language that is selected. — xaosflux Talk 23:59, 26 July 2016 (UTC)[reply]
  • Shut it down. "She is board member of the Institute Press and Society, chapter Caracas.[1] IActually is correspondent of international medias and manage his own company consulting in communications." I'd really like to know how that sentence (which lasted three days in articlespace) fits with the WMF's directive that all BLPs need to be handled with care and sensitivity. --NeilN talk to me 23:59, 26 July 2016 (UTC)[reply]
@NeilN:, this is the sort of text that is very easy to fix. And all that's necessary for BLP sourcing is a check if the ref shows the subject is managing their own company. (and that's the same problem with original enWP BLPs. ) DGG ( talk ) 01:05, 30 July 2016 (UTC)[reply]
@DGG: It's easy to fix - it took three days to spot. Having "she is" and "manage his" makes us look like idiots to the subject and the general reader. --NeilN talk to me 01:42, 30 July 2016 (UTC)[reply]
Actually, I think anyone would just assume that the text was composed by a non-native and has not yet been edited; we have a good deal of that level of grammar from purely original writing here, tho most of comes from another continent. Almost every article dealing with India needs to be proof-read for this sort of thing. We will inevitably get more , and we need to fix them, but what's being reported here is perhaps 0.01% of the problem. DGG ( talk ) 05:15, 30 July 2016 (UTC)[reply]
  • With custom error message:
Original draft of error message

User filter message in translate tool.

xaosflux Talk 00:11, 27 July 2016 (UTC)[reply]
Can we use that message with the "do not enter" icon? --NeilN talk to me 00:14, 27 July 2016 (UTC)[reply]
That part is easy, I changed the draft message (located here: MediaWiki:Abusefilter-warning-cx to the "NO" symbol). — xaosflux Talk 00:17, 27 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Activating Abuse Filter

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


We need to have a clear consensus that adding the abuse filter is the way the English Wikipedia Community wants to go on this. The current filter (Special:AbuseFilter/782) is set to LOG when:

  1. The page is new edited or created
  2. The user has less than 5000 edits
  3. Some other technical stuff to map it to the edit

This is not a good long-term fix, and if going forward, a phab ticket should be opened to disable English as a translatable target. While any sysop can flip on this filter, I don't think most of us are ready to without strong community support behind it. — xaosflux Talk 00:14, 27 July 2016 (UTC)[reply]

It is just a couple of clicks to change the "log" to "warn and disallow"; watching the log first will ensure that false positives are not occurring - while allowing to see who would have been impacted. I'm suggesting the high edit count value so that "experienced" editors can still test if necessary. — xaosflux Talk 00:19, 27 July 2016 (UTC)[reply]

Pinging our WMF community liaison WhatamIdoing. Don't suppose the WMF would turn this off for now given the concerns above? --NeilN talk to me 00:23, 27 July 2016 (UTC)[reply]

  • Support activating the filter. RE Neil, yeah cos the WMF has a good track record of listening to the EN-WP community when its technical fuckups that are implemented without adaquate testing cause problems... Only in death does duty end (talk) 00:29, 27 July 2016 (UTC)[reply]
    • It's a good chance to see if/how we've moved on from the VE/Flow/Superprotect era. --NeilN talk to me 00:46, 27 July 2016 (UTC)[reply]
  • 5000 is rather extreme, isn't it? An excessively high bar, whatever one thinks about Content Translation. — This, that and the other (talk) 01:06, 27 July 2016 (UTC)[reply]
  • Support for all the reasons discussed above. If the WMF were a user, they'd be blocked for disruptive editing if they made these sorts of translation-spam edits repeatedly. I'd rather see sysop-only for testing purposes over the arbitrary 5,000 edit limit, but I won't let that stand in the way of my support. Eventually, I think we should have a userright to allow access to this tool since it is potentially useful to start from a machine translation if you're genuinely translating the material. It's the difference between writing sentences from scratch and copy-editing; one is easier if you know what you're doing, even if you start with something fairly shitty. ~ Rob13Talk 01:23, 27 July 2016 (UTC)[reply]
    This, that and the other and BU Rob13 I think a proper permission system would be better than this - but 5000 should be a high enough bar to include sysops, plus users that certainly should know better. We can set it at anything, but lower values (as opposed to groups) will require more case evaluations for the filter. — xaosflux Talk 01:41, 27 July 2016 (UTC)[reply]
    Are many of these edits coming from users with more than 500 edits? If not, we could co-opt extended confirmed. In the end, though, a proper system at RFP is likely needed.
    So far no, and that may a good enough threshold where "I'm new" stops being a good excuse for being reckless. — xaosflux Talk 02:11, 27 July 2016 (UTC)[reply]
    I'm not really sure where to make this comment, but I want to mention that the main driving force behind Content Translation was to allow small wikis to expand their coverage by translating articles from larger wikis. Although we haven't heard anything from the WMF here so far, I would imagine that they would be reasonably amenable to a request to limit use of the tool on this wiki, as we really aren't the core "target audience" for the tool, and it seems to be causing more trouble than anything else on this wiki. — This, that and the other (talk) 02:05, 27 July 2016 (UTC)[reply]
  • Support activating the filter; I would prefer it to be absolutely forbidden to use this tool. I've just made a start on cleaning up Dunja Hayali, which looked superficially good but had a word that was linked to the German equivalent of altar server translated as "measurement servant", reversed the work relationship of her father and her mother, consistently omitted possessive apostrophes, and failed to translate the bits in the citation templates so that it would be possible to see where they were published and when they were dated. It even for some reason converted a named reference into the same reference defined twice? This thing just hoodwinks people into thinking it will save them time, and the risk is high of even the most experienced translators missing an error it has introduced into the text. That's why we've always been against machine translations! Yngvadottir (talk) 01:31, 27 July 2016 (UTC)[reply]
Adding: with a break to devour a pizza, one edit conflict, and some additional edits to articles I came across while doing the task, it took me almost 4 hours to clean up that article. I work thoroughly but also faster than many translators I know. This is not a time-saver, and the option of consulting a machine translation for specific passages will remain available without this tool. This is one of a cluster of translations by another new user. I've left a note for them on their user talk page asking them to check the translations, but the WMF hasn't done them any favors leading them to think we want such poor work. And who's to say they couldn't have done better working from the original wiki-code; the assumptions behind this tool disrepect the new users as well as those of us over here who have believed the slogans about Wikipedia being almost finished and its being time for quality. Yngvadottir (talk) 05:15, 27 July 2016 (UTC)[reply]
  • Strong support for flipping it on as quickly as possible. The Foundation will likely be frowny that their new LiquidThreads/Flow/Superprotect that they put up without consulting us wasn't as well received as they thought, but if they think its such a great thing then they can clean up the mountains of gibberish it seems to be spewing out. I wouldn't be opposed to turning it into a Userright along the lines of Rollbacker, but that takes developer time and first we need to focus on stopping the immediate harm before we have to resort to something as drastic as summary mass deletion. The WordsmithTalk to me 02:20, 27 July 2016 (UTC)[reply]
  • Support To stop this disruption to the encyclopedia I support this nuclear option while the details are being worked out. We will see how the WMF reacts but the tool definitely seems like a net negative to have on enwiki at this time. --Majora (talk) 02:26, 27 July 2016 (UTC)[reply]
The Wordsmith and Majora This is not really a "sky is falling" situation yet (there have been 0 hits to the filter since it was put in for logging, other than my own tests). — xaosflux Talk 02:41, 27 July 2016 (UTC)[reply]
I understand that. Never said it was falling. But the restriction should be put into effect until (and if) the WMF agrees to disable it on enwiki. Like any organization, the WMF isn't exactly know for fast decisions. Which is why I said this nuclear option should be done only until the details are worked out with the WMF (if they are worked out at all). --Majora (talk) 03:00, 27 July 2016 (UTC)[reply]
Thank you, with the hit rate being very low right now, we have time to collect community commentary here - should it rise we are ready to go. — xaosflux Talk 03:08, 27 July 2016 (UTC)[reply]
  • Strong support It's done enough damage already by creating tons of crap articles; not just bad translations, but many of them of questionable notability as well. OhNoitsJamie Talk 04:43, 27 July 2016 (UTC)[reply]
  • Support - The filter will be a net positive, seemingly unlike the tool itself.Godsy(TALKCONT) 05:15, 27 July 2016 (UTC)[reply]
  • Support as the closest thing we have to making this a permission for trusted users, which given the numerous problems with the tool, are the only ones who should have access. Would prefer to get rid of the new page criterion, since one problem with the translation tool is that it can wholesale replace existing English Wikipedia articles (e.g. [3]). ---- Patar knight - chat/contributions 06:43, 27 July 2016 (UTC)[reply]
  • Strong support - The current situation appears to be untenable. Translations which are rough are one thing - translations which introduce factual inaccuracies that are hard to check are another. Because the hit rate is low, this does not appear to be currently time sensitive, so I would encourage holding off until the WMF responds or the hit rate increases. Tazerdadog (talk) 06:54, 27 July 2016 (UTC)[reply]
  • I suppose I'm the proposer for this measure, but in case it wasn't obvious, I support temporarily disabling the tool throughout en.wiki, until we can reach community consensus and receive technical advice on a process that will enable us to restrict access to it to people we trust. I can see that people like Rosiestep are doing good work with it. However, other people are using it to fill our encyclopaedia with semi-comprehensible semi-sourced material that falls well below community standards. If the content translator tool was an editor, it would have been blocked. It's costing too much volunteer time in patrolling, filtering, and speedily deleting relative to its value on the English Wikipedia.—S Marshall T/C 07:24, 27 July 2016 (UTC)[reply]
  • Strong support - translation tool is a net negative. As an alternative to the 'more than 5000 edits', we could consider to enable it for editors who have a 'given right' (or a combination of the two - more than 5000 edits, OR having been granted a 'given right'). --Dirk Beetstra T C 08:14, 27 July 2016 (UTC)[reply]
  • Strong support - I've created many phabricator tasks about CX a year ago, almost none has been fixed since, and the tool keeps producing a lot of garbage, especially for unexperienced editors (which are attracted by the advertisement made on some wiki for an unstable tool...). I reported the many problems not long ago to the WMF liaison on frwiki to no avail. And the concerns about proper translations are not dealt with by the WMF team that develops this tool. --NicoV (Talk on frwiki) 11:53, 27 July 2016 (UTC)[reply]
    For those reading French, an example of the amount of work for other people (regular contributors, administrators...) when a new contributor insists in using CX and explains that the problems are not his fault but the tool only as it is in beta... fr:Discussion utilisateur:KAES D NYJRI (and there were other discussions on other pages also...). --NicoV (Talk on frwiki) 13:42, 27 July 2016 (UTC)[reply]
  • Support Machine translations are not adequate for an article and are of little or no use to someone who is capable of doing a good job of translation. En.wp already has a PAG against using machine translation for article text so that consensus carries through to this matter. Also, my preference is to keep the minimum edit requirement on the filter high or, if possible, have it prohibit below a given threshold but still warn above. (My support is not conditional on either of those though.) JbhTalk 12:37, 27 July 2016 (UTC)[reply]
  • Note I updated the logging filter for ALL edits using this tool, not only "page creations" - as the support above is trending towards using the tool at all, not just using it to be the first editor of the page. — xaosflux Talk 13:31, 27 July 2016 (UTC)[reply]
  • Support The risk of disaster is too great to allow unchecked use of this "tool". Blackmane (talk) 14:21, 27 July 2016 (UTC)[reply]
  • Support as the necessary short-term fix. I still feel that the Translator tool has potential, if its use can be properly controlled, but now that I understand better how it works, I see that it would be difficult to impose a user-right control. A user on another WP can pick up an article there and invoke the translator tool to send it here. That route would have to be blocked, with an explanation that translation into English could only be done from en-wp by users with the appropriate permission here. (This begins to sound rather like Brexit!) JohnCD (talk) 14:34, 27 July 2016 (UTC)[reply]
  • Support This tool wasn't intended for enwiki, but it doesn't surprise me that WMF staff are so incompetent that they let it be used there. Nothing changes. Black Kite (talk) 15:13, 27 July 2016 (UTC)[reply]
  • Support (reasoning above). With appreciation for Black Kite, Xoas, Yngv and others who got on this; had the same or similar attention been given by admins to that other WMF-inspired disaster that filled the project with garbage that regular editors couldn't keep up with (Wikipedia:Student assignments), we might have less of a problem of declining participation today. SandyGeorgia (Talk) 15:44, 27 July 2016 (UTC)[reply]
  • Comment because it will probably get lost below. In regards to the comments that this is some almighty number of articles that we can't keep up with; approximately 800-1000 articles are created per day on the English Wikipedia at the moment, with an average of 11 of those being through Content Translation. Sam Walton (talk) 15:49, 27 July 2016 (UTC)[reply]
  • Support, per a total lack of confidence that the WMF will deactivate it from their end. Regarding the 800–1000 non-translated articles per day, the comparison isn't germane; because most of the translations at least claim to be referenced articles on notable figures, as things stand New Page Patrollers can't send the things for G1 or BLPPROD as appropriate, as they do for normal gibberish articles and dubious biographies. ‑ Iridescent 16:03, 27 July 2016 (UTC)[reply]
  • Support strongly. I really didn't want to get into too much hyperbole, but allowing free-for-all use of this tool is one of the stupidest moves I've seen here in quite some time, with the results clearly bearing that out. Use of the tool in the way it's being used breaches policy here too, as machine translations are not allowed for obvious good reasons. (And thanks to those who picked up on this.) Boing! said Zebedee (talk) 16:09, 27 July 2016 (UTC)[reply]
    • This isn't my product, but it appears that, here at the English Wikipedia, you can only use this tool if you create an account, login, find the item in your preferences, and deliberately enable it. These practical restrictions are probably why this "free-for-all use" is resulting in only ~1% of page creations each day here. Realistically, we're probably getting more poor translations from unattributed copy-pasting of translations into plain wikitext than from this tool. Whatamidoing (WMF) (talk) 09:38, 28 July 2016 (UTC)[reply]
  • It seems kind of dishonest for me to write "support" here when I have no confidence that the WMF won't retaliate against whichever admin turns this on for daring to oppose one of their boondoggles, when I wouldn't be willing to be that admin myself. But yes, use of this "tool" needs to stop. —Cryptic 16:44, 27 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Technical note

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I activated the filter, which currently disallows the action and warns the user. However, it just gives a generic warning about possible unconstructve editing. We may want to design a dedicated warning for content translation tool which is not to be used for editors with less than 5K edit.--Ymblanter (talk) 20:36, 27 July 2016 (UTC)[reply]

The message was updated to the custom message discussed above, it has room for improvement. — xaosflux Talk 20:50, 27 July 2016 (UTC)[reply]
The filter appears to be working, first warning the user that it is disabled, then disallowing the change. Tested as the 2 edits from my test account see log. — xaosflux Talk 20:56, 27 July 2016 (UTC)[reply]
Good work, well done Xaosflux. What does the landing page currently say? Black Kite (talk) 21:00, 27 July 2016 (UTC)[reply]
Black Kite Currently displays this warning first MediaWiki:Abusefilter-warning-cx, then the default "NOPE" one. — xaosflux Talk 22:22, 27 July 2016 (UTC)[reply]
  • Landing page? Would someone mind drafting up a Wikipedia:xxxxxxxxxxxxxxx page that describes that this is disabled, conditions for use, etc - that we can use as a landing page for any error messages? Would be good to cite any policies as well. — xaosflux Talk 20:45, 27 July 2016 (UTC)[reply]
Working on it now at WP:ContentTranslationTool. The community is free to join me there. Tazerdadog (talk) 21:13, 27 July 2016 (UTC)[reply]
  • Tazerdadog's draft guideline made me think, maybe we could try to have pages created with this tool default to Draftspace with an AfC-translated-submitted template?  · Salvidrim! ·  21:41, 27 July 2016 (UTC)[reply]
    • I like this idea. Sam Walton (talk) 22:15, 27 July 2016 (UTC)[reply]
    • As the initial proposer, I also like it. Is it technically practical? Tazerdadog (talk) 22:20, 27 July 2016 (UTC)[reply]
      • That would be a software change, doesn't sound "complex", but will need the language team developers to implement it. — xaosflux Talk 22:31, 27 July 2016 (UTC)[reply]
        • Could we hack it with a bot (or even a human) to tag and move the submissions to draftspace shortly after they are submitted? Or does the edit filter stop that? Tazerdadog (talk) 22:45, 27 July 2016 (UTC)[reply]
          • The edit filter is currently preventing the page from being written - if disabled: the pages are already using Tags - so it is possible a bot could move all these pages after they come in, but it will be messy. — xaosflux Talk 22:54, 27 July 2016 (UTC)[reply]
  • If it is not too complex, one could write in the warning-message a simple how-to to explain how to save the page in draft-space. Editors who then run into the filter can simply save it there. Excluding the filter on draft-space only is peanuts. --Dirk Beetstra T C 11:02, 28 July 2016 (UTC)[reply]
Beetstra Have you tried the translator? The current interface doesn't let you pick a namespace, and does give you a block of wiki text to go copy in to to a new window. If I missed something please let me know - perhaps you can write a quick how to and we can incorporate it? — xaosflux Talk 11:45, 28 July 2016 (UTC)[reply]
@Xaosflux: - I was just considering to make the filter block anything but draft-namespace (and maybe personal userspace), and the warning and block message to tell the people to copy-and-paste it into draft workspace. You don't want this stuff to be lying around anywhere else. --Dirk Beetstra T C 12:10, 28 July 2016 (UTC)[reply]
@Beetstra: I think we are missing each other somewhere here :D The CXT does not allow you to go anywhere except for the (main) namespace, and the filter should only be catching edits by the tool ("technically" it could catch other edits, but very unlikely). Have you tried the tool to see what it presents the editor? It is available here: Special:ContentTranslation - it does not present you with the wikitext to copy and paste. — xaosflux Talk 14:01, 28 July 2016 (UTC)[reply]
Not true. If your title for the article includes a namespace, then the translation can be published to that namespace.
Note, too, that I started this translation at https://simple.wikipedia.org/wiki/Special:ContentTranslation but that it was published at the Spanish Wikipedia (and over-wrote the redirect there – I'd meant for it to end up at simple:Wikipedia:Sandbox). Whatamidoing (WMF) (talk) 15:06, 28 July 2016 (UTC)[reply]
Whatamidoing (WMF) Agree - I started a section at the end of this, to also state that we are not edit-filtering those (782 only impacts (main). While it is "possible" to drop this in to another namespace - someone using the tool would have to specifically know to type that in. If we want to keep a filter in place, we could add a direction to the warning message to tell the editor to put "Draft" or something else in there. — xaosflux Talk 15:25, 28 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Some data?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Above there have been claims of this tool creating a load of new work for new page patrollers and implications that almost every article created through this tool is bad. Call me skeptical but could we get some data on this? How many pages are made with this tool per day relative to the total number of new pages? How many of the articles created are below the average quality of a new article created without the tool? How many new editors make their first edits with the tool and go on to be productive editors?

It would also be good if someone could clarify how machine translation relates to this tool - some users above seem to think that ContentTranslation only auto translates articles, whereas it was my understanding that users are supposed to manually translate (I could be wrong, I know very little about this tool). Sam Walton (talk) 08:26, 27 July 2016 (UTC)[reply]

  • Here's the feed of recent changes; the last few days mostly contain decent articles because the chaff has been deleted or redirected. Going further down the list will give you an idea, though. Where the editor is experienced, the article is usually good. Where they are not, or do not have a good grasp of English, they are not. Hence the edit-limit being proposed above. Black Kite (talk) 10:04, 27 July 2016 (UTC)[reply]
  • Also, pinging people with more technical knowledge than me, would it be simple to produce a list of all the pages created using ContentTranslator? Black Kite (talk) 10:07, 27 July 2016 (UTC)[reply]
    Black Kite A database query for the tag would be a good way, for some immediate data - this is pages created in the last month (excluding pages that were deleted). Since the logging abuse filter was added, one page has been created. — xaosflux Talk 11:22, 27 July 2016 (UTC)[reply]
    @Xaosflux and Black Kite: https://quarry.wmflabs.org/query/11275 - that took more wrangling than I would have liked. I think you can download the data but there's 40 pages worth -- samtar talk or stalk 12:41, 27 July 2016 (UTC)[reply]
    Thank you Samtar - I moved to a page for easy viewing here: CX Translation report. — xaosflux Talk 13:13, 27 July 2016 (UTC)[reply]
    Over what period of time were these pages created? Sam Walton (talk) 14:41, 27 July 2016 (UTC)[reply]
    @Samwalton9: Would you like me to run another query including date? -- samtar talk or stalk 14:54, 27 July 2016 (UTC)[reply]
    No don't worry, doing some investigation of my own now. Sam Walton (talk) 14:59, 27 July 2016 (UTC)[reply]
    For what it's worth, here's the query -- samtar talk or stalk 15:01, 27 July 2016 (UTC)[reply]
    Although it would have been trivial for Sam to click on the oldest of the 3,603 entries and see that it was July 2015. Thanks Samtar regardless. Black Kite (talk) 15:06, 27 July 2016 (UTC)[reply]
    I didn't know if they were in date order. Sam Walton (talk) 15:12, 27 July 2016 (UTC)[reply]
    Can we get a total number of created pages for the past year Samtar? Sam Walton (talk) 15:18, 27 July 2016 (UTC)[reply]
@Samwalton9: https://quarry.wmflabs.org/query/11288 covers 01/01/2016 00:00:00 until 27/07/2016 00:00:00 order from most recent. There are 2586 articles -- samtar talk or stalk 15:29, 27 July 2016 (UTC)[reply]
Thanks, though I should have been clearer; I meant the total number of pages created (not just through ContentTranslation). I'm interested to know what fraction of articles were created with the tool, to find out if it really is swamping NPP. Sam Walton (talk) 15:32, 27 July 2016 (UTC)[reply]
Ah, no worries - https://quarry.wmflabs.org/query/11290 is trying it's best (still running) but given it's current execution time it's either going to be many thousands of articles or it'll just time out -- samtar talk or stalk 15:41, 27 July 2016 (UTC)[reply]
Don't worry - found the data elsewhere. Something like 800-1000 new articles per day in total; approximately 11 from ContentTranslation. Hardly swamping NPP then. Sam Walton (talk) 15:46, 27 July 2016 (UTC)[reply]
I don't believe anyone ever said it was "swamping" NPP, but the issue is that it's been running for a year creating sub-standard articles, which is why I asked for a full list of the articles produced. 20:44, 27 July 2016 (UTC)
It may be only the opening post that states the tool is "creating masses of work for new page patrollers, editors and admins", but others shared the sentiment that this tool was creating a lot of additional work, when the data shows that only around 1% of pages are created using it. The other concern, that the pages are all or primarily sub-standard appears to still be poorly evidenced beyond a few choice examples of bad pages. Has anyone verified that the tool is resulting in a substantially greater fraction of poor pages than average? I remain skeptical. Sam Walton (talk) 21:55, 27 July 2016 (UTC)[reply]
Well, I can't argue with your opinion, but I would suggest you work your way through the list posted above of the 3,603 pages (less the ones created by experienced editors) and decide for yourself whether they are a net positive to an English language encyclopedia, and then decide whether you think the tool that enables any editor from any wiki to create such pages (and even to overwrite existing pages with their work) is a good idea. Black Kite (talk) 22:04, 27 July 2016 (UTC)[reply]
Not sure where to add this in, but Samwalton, your arguments about the number of articles translated relative to the overall are unconvincing because ... no matter the forum (DYK, GA, FAC, new article patrol, etc), I often am reminded how few editors we have here who are willing to do or have the skills to do translation work. It requires sufficient knowledge of the original language and the target language, and the ability to read sources in the original language. It is much more time consuming work than checking a plain ole plain ole English-English article. If there are only a couple of us doing the Spanish articles, for example, your comparisons to overall project numbers are apples and oranges. Best, SandyGeorgia (Talk) 22:48, 27 July 2016 (UTC)[reply]

Sam, in reply to your question about machine translation: CX is supposed to provide no machine translation here. It did temporarily, for a very small list of languages, as a result of a mistake in a config file. I believe that about 600 articles were affected by this (almost all of which were from the Spanish Wikipedia to English).

Within the CX tool, "machine translation" – if you enable it; it's optional – means that it provides, separately for each paragraph, a fully editable copy of the machine translation in the column where you would otherwise manually type your translation, complete with sources and links. Depending upon the language pair, you may have multiple options for the translation service (none of which are Google Translate).

If you don't change the supplied text for more than a couple of paragraphs, then it displays a large warning and a link to Wikipedia:Translation#How to translate. Editors can choose to disregard the warning, just like editors can (and do) copy machine translations out of Google Translate into plain wikitext pages. In some circumstances (e.g., lists), disregarding the warning may even be appropriate.

If you want to know more about how it works, then give it a try. You just need to enable it in your preferences (under "Beta"). Nothing gets published automatically, so it's safe to play around in the tool. Whatamidoing (WMF) (talk) 09:38, 28 July 2016 (UTC)[reply]

The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Suggestion

The following discussion is an archived record of a request for comment. Please do not modify it. No further edits should be made to this discussion. A summary of the conclusions reached follows.
The purpose of this discussion was to consider the best way to codify the current practice of summarily speedily deleting the thousands of substandard pages produced semi-automatically by the Content Translation Tool (CXT). The initiator of the RfC, Tazerdadog, included a straw poll section where users could object to amending the criteria for speedy deletion in general. The community seems to have ignored that and jumped straight into the four proposed methods of implementation. Over six days have passed, and multiple participants have requested that this be closed. AN is a central community noticeboard, and notification was posted at WT:CSD on the day the RfC launched, and at WP:CENT since 1 August. Since consensus is now clear, there is no need to keep the discussion open.

The community has agreed to create a new series of temporary criteria for speedy deletion labelled "exceptional circumstances" for situations which overwhelm the normal deletion process.

There is consensus to implement two criteria under this new series:

These criteria are labelled "temporary" because they would no longer be needed once their specific situation is resolved. For example, X2 would expire when all 3,603 CXT articles have been checked (or, I suppose, cleared to a point where normal processes would no longer be overwhelmed).

The community rejects the proposals to temporarily expand the scopes of G1 and G6 to include CXT pages on the basis that these deletions are too unrelated to those existing criteria; many editors warned against inappropriately applying these as "catch-all" criteria. There was some support for expanding A2 instead of creating X2, but that was a minority view. Additionally, there was some opposition to including the Neelix redirects as "X1" on the basis that it would make deletion histories confusing and that deleting redirects is more close to a technical deletion (G6); however, this is also a minority view.

There is a consensus, however, to approach these new temporary criteria with a degree of caution. Note that they do not authorize administrators to delete any page created by CXT. Administrators must apply judgment and speedily delete only CXT articles that would obviously require more effort to fix than to start from scratch. When in doubt, consider tagging it, moving it to draft space, or nominating it to PROD or AfD.

As with any speedy deletion criterion, administrators must take care to check that all versions of the page in the edit history also fall under the criteria for speedy deletion—in this case, ensure there was no preexisting content prior to the Content Translation Tool—before deleting. Respectfully, Mz7 (talk) 02:43, 3 August 2016 (UTC)[reply]

Until the above is sorted out, we're still left with the issue of thousands (or tens of thousands) of substandard articles which would require tens of thousands of editor-hours to make them coherent, much less in line with Wikipedia policy. The bulk of them would still likely end up clogging up the AFD and PROD and BLPPROD processes. Therefore, for the duration of the cleanup, I request that the community approve a temporary extension to CSD G6 (Housekeeping) to include substandard articles created using the Content Translation tool. A similar temporary extension was approved at ANI in November of this year to clean up questionable redirects created by Neelix, with success. The WordsmithTalk to me 14:23, 27 July 2016 (UTC)[reply]

Just for clarity, there are 3603 articles (some of which have been deleted) -- samtar talk or stalk 14:56, 27 July 2016 (UTC)[reply]
I suggest that admins are allowed to delete anything that would require large amounts of work to fix into an acceptable article, as G6 (or R3 in the case of redirects). This would include all/most of the articles created by the unfortunate User:‎Cadejoblanco. If such articles are really notable, they can be reinstated by reliable editors. Black Kite (talk) 15:09, 27 July 2016 (UTC)[reply]
<wonk>If we go this route, can we please not call it G6? G6's are supposed to have absolutely zero loss of content and be absolutely uncontestable by anyone editing with even a modicum of good faith; this is neither. G6's recent use as a dumping ground for "anything we don't like, but we don't think can get consensus for a new speedy criterion for" is a terrible practice, and further encourages admins to use it as "anything I don't like, but can't shoehorn into an actual criterion". If we think we should speedy these, the way to do it is to remove G1's (patent nonsense) restriction against material machine-translated by the Content Translation tool, or perhaps machine-translated material in general, for however long it takes to resolve this in a way that won't get the WMF to go on another superprotect-style rampage. —Cryptic 16:12, 27 July 2016 (UTC)[reply]
  • Given the (minuscule) amount of effort it takes to create one of these and the scale of the problem, speedy deletion is the only appropriate and proportional response. I agree that we should enact a temporary speedy deletion criterion until the backlog is gone, and I agree with Cryptic that it should not be an extension of G6 but a new, unique speedy deletion criterion for the circumstances. Let's rename the Neelix provision "X1" (exceptional circumstances #1), and enact speedy deletion criterion "X2: Poor quality article created by the content translation tool."—S Marshall T/C 16:57, 27 July 2016 (UTC)[reply]
    I agree that G6 is not the ideal place for this, but creating a new CSD, and especially an entirely new category of CSD, historically has required at least one RfC, widely published for 30 days, and afterwards further discussion on how to implement that and getting it into Huggle, Twinkle etc. While creating a separate category would be a better long-term solution, doing so would likely take at least 2 months to implement. A temporary extension to G6 has precedent for being approved by consensus on a Noticeboard within just a few days, so that is why I suggested that. In any case, at my next available opportunity I intend to begin summary deletion under WP:IAR, since the established deletion policy was not created to handle circumstances like these. I would encourage other administrators to go through the list and do the same. The WordsmithTalk to me 17:05, 27 July 2016 (UTC)[reply]
    I very much understand that. My position is that once we've turned the tool off we'll have time to progress the various RfCs needed. I doubt if this is the last time we'll need an "exceptional circumstances" criterion, and I expect the problem will take way more than two months for the community to sort out. Yngvadottir took 4 hours to clean up one article, and she's quick; I seriously doubt if I could match her speed.—S Marshall T/C 17:16, 27 July 2016 (UTC)[reply]
    Can we just use G6 (or as Cryptic suggests G1) please? Deleting admins can always use the note section to explain. We've got a lot of stuff that needs to be deleted and can't wait for a month of an RfC to do so. Black Kite (talk) 20:46, 27 July 2016 (UTC)[reply]
    30 days is the default length of an RFC, there's no reason we can't make a 1 week or 72 hour RFC. Tazerdadog (talk) 22:23, 27 July 2016 (UTC)[reply]
    • There's no reason we need to discuss longer solely because of what letter-number code we use. The important decision is whether to delete, not bikeshedding about what to put in the deletion summary. (But I'll bikeshed a bit more anyway - there's no need for a letter-number code at all. Linking to WP:CXT would not just be enough, but superior.) —Cryptic 22:48, 27 July 2016 (UTC)[reply]
  • Why can't we use prod? There is no great hurry to remove the rubbish, and it gives a chance for improvement if objectors want to assist. Graeme Bartlett (talk) 10:54, 29 July 2016 (UTC)[reply]

Straw poll

Should the criteria for speedy deletion be amended to allow the deletion of pages created using the page creation tool without a full deletion discussion if the administrator believed the page would not survive a XfD discussion under the snowball clause? (Note: This is intended to be an abbreviated discussion, lasting about 3 days before a close for implimentation.) Tazerdadog (talk) 23:21, 27 July 2016 (UTC)[reply]

Support expansion. The ease of deleting these articles must be proportional to the ease of deleting them, and using AfD or PROD would overwhelm these processes.Tazerdadog (talk) 23:21, 27 July 2016 (UTC)[reply]

Method of implementation

Assuming that the straw poll section above supports including these pages in WP:CSD, how should it be implemented?

Reword G1

The section at G1 would be temporarily reworded to remove poorly translated material, and explicitly permit pages created under the ContentTranslationTool to be deleted if the deleting admin believes the page would not survive a XfD under the snowball clause.

  • Support, and I would go further and say that it doesn't need to be temporary. Poorly translated material is nonsense, there's no need to have this exception in place. I agree with what was said above that no article is better than poorly translated article. -- Tavix (talk) 03:38, 28 July 2016 (UTC)[reply]
  • Support, a valid solution--Ymblanter (talk) 06:00, 28 July 2016 (UTC)[reply]
  • Second preference if the community is against X2. --Redrose64 (talk) 07:47, 28 July 2016 (UTC)[reply]
  • Oppose Expanding G1 in such a significant way should require a fullter RFC. BethNaught (talk) 09:13, 28 July 2016 (UTC)[reply]
  • Strong oppose G1 is overused in any case, by beginners at NPP who thing it should be used for short but meaningful articles. It should not be used when the meaning can be determined. DGG ( talk ) 00:16, 30 July 2016 (UTC)[reply]
  • Oppose Articles that are simply machine translated should be simply deleted. The ones created in good faith should not. We should not delete something just because it was created with CX. Doc James (talk · contribs · email) 12:43, 30 July 2016 (UTC)[reply]
  • Oppose Patent nonsense should not be redefined to mean "anything an admin guesses would get a bunch of delete votes". Wnt (talk) 13:11, 2 August 2016 (UTC)[reply]
Expand G6

The G6 criterion would have another temporary extension added, very similar to the Neelix redirects.

  • Strong oppose G6 is already overloaded and badly misused, this would make those problems worse not better. Thryduulf (talk) 01:41, 28 July 2016 (UTC)[reply]
  • Oppose, this would not be a technical deletion.--Ymblanter (talk) 06:02, 28 July 2016 (UTC)[reply]
  • Per Runa's comment, the problematic articles being discussed here were indeed the result of a technical error (a MediaWiki misconfiguration until gerrit:301065, 2016-07-26, 7 UTC). Mass-deleting the articles created while the misconfiguration was in place would be a fix for a technical error and would not set a precedent for normal times. (Also, editors might find out that outside that problematic timespan nothing is on fire; a mass deletion once is a measure of limited disruption, compared to a permanent abuse filter or other interface impediment.) Nemo 06:34, 28 July 2016 (UTC)[reply]
  • Oppose Already misused as a catch-all "none of the others fit" criterion. --Redrose64 (talk) 07:47, 28 July 2016 (UTC)[reply]
  • Oppose per above. JohnCD (talk) 08:32, 28 July 2016 (UTC)[reply]
  • Oppose Neelix was a very different case, as these articles do have actual content (even if it is bad) and so I think "technical deletion" is inappropriate. BethNaught (talk) 09:13, 28 July 2016 (UTC)[reply]
  • Oppose - G6 shouldn't be overused. עוד מישהו Od Mishehu 09:50, 2 August 2016 (UTC)[reply]
  • Oppose. Editors chose to use this tool and not improve the output, so it is not a technical error but an error in editor judgment. Wnt (talk) 13:12, 2 August 2016 (UTC)[reply]
Create X2

An exceptional circumstances series of CSD criteria would be created for temporary deletion criteria that would overwhelm normal processes. Neelix redirects would become X1, while the PageTranslationTool would become X2.

  • Support, as the cleanest solution. Tazerdadog (talk) 23:42, 27 July 2016 (UTC)[reply]
  • Support iff there is consensus to speedy delete these. I haven't decided whether I support that yet or not. Thryduulf (talk) 01:43, 28 July 2016 (UTC)[reply]
  • Support. I am not sure why this is a temporary criterion - we can as well make it permanent, saying "Machine translated text which requires more effort to clean it up than to create a good one from scratch", but fine with me anyway.--Ymblanter (talk) 06:00, 28 July 2016 (UTC)[reply]
  • Support as my first preference. It keeps them all together, doesn't mix them with the other uses of an existing criterion. --Redrose64 (talk) 07:47, 28 July 2016 (UTC)[reply]
  • Support X2 as a temporary measure. It should be temporary because it is (rightly) difficult and requires long discussion to establish a new permanent CSD, but there is probably enough participation here to establish consensus for a temporary one. Having a specific tag, as opposed to using G6, would enable the deletions to be undone if consensus on more mature consideration decided they were a mistake. JohnCD (talk) 08:31, 28 July 2016 (UTC)[reply]
  • Support with caveats PageTranslationTool should be X1. We should not rename Neelix to X1 because many G6 Neelix deletions have been carried out and it would make the history more confusing, and also because deleting a redirect is very different, and much closer to a technicality, to deleting a (bad) article with content. BethNaught (talk) 09:13, 28 July 2016 (UTC)[reply]
  • Partial support. The idea is good, but before we enact any temporary criterion, we need to define its periods of activity: when will it cease to be official? Perhaps we should adopt Ymblanter's suggestion of making it permanent (I have no opinion either way on that), but saying "It's temporary" without providing further guidance is foolish. Idea is good; we need to get rid of gibberish one way or another, and unless someone's willing to spend hours and hours improving articles about folks who were in youth measurement servant, those articles need to be trashed on WP:TNT grounds. Nyttend (talk) 10:43, 28 July 2016 (UTC)[reply]
  • Support This is a clean solution and is clearly needed. TNT is applicable here. --Lemongirl942 (talk) 11:59, 28 July 2016 (UTC)[reply]
  • Support All in all, I think this is the best solution. We have to be careful, though, that we don't overuse it. Katietalk 15:17, 28 July 2016 (UTC)[reply]
  • Support This is the cleanest soultion. It gives flexibility when this type of problem arises while maintaining the principle that CSD criteria are clear and well delineated. JbhTalk 15:40, 28 July 2016 (UTC)[reply]
  • Oppose Neelix redirects becoming X1 per BethNaught. Neutral on X2.Godsy(TALKCONT) 17:36, 28 July 2016 (UTC)[reply]
+1 -- Tavix (talk) 17:39, 28 July 2016 (UTC)[reply]
  • Support as proposer.—S Marshall T/C 18:59, 28 July 2016 (UTC)[reply]
  • Support with caveats per Nyttend. I would include the constraints that any approved use of an "exceptional circumstances" criterion needs to have a clearly stated condition for when to finish its use, either as an expiration date or a well-defined backlog that can be depleted. Diego (talk) 08:38, 29 July 2016 (UTC)[reply]
  • Question Would "Content Translation Tool" be the only justification necessary for the deletion of any article without regard to whether the article is actually acceptable, or does the page need to be A) Created using the Content Translation Tool, AND B) the administrator believes the page would not survive an XfD discussion under WP:SNOW? Tony Tan · talk 11:54, 29 July 2016 (UTC)[reply]
    Try reading the actual question, perhaps? The proposal is whether to allow the deletion of pages created using the page creation tool without a full deletion discussion if the administrator believed the page would not survive a XfD discussion under the snowball clause?, which is fairly unambiguous; we're talking about those pages which no reasonable editor would support keeping. If you look at the list, you'll see admins are already unilaterally deleting the worst of the crap regardless, so this discussion is as much about "making it official" as anything else. ‑ Iridescent 12:02, 29 July 2016 (UTC)[reply]
    @Iridescent: I am sorry if my question was a little repetitive. Yes, I did read the "actual question" above, and that is why my wording is so similar. I just wanted to be sure that this proposal is actually the same one, because if you scroll up a bit you can see that the formatting is a least a little confusing. It seems like what we are voting on here is the way to implement it if the straw poll section above supports including these pages in WP:CSD, but only one user has so far voted in that straw poll so I am not sure what that means for the overall question above. Hence, I wanted to make sure that everyone is on the same page, because I was confused. Tony Tan · talk 17:40, 29 July 2016 (UTC)[reply]
    When I suggested it, I meant for it to apply to articles like this or this but not to articles like this. The sniff-test I had in mind was, is it easier to develop the translation into a proper article or to nuke it and start from scratch?—S Marshall T/C 19:17, 29 July 2016 (UTC)[reply]
    @Tony Tan: This is why multi-question RfCs are not such a good idea, see User talk:Redrose64#RFC !vote. --Redrose64 (talk) 20:48, 29 July 2016 (UTC)[reply]
  • Support, after clarification by S Marshall. However, as Thryduulf mentioned below, there should be a requirement that the poorly translated content be the only content in the page revision history - i.e. there is no non-speedy deletable version in the page history. Tony Tan · talk 19:27, 29 July 2016 (UTC)[reply]
  • Wait there a sec. I understand the noble impulse that makes us want to put strict limits on the use of this deletion criterion, but we don't have a lot of sysops with dual fluency and the ones who do often have other important tasks. It isn't fair to ask a sysop to make a judgment about an article where all the sources are in a language they can't read. The AfD process won't work for exactly the same reason. This can only work with trust.

    The deleting sysop can only ask, are they prepared to believe that the person who tagged X2 knows what they're talking about?—S Marshall T/C 19:50, 29 July 2016 (UTC)[reply]

  • @S Marshall: All the restriction adds is a requirement to check the page history to make sure that the CX tool did not overwrite a pre-existing page. If all versions of a page are poorly translated garbage then speedy delete, if there is a version that is anything other than poorly translated garbage then revert to that and don't speedy delete (unless of course that version is speedy deleteable for some other reason). Thryduulf (talk) 11:07, 30 July 2016 (UTC)[reply]
  • Agreed; I trust this will be clear from the wording.—S Marshall T/C 13:55, 30 July 2016 (UTC)[reply]
  • 'Strongly oppose for both Neelix redirects are sometimes useful, and many of them can not be deleted at sight.Of those that go to RfD, some do get kept. As for translation, oppose unless the time use is strictly limited to this particular batch. As a general solution, it will lose us too much salvageable work. DGG ( talk ) 00:19, 30 July 2016 (UTC)[reply]
  • Agreed that the translation speedy deletion criterion should be limited to the 3,603 articles listed at User:Xaosflux/Sandbox16.—S Marshall T/C 13:55, 30 July 2016 (UTC)[reply]
  • Oppose deleting everything created with CX. The content should be deleted because it is bad not because it was made with a specific tool. Doc James (talk · contribs · email) 12:49, 30 July 2016 (UTC)[reply]
  • In practice each of the 3,603 articles are being reviewed. Someone with dual fluency will make the decision about whether it's bad enough to tag.—S Marshall T/C 13:55, 30 July 2016 (UTC)[reply]
  • Support, although I prefer an expanded A2. - HyperGaruda (talk) 14:08, 31 July 2016 (UTC)[reply]
  • Support a good reason for speedy deletion and this will make future emergent cleanups easier too. --Tom (LT) (talk) 06:56, 2 August 2016 (UTC)[reply]
  • Support (and disable the tool completely until it is a lot better, if ever). Fram (talk) 08:30, 2 August 2016 (UTC)[reply]
  • Start slow. If you're making temporary policies, don't rush to delete everything. Just delete the auto-translated articles that are unsourced first, then reassess and have the people who have looked at them try to designate a next worst category. It should be clear that not *all* are going to be badly written let alone worthy of deletion, and it would be best to approach that dividing line carefully. Also, consider batch moves to Draft: rather than deletion for those that are not altogether worthless. Wnt (talk) 13:21, 2 August 2016 (UTC)[reply]
Expand A2

A2 seems most appropriate. Currently this is "Foreign language articles that exist on another Wikimedia project. This applies to articles having essentially the same content as an article on another Wikimedia project". A mechanical translation is arguably much the same content and it's a closely related issue. Andrew D. (talk) 10:23, 28 July 2016 (UTC)[reply]

  • Support extended A2 (or X2, but I prefer A2). CXT translations are by definition based on foreign language articles that exist on other Wikimedia projects. I proposed a BLP-type of PROD in the past, but considering the proposals, speedies seem to be back on the menu, so yes please. Wow, I did not realise that my ANI case for Cadejoblanco would cause such a discussion with its own subpage! Feeling a bit proud :) - HyperGaruda (talk) 14:04, 31 July 2016 (UTC)[reply]
  • Support - this looks like the best solution. If the page is nothing more than the output of WMF-software on an text from an other Wikipedia, then the logic behind A2 fits fairly well here. עוד מישהו Od Mishehu 09:49, 2 August 2016 (UTC)[reply]
  • Oppose. These are not foreign language articles and have the obvious intent of being translations to English. This doesn't fit the purpose and might encourage deletions of a lot of bystander content, like articles with a big quotation of a foreign language source such as a poem. Wnt (talk) 13:15, 2 August 2016 (UTC)[reply]
Well, they are not English either. Trust me, it is quite easy to tell the difference between an unedited machine translation (for which this CSD is meant) and a human translation by a non-fluent person. It's not like we are losing information or compromising hours of work, because most of these "translations" are nothing more than other-language wikipages that have gone through a process of ctrl+c, ctrl+v and "ok" clicks. - --HyperGaruda (talk) 13:55, 2 August 2016 (UTC)[reply]
Comments not specific to a single criteria
  • Whichever criteria is chosen, I think it is important that it explicitly includes a requirement that the poorly translated content be the only content in the page revision history - i.e. there is no non-speedy deletable version in the page history. Without this I think we run the risk of deleting pre-existing content that has been overwritten by this tool, which would be an abuse of the speedy deletion process. Anyone found not making sure of this should be trouted at minimum, with desysoppings for those who repeatedly or egragariously do not do this. Thryduulf (talk) 17:04, 28 July 2016 (UTC)[reply]
    Support including this important requirement. Tony Tan · talk 19:30, 29 July 2016 (UTC)[reply]
    Support. This was an oversight in the wording of the initial proposal, and should of course be Implemented.Tazerdadog (talk) 21:47, 29 July 2016 (UTC)[reply]
    Support Good catch. We need to remember there will be vandals looking in who just want to look for the easiest way to get us to delete an article, and overwriting it with dreck is such a way. Wnt (talk) 13:17, 2 August 2016 (UTC)[reply]
  • We may have issues just about the standards of other language Wikipedias. For example, at one point as an exercise I translated the Spanish version (manually) to Draft:José Gregorio Vielma Mora; but I couldn't move it to main space. The Spanish version just didn't have nearly as much sourcing as a BLP needs, and finding sources for all the statements is difficult for me because I don't speak Spanish well enough. It may be that we can get best results by encouraging the use of the tool from specific foreign language Wikis with better sourcing at first, or by enlisting projects on the source wikis to clean up articles prior to translation. Wnt (talk) 13:26, 2 August 2016 (UTC)[reply]

Closing

The amount of time specified in the RFC has passed, and comments are starting to slow. Could an uninvolved experienced editor please assess the consensus and close the discussion? Thank you. Tazerdadog (talk) 21:06, 31 July 2016 (UTC)[reply]

@Tazerdadog: No it hasn't. The RfC began at 14:23, 27 July 2016, and adding 30 days to that gives 14:23, 26 August 2016 - still a long way off. --Redrose64 (talk) 21:56, 31 July 2016 (UTC)[reply]
(edit conflict) :@Tazerdadog: I am uninvolved in this discussion. I understand that the RfC was only intended to last 3 days from the start, but are you sure that 3 days is enough time to produce a valid consensus that modifies one of our core deletion policies? I haven't thoroughly reviewed the discussion, but it seems to me that this would actually set a very significant precedent—of creating specifically-defined temporary speedy deletion criteria for "exceptional circumstances" on a case-by-case basis. The Neelix discussion was designed to be an WP:IAR extension to an already-existing speedy deletion criterion (and that discussion was open for more than six days), but this discussion appears to propose completely new speedy deletion criteria in a way that we've never done before. Wouldn't more time and more input (perhaps a notice on WP:CENT) be advisable? Mz7 (talk) 22:00, 31 July 2016 (UTC)[reply]
@Mz7::The goal was to codify the current IAR deletions so that they could happen in an orderly way. The consensus seems strong to me, but I could be mistaken. Ideally, I'd ask for a preliminary implementation of the current consensus without closing the RFC, and allowing it to continue for the full 30 days with whatever advertising is deemed appropriate. If you feel it's too early, and the discussion needs to be run longer/better advertised, say as much and I will happily respect your decision. Ultimately, I do not own the RFC, and this is my request as an ordinary involved editor. Tazerdadog (talk) 00:43, 1 August 2016 (UTC)[reply]
@Redrose64: the default length of an RFC is 30 days, but other lengths may be used. 3 days was advertised from the start. Perhaps my judgement that 3 days would be enough to produce a sufficiently strong consensus was in error, but that is a different question.Tazerdadog (talk) 00:43, 1 August 2016 (UTC)[reply]
No, your comment "This is intended to be an abbreviated discussion, lasting about 3 days before a close for implimentation" is in the "Straw poll" level 4 subsection; the RfC is in the enclosing "Suggestion" level 3 subsection, which also encloses this "Closing" level 4 subsection. So although the straw poll lasts 3 days, the RfC lasts 30. --Redrose64 (talk) 08:38, 1 August 2016 (UTC)[reply]
I feel like it was reasonably clear that if you read the discussion in context, it was supposed to last 3 days. Nevertheless, if the community feels a full 30 is better, then I won't strongly object. Tazerdadog (talk) 19:01, 1 August 2016 (UTC)[reply]
As I understand it, there's no particular rush to get this passed at the moment. RfCs don't absolutely have to last a full 30 days—according to WP:RFC#Ending RfCs, it depends on how much interest there is in the issue and if editors are still commenting. With regards to this RfC, because its outcome seems to be rather significant, I definitely want to make sure there's enough time for broader input. Three days seems too short for that. Mz7 (talk) 03:32, 2 August 2016 (UTC)[reply]
  • I've added a notice to WP:CENT for this discussion, since a change to the speedy deletion criteria seems to be of community-wide interest. Mz7 (talk) 03:44, 2 August 2016 (UTC)[reply]
    Thanks for adding the discussion to CENT. I'll bring this up in a week or so after the community has had time to discuss. Tazerdadog (talk) 06:42, 2 August 2016 (UTC)[reply]
  • No, stop. The consensus is totally clear and to my certain knowledge sysops are already performing summary speedy deletions under this provision. Please stop wasting time on needless fussing about procedure and close this.—S Marshall T/C 08:08, 2 August 2016 (UTC)[reply]
@S Marshall: It's now been almost a week, so at this point, I think you are right. I just have one question (and a few editors brought this up before): as these proposed speedy deletion criteria are "temporary", when and how should they be withdrawn? Should they be set to expire after a specific time frame or should they remain indefinitely until there is a consensus that they are no longer needed? Mz7 (talk) 23:21, 2 August 2016 (UTC)[reply]
I'd think borrowing the idea from the Neelix redirects that the speedy deletion criterion should expire when the backlog has been cleared or the deletions have slowed to the point where they would not overwhelm the other deletion processes would be best. The speedy criteria can be withdrawn when either 1) the 3603 pages listed/50,000 redirects listed have been deleted or checked, or 2) The rate of deletions falls to the point where XfD or PROD can handle it. Tazerdadog (talk) 01:14, 3 August 2016 (UTC)[reply]
Indeed. These "X" speedies are distinguished from others because they're limited in scope to a specific situation -- Neelix redirects or the 3,603 articles subject to X2. They would last until all the Neelix redirects have been checked or until all 3,603 articles have been examined. It's the limited scope that enables them to be enacted after an abbreviated RfC. The way to make them expire quicker would be to muck in and check some!—S Marshall T/C 02:13, 3 August 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Reply to Runa: bringing out of the big thread

Content Translation is used to create thousand of good articles and we understand that you had to delete many articles by a user who uploaded unedited machine translations. Machine translations were not supposed to be enabled in the English Wikipedia, but it got enabled in error during a configuration change, and now it's disabled. I apologize for this confusion. We are actively monitoring the numbers of articles being created and deleted, and we are listening to community feedback. In addition, we would also like to reiterate that there is a specific warning that is shown to editors if unedited machine translated content beyond a threshold is retained. This may not always be harmful, especially in cases where the quality of machine translated content is very high (e.g Spanish-Catalan, English-Russian and many others) and we respect the editor’s discretion about retaining, discarding or modifying it. The modified content is also available for use by the developer’s of the machine translation system to improve the quality of the system. Thanks.--Runa Bhattacharjee (WMF) (talk) 03:30, 28 July 2016 (UTC)

@Runab WMF: The edit filter is still catching incoming translations. I presume (and can you clarify) that the tool is still enabled, but it does not provide machine translations into English any more? But that is only part of the problem. Other users have complained about it producing bad formatting and messy markup, and poor-quality content, among other concerns. Would you care to respond to suggestions in this thread about fully disabling the tool on English Wikipedia, or implementing a permissions system? BethNaught (talk) 09:10, 28 July 2016 (UTC)[reply]

You are correct: the tool can be enabled in your preferences, but there is no ability to use machine translation options within it. Whatamidoing (WMF) (talk) 09:41, 28 July 2016 (UTC)[reply]
I don't think that's going to be good enough. The community has other concerns beyond machine translation. Firstly the quality of the articles coming in is poor even if we made the translations perfect. People are writing their translations over better existing articles, making articles which already exist at a different title, or making articles that should not exist due to notability, verifiability, or neutral point of view. Secondly, there is no guarantee that people won't simply use google translate to paste a translation in. Thirdly, the markup syntax produced by the tool is often abysmal, with span tags all over the place, and references breaking. We are asking for the tool to either be disabled completely for translations into English, or preferably, for a permissions system to be put into place. A third possibility that has been less explored is to have the output of the tool be submitted AFC drafts in draft space. Tazerdadog (talk) 09:52, 28 July 2016 (UTC)[reply]
Hello BethNaught, that is correct. Machine translation to English is disabled on the Content Translation tool, which is a beta feature. During the past months we have also been working on several issues related to the markup problems that have been reported in the past and mentioned in this discussion, and indeed, some of them were already fixed. Did you have any specific ones in mind which we can perhaps check?
As I mentioned earlier, the tool is already limited in scope as a beta feature and has been developed to conform with editing policies on the wiki where it is active and be as close as possible to the usual ways a new article is created on the wiki. We do our best to take existing AbuseFilters, editing guidelines and user permissionsinto account before an article can be published via the tool. We also display appropriate warnings to inform users of unedited machine translation, overwriting existing articles and other errors. Editors can also choose to publish the translation under the draft namespace or their user namespace and then move them to the main namespace.
Content Translation has been active on all Wikipedias for many months now and as more users are exploring the tool we are also learning about use cases that need attention from the development team. We are monitoring the discussion here and elsewhere to fully understand the impact of particular incidents. Meanwhile we have an FAQ page which has more details. Also, I wanted to point you to the Content Translation statistics page, which I realized was not mentioned during the discussion here and may be helpful for some of the data that is being collected. We’ll also be happy to help you collect more relevant data that that page doesn’t display.Thanks.--Runa Bhattacharjee (WMF) (talk) 13:07, 28 July 2016 (UTC)[reply]
@Runab WMF: I gather from your answer that you are not interested in disabling ContentTranslation or implementing a permissions system for it, and that therefore English Wikipedia will have to take such actions as it sees fit to control its use. Thank you for clarifying your position, even if you didn't directly respond "to suggestions in this thread about fully disabling the tool on English Wikipedia, or implementing a permissions system" like I asked you to. BethNaught (talk) 13:55, 28 July 2016 (UTC)[reply]
Runab WMF - In regards to "Editors can also choose to publish the translation under the draft namespace or their user namespace" - I've been testing with this tool, and the only option available is "Publish translation" - I'm not seeing any option to direct the edit to any other location. How would a translator know the namespace to type in there? — xaosflux Talk 14:11, 28 July 2016 (UTC)[reply]
I created a new discussion section at the end of this discussion - for options to put in other namespaces. Our existing 5000 edit block is only applying to "Articles", so if a translator puts in "Draft:" or some other namespace by hand they can still create the page. — xaosflux Talk 14:19, 28 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Lower/Change the 5000-Edit Bar?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I understand that this tool has been causing a lot of frustration for everyone, and that many people using it are committing violations against enwiki policy at the expense of other editors. However, looking at the list of recent changes, we can see that there are also perfectly acceptable edits that are constructive. It seems like in the hands of experienced editors from this wiki, such as @Rosiestep:, the content translation tool can be used in legitimate ways that are beneficial to the English Wikipedia.

I love the translation tool... when it works. It has not been working for Spanish→English translation in the last couple of days, including right now. Can someone check out what's up with that? Also I wish it supported more languages, e.g. all Romance languages → English. --Rosiestep (talk) 13:47, 28 July 2016 (UTC)[reply]

When I tried this tool a few days ago, it never offered to automatically translate text. It simply presented a useful interface where I can look at both sets of text side-by-side and do my own translation, while copying over links and formatting. However, now that an edit count of 5000 is required to use the tool, I can no longer create articles with the help of this tool because I am 1500 short of that count. Given that most of the irresponsible edits were created by users who are completely new to this wiki, I think the 5000 edit limit is a little high.

Could we consider lowering the edit bar to something more moderate, such as 500? 1000? 2000? Anyone with more than 500 edits and 30 days (WP:EXTENDEDCONFIRMED) should know the rules here regarding article creation, translation, and WP:CITE. Tony Tan · talk 10:49, 28 July 2016 (UTC)[reply]

Just stumbled upon an editor who got caught in the filter (Shadowowl) who has retired from Wikipedia until the translation tool is fixed - is our "hardline" a little too hard? Could the edit count be reduced to say a trivial amount (500 edits?) to weed out only the newest of editors? -- samtar talk or stalk 10:43, 28 July 2016 (UTC)[reply]
Moved from own section above, agree entirely with edit count lowering sentiment -- samtar talk or stalk 10:59, 28 July 2016 (UTC)[reply]
Given that prior to their blanking it their talkpage consisted entirely of warnings (also note the banner at the top), I'd say that if anything this is proof that the high bar is working, since if it were dropped to 500 this editor would have slipped through the net. (Note also their stated intention to rack up minor edits to reach the threshold, rather than actually stop making mistakes.) ‑ Iridescent 11:03, 28 July 2016 (UTC)[reply]
Perhaps I picked the worst example imaginable - I still think EC would weed out the worst of it. No matter what we do (bar disabling completely), we're still going to have to patrol and mop up -- samtar talk or stalk 11:07, 28 July 2016 (UTC)[reply]
Shadowowl gave me a barnstar for improving one of their translated pages. It was a page on a Swedish location, translated from the tiny stub on Dutch Wikipedia rather than the short but multi-paragraph page on Swedish Wikipedia. The translation program had mixed up two different pieces of information, creating a mess, and had failed to bring over the coordinates template, which I would have thought was pretty easy for an automated tool to do. Shadowowl might well have translated it better without the program. I believe they are an example of a new(ish) editor being misled by the WMF's providing this dangerous tool—and by the message they are sending that machine translations are a good way to create content. As I said above, in my view the thing is a pitfall, not a crutch, and it's hurting editors new to this wiki, as well as hurting the wiki. Yngvadottir (talk) 16:14, 28 July 2016 (UTC)[reply]
Pinging This, that and the other and xaosflux, who have previously mentioned the 5000-edit limit. Tony Tan · talk 11:27, 28 July 2016 (UTC)[reply]
I'm not opposed to having this changed to what ever the community decides - the initial trend was along the no one should use this ever line - I proposed the 5000 bar as a way to let testing continue. — xaosflux Talk 11:32, 28 July 2016 (UTC)[reply]
  • Please see the next section on namespace restrictions - the edit filter currently only applies to (article). — xaosflux Talk 14:20, 28 July 2016 (UTC)[reply]

Lower requirements to WP:EXTENDEDCONFIRMED

  • Support - currently 5000 edits is weeding out constructive use of this tool. As Tony Tan mentions above, any editor who is at 500 edits and 30 days should be aware of key policies. I also have seen that the tool does not offer to automatically translate text, so currently a lot of the very poor quality articles we saw shouldn't be that much of a problem. The use of the filter and tag means these edits can be easily tracked and patrolled -- samtar talk or stalk 11:03, 28 July 2016 (UTC)[reply]
    Given security implications below, I'd like to clarify that this reduction is not an endorsement of the tool remaining active on the English Wikipedia -- samtar talk or stalk 18:53, 28 July 2016 (UTC)[reply]
  • Oppose per reasons stated above. No disrespect, but "any editor who is at 500 edits and 30 days should be aware of key policies" is nonsense, as a glance at pretty much any ANI thread will tell you. Having tried to clean up some of the crap from this list, I can see no legitimate reason for keeping it active at all, let alone making it easier to access. ‑ Iridescent 11:07, 28 July 2016 (UTC)[reply]
I think I understand what you are saying, that there are users with 500 edits and 30 days who would abuse this tool if given access. I also think the same would be true for any other feature or privilege, including the fundamental ability to create and edit articles. The response to this should not be to get rid of those features altogether, but to educate users about the policy and taking action to revert, warn, and block the offenders if necessary. Moreover, there are legitimate uses of the content translation tool, as evidenced by the not-insignificant constructive edits in the list that I have linked to above. Given that most of the "garbage" created using this tool are by new users, I do not think that a 5000-edit limit is justified. The limit only needs to be enough to stop complete newcomers. Tony Tan · talk 11:24, 28 July 2016 (UTC)[reply]
  • Support lowering to 500 edits (namely, extendedconfirmed). Users above this threshold should know better; if they don't, there should be few enough of them that our usual processes for someone who is creating substandard content (namely, discussing with them on their user talk page) should be sufficient. — This, that and the other (talk) 11:48, 28 July 2016 (UTC)[reply]
  • Support until the community can come up with a better way of allowing only certain users to access this tool (e.g a separate user right). This is on the basis that very, very few vandals reach as high as 500 edits without being blocked, and that the current 5000-edit requirement is too restrictive. Although it's true that even users who are extended confirmed may not have a good grasp of basic policies, they should've at least figured out to edit constructively by then. I feel like the current filter may stop those who legitimately want to translate pages but who don't yet meet the requirements, for example, those who's main Wiki isn't the English Wikipedia, and who are translating because they have a good knowledge of another language. Omni Flames (talk) 11:56, 28 July 2016 (UTC)[reply]
  • Support since it now appears that machine-translation to English, which was the most troubling aspect, was only enabled by mistake and has now been disabled again. JohnCD (talk) 13:47, 28 July 2016 (UTC)[reply]
  • Support reluctantly. It is no secret that I dislike the very idea of 500/30 as contradicting the nature of Wikipedia (the free encyclopedia that anyone can edit) and so I'm hesitant to extend its use in any case. However, this is an optional tool and not actually restricting anybody from editing articles. We also have clear evidence of harm being done. In this instance, 500/30 is the most logical threshold. The WordsmithTalk to me 13:56, 28 July 2016 (UTC)[reply]
  • Strong Support. I have made my strong opposition to ECP very clear, but even that is better than this (seriously, five thousand edits?). I was shocked when I saw something about a 5000-edit requirement just to help translate content. At this rate, Wikipedia will basically commit suicide—all these anti-newbie restrictions are being implemented with no regard for the even more adverse effects it will have on our editor retention (which is terrible as is). Biblio (talk) WikiProject Reforming Wikipedia. 14:23, 28 July 2016 (UTC)[reply]
  • Oppose. The tool hoodwinks people into thinking that translation is easy and that we want new articles created by translation. It isn't and we don't: we want adequate new articles created by any responsible method, and very often that isn't translation of an article on another Wikipedia. I'd rather see it banned altogether; as a second best, a very high threshhold is needed to make editors realize that machine-aided translation is deprecated, not encouraged. As I have said in the discussions above, translation from the original Wikipedia in edit mode is still available, and faster and better than the output of this tool (which balks at trivial templates and clogs the page with unnecessary code). That includes pasting sections of text into an automatic translation program as a conscious decision. Yngvadottir (talk) 16:04, 28 July 2016 (UTC)[reply]
    I agree that using this edit filter is less preferable than disabling the tool (given below concerns) - what I don't like, however, is this high bar sending the message to new semi-users (who have gotten over the whole write about my dog phase) that they're "not good enough" to use common sense when translating. Maybe it doesn't send that message, but anything which causes a "them and us" situation needs to be carefully reviewed, at least in my opinion -- samtar talk or stalk 16:20, 28 July 2016 (UTC)[reply]
@Samtar: I agree with you to a certain extent. I'd rather it were simply banned, and the security concern with the private drafts that is raised above is another reason to simply nuke it. But failing that, the message needs to be not "We just don't trust you yet" but "This is a bad tool, to be used with extreme caution". The thing doesn't work well: it doesn't do the things it should do, like find the equivalent wikilinks and render citation and other templates, and like all machine translation it produces garbage at worst and at best easily misleads one into an inaccurate translation that takes considerable time, effort, and expertise to correct. Unfortunately there's a widespread perception that translation is pretty easy on a word by word basis. Pumpie was an extreme example, but within the past year we've had an RfA candidate with an unreasonably high opinion of their translation abilities, and cleaning up bad translations is a huge part of what the volunteers at WP:PNT do, despite its name. By promoting this tool, the WMF is validating this idea, it's wrong, and I'm particularly concerned about the new editors who are being lured into doing something they won't be thanked for, but I'm also concerned about the encyclopedia. Again, the article I spent almost 4 hours fixing up—and there is of course a list of articles I'd prefer to be working on—looked superficially good. But I have no confidence the WMF can be persuaded to eliminate the thing, especially since it is seductive to inexperienced translators, so second best is a very high bar. Yngvadottir (talk) 16:50, 29 July 2016 (UTC)[reply]
  • Oppose per Yngvadottir. BethNaught (talk) 16:34, 28 July 2016 (UTC)[reply]
  • Strong Oppose per Yngvadottir - We can't disable the tool, and unless the community were to approves its use, I'd even support stronger restrictions on it. I certainly don't support lowering the requirements as is suggested here.Godsy(TALKCONT) 17:29, 28 July 2016 (UTC)[reply]
  • Support. Now that the tool cannot be used to post raw machine translations onto en.wiki, it's appropriate to reduce the minimum edit count.—S Marshall T/C 19:08, 28 July 2016 (UTC)[reply]
  • Oppose - Given the number of problems evident with this tool, I'm not convinced ANYONE should be using it. Support raising edit threshold to a number high enough that it catches all users. Tazerdadog (talk) 22:15, 28 July 2016 (UTC)[reply]
How much have you used the tool? Doc James (talk · contribs · email) 12:38, 30 July 2016 (UTC)[reply]
Nothing beyond light testing because I do not possess the competency in any foreign language to do a translation properly. My comment was informed by observations of the finished results of the tool by other users, which, in my opinion, were bad enough often enough to warrant shutting the tool down. The tool translates poorly, and executes markup poorly, the only thing it has going for it is a fancy interface. Tazerdadog (talk) 13:46, 30 July 2016 (UTC)[reply]
  • Support if experienced people want to try it out, I don't think we should prevent that. They should at least be able to clean up after themselves. Graeme Bartlett (talk) 10:57, 29 July 2016 (UTC)[reply]
  • Support. Editors with at least 500 edits and 30 days should be "experienced" enough to be held accountable for their actions. They should know the rules here, and if they misuse a tool, we as a community can take the appropriate actions to warn, revert, and block them. Given that most, if not all, of the substandard articles were created by new editors, allowing extended confirmed users to experiment should not be a problem. Tony Tan · talk 12:07, 29 July 2016 (UTC)[reply]
  • Question Support per Tony Tan's answer to my question. Is it possible to set this to mean 500 edits to English Wikipedia (or, more generally, whatever the target language is)? Per a point that JohnCD made above, there's a risk that--even with machine translation disabled--an automated tool like this might induce editors who are experienced in one language to start replicating articles in other languages irrespective of the editor's mastery of the target language(s). I realize there are warnings against doing that, but the problem with limited language skills is that you don't know how many errors you're introducing. I think it is very, very sound thinking to require an editor have shown themselves able to make 500 (almost certainly smaller, more easily rectifiable-if-errant) edits to the target language Wikipedia before they're given the keys to an automated tool to create entries there. I think the concerns outlined above about the quantity of improper entries the tool could generate--and how much more time it takes to sort those out than it did to wreak the havoc--are really legitimate. So I'd really want the 500/30 cut-off to be specified to the target language's site. Innisfree987 (talk) 18:50, 29 July 2016 (UTC)[reply]
Will take this opportunity to add that I'm against solving this particular problem (not a comment on the security issue, which I leave to more expert hands) by sending everything to draftspace--whether an editor has to look at the entry again in draft before they can publish will not change whether they have the language skills necessary to evaluate their own contributions in the target language. Innisfree987 (talk) 18:54, 29 July 2016 (UTC)[reply]
@Innisfree987: Yes, the 500 edits and 30 days requirement is referring solely to the account on the English Wikipedia. Anything we pass here will only affect the English Wikipedia, so it would not be possible for us to determine what happens if the target language is not English. When the target language is English, this proposal would require the user to be WP:EXTENDEDCONFIRMED on this Wikipedia. Tony Tan · talk 19:19, 29 July 2016 (UTC)[reply]
Great thanks! In that case I'm satisfied: this seems roughly equivalent to the level of responsibility accorded folks with the power to review at AfC, so using the same standard makes sense to me. Innisfree987 (talk) 19:30, 29 July 2016 (UTC)[reply]
  • Suppport It would be desirable to use existing limits, to avoid having too many different levels of permission. DGG ( talk ) 01:13, 30 July 2016 (UTC)[reply]
  • Suppport 5000 is an overreaction. Restrictions should start at lower thresholds and only be increased if the problem persists. Which I doubt it will, certainly not on the scale seen here. Any single cases that slip through can be dealt with on a individual basis. The articles all get tagged after all. If I'm proved wrong then we can revisit the issue. Acer (talk) 08:40, 30 July 2016 (UTC)[reply]
  • Suppport Gah 5000, seriously. Drop it to a 100. This was closed a little early IMO[4] and maybe we should revisit it Doc James (talk · contribs · email) 12:34, 30 July 2016 (UTC)[reply]
  • Support. Fortunately, or unfortunately, I missed the hysteria above, but 5000 posts especially is just a silly kneejerk reaction. Lankiveil (speak to me) 03:11, 31 July 2016 (UTC).[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Create a whitelist

Extended content

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Rather than reduce the edit count required, what if we simply assigned the confirmed (not autoconfirmed) userright to users we trust to use the tool correctly, but do not have the required edit count. We could then set the edit filter to trigger only if the user is not confirmed or doesn't have 5000 edits. I believe this is technically possible, but this is not my area of expertise, so correct me if I am wrong.Tazerdadog (talk) 21:03, 28 July 2016 (UTC)[reply]

Other rights that could be used to circumvent technical problems include (ep-enroll), or autopatrolled.Tazerdadog (talk) 21:09, 28 July 2016 (UTC)[reply]

That's a massive misuse of the user groups system. If the whitelist were of modest size it would be possible to directly build it in to the filter. BethNaught (talk) 21:17, 28 July 2016 (UTC)[reply]
If we want a usergroup for this, getting a group created is fairly easy, the only "permission" the group would need is "read" (this was done for the OTRS users a ways back before global groups). Then the filter would be able to look for "sysop or newgroup". I'm pretty much opposed to maintaining an edit filter of usernames. Also Tazerdadog you can't assing a "permission" to someone, only a group. — xaosflux Talk 21:32, 28 July 2016 (UTC)[reply]
OK, we should do that. Of note, both confirmed and autopatrolled are groups, unless I am mistaken, so either of those would technically work. However, a new userright is the cleanest solution.Tazerdadog (talk) 21:35, 28 July 2016 (UTC)[reply]
Actually, autopatrolled might work. They are users trusted with ability of creating articles of at least minimum acceptable quality, and this is exactly what we require here.--Ymblanter (talk) 06:04, 29 July 2016 (UTC)[reply]
Confirmed won't work here. That would result in the users who were granted confirmed for the purposes of machine translation being mixed up with those who were granted it for getting the permissions in autoconfirmed before meeting the requirements for autoconfirmed. That means that we'd have users who don't even meet the requirements for autoconfirmed yet being able to use the translation tool because of a different reason. Omni Flames (talk) 04:50, 30 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Non (main) namespace creations

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


PLEASE NOTE: The existing edit filter is only blocking ARTICLE namespace creations; it does appears the CXT allows for creation of pages in other namespaces, though it does not make this clear to editors. If we want these to go to Draft: for example, we can change the edit filter WARNING message to inform the translator to type "Draft:" in front of their title - and it will go through. (They can put any valid namespace they can write to in there (e.g. User:aaaaaa/sandbox/Title). — xaosflux Talk 14:17, 28 July 2016 (UTC)[reply]

Now that we know this CAN write to other namespaces - a few options:
  • Do nothing - If you figure out to put a namespace: in the CXT it will work
  • Add a message to the filter warning and/or WP:CXT informing user HOW to do #1 above on purpose
  • Adjust the filter to act on all or more namespaces (disallowing edits)
Discussion

With the amount of attention this page has, I'm a bit surprised no one has an opinion on this? — xaosflux Talk 23:33, 28 July 2016 (UTC)[reply]

  • I'm still thinking about this, I'm leaning towards fully disabling it for all namespaces so that people don't make the translation as a draft and then immediately move it to mainspace, but I'm not sure yet. Tazerdadog (talk) 23:41, 28 July 2016 (UTC)[reply]
    If a newbie does that, wouldn't that submission enter into the AfC process? I don't see a problem with allowing editors access to the tool for creating experimental content; once you remove the "automatic dumping of garbage directly in main space" process, the number of editors who "know enough to create and move an article but not enough to stay out of problems" should be low. Diego (talk) 08:51, 29 July 2016 (UTC)[reply]
    The submission would not enter into the AFC process automatically - all afc submissions should be in draftspace, but not all draftspace pages are submitted to AFC. My only worry is that editors will see it as one more step they have to follow, and will do it heedless of the consequences. This is probably unfounded. Perhaps the compromise solution is to disable it for all major namespaces except draft, and then just manually watch to make sure people aren't moving the pages rapidly. Tazerdadog (talk) 09:19, 29 July 2016 (UTC)[reply]
    If there's going to be a warning template for using the tool, it could even have instructions to include the draft in the AfC process. And yeah, it can make sense to restrict it for drafts (though I think it should be allowed either in Draft space or User space); I don't see the need of this tool for anything other than drafting translations, and any other exceptional usage could be handled as a temporary draft step. Diego (talk) 09:40, 29 July 2016 (UTC)[reply]
    I'd be happy with either draftspace or userspace, with the instructions saying something to the effect of "If you feel that your translation is ready for inclusion in English Wikipedia, please save it as Draft:TITLE, and add {{subst:Whatever the AFC Template is nowadays}} to the top of the article. Tazerdadog (talk) 10:16, 29 July 2016 (UTC)[reply]
  • We could add a message to the filter warning about this, but we should not adjust the filter to disallow all namespaces. This translation tool, in the hands of a responsible editor, can be very useful for contributing in a constructive way. I for one find translating (manually) using this tool to be much easier than manually copying and pasting wikitext. Moreover, I don't need to manually look up the corresponding wikilinks in the English Wikipedia— it does so for me. I only need to verify that the links are correct, which saves time and is much more convenient than trying to locate every article. Tony Tan · talk 12:22, 29 July 2016 (UTC)[reply]
Agreed about usefulness of tool for qualified editors. Innisfree987 (talk) 19:43, 29 July 2016 (UTC)[reply]
  • Correct me if I'm wrong but is the upshot of move to draft still being enabled that as long as an editor is not literally brand-new, any editor could just save to draft and then post? I think this does not go far enough to make sure the tool is only being used by qualified editors. Even adding one more step like this, a semi-automated tool in the wrong hands can introduce errors into Wikipedia at a pace that vastly outstrips the editorial capacity to review for problems. (Even now, new page patrols is backlogged by more than 8000 entries; I shudder to think what AfD would look like.) As I say above, I think the 500/30 rule would work well to mitigate this risk, and I think it should definitely apply to draft creation as well. Innisfree987 (talk) 20:01, 29 July 2016 (UTC)[reply]
    • I suppose it would demonstrate that the user is able to read and follow directions in English. We could put the direction for saving to User or Draft space in WP:CXT - then they would have to go read it to see how. — xaosflux Talk 23:57, 29 July 2016 (UTC)[reply]
      • It'd be a start, but since we don't, for instance, view being able to read and follow directions in English as sufficient competence to use AFCH, I'm still wary about whether it's enough to use this automated tool. Especially someone who's edited a different Wikipedia a fair amount might well be able to sort out the process without having anything like the level of language skill necessary to produce a translation not riddled with error.
I feel a bit bad coming down so hard on this but I really think translating, and most especially translating with an automated tool, differs in a very essential way from creating a new article from scratch: instead of creating content sentence by sentence with only what you can pull from your own brain, you are starting with content on the page and the principle task is to find all the places where it needs to be corrected to be accurate in the target language. A person with limited skills in X language is (with some exceptions, to be sure!) not likely to write more than a few sentences in X language--it's just too much work! The small number of sentences they write might have some errors, sure, but the overall quantity of the errors they might introduce is reasonably limited, and can be adequately handled by peer review. With translating, by contrast, a person can easily take even tens of thousands of words from language Y and they need to be able to competently check the entirety for accuracy in language X (which is a much, much higher level of language competency than reading a few lines of directions properly.) Unless they can do that, all they're doing is creating a literally-overwhelming amount of clean-up for the editors who do speak both X and Y (already in short supply, almost no matter what two languages you pick); a huge snarl in whatever deletion process would apply; and/or, worst of all, erroneous accounts for readers. Extending a 500/30 limit to the draft thing might have some small effect in discouraging a handful of new editors, but I think it's outweighed by how very far the measure would go toward limiting translations so flawed that they are worse for readers than nothing. Innisfree987 (talk) 15:51, 30 July 2016 (UTC)[reply]
  • CXT is a useful tool. We are using it extensively for translations of medical content into other languages. I guess sending the newly created articles to "draft" space automatically when people translate into EN would decrease frustration by those using the tool. Doc James (talk · contribs · email) 12:33, 30 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Serious problem with CX: private storage

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


One of the big problems with Gather that led to the community demaning its removal was the fact that it provided storage of content on Wikipedia servers which was accessible to only the uploading user. I do not know the exact technical details but I cannot find a way to view other users' pre-publication translations. Given that arbitrary text (and hence images, via base64) can be uploaded to ContentTranslation—I have tested this myself—this enables Wikipedia as a kind of dead-drop message exchange service. At phab:T39992#424332 Jdforrester explains why this is a very bad idea. @Runab WMF: What is done to guard against abuse of this feature? Should not administrators or functionaries, or indeed all editors, be able to view other users' translation drafts? Is this possible but poorly advertised? BethNaught (talk) 14:59, 28 July 2016 (UTC)[reply]

A valid issue - I've just tested the same. I don't think it would be possible to view another user's draft at all - it's all stored within the Special: 'namespace' and is specific to the logged in user. Pre-published usage isn't logged anywhere public (by the looks of it) -- samtar talk or stalk 15:31, 28 July 2016 (UTC)[reply]
From the FAQs:
Can anybody read the text that is saved automatically while I am writing the translation?
No. It is accessible only to the translator until the publishing.
-- samtar talk or stalk 15:39, 28 July 2016 (UTC)[reply]
Oh, this is bad. Really bad. Katietalk 18:10, 28 July 2016 (UTC)[reply]
And Runa's stats show that they have thousands of drafts sitting there available only to the account that created them, opening up the WMF to legal issues.Tazerdadog (talk) 21:16, 28 July 2016 (UTC)[reply]
Yeah, the Foundation really needs to consider an emergency shutdown and consult with its counsel before turning it back on. If they decide to keep it, is there some Mediawiki: bit we could modify to keep people from accessing it from en.wp? There must be a way to at least hide it locally, because I'm almost certain the community doesn't want that sort of issue here. The WordsmithTalk to me 21:25, 28 July 2016 (UTC)[reply]
Who are the appropriate WMF legal people to ping? Tazerdadog (talk) 21:41, 28 July 2016 (UTC)[reply]
I'm not entirely sure this isn't a DEFCON 1, Jimbo Wales-level issue. I am not exaggerating in the slightest when I say that the entire project could get shut down by the US government if this were abused, and I don't see any indication in the other dialogue in this section that WMF is taking our concerns very seriously. Katietalk 21:48, 28 July 2016 (UTC)[reply]
  • Doesn't this sort of depend where the translation draft is saved to? It may be stored locally on the user's own machine.—S Marshall T/C 21:58, 28 July 2016 (UTC)[reply]
It's server-side, just tested, can access on my phone a draft I made on my laptop. Pinging Mpaulson (WMF), interim general counsel. BethNaught (talk) 22:06, 28 July 2016 (UTC)[reply]
Testing now. - Confirmed that it's server side, and I can only see the draft while logged in. Tazerdadog (talk) 22:09, 28 July 2016 (UTC)[reply]
Has anyone used the emergency wikimedia email, and do y'all think that we need to in this case? not something we need to do Tazerdadog (talk) 22:13, 28 July 2016 (UTC)[reply]
Emergency is usually for threats to editors' life, legal is more appropriate. I've made phab:T141576, if someone wants to comment there. --Vituzzu (talk) 22:14, 28 July 2016 (UTC)[reply]
Please do not use the emergency email. Clogging up that resource is not something that should be done and this is nowhere near what that email is intended for. --Majora (talk) 22:15, 28 July 2016 (UTC)[reply]
I am not getting the gravity of this. Can someone explain how it is illegal in the US or different than say someone writing a private email to themself and then sharing the password with someone else as a method to quietly share content? Doc James (talk · contribs · email) 01:29, 29 July 2016 (UTC)[reply]
(I am far from being a lawyer) I believe this has to do with our Safe harbor (law) status and/or subjecting WMF to NSL's and other warrants because we "could" be housing private data. — xaosflux Talk 01:57, 29 July 2016 (UTC)[reply]
Massive email providers are constantly served with those types of requests. — xaosflux Talk 01:58, 29 July 2016 (UTC)[reply]
Massive email providers also have the resources to handle all those warrants, National Security letters, etc, including an army of lawyers. Wikimedia has one Interim General Counsel, and there is probably no easy method of retrieving these drafts without getting a Database Admin to manually extract them. We can't handle the sort of load necessary to comply with that, and if somebody abuses it and the WMF bobbles the case, it could be disastrous. Every form of access to nonpublic information has a strict set of controls before it is implemented, and nowhere else in Wikimedia can I think of a place where accounts can store data on Wikimedia servers without some form of oversight. The WordsmithTalk to me 02:19, 29 July 2016 (UTC)[reply]
Seems like the simply solution would be to make them open for viewing. Doc James (talk · contribs · email) 03:08, 29 July 2016 (UTC)[reply]
FYI: [5][6][7]. MER-C 04:31, 29 July 2016 (UTC)[reply]
  • I posted a comment about necessary architectural changes here. Pinging Risker whose checklist will be coming into play. BethNaught (talk) 06:28, 29 July 2016 (UTC)[reply]

This is one of the worst cases of ignorant fear mongering I've ever seen, perhaps wikipedians should leave Computer security and Privacy to those who are qualified and truly understand it, or better inform themselves by reading an article [8].

First, the contents of content translation are definitely hidden, but so is revision deletion, and "super revision" deletion. But that's not really the most problematic thing, the fact of the matter is that Stewards can use RevisionDeletion to hide illegal material and share it among themselves or any one they want to see it on their computers. Secondly, the user options [9] allows arbitrary storage of any illegal material on wikimedia servers, and that can be used without content translation. Thirdly all files on commons can host hidden illegal materials[10]. Lastly, all openly editable wiki pages can host secretly written materials, by storing / vandalising pages and using the revision history to obtain them later on. This is the true about all sites on the internet that host any user contributed content.

Also, the same link you use to support your fear mongering talks about infocalypse [11]. So really, if you want full security then all of the above needs to be disabled, which basically means disabling editing on wikipedia because even live articles can be used the same way. 197.218.88.52 (talk) 07:43, 29 July 2016 (UTC)[reply]

Hello IP, and welcome to wikipedia. I think you fundamentally misunderstand the problem here. Revision deletion can only be done by admins, and the content is still viewable by admins. Oversighting can only be done by people who have disclosed their identities to the WMF, and are trusted by the WMF not to abuse it. The content translation drafts are hidden better than even oversighted material in terms of the number of people who could view or delete it, and can be created by anyone for any purpose including nefarious ones. While it is possible to embed a message into a picture on Wikipedia, it's also something that we cannot reasonably prevent, and would be in view of all. The uploader of the image would be publicly logged. Similarly, revisions of a page that are not deleted are public, and a message in one of those is publicly attributed to the user. (Public = I can see the username, and any checkuser has the technical ability to see the IP address of the user that made the change). This is not the case with the content translation tool, which is why it's a big deal. Tazerdadog (talk) 08:05, 29 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

Little help please?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I'm going through the list in Xaosflux's userspace, which is a fairly Augean job, and I've just prodded two that failed my sniff test, created by User:Arcituno. Then I looked at his userpage and saw he was permablocked as a sock. Please could some kindly-disposed sysop G5 those two (I've bolded them on the page for ease of finding). Also, is there any way to automatically add the names of the creators to that list? We'll likely find problems quicker with that info.—S Marshall T/C 20:16, 28 July 2016 (UTC)[reply]

Deleted. The WordsmithTalk to me 20:22, 28 July 2016 (UTC)[reply]
  • Just to note that most of Arcituno's page creations were already G5 deleted by me. I believe that some people wanted to restore those based on some argument that machine translation edits weren't really the sock's edits so if that was done, that may require another review. -- Ricky81682 (talk) 02:35, 31 July 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


References

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


We need reference templates that work / are compatible across all languages. The lack of these is an ongoing problem for those who work on translations. Until we get these turning off the sea of red would be nice.[12] This request come in number three on the 2015 Community Wishlist Survey but unfortunately there was blockers[13] Doc James (talk · contribs · email) 01:22, 29 July 2016 (UTC)[reply]

In between now and then, you could solve most of this by importing enwiki's most popular CS1 citation templates – as is, no translation, under the English name – to other Wikipedias. Four (probably) rounds of Special:Import per target wiki, and CX would have no red error messages. Alternatively, someone could manually crawl through every major template and set up English (and probably Spanish, French, and German aliases for every parameter). This is probably worth it, eventually, for the biggest wikis, but at most smaller ones, simply importing a plain copy is probably the way to go. WhatamIdoing (talk) 08:53, 29 July 2016 (UTC)[reply]
pasting a plain copy of the rendered text is what I do with other language's templates of this sort now. It works, and causes no additional problems. DGG ( talk ) 01:20, 30 July 2016 (UTC)[reply]
Weird bug: if you copy from the saved page, and paste into the visual editor, then you may pick up some unexpected HTML (e.g., <abbr>...</abbr>) from some templates. Also, links to Wikipedia articles may be processed as external links to that page (a big problem if you're copying from your sandbox.) The solution in the visual editor is to always open the page ("Edit" mode, not "Read" mode) before copying from a Wikipedia page. Whatamidoing (WMF) (talk) 09:28, 1 August 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Content created

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Looking here and I am seeing a fair bit of good content being created.

Stuff like this:

This tool is helping to make EN Wikipedia less US centric which is a good thing. Doc James (talk · contribs · email) 12:56, 30 July 2016 (UTC)[reply]

I spent quite some time working on Metrobus (Tegucigalpa). I first saw it in this version. It is not long, try to read it from the beginning to the end. Now it is at this version, and even stubbifying would still require a lot of work. Sure we get coverage of things which we would never get otherwise, or may be only after very long time, but the investment of time needed to fix it is disproportionally big.--Ymblanter (talk) 13:48, 30 July 2016 (UTC)[reply]
  • Please note, that new content can still be created - the automated machine translation components are no longer running, but manual translations are currently back on for extendedconfirmed editors and up. — xaosflux Talk 02:20, 1 August 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Little more help please?

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


I'd really be very grateful indeed if a few more people could help to work through the list of content tool created articles. At the moment it's largely me and Yngvadottir. I can only translate from French and German. Yngvadottir is a great deal more talented, but from her userboxes, she too is quite focused on Northern European tongues. We could really use some help from people who speak more diverse languages, particularly Spanish.—S Marshall T/C 15:39, 30 July 2016 (UTC)[reply]

I have been doing Spanish for some time already, but there are too many of them.--Ymblanter (talk) 15:49, 30 July 2016 (UTC)[reply]
I speak a small amount of German, and am CSDing some articles. ThePlatypusofDoom (talk) 19:05, 1 August 2016 (UTC)[reply]
So far, it looks like they've reviewed about 100 articles in detail, and almost all of them are being kept (93% are non-speediable at the moment? However, as the review is mostly chronological/non-random, it's too early to be settling in on exact numbers, though). The list has an unfortunate sea of redlinks due to encoding errors for accented characters. I have just fixed a few hundred of them, but more work in that regard might be helpful to the people doing the reviews. WhatamIdoing (talk) 09:04, 2 August 2016 (UTC)[reply]
That's a very optimistic reading of the figures (no surprise there). Looking at User:Xaosflux/Sandbox16 articles 1 to 50, I see 4 that are deleted, 7 not in the mainspace, 1 created incorrectly over an existing page and reverted, 1 "translated" from Simple English, and 3 redirected. This is among the oldest batch, so most of the problematic articles will already be deleted here. In any case, that's 16 of the 50 not useful or checked (non mainspace), or 9 of 43 in mainspace with major problems. The amount of errors and problematic articles increased significantly afterwards (when the WMF rolled out the machine translation). Fram (talk) 10:15, 2 August 2016 (UTC)[reply]
Xaosflux's list doesn't include any article that was deleted before the list was created, so all of the really obvious rubbish that was created and immediately deleted won't appear. Black Kite (talk) 13:06, 2 August 2016 (UTC)[reply]
All you can tell from looking at what we've done so far is how the early adopters' translations looked if they got past NPP.—S Marshall T/C 15:17, 2 August 2016 (UTC)[reply]
... and you can also tell that some of the early versions are raw machine translations. I've heard people say "it wasn't enabled at that point", but I'm looking at page creations like this one, and I'll look you in the eye and tell you that content's been translated by an algorithm. A bad algorithm. I wonder if I'll find a BLP like that.—S Marshall T/C 15:34, 2 August 2016 (UTC)[reply]
S Marshall, please note - that there can be a difference between an editor using machine translation in general, and them using machine translation that was actually integrated to the CXT. Nothing is preventing them even now from copy pasting to google translate, etc. — xaosflux Talk 16:18, 2 August 2016 (UTC)[reply]
Okay, I accept that. The problem in that instance was probably between keyboard and chair. It does look like a good tool, light years ahead of other WMF's releases that I've criticised in the past. It's being misused and, I'm starting to think, the marketing for it might perhaps usefully be adjusted in response to our feedback.—S Marshall T/C 16:44, 2 August 2016 (UTC)[reply]
Those numbers mean that the translations are faring better than new pages created de novo by new editors in wikitext – almost all of which get tagged as having problems, and a huge fraction of which are eligible for CSD. I'm willing to agree that most of them won't pass GA at the moment, but so far, new-translation-by-new-editor seems to be producing higher quality pages than new-article-by-new-editor. WhatamIdoing (talk) 17:29, 2 August 2016 (UTC)[reply]
No, that's comparing apples with oranges, and is actually just simply wrong. The majority of new-articles-by-new-editor are deleted not because they're badly written, but because they're not notable. The subjects of new-translations-by-new-editor are generally - but not always - notable (they wouldn't usually exist on the other wiki of they weren't) but that doesn't mean that their quality, unless they're done by an experienced editor fluent in both languages, isn't generally very sub-standard. Have you actually read through the quality of the articles on the list? Black Kite (talk) 17:56, 2 August 2016 (UTC)[reply]
Whatamidoing, think before you post. You are looking at the pages which have survived for a year, not new creations. In general, if you would look at 50 pages created a year ago and still surviving today, you wouldn't normally find (at least) 4 that are speedy-deletable and 3 others that get redirected after scrutiny, with a lot of the others tagged for major problems. And of course, if you look at the more recent ones, which kickstarted this discussion, the quality (and survival rate) is much, much lower. Why do you think there is so much support for a new CSD criterion here? Perhaps most others have actually looked at the evidence and given it some thought, instead of producing knee-jerk "the WMF is holy and their products are superior" reactions (oh, right, this post was not made with your WMF account so isn't an officially sponsored one but your personal opinion, right...). According to Special:ContentTranslationStats, 4369 pages have been created via this tool on enwiki. But at the moment this discussion started, this had already dropped to 3583 surviving pages, of which quite a few are already deleted by now and a lot more will be deleted over the next days and weeks. Fram (talk) 07:19, 3 August 2016 (UTC)[reply]
  • When this discussion started, i.e., at ANI on 24 July 2016, Special:ContentTranslationStats says that there had been 4,280 translations published (ever) and 1,083 deleted.
  • To make a comparison between two groups, you need two numbers (i.e., the percentage of surviving new translations and and the percentage of surviving new articles by new editors). You appear to only have one number there. WhatamIdoing (talk) 09:40, 3 August 2016 (UTC)[reply]
  • Regarding percentage of new articles by new editors, is that really the correct metric? I imagine that most users of the content translation tool have some experience on their home Wikipedias. They might not have the en-wiki experience, but they have some experience. This makes a good comparison difficult beyond "This rate is unacceptably high and likely worse than nothing" Tazerdadog (talk) 09:54, 3 August 2016 (UTC)[reply]
  • "Those numbers mean that the translations are faring better than new pages created de novo by new editors in wikitext – almost all of which get tagged as having problems, and a huge fraction of which are eligible for CSD. I'm willing to agree that most of them won't pass GA at the moment, but so far, new-translation-by-new-editor seems to be producing higher quality pages than new-article-by-new-editor." (WhatamIdoing, here, 17:29 2 August 2016). "To make a comparison between two groups, you need two numbers[...]" (WhatamIdoing, here, 09:40 3 August 2016). You are quick to criticize others when they provide numbers, but when you make statements and comparisons, your figures are either totally absent or wrong, and (as has been said) you compare apples and oranges, all to get the desired result (please the hand that feeds you). If you have anything actually fact-based and relevant to contribute, please do, but otherwise it would save us a lot of time and you a lot of embarrassment if you would just drop out of the discussion. Fram (talk) 10:32, 3 August 2016 (UTC)[reply]
  • I'll put this here for want of a better idea, but it really relates to multiple statements that have been made above. There are well over 3,000 articles on that list to be checked, although some continue to be speedy deleted. Most take a considerable time to check, and we have a tremendous need for people who can check articles translated from Russian, Polish, Hebrew, Arabic ... One of the reasons checking takes so long, and also one of the reasons NPP has often passed these through (although I am still finding some that have never been marked as unreviewed, is that the problems range all the way from incomprehensibility (the unnameable Wikipedia criticism site flagged a couple of those today) through relatively easy fixes if you can read the original, to something superficially fine but inaccurate, and the last category are potentially the most damaging to the encyclopedia and to living people. I'm afraid these articles are demonstrating that, contrary to several statements above, it's really hard to check a translation. A couple of us mentioned above the BLP translation from German that included "— itself is a Catholic and was in youth measurement servant" (the problem word meant altar server). The other day I found a BLP translation from Spanish by a well respected and extremely competent Wikipedian where the tool had translated the section on their performance work—not only were there glories of prose such as "it Produced and it Drove Program of TV: Pcia. Of Saint Luis: Nattiva (Channel 13)." but titles of films, plays and tv shows were partially translated ("The trasnoche of the truck driver"), all radio broadcasters were rendered "Irradiate", names were stripped of accented characters and sometimes translated, with the surname Cuervo rendered "Crooked" and a professional name piped to read "Monkey", and the references section was headed "Referencias". Just really bad raw machine translation and pretty bad from a BLP perspective. The translator had missed it. So had the first editor to check the article for the list of articles to be checked. The mess is much greater than one might think at a quick glance even after some deletions. The tool is hoodwinking even experienced editors. We're desperately short of copyeditors up to the task of checking and fixing the articles; this is why PNT is so backlogged. This tool needs to be deprecated. It is misleading, and translation for translation's sake is not helpful. Yngvadottir (talk) 19:52, 3 August 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.

The following discussion is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Did WMF just roll out further?

When I look at my contributions now, there's a prominent icon inviting me to contribute via the translation tool. I don't think that was there a few hours ago. And I note that the translation tool seems to be offering translations between any Wiki language pair. Supposedly the tool can can translate Navajo articles into Japanese. Plantdrew (talk) 03:46, 2 August 2016 (UTC)[reply]

I'm seeing it too, and it definitely looks like a further rollout of the tool. This seems very insensitive from the WMF's end. Tazerdadog (talk) 06:41, 2 August 2016 (UTC)[reply]
  • Again, "the translation tool" does not translate articles. "The users" translate articles. And, if an editor knows both of those languages well enough, then why would anyone want to prohibit such translations?
  • The "prominent icon" appears when you opt in. There have been no changes to the software. See, e.g., these screenshots and this feedback (from an editor who doesn't do translations) from more than a year ago. There is a long list of design tasks here, and you are welcome to add your ideas. Whatamidoing (WMF) (talk) 07:26, 2 August 2016 (UTC)[reply]
  • The opt-in explains that, and this is likely a non-issue. The community welcomes sufficiently competent editors to submit translations. However, the tool was mostly producing articles of unacceptable quality before the edit filter. Tazerdadog (talk) 07:48, 2 August 2016 (UTC)[reply]
  • Thanks for taking our concerns to the WMF, dear community liaison. Please add my idea to phabricator, as I don't edit there: simply disable the translation tool from enwiki. Most articles created are crap, translations in progress are not accessible, and the tool sucks (even with machine translation disabled here). I just enabled it, tried to translate an article from nlwiki, couldn't copy (or access) references fom the original, and couldn't change the title (or namespace) again before publishing. The suggestions includes pages that already exist (even with the very same title, like Mont Saint-Michel or Gothic novel). For enwiki, the tool is basically useless and generates more problems than it solves. Oh, and perhaps next time show a bit more sensitivity when replying to people's concerns here. Or simply make it clear that you are a one-way community liaison, that may help as well. Fram (talk) 07:49, 2 August 2016 (UTC)[reply]
+1Tazerdadog (talk) 08:48, 2 August 2016 (UTC)[reply]
Quite an apt username really ;) Muffled Pocketed 10:07, 3 August 2016 (UTC)[reply]
Plantdrew If you use the tool once (even to look at it) it opts you in to the "beta feature" and then displays those icons and links on your interface. You can easily turn those off by opting back out of the beta feature here: Special:Preferences#mw-prefsection-betafeatures - just uncheck the box and they will be gone. — xaosflux Talk 12:55, 2 August 2016 (UTC)[reply]
Thanks. Plantdrew (talk) 16:40, 2 August 2016 (UTC)[reply]
The discussion above is closed. Please do not modify it. Subsequent comments should be made on the appropriate discussion page. No further edits should be made to this discussion.


Machine translation gadget

There is currently a gadget called GoogleTrans which allows the straight dropping of google translate into the content translation tool. (See here). I just did a test, and I was able to produce a machine translated article into english without leaving wikipedia using this gadget. Pinging the creator of the gadget: @Endo999:. I do not think this gadget should be present on the English wikipedia, and certainly not when it seems to explicitly endorse machine translations. Fortunately, it doesn't get around the edit filter, but it still sends a terrible message. Tazerdadog (talk) 09:02, 3 August 2016 (UTC)[reply]

Thank you, I didn't remember about that gadget; I surely can make good use of it. That's the kind of tools that may be invaluable time savers in the hands of us who know how to use them, making the difference between translating a stub right now when you first stumble upon it (thanks to the kick-start of having part of the work already done), or leaving it for another day (and never coming back to it).
Given that the CTX tool has been restricted to experienced editors, and that the GoogleTrans gadget needs to be explicitly activated, the combination of the two won't be at the hands unexperienced newbies in the way that created the current backlog. The GoogleTrans doesn't insert translated content into text fields, it merely shows the translation in a pop-up; so I don't agree that it "explicitly endorses machine translations". Any editor with your experience should know better than copy-paste machine translated text unedited into an article. Diego (talk) 10:39, 3 August 2016 (UTC)[reply]
I am the creator of the GoogleTrans gadget and it does do Machine Translation Under The HTML Markup when used in the Content Translation system. I have used this to translate 226 articles from the frwiki to the enwiki and got all of them reviewed okay. The Machine Translation is a starting point. You still have to manually change each and every sentence to get the grammar and meaning right. It's not very sensible to ban it because, without human followup, it produces a bad article. The point is that it is a tool to quicken the translation of easy to medium difficulty articles, especially for good language pairs like English-French. Wikipedia, itself, uses both Apertium and Yandex translation engines to do machine translation and these have been used to good effect in the Catalan and Spanish wikipedias. GoogleTrans does the same thing as Apertium in the Content Translation system, except it uses Google Translate, which most people feel is a better translation engine. As Diego says this needs to be explicitly turned on, so it tends to restrict usage to competent editors. To stress the point, Machine Translation, as done by GoogleTrans gadget, is a starting point, it is not the end product. Human intervention is required to massage the MT into decent destination language text and grammar, but Machine Translation can help start the translation quite a bit. Wikipedia feels that Machine Translation is worth doing, because it has it as a feature (using both Yandex and Apertium machine translation engines) Endo999 (talk) 11:45, 3 August 2016 (UTC)[reply]
Except that we have a policy against machine translation on en.wikipedia, because the requirements for correcting its output are far higher than users tend to realise; in fact it is easier and faster to translate from scratch than to spend the necessary time and effort comparing the original with the translation to find the errors. Hence the whole long discussion above and the agreement that machine translations can be deleted as such. Yngvadottir (talk) 19:13, 3 August 2016 (UTC)[reply]
There is no policy against Machine Translation on the enwiki. That would have to be posted on the Content Translation blog, and it isn't. I've done 226 of these articles successfully and I can tell you there is more editing for non text issues, like links around dates coming from the frwiki, editing getting references right, manual changing of TAGS because their parameter headings are in the origin language. The actual translation work postprocessing, when polished up by a person competent in the destination language is far less than you say. But style differences between the wikis take more of the editors time. Endo999 (talk) 19:44, 3 August 2016 (UTC)[reply]
The policy is at WP:MACHINETRANSLATION, and has been in force for a decade. ‑ Iridescent 19:49, 3 August 2016 (UTC)[reply]
The policy is against unedited machine translation. It doesn't apply to using machine translation as a starting point to be cleaned up by hand. Diego (talk) 20:12, 3 August 2016 (UTC)[reply]
MACHINETRANSLATION isn't a policy. It isn't even a guideline. WhatamIdoing (talk) 16:14, 8 August 2016 (UTC)[reply]
I have never claimed that Machine Translation first drafts are good enough for articles on the enwiki. They aren't, but responsible use of Machine Translation, as a first draft, that is then worked on to become readable and accurate in the destination language is quite okay and even helpful. Endo999 (talk) 20:40, 3 August 2016 (UTC)[reply]
The consensus is pretty clear that unless you are translating at a professional level, machine translation is a trap. It looks good at first glance, but often introduces bad and difficult to detect errors, such as missed negations or cultural differences. Even if a human caught 9 out of 10 of these errors, the translation would be grossly unacceptable and inaccurate. I'd request that this gadget be disabled, or at minimum, de-integrated from the content translation tool. Tazerdadog (talk) 23:33, 3 August 2016 (UTC)[reply]
Well. I'm pretty far from being a fan of machine translations, but it's always been possible to copy/paste from Google Translate. Anyone autoconfirmed can do that without going to all the trouble of finding and enabling this gadget. The problem is fundamentally behavioural rather than technological. The specific problem behaviour is putting incomprehensible or misleading information in the mainspace. Over-reliance on machine translation is a cause of this, but we can't prevent or disable machine translation entirely, and there's not much point trying. I think the position we should adopt is that it is okay to use machine-aided translations provided you don't put them in the mainspace until they've been thoroughly checked by someone who reads the source language and writes the target language fluently. I suggest the approach we take to Endo999's tool is to add some warnings and instructions rather than try to disable it.—S Marshall T/C 23:51, 3 August 2016 (UTC)[reply]
Don't forget that the use of Machine Translation in the Content Translation system is expanding all the time, and I'm am pretty much the only regular user of my GoogleTrans gadget for translation purposes. Why is the gadget being singled out? Yandex machine translation is being turned on by the Content Translation people all the time for various languages, like Ukranian and Russian. The Catalan and Spanish wikipedias are at the forefront of machine translation for article creation and they are not being flamed like this. I reiterate that the majority of edits per my frwiki-to-enwiki articles are over differences in the frwiki for an article than for articles in the enwiki. The treatment of dates and athletic times is one such difference. You need to do postediting after the document has been published in order to please the editors of the destination wiki. This usually has nothing to do with the translated text but is actually the treatment of links, the treatment of dates, the removal of underlines in links, the adding of categories, the transfer of infoboxes, the addition of references (the fiwiki is particularly good for references of track and field athletes), and other wiki standards (that are different from the origin wiki). There's always going to be some postediting of translated articles because of these nontranslation specific items. It's just inherent in wiki to wiki article movement. Endo999 (talk) 00:24, 4 August 2016 (UTC)[reply]
We don't care about what happens on other wikipedia language versions, basically. Some are happy to have 99% bot-created articles, some hate bot-created articles. Some are happy with machine-translated articles, some don't. It may be true that "the use of Machine Translation in the Content Translation system is expanding all the time", but at enwiki, such a recent "expansion" started all this as the results were mostly dreadful. "Why is the gadget being singled out? Yandex machine translation is being turned on by the Content Translation people all the time for various languages, like Ukranian and Russian." Your gadget is in use on enwiki, what gadgets they use on ruwiki or the like is of no concern to us. We "single out" tools in use on enwiki, since this is an enwiki-only discussion. And this discussion is not about the long list of more cosmetic things you give at the end (or else I would start a rant about your many faux-bluelinks to frwiki articles in enwiki articles, a practice I truly dislike), it is (mostly) about quality of translation, comprehensability and accuracy. Yours are a lot better than most articles created with ContentTranslation, luckily. Fram (talk) 07:07, 4 August 2016 (UTC)[reply]
  • @Endo999: I just happened to check Odette Ducas, one of your translations from French. You had Lille piped to read "Little". This is a good illustration of how easy it is to miss errors, and it's not fair, in fact counterproductive, to encourage machine-based translation and depend on other editors to do the necessary painstaking checking. Yngvadottir (talk) 17:20, 6 August 2016 (UTC)[reply]
Thanks for catching that error (Lille translated at Little). I had seen and corrected that problem in a later article on a french female track athlete from Lille, but didn't correct the earlier translated article. Don't forget that Wikipedia is about ordinary people creating Wikipedia articles and through the ARGUS (many eyes) phenonmenon having many people correct articles so they become good articles. This is one example of that. Wikipedia is not about translation being restricted to language experts or simply experts for article creation. Your argument does tend towards that line of thought. Endo999 (talk) 18:56, 6 August 2016 (UTC)[reply]
I don't think it does (for one thing, all you can know of my level of expertise is what I demonstrate). The wiki method is about trusting the wisdom of the crowd: this tool hoodwinks people. It's led you to make a silly error you wouldn't have otherwise made, and it's led to at least one eager new editor being indeffed on en.wikipedia. It rests on condescending assumptions that the editing community can't be left to decide what to work on, in what order. (Not to mention the assumptions about how other Wikipedias must be delighted to get imported content just because.) Yngvadottir (talk) 19:04, 6 August 2016 (UTC)[reply]

I'ld like to retract my compliment about Endo999's use of his translation tool. I have just speedy deleted his machine translation of Fatima Yvelain, which was poorly written (machine translation) and a serious BLP violation. Fram (talk) 08:01, 12 August 2016 (UTC)[reply]

Almost everyone of the articles I have translated, using the GoogleTrans gadget, has already been reviewed by other editors and passed. I can only translate the existing French, which is sometimes not well written. In Fatima Yvelain's case I transferred over all the sources from the frwiki article. Can you tell me which reference didn't work out. You've deleted the article, without the ordinary seven day deletion period, so you deleted the article without any challenges. Are you and a few other reviewers systematically going through every article I have translated looking for things to criticize? Endo999 (talk) 01:54, 13 August 2016 (UTC)[reply]
That's how Wikipedia rolls; it's the easiest way to demonstrate supposed incompetence, and since incompetence on the part of the creator reflects on the tool, it is therefore the easiest way in which to get the tool removed (along with phrases such as "I'd like to retract my compliment", which I hate as much as Fram hates faux-bluelinks). Simples. jcc (tea and biscuits) 11:00, 13 August 2016 (UTC)[reply]
Having just checked the article for myself, if it was really "reviewed by other editors and passed" it reflects just as badly on those other editors as it does on you, given that it contained an entire paragraph of grossly libellous comments sourced entirely to an alleged reference which is on a completely unrelated topic and doesn't mention the subject once. (The fr-wikipedia article still contains the same paragraph, complete with fake reference.) Checking the review log for the page in question, I see no evidence that the claim that anyone else reviewed it is actually true. ‑ Iridescent 15:46, 13 August 2016 (UTC)[reply]

I realise this isn't a vote, but I agree with Tazerdadog that having such a tool easily available is sending the wrong message. It needs to be restricted to experienced users, with plenty of warnings around it. Deb (talk) 13:28, 17 August 2016 (UTC)[reply]

You are all panicking. There's nothing wrong with using the GoogleTrans gadget with the Content Translation system if the appropriate editing happens alongside it. The ordinary review process can uncover articles that are not translated well enought. I'm being punished for showing ingenuity here. Punishing innovation is a modern trait I find. Endo999 (talk) 07:31, 18 August 2016 (UTC)[reply]
No, our reveiw processes are not adequate for this. Both the problems with translated articles, and the unrelated but similar problems with tool created articles (now discussed at WP:ANI show the problems we have in detecting articles which superficially look allright (certainly when made by editors with already some edits) but which are severely deficient nevertheless, and in both cases the problems were worse because tools made the mass creation of low quality articles much easier. While this is the responsability of the editors, not the tools, it makes sense to dismiss tools which encourage such creations. Fram (talk) 09:19, 18 August 2016 (UTC)[reply]

User:Fram per "We don't care about what happens on other wikipedia language versions" please speak only for yourself. Some of us care deeply what happens in other language version of Wikipedia. User:Endo999 tool is not a real big issue. It does appear that the Fatima Yvelain needs to have its references checked / improved before translation. And of course the big thing with translation is to end up with good content you need to start with good content. Doc James (talk · contribs · email) 17:41, 18 August 2016 (UTC)[reply]

We, on enwiki, don't care about what happens at other language versions: such discussions belong either at that specific language or at a general site (Wikimedia). These may involve the same people of course. 19:45, 18 August 2016 (UTC)

Do people feel that a RFC on this topic would be appropriate/helpful? The discussion seems to have fixated on minute analyses of Endo999's editing, which is not the point. The discussion should be on whether the presence of the gadget is an implicit endorsement of machine translated materials, and whether its continued presence sends the wrong message. Tazerdadog (talk) 22:33, 23 August 2016 (UTC)[reply]

Yes, I believe an RfC would be helpful assuming it is well prepared.--Ymblanter (talk) 05:49, 24 August 2016 (UTC)[reply]
The GoogleTrans gadget has been running on the enwiki for the last 7 years and has 29,000 people who load the gadget when they sign into Wikipedia. It's quite a successful gadget and certainly, wiki to wiki translators have concentrated on the gadget because while they may know English (when they are translating articles between the enwiki and their home wikis) they like to get the translation of a word every once in a while. Endo999 (talk) 17:05, 24 August 2016 (UTC)[reply]
Discussion on this matter seems to have mostly died down, but I was unaware of this discussion until now and I feel the need to speak up on behalf of translation tools. I don't believe the tool being discussed here is the one I am using, since *it* does not provide a machine translation into english. However. I do put english into French based on the machine translation. I repeat, *based on*. Many of my edits to date have been translation and cleanup after translation, so I am probably close to an ideal use case. The tool, Yandex.Translate, appeared on my French wikipedia account and I do find it useful, although it produces text that needs to be gone over 4-5 times, as, yes, it sometimes creates inappropriate wikilinks, often in the case where a word can mean a couple of different things and the tool picks the wrong one. And it consistently translates word by word. I have submitted a feature request for implementation of some basic rules -- for example in German the verb is always the last word in the sentence and in French the word order is almost always "dress blue" not "blue dress". But there are many many MANY articles with word order problems on Wikipedia; it's just usually more subtle that that when the originating editor was human but not a native English speaker. So it's a little like fixing up the stilted unreferenced prose of someone who can't write but yea verily does know MUCH more about the topic than I do. And has produced a set of ideas, possibly inelegantly expressed, I would not have conceived of. The inelegant writing is why we have all this text in a *wiki*
For the record, I agree that machine-translated text is an anathema and have spent way too many hours rescuing articles from its weirdnesses, such as "altar" coming through as "furnace branch" in Notre-Dame de la Garde. BUT. Used properly, machine translation is useful. For one thing it is often correct about the translation for specific obscure words. I deeply appreciated this when, for example, I was doing English into French on a bio of a marauding Ottoman corsair who, at one point or another, invaded most of the Mediterranean. I am an English speaker who was educated in French and has spent years operating in French, but the equivalent terms for galleon, caravel, Papal States, apse and nave, for example, not to mention Crusader castles and Aegean islands, weren't at the tip of my tongue. Its suggestions needed to be verified, but so do Google Search results. I could look these words up, sure, and do anyway, but Yandex gives my carpal tendons a break, in that I can do one thing at a time, ie translate a bit of text like "he said" then check to make sure that wikilink is correct, move down to the next paragraph and do some other simple task like correcting word order while I mull why it is that the suggested translation sounds awkward, walk away and come back... All of this is possible without the tool, but more difficult, and takes much longer. I have translated more articles in the past month, at least to a 0.95 version, that I had in the entire previous several years I've been editing wikipedia. Since the tool suggests articles that exist on one wikipedia but not the other, I am also embarking on translations I otherwise would not, because of length or sheer number of lookups needed to refresh my memory on French names for 16th-century Turkish or Albanian settlements or for product differentiation or demand curve or whatever. Or simply because while the topic may be important it's fundamentally tedious and needs to be taken in small doses, like some of the stuff I've been doing with French jurisprudence and which is carefully labeled, btw, as a translation in progress on those published articles that are still approaching completion.
I agree that such tools should not be available to people who don't have the vocabulary to use them. I don't really have suggestions as to what the criteria should be, but there is a good use for them. They -- or at least this tool -- do however make it possible to publish a fully-formed article, which reduces the odds of cranky people doing a speedy delete while you are pondering French template syntax for {{cn}} or whatever. This has happened to me. The tool is all still kinda beta and the algorithm does ignore special characters, which I hope they remedy soon. (In other words ê becomes e and ç becomes c etc.) Also, template syntax differs from one wiki to another so infoboxes and references often error out when the article is first published. Rule of thumb, possibly: don't publish until you can spend the hour or so chasing this sort of thing down down. And the second draft is usually still a bit stilted and in need of an edit for idiom. But the flip side of that is that until you do publish, the tool keeps your work safe from cranky people and in one place, as opposed to having to reinvent the version management wheel or wonder whether the draft is in Documents or on the desktop. Some people complain within 3 minutes of publication that the article has no references without taking the time to realize that the article is a translation of text that has no references. As the other editor said above, translation tools aren't magic and won't provide a reference that isn't there or fix a slightly editorial or GUIDEBOOK tang to language -- this needs to come next as a separate step. When references are present the results are uneven, but I understand that this issue *is* on the other hand on the to-do list. Anyway, these are my thoughts on the subject; as you can see I have thunk quite a few of them and incidentally have reported more than one bug. But we are all better off if people like me do have these tools, assuming that there is value in French wikipedia finding out about trade theory and ottoman naval campaigns, and English wikipedia learning about the French court system. Elinruby (talk) 08:39, 31 August 2016 (UTC)[reply]


List of articles to consider for the newly-enacted X2


Articles created by block-evading sock using the WMF translation tool

My attention was drawn at a site I should not link to and therefore will not name (however, the thread title is "The WMF gives volunteers another 100K articles to check") to the fact that Duckduckstop created several articles using the WMF translation tool. They were blocked on 5 April as a sock of a blocked user, and their edits are thus revertable. I checked one translated article as a test, John of Neumarkt, and I've seen worse, but it is clearly based on a machine translation and contains at least one inaccurate and potentially misleading passage: "Auch in Olmütz hielt sich Johannes nur selten auf" does not mean "Also in Olomouc, John held only rarely"; it means he rarely spent any time there, but a reader might either not understand that or think it meant he rarely claimed the title. Wikipedia:Administrators' noticeboard/CXT/Pages to review contains thousands of pages, the vast majority still to be checked. Only a few of us are working there. I feel guilty having taken a few days off to write 2 new articles. I haven't looked through Duckduckstop's page creations to see what proportion were created with the translation tool, but that one has not been substantially edited by anyone else. I suggest that in this emergency situation, it and others that fall into both categories—translation tool, and no substantial improvements by other editors—be deleted under the provision for creations by a blocked/banned user. Yngvadottir (talk) 17:42, 27 August 2016 (UTC)[reply]

Hi there. I have had a look today at that list, but haven't really been posting comments since as far as I could see nobody else has been there in several days. I do not know what happened with duckduckstop but as to the articles on the list
- I do not understand why an article about a French general who invaded several countries under Napoleon is nominated for deletion as far as I can tell solely on based on authorship? Do we not trust the content because of the person who wrote it? Can someone explain this to me? I glanced at the article quickly and the English seems fine. This is a serious question; I don't get it. Also, why did we delete Genocide in Guatemala? It was already redlinked when I noticed it, but unless the article was truly astonishing bad, I would have made an effort to clean that one up. Personally. Considering that some of the stuff that's been on the "cleanup after translation" list the past few years --- we have had articles on individual addresses in Paris. We have lists of say, songs on a 1990s album in Indonesian, sheriffs of individual municipalities in Wales (one list per century), and government hierarchies in well, pretty much everywhere.
- I have a suggestion: The person who decides that we need a set of articles for each madrasa in Tunis, water tower in Holland or mountain in Corsica is responsible for finishing the work on the articles in the set to a certain standard. Which can be quite low, incidentally. I have no objection to some of the association football and track and field articles that are being nominated for deletion. They may not be sparking entralling prose but they are there and tell you, should you want to know, who that person is. Similarly the articles about figures in the literature of Quebec, while only placeholders, do contain information and are preferable to nothing. Although I don't see machine translation as the huge problem some people apparently do, the translation tool also does need work. It might be nice if it sent articles to user space by default, and the articles could then be published from there there after polishing. Elinruby (talk) 14:26, 8 September 2016 (UTC)[reply]
Guatemalan genocide was redirected, not deleted, for being a very poor translation, resulting in sentences like this (one sentence!): "The perpetración of systematic massacres in Guatemala arose of the prolonged civil war of this country centroamericano, where the violence against the citizenship, native mayas of the rural communities of the country in his majority, has defined in level extensivo like genocide -of agreement to the Commission for the Esclarecimiento Historical- according to the crimes continued against the minoritary group maya ixil settled between 1981 and 1983 in the northern demarcation of the department of The Quiché, in the oil region of the north Transversal Band, with the implication of extermination in front of the low demographic density of the etnia -since it #finish to begin to populate the region hardly from the decade of 1960- and the migration forced of complete communities to the border region in search of asylum in Chiapas, Mexico , desarraigadas by the persecution; in addition to becoming like procedure of tactical State of earth arrasada, tortures, disappearances, «poles of development» -euphemism for fields of concentration- and recurrent outrages against the women and girls ixiles, many of them dying by this cause, crimes of lesa humanity against of all the international orders of Human rights." Fram (talk) 08:56, 12 September 2016 (UTC)[reply]
Heh. That's not unusual. But see there *is* an article, which was my primary concern. I should have checked before using it as an example. Here is the point I was trying to make. Since apparently I didn't, let me spell it out. -- I have put in a considerable amount of time on the "cleanup after translation" list so yes, I absolutely agree that horrible machine translations exist. I have cleaned many of them up. But. Many of the articles we keep are extremely trivial. Many get deleted that seem somewhat important, actually, just not to the particular person who AfD's them. I have seen articles on US topics get kept because of a link to Zazzle. (!) Perhaps my POV is warped by the current mess I am trying to straighten out in the articles on the French court system, but it seems to me that the english wiki is rather dismissive of other cultures. (Cour d'assises != Assizes, just saying; this is what we call a cognate.) That is all; just something that has been bothering me. Elinruby (talk) 05:59, 13 September 2016 (UTC)[reply]

The interim period ends today

But most articles have not been reviewed--it will apparently take many months. Of the ones still on the list that I have reviewed, I am able to find at least one-third which are worth rescuing and which I am able to rescue. We need a long continuation.If this is not agreed here, we will need to discuss it on WP:ANB. I would call the discussion "Emergency postponement of CSD X2" DGG ( talk ) 04:15, 6 May 2017 (UTC)[reply]

My understanding was that we were still working out how to begin the vaccination process. I'm happy if we simply moved to draft space instead of deleting at the end of the two weeks, but I'm not sure if that would address your concern. Tazerdadog (talk) 22:15, 6 May 2017 (UTC)[reply]

@S Marshall, Elinruby, Cryptic, No such user, Atlantic306, DGG, Acer, Graeme Bartlett, Mortee, Xaosflux, HyperGaruda, Ymblanter, BrightR, and Tazerdadog:
I call "reltime" on the section title! ;-) But seriously, it does end in a few days, and although I've been active in pushing to stick with the current date (June 6) to finalize this, so I almost hate to say this, but I'd like to ask for a short postponement, for good cause. This is due to two different things that have happened in the last few days, that materially change the picture, imho:
  • CXT Overwrites - this issue about CXT clobbering good articles of long-standing, was raised some time ago, and languished, but has been revived recently, and we now (finally!) have the list of overwrites we were looking for in order to attack this problem: around 200 of them. All that remains to completely solve this for good, is to go through the list, and if the entry also appears in WP:CXT/PTR, strike it. See WP:CXT/PTR/Clobbers for details.
  • Asian language review - this was stalled for lack of skilled translator/proofreaders in these and other languages. In response to a suggestion by Elinruby, I made an overture a few days ago about starting a recruitment effort. Since time is so short, rather than wait for a response, I went ahead and started one at WP:CXT/PTR/By language. In just three days[a] this has started to bear fruit, with editors working on articles in Gujarati, Hindi, Hungarian, Farsi, Romanian and Arabic; with over 50 or 60 analyzed. I'm ready to ramp up the recruitment effort on Turkish, Chinese, Japanese, Russian, and more European languages (hopefully with the help of others here) but this does need some time as it's only got started literally in the last few days.
A postponement would give us the time to save all the clobbers, and make a significant dent in the articles from Asian and other languages for which we don't have a lot of expertise. Mathglot (talk) 06:19, 1 June 2017 (UTC)[reply]

Notes

  1. ^ That is to say, four days less than it took Dr. Frank-N-Furter to make Rocky a man.
My understanding is that the clobbers have all been taken care of. This leaves the Asian language articles. I'm sure that if someone with the needed language skills comes along in the future, admins would be more than happy to mass-undelete the drafts so that they could be reviewed. However, I don't see a reason to postpone in the hope that this will occur. Tazerdadog (talk) 18:45, 2 June 2017 (UTC)[reply]
Clobbers *are* taken care of, because we (two of us) have been taking care of them. Asian (and other) languages have plenty of translators td.hat could take care of them, it's not a matter of "hoping" for anything in the future, they exist now, so all we have to do is continue the effort begun only a few (5) days ago here. Going forward, this should be even more efficient, now we have the results of Cryptic's queries 19218 and 19243 created only today, and wikified here: WT:CXT/PTR/By language. We have editors working on Gujarati, Hindi, Bengali, Arabic, Romanian, and Hungarian, with more in the pipeline. This is a ton of progress in five days, and I wish it had been thought of a month ago, but it wasn't, and we are where we are. A postponement will simply allow ongoing evaluations by editors recruited less than a week ago and are delivering fast results, to continue instead of being cut off, and additional languages to be handled. Go look at WP:CXT/PTR/By language to see what has been accomplished so far, and at what speed. @Cryptic and Elinruby:. Mathglot (talk) 07:52, 3 June 2017 (UTC)[reply]
Still need to recruit de, bg and ru. Also still very distracted by real life -- I have had one parent die and another go into hospice in the course of this project, and we have still gotten all this done, so it's not like we are dragging this out into never-never land. A majority of these articles are rescuable, esp as we bring in new editors who are not burned out by re-arranging the word order of the sentences for the 10,000 time. I think the really stellar articles have all been flagged now, but we have still found some very recently and I have said this before. Beyond the really stellar though are the many many not-bad articles and the more mediocre ones that are nonetheless easier to fix than to do over.I am in favor of an extension, personally, though as we all know I would not have started this at all if it were up to me. Many of the really bad articles were already at PNT.
I will be flying almost all day today but will check into wikipedia tonight. Elinruby (talk) 16:19, 3 June 2017 (UTC)[reply]
I'm involved in many other things, and get here as I can, and each time I do, I find more than can and should be rescued. There are whole classes of articles, like those of small towns or sports stadiums, which have merely been assumed to be of secondary importance and not actually looked at. If we delete now, we will be judging article by the title. It is very tempting to easily remove all the junk by removing everything, but that;'s the opposite of sensibler ,and the opposite of WP:PRESERVE/ DGG ( talk ) 09:45, 5 June 2017 (UTC)[reply]
I found a couple today that kind of amazed me, they were so good. But let's play out the chinese fire drill. I'm afraid we're going to find out that we've all done a huge amount of work to delete 30 articles that need to be deleted and 350 whose authors will will not contribute again. Anyway. I have not touched stadiums, personally, because I suspect they will be deleted for notability so why? Ditto all these people with Olympic gold medals because I already have plenty to do without getting involved with articles that are certain to be deleted, not to mention all the argentinian actresses and whatnot.... grumble. Gonna go recruit some chinese and norwegians, because the articles are just going into some other namespace we can still send links to right? Elinruby (talk) 02:14, 6 June 2017 (UTC)[reply]
The plan is to draftify the articles prior to deletion, but I think deletion can be postponed basically indefinitely once they are draftified Tazerdadog (talk) 02:31, 6 June 2017 (UTC)[reply]
@Tazerdadog: I'd like to be sure of that. This is why you lose editors, wikipedia... anyway. Am cranky at the moment. Let me get done what I can with this and then I'll have some things to say. Hopefully some intelligent and civil things. Are we really getting articles from PootisHeavy still? Elinruby (talk) 02:46, 6 June 2017 (UTC)[reply]
@Elinruby: Fixing pings like you just did doesn't work. Pings only work if you sign your post in the same edit and do nothing but add content. Pppery 02:49, 6 June 2017 (UTC)[reply]
@Elinruby: The second part statement I made above is a departure from the established consensus as I understand it. The plan which achieved consensus was to draftify, hold in draft for just long enough to check for massive clerical errors, and then delete. I floated the above statement to try to gain consensus to hold the articles in draft space for longer (or indefinitely). While it is important to get potential BLP violations and gross inaccuracies out of mainspace in a timely manner, i don't think it is nearly so important to delete the drafts, especially if salvageable to good articles are regularly being pulled out of them. Tazerdadog (talk) 03:06, 6 June 2017 (UTC)[reply]
@Tazerdadog: Thanks for the suggestion. I think that would help limit the potential for damage and it would alleviate some of my concerns. My assessment differs widely from what I keep reading on this board, but, hey. If anyone cares I would say that 10% of these articles are stellar and very advanced and sophisticated translations. Don't need a thing. Another 10% are full or partial translations, quite correct, of articles that do not meet en.wiki standards for references or tone but do faithfully reflect the translated article. Many of these are extremely boring unless you are doing nitty-gritty research into something like energy policy in Equatorial Guinea, but they then become important... About another 5% I cannot read at all and let's say another 10% are heavy going and require referencing one or more equivalent articles in other languages. Say 5% if anyone ever gets around to dealing with PootisHeavy. The rest are... sloppy english but accurate, unclear but wikilinked, or some other intermediate or mixed level. This has not, in my opinion, been a good use of my time and I have stopped doing any translations, personally, until we get some sanity here. The whole process, it seems to me, simultaneously assumes that translation is easy and also that it is of no value. If wikipedia does not value translation then -- argh. It just makes me to see a good organization eat its own foot this way, is all. Off to see if I can catch us a nepali speaker ;) Elinruby (talk) 03:26, 6 June 2017 (UTC)[reply]
@Elinruby: No Nepali speakers needed, there are no Nepali articles in the batch, afaict. Also, Pootis stopped translating following a March 23 addition to his talk page, currently at #53 on the page.
@Tazerdadog: Whatever kind of draft/quarantine/hyperspace button you press, I plan to carry on with some of the Asian and other languages recruitment which we only recently got started on (which is going great, btw, and we could use some more help over at there if anyone wants to volunteer). I'll want to modify the editor recruitment template so that it can blue-link articles in whatever new location they reside in, so hopefully it will be a nice, systematic mapping of some sort so a dumb template can easily be coded to figure out the new location, given the old one. Just wanted to mention that, so that you can keep it in mind when you come up with the move schema. Naturally, if it's just a move to Draft namespace, then it will be an easy fix to the template.
There is one article in Nepali. I have not invited anyone for it yet, though I did do some of the less populated languages like latvian, indonesian and polish. I have several answers (da, es, pt as I recall) and most articles passed. I will put translated templates and strike those articles shortly. And yes, I just now struck one today. Anything about 3-d modeling is notable imho and I will work on it as long as I can read it at all. Also some of the bad translations about historical documents may be fixable given the response we are getting. If either of you gets enough help/time there are quite a few es/pt/de articles that I did that I believe to be correct but cannot myself certify in terms of the translated template Elinruby (talk) 00:29, 8 June 2017 (UTC)[reply]
@DGG: I withdraw my aspersions on the section title name. This offer valid for twenty-four hours. Mathglot (talk) 08:24, 6 June 2017 (UTC)[reply]
I do not understandd what you mean by this. I assume you mean you are withdrawing the attempt to start mass deletions immediately. If not, please let me know--for I will then proceed to do what I can to prevent them--and , if possible to try to change policy so that no X- speedy criteria can ever again be suggested. The more of these translations I look at, them ore I find that should be rescued. DGG ( talk ) 23:59, 7 June 2017 (UTC)[reply]
Retrieved from "https://en.wikipedia.org/w/index.php?title=Wikipedia:Administrators%27_noticeboard/CXT&oldid=1099350307"