Wikipedia talk:Large language models
| This is the talk page for discussing improvements to the Large language models page. |
|
| Archives (index): 1, 2, 3, 4, 5, 6, 7Auto-archiving period: 14 days |
This project page has been mentioned by multiple media organizations:
|
| This project page does not require a rating on Wikipedia's content assessment scale. It is of interest to the following WikiProjects: | ||||||||
| ||||||||
Chatbot to help editors improve articles
edit| This section is pinned and will not be automatically archived. |
I wrote a user script called WikiChatbot. It works by selecting text in an article and then clicking one of the buttons on the right to enquire about the selected text. It includes many functions. For example, it can summarize and copyedit the selected text, explain it, and provide examples. The chat panel can also be used to ask specific questions about the selected text or the topic in general. The script uses the AI model GPT 3.5. It requires an API key from OpenAI. New OpenAI accounts can use it freely for the first 3 months with certain limitations. For a more detailed description of all these issues and examples of how the script can be used, see the documentation at User:Phlsph7/WikiChatbot.
I was hoping to get some feedback on the script in general and how it may be improved. I tried to follow WP:LLM in writing the documentation of the chatbot. It would be helpful if someone could take a look to ensure that it is understandable and that the limitations and dangers are properly presented. I also added some examples of how to use edit summaries to declare LLM usage. These suggestions should be checked. Feel free to edit the documentation page directly for any minor issues. I'm also not sure how difficult it is to follow the instructions so it would be great if someone could try to set up the script, use it, and explain which steps were confusing. My OpenAI account is already older than 3 months so I was not able to verify the claims about the free period and how severe the limitations are. If someone has a younger account or is willing to open a new account to try it, that would be helpful.
Other feedback on the idea in general, on its problems, or on new features to implement is also welcome. Phlsph7 (talk) 12:45, 12 July 2023 (UTC)
- I meant to reply to this sooner. This is awesome and I'm interested in this (and related ideas) related to writing / reading with ML. I'll try to have a play and give you some feedback soon. Talpedia 10:18, 17 July 2023 (UTC)
- Related: see also m:ChatGPT plugin. Mathglot (talk) 07:22, 18 July 2023 (UTC)
- Whilst I rather like the ability of this nifty little script to do certain things, I do have some criticism. These functions strike me as extremely risky, to the point that they should probably be disabled:
- "is it true?" - ChatGPT likely uses Wikipedia as a source, and in any case, we want verifiability, not truth. I feel quite strongly, based on several other reasons too, that this function should be disabled and never see the light of day again.
- "is it biased?" - ChatGPT lacks the ability to truly identify anything more than glaring "the brutal savages attacked the defenceless colonist family" level bias (i.e. something that any reasonably aware human should spot very quickly indeed). Best left to humans.
- "is this source reliable?" - Same as the first one, this has so much potential to go wrong that it just shouldn't exist. Sure it might tell you that Breitbart or a self-published source isn't reliable, but it may also suggest that a bad source is reliable, or at least not unreliable.
- I don't think that any amount of warnings would prevent misuse or abuse of these functions, since there will always be irresponsible and incompetent people who ignore all the warnings and carry on anyway. By not giving them access to these functions, it will limit the damage that these people would cause. Doing so should not be a loss to someone who is using the tool responsibly, as the output generated by these functions would have to be checked so completely that you might as well just do it without asking the bot.
- The doc page also needs a big, obvious warning bar at the top, before anything else, making it clear that use of the tool should be with considerable caution.
- The doc page also doesn't comment much on the specific suitability of the bot for various tasks, as it is much more likely to stuff up when using certain functions. It should mention this, and also how it may produce incorrect responses for the different tasks. It also doesn't mention that ChatGPT doesn't give wikified responses, so wikilinks and any other formatting (bolt, italics, etc) must be added manually. The "Write new article outline" function also seems to suggest unencyclopaedic styles, with a formal "conclusion", which Wikipedia articles do not have.
- Also, you will need to address the issue of WP:ENGVAR, as ChatGPT uses American English, even if the input is in a different variety of English. Mako001 (C) (T) 🇺🇦 01:14, 23 July 2023 (UTC)
- You can ask it return wikified responses and it will do it with reasonable good success rate. -- Zache (talk) 03:03, 23 July 2023 (UTC)
- @Mako001 and Zache: Thanks for all the helpful ideas. I removed the buttons. I gave a short explanation at Wikipedia:Village_pump_(miscellaneous)#Feedback_on_user_script_chatbot and I'll focus here on the issues with the documentation. I implemented the warning banner and add a paragraph on the limitations of the different functions. That's a good point about the English variant being American so I mentioned that as well. I also explained that the response text needs to be wikified before it can be used in the article.
- Adding a function to wikify the text directly is an interesting idea. I'll experiment a little with that. The problem is just that the script is not aware of the existing wikitext. So if asked to wikify a paragraph that already contains wikilinks then it would ignore those links. This could be confusing to editors who only want to add more links. Phlsph7 (talk) 09:12, 23 July 2023 (UTC)
- I made summaries/translations/etc it so that I gave wikitext as input to chatgpt instead of plaintext. However, the problem here is how to get the wikitext from page in first place. -- Zache (talk) 09:48, 23 July 2023 (UTC)
- In principle, you can already do that with the current script. To do so, go to the edit page, select the wikitext in the text area, and click one of the buttons or enter your command in chat panel of the script. I got it to add wikilinks to an existing wikitext and a translation was also possible. However, it seems to have problems with reference tags and kept removing them, even when I told it explicitly not to. I tried it for the sections Harry_Frankfurt#Personhood and Extended_modal_realism#Background, both with the same issue. Maybe this can be avoided with the right prompt. Phlsph7 (talk) 12:09, 23 July 2023 (UTC)
- I made summaries/translations/etc it so that I gave wikitext as input to chatgpt instead of plaintext. However, the problem here is how to get the wikitext from page in first place. -- Zache (talk) 09:48, 23 July 2023 (UTC)
- You can ask it return wikified responses and it will do it with reasonable good success rate. -- Zache (talk) 03:03, 23 July 2023 (UTC)
- Whilst I rather like the ability of this nifty little script to do certain things, I do have some criticism. These functions strike me as extremely risky, to the point that they should probably be disabled:
- Thanks for setting this up. I've recently had success drafting new Wikipedia articles by feeding the text of up to 5 RS into GPT4-32k through openrouter.com/playground and simply asking it to draft the article. It does a decent job with the right prompt. You can see an example at Harrison Floyd. I'll leave more details on the talk page of User:Phlsph7/WikiChatbot, but I wanted to post here for other interested parties to join the discussion. Nowa (talk) 00:02, 20 September 2023 (UTC)
- Thanks for the information. I've responded to you at Talk:Harrison_Floyd#Initial_content_summarized_from_references_using_GPT4 so that we don't have several separate discussion about the same issue. Phlsph7 (talk) 07:44, 20 September 2023 (UTC)
- Ran into a brick wall I thought might be helpful to know about. I've been working on the bios of people associated with Spiritual_warfare#Spiritual_Mapping_&_the_Charismatic_movement. GPT 4 and LLama refused to read the RS claiming that it was "abusive". I can see from their point of view why that is, but nonetheless, RS is RS, so I just read it manually. Between that and the challenges of avoiding copyvios I'm a bit sour on the utility of LLMs for assisting in writing new articles. It's just easier to do it manually. Having said that, the Bing chatbot does have some utility in finding RS relative to Google. Much less crap. Nowa (talk) 00:35, 9 October 2023 (UTC)
- Thanks for the information. I've responded to you at Talk:Harrison_Floyd#Initial_content_summarized_from_references_using_GPT4 so that we don't have several separate discussion about the same issue. Phlsph7 (talk) 07:44, 20 September 2023 (UTC)
If we're going to allow LLM editing, this is a great tool to guide editors to the specific use cases that have community approval (even if those use cases are few to none at this point). I found it to be straightforward and easy to use. –dlthewave ☎ 16:06, 23 July 2023 (UTC)
- There is no policy or guideline disallowing the use of LLM or other machine learning tools. No need for any approval unless that changes. MarioGom (talk) 17:29, 11 February 2024 (UTC)
Seeking positive suggestions - maybe for a 'good usage' section
editI request in this thread folks provide some suggestions for positives ways LLMs can or should be used, and suggestions for a title if it becomes a separate section for such. The essay here is highly negative - my editor counts 154 "not"s here and guidance such as WP:LLMCOMM are all about blocking it out. There doesn't seem to be content here reflective of external remarks that WP has strategy to use AI, such as here or here or AI Strategy Brief for Editors - STABLE - 2025-02-10 here, or Artificial intelligence in Wikimedia projects and official WP policy WP:BOTPOL also seems to regard it as Skynet. But lets face it -- most users are getting search assist or Alexa or something first along with or instead of WP, and it would be good to be value added not value excluded.
I'd like to see some *useful* suggestions stated, ideas on how to maybe use AI. Surely folks can think of some things unless they'd like to step aside for folks out there smarter than me who can ?
I'm not looking for AI to just be some enhanced editor that humans supervise closely -- I'm looking at AI to do things humans cannot do or do poorly. As an initial start I will offer a few notions.
- Creating initial drafts for articles or article sections, identifying major themes in external sources and differentiate pop-press from authoritative content.
- Providing evidence for TALK discussions - uncovering points while taking the personality out of it and avoiding claims of cherry-picking.
- (I've seen a lot of TALK and RFC discussions with wild claims of WEIGHT or Consensus or inability to look at all relevant WP guidances ...)
- Citation generator - finding and listing good reference works in good style that editors can look at
Cheers Markbassett (talk) 15:29, 29 October 2025 (UTC)
- I've seen zero evidence that LLMs can do any of those things to any useful degree of reliability.
- 'Initial drafts': Firstly, and quite obviously, LLMs have no access whatsoever to sources unless they are online. This would make identifying 'major themes' deeply problematic even without the many other issues LLMs have with citations. Which begin with the simple fact that LLM's routinely 'cite' articles with sources selected not because they actually support content, but because the title suggests they might. And with astonishing frequency mangle up the citations in doing so, if they don't hallucinate them entirely. At best, LLM output might find the occasional useful source for article content, but you'll need to go through whole slews of mangled titles and god-knows-what to find such stuff - a very inefficient and untrustworthy search engine. Note also that as all current LLMs have (as far as I'm aware at least) been trained on large quantities Wikipedia output, any draft on anything of any complexity covered by Wikipedia is liable to generate content violating WP:CIRCULAR, only without the citation to Wikipedia to admit to it.
- 'Evidence for TALK': Unless you've actually checked the sourcing, it isn't 'evidence'. And it is already cherry-picked, due to online bias, the biases inherent in prompting an LLM to produce what it thinks is appropriate content for Wikipedia - which as we've repeatedly seen, frequently fails to conform to core policies, and in the case of Grok at least, is deliberately prompted to reflect the political stance of the owner.
- 'Citation generator'. NO. NOT EVER. NOT UNDER ANY CIRCUMSTANCES. Absolutely cannot be trusted, per my earlier comments. They mangle citations. They invent them. They 'cite' based on titles, rather than on the content of the document they are supposedly referring to.
- In summary, beyond maybe using LLM output as an extra (inefficient and inaccurate) source search, they are functionally useless, and liable to lead all but the most careful and experienced contributors (who should need this sort of help least) right up the garden path. AndyTheGrump (talk) 23:07, 29 October 2025 (UTC)
- Andy - Perhaps you need a newer or better LLM too - because I think I've seen them doing these three things better than human average in WP, and think these are suggested uses turning up in a generic search.
- In any case, this thread topic is for brainstorming and contributing suggested uses. Please try to give some positive ideas on where and how to use LLM, or ways disliked least, because without such the natural consequence is people use it in unguided ways since all uses are equally valued. If you have no preference, then you won't particularly mind the preferences of others - but I'm looking for some additional ideas here. Cheers
- Markbassett (talk) 19:55, 31 October 2025 (UTC)
- We can't base recommendations to use LLM's on your personal experience. If you can point to e.g. peer-reviewed research which supports your claims, fine, but meanwhile, we KNOW that LLM's hallucinate. And we KNOW that it has been mathematically proven that this is inherent to the software, and can't be fixed. And no, I'm not going to invent fictitious reasons to use LLMs. If you want an honest appraisal, you don't go around insisting only on 'positive ideas'. And yes, I do mind what people do with this stuff - which is why I'm advocating that they shouldn't. AndyTheGrump (talk) 20:28, 31 October 2025 (UTC)
- Grump - I think you're WP:OFFTOPIC. This thread was clearly stated as asking for positive suggestions on where/what/how such should be used for WP, not about more doubts or more barriers or asking where things are imperfect all over - humans and LLMs. If you're saying you want to remove all the essay negatives that do not meet the peer-reviewed research standard, then say so - I suppose that would fit the thread as less of the negatives would sort of count as a 'positive'. If you want to seek research before adding an application suggestion, feel free to go ahead and do so. If you want to suggest such a peer-reviewed forum that explores applications which we should look to, feel free to do that - not an idea per se but maybe helpful to find such. Meanwhile, out in the world folks are using List of large language models and List of chatbots, and it's going to get in WP and/or replace WP... I think this essay may partly choose which way the WP future might go. Cheers Markbassett (talk) 00:23, 4 November 2025 (UTC)
- I don't give a toss if you think that I'm off-topic. It is grossly inappropriate to try to restrict commentary on a controversial subject solely to those who support your position. That isn't a topic, it is an attempt to manipulate on by improper means. AndyTheGrump (talk) 07:16, 4 November 2025 (UTC)
- The only marginally good use I've seen posited is that of translation for non-English speakers needing to interact with this project. Otherwise, every use of LLM/AI I've seen on EN.WP is just a shit-show. —Locke Cole • t • c • b 00:59, 4 November 2025 (UTC)
- OK, thanks - that's one. Markbassett (talk) 13:34, 5 November 2025 (UTC)
- @Locke Cole Sounds like we haven't met. Hi, I am Polygnotus (allegedly). Polygnotus (talk) 21:34, 5 November 2025 (UTC)
- Grump - I think you're WP:OFFTOPIC. This thread was clearly stated as asking for positive suggestions on where/what/how such should be used for WP, not about more doubts or more barriers or asking where things are imperfect all over - humans and LLMs. If you're saying you want to remove all the essay negatives that do not meet the peer-reviewed research standard, then say so - I suppose that would fit the thread as less of the negatives would sort of count as a 'positive'. If you want to seek research before adding an application suggestion, feel free to go ahead and do so. If you want to suggest such a peer-reviewed forum that explores applications which we should look to, feel free to do that - not an idea per se but maybe helpful to find such. Meanwhile, out in the world folks are using List of large language models and List of chatbots, and it's going to get in WP and/or replace WP... I think this essay may partly choose which way the WP future might go. Cheers Markbassett (talk) 00:23, 4 November 2025 (UTC)
- We can't base recommendations to use LLM's on your personal experience. If you can point to e.g. peer-reviewed research which supports your claims, fine, but meanwhile, we KNOW that LLM's hallucinate. And we KNOW that it has been mathematically proven that this is inherent to the software, and can't be fixed. And no, I'm not going to invent fictitious reasons to use LLMs. If you want an honest appraisal, you don't go around insisting only on 'positive ideas'. And yes, I do mind what people do with this stuff - which is why I'm advocating that they shouldn't. AndyTheGrump (talk) 20:28, 31 October 2025 (UTC)
- A thread on finding positive uses for llms that opens with using llms to generate article text is not helping its cause. CMD (talk) 04:48, 4 November 2025 (UTC)
- Again, the topic is asking to provide notion(s) for how to maybe use AI/LLM. Cheers Markbassett (talk) 13:46, 5 November 2025 (UTC)
- I used an AI image generator to add a funny image to WP:OMGWTF (I did manually manipulate it in Photoshop though, so it technically isn't 100% AI). But AI images in article-space are a non-starter for obvious reasons. Other than those edge cases, AI/LLM content should generally be avoided and care should be taken with using an AI/LLM in article research as such use is very easy to misread or misjudge (or simply be wrong) and can lead to undue influence in article editing. It's simply not worth the risk.
- Like for example, I just saw an esteemed editor with a long history on the project repeatedly use LLM in their edits, and despite protests claiming they were checking the output, many errors were found and ultimately that editor was indef blocked. —Locke Cole • t • c • b 17:57, 5 November 2025 (UTC)
- OK, thanks. That's another, and seems within WP:AIIMAGES - the AI guidance being still in works is obvious there at the it's discussion link. Sorry to hear about the editor getting an indef ban, that seems symptomatic of excesses in the topic area and lack of clear undo, I think it should aim more to be proportionate and providing a positive good. Cheers Markbassett (talk) 07:07, 7 November 2025 (UTC)
- @Markbassett See T360489 and my userspace. Polygnotus (talk) 21:32, 5 November 2025 (UTC)
- Thanks. Much more detail and tech than my little braindstorming in this thread.
- I think the T360489 seems largely potential candidates for officially-accepted WP bot service.
- (Though I'm thinking we may often already know better before an AI points it out but have difficulty doing right. Like me and my diet or weight ;-) ) Cheers Markbassett (talk) 07:23, 7 November 2025 (UTC)
- @Markbassett I am experimenting with having an LLM provide feedback on articles, write editsummaries and help verify if claims are supported by the source provided. In all cases the human makes the decision, the LLM only provides information. Polygnotus (talk) 09:53, 7 November 2025 (UTC)
- Again, the topic is asking to provide notion(s) for how to maybe use AI/LLM. Cheers Markbassett (talk) 13:46, 5 November 2025 (UTC)
LLM comment?
editAn editor posted an RFC review request at WP:AN (specifically, Wikipedia:Administrators' noticeboard#Request for Review of RfC Closure: Talk:Floppy disk). A number of editors, myself included, believe the entire request to be LLM/AI-written. The request was hatted and eventually closed (with a number of admins endorsing the closure). The request was made again, and in their most recent comment they claim they wrote the request.
Could we get some experienced eyes to take a second look and help determine if the editor is writing their own comments or utilizing AI? Thanks! —Locke Cole • t • c • b 05:23, 31 October 2025 (UTC)
- I'm perhaps out of turn here since I was involved in that RFC from random looking at RFC/A. But in hopes this helps, it doesn't seem LLM and not something to chase anyway.
- I think it was *not* a LLM output.
- (a) the Tom94022 challenge is present in his Sandbox2 at 12 successive edits across 3 days with enough time between edits that seems plausible for manual reviewing and manual changing, and
- (b) Tom94022 has previously demonstrated a long-winded pain-in-the-ass nature and deep involvement in tech topics widely as visible from his Talk page and edit history. Ding him for BLUDGEON perhaps, but don't think the whole thing was just popped out of a LLM.
- And I think that LLM question is really irrelevant -- because in short the closer put forward only two lines of fairly generic declaring "consensus", and after Talking with ed17 a editor put forward an excessively long request asking for review to look at the strength of arguments in light of policy per WP:CONSENSUS. If someone will just do that it would seem a lot less effort than the already spent effort on long-winded rejections and commentary. Markbassett (talk) 21:11, 31 October 2025 (UTC)
... the Tom94022 challenge is present in his Sandbox2 at 12 successive edits across 3 days with enough time between edits that seems plausible for manual reviewing and manual changing
Inspection of the edit history tells another tale, as FaviFake noted, Special:Diff/1318588111 shows Tom inserting text with markdown syntax which is a telltale sign of LLM usage. The other thing I'm noticing in that edit is that the text inserted has inserted line-breaks, which is also consistent with LLM use. I'd like someone uninvolved (but ideally good at spotting LLM use) to take a look at the initial HATGPT'd request and either opine here or at the still-open WP:AN discussion, as Tom is insisting he wrote the initial request. And what's more distressing to me is that his two most recent replies also show signs of LLM use...- And this is more user-conduct related (and likely something for AN/I if this keeps up) but if he continues to insist he wrote it, then there's a WP:CIR issue here because WP:V explicitly says the opposite of what he's claiming in his request (
Reliably sourced information must be included. Removal requires lack of sources, not editorial preference.
vs.While information must be verifiable for inclusion in an article, not all verifiable information must be included. Consensus may determine that inclusion of a verifiable fact or claim does not improve an article, and other policies may indicate that the material is inappropriate. Such information should be omitted or presented instead in a different article.
). If he admits it was LLM generated, then we can at least give him the benefit of the doubt that it was an AI hallucination. - @FaviFake and TonySt: Pinging the other two editors who noted LLM use. —Locke Cole • t • c • b 16:29, 1 November 2025 (UTC)
- I'll quote the relevant comments of mine:
— Wikipedia:Administrators' noticeboard § c-FaviFake-20251027192200-Locke Cole-20251027190400Comment: [...] The "smart" quotes, large number of headings, each numbered, the incorrect capitalisation of every heading and bolded text, the extensive amount of boldface usage, the long bulleted lists using em dashes, shortcuts not being linked and instead being italicised, the incorrect spacing around slashes, the unnecessarily detailed and frequent references, the repetition of the words "as viewed through the lens of Wikipedia policy" three times, the unnecessary table, and the eerily uncommon words used make me think this is AI-generated.
— Wikipedia:Administrators' noticeboard § c-FaviFake-20251027193000-SnowFire-20251027192700 and Wikipedia:Administrators' noticeboard § c-FaviFake-20251027194300-Locke Cole-20251027193000See Special:Diff/1318491635 and the rest of their sandbox's history. [...] Here's an edit in which they pasted markdown formatting, and another edit which created a slew of weird, empty references. I'd say this closure review was not written by them.
10_template�For these reasons, I indeed believe there is an extremely high chance all their three comments were mostly generated using AI. FaviFake (talk) 16:34, 1 November 2025 (UTC)- Again, I don’t think that a human filing a request should be overlooked for a much harder/longer question, but is the question interesting folks whether the editors did ‘enough’ edits to believed LLM that they avoid the policy of WP:MEATBOT ? Cheers Markbassett (talk) 14:16, 2 November 2025 (UTC)
- What? FaviFake (talk) 14:18, 2 November 2025 (UTC)
- I’m asking then why (or if) this is a topic involved with a request about RFC close appeal? Perhaps it is just a side interest of LockeCole thinking that “entire request to be LLM/AI-written.”
- My note of Tom’s 12 successive edits had Locke noting edit 811 where “text inserted has line-breaks” and “text inserted has markdown syntax”. Seems reassuring about not “entire request to be LLM/AI-written” but still thinking that part of the edits came from LLM or other paste. (And adding personal distress that two AN replies are thought to also have ‘signs’ of LLM usage.). So… interesting, but what is the WP policy in question here and does it relate to the RFC review request ? Is it whether it is ‘enough’ edits that they avoid the policy of WP:MEATBOT ? Something else? It still seems a long unnecessary and yucky detour rather than just do a bit more than two-line close and/or just say that nobody desires to do the optional review. Cheers Markbassett (talk) 03:16, 3 November 2025 (UTC)
what is the WP policy in question here
There's currently a discussion waiting to be closed at WP:RFCL bout this exact issue. Policies may change any day now. FaviFake (talk) 15:26, 3 November 2025 (UTC)- ???? Not seeing any relevant discussion at RFCL -- this did not say a location there and still no Policy named as here in question. The original RFC close and blowing off the review request both ended a couple of days ago, no review 'in light of' any policies was done and all that before this thread even started, so I'm not seeing what this discussion is about. Still thinking the "entire request to be LLM/AI-ritten' sounded not so and personally still think it irrelevant - a human entered a request for review, it's the RFC content up for review, not the request up for inclusion. Anyway, think I'm done here -- over and out. Cheers Markbassett (talk) 00:54, 4 November 2025 (UTC)
- i was referring to Wikipedia:Closure requests#Wikipedia:Village pump (policy)/Archive 205#LLM/AI generated proposals? FaviFake (talk) 05:08, 4 November 2025 (UTC)
- Umm ??? I can't see a relevance. The Wikipedia:Administrators' noticeboard#Request for Review of RfC Closure: Talk:Floppy disk) and the belief "that the entire request was LLM/AI-written" has gotten to discussion of LLM like that other request two months ago about a requested move and whether LLM input is becoming undetectable. I maybe need it spelled out more, but I am not seeing a particular policy point named in that for this discussion nor anything here stating the policy point at stake. The TALK here asked for a look at the Floppy Disk discussion and got evidence counterindicating the belief that Floppy disk appeal "entire request was LLM/AI-written", so comments in that other discussion which may be relevant seem like WP:HATGPT does not apply on this and "We really should not care whether it is or isn't AI-generated, that's just wasting everybody's time trying to determine something that is irrelevant."
- Somehow I don't think it is a feasible option nor intent to UNhat and restart the request from HATGPT now being counter-indicated -- nor to offer the editor in question some apology about the hatting -- so there seems no action under consideration here.
- Again - it seems far simpler, faster, easier, and more respectable of WP:AGF to note a human made a request for a close review so either review the close or choose to not review the close. Only the text at the RFC and the close is up for review in that. It even seems inappropriate to have content in the request affect the outcome because that's not part of what the RFC closer had to work with, is not what the request asked to be reviewed, and additional post-close evidence or single-voice arguing should not be factors in such a review. Maybe it would be more useful to alter the WP:CLOSECHALLENGE guideline limiting the length of the request to shortly identifying the RFC and the concern. Cheers Markbassett (talk) 15:34, 4 November 2025 (UTC)
- i was referring to Wikipedia:Closure requests#Wikipedia:Village pump (policy)/Archive 205#LLM/AI generated proposals? FaviFake (talk) 05:08, 4 November 2025 (UTC)
- ???? Not seeing any relevant discussion at RFCL -- this did not say a location there and still no Policy named as here in question. The original RFC close and blowing off the review request both ended a couple of days ago, no review 'in light of' any policies was done and all that before this thread even started, so I'm not seeing what this discussion is about. Still thinking the "entire request to be LLM/AI-ritten' sounded not so and personally still think it irrelevant - a human entered a request for review, it's the RFC content up for review, not the request up for inclusion. Anyway, think I'm done here -- over and out. Cheers Markbassett (talk) 00:54, 4 November 2025 (UTC)
- What? FaviFake (talk) 14:18, 2 November 2025 (UTC)
- Again, I don’t think that a human filing a request should be overlooked for a much harder/longer question, but is the question interesting folks whether the editors did ‘enough’ edits to believed LLM that they avoid the policy of WP:MEATBOT ? Cheers Markbassett (talk) 14:16, 2 November 2025 (UTC)
- Similar discussion: Courtesy link: Wikipedia talk:Talk page guidelines § c-Markbassett-20251105143100-Chipmunkdavis-20250905092100. FaviFake (talk) 16:49, 5 November 2025 (UTC)
Should vs. must?
editI had assumed that LLMDISCLOSE was mandatory, but I now realize it's optional. That seems totally broken, and so I'm wondering if anyone here could help enlighten me as to why it was written that way. I ask this with an eye to making this conversation a pre-RfC to make LLM disclosure mandatory. CaptainEek Edits Ho Cap'n!⚓ 19:37, 2 November 2025 (UTC)
- Probably to offer some flexibility as is the norm. Even COI disclosure is only a should, must only comes into play with PAID and that is mandated by the TOU. Maybe not as important to be precise when wording an essay as opposed to PAGs, but still usually best to try. 184.152.65.118 (talk) 22:01, 2 November 2025 (UTC)
- Previous discussion about making it mandatory is at Wikipedia:Village pump (policy)/Archive 205#Alternative approach: make transparency policy. --Tryptofish (talk) 00:52, 3 November 2025 (UTC)
- @Tryptofish that seems like a rather decent pre-RfC. I see the point about a technical solution, but I think that a policy solution would probably be necessary first to both be a stopgap and to spur on the technical solution. It seems like the path forward is to propose that LLMDISCLOSE be made mandatory and elevated to policy? You obviously have way more experience in this arena so I'll happily defer to you, just don't want this to wither on the vine :) CaptainEek Edits Ho Cap'n!⚓ 17:27, 3 November 2025 (UTC)
- Thanks. I'm going to give a more detailed reply below, but it applies to both what you are saying and what isaacl is saying. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)
- I submitted a closure request for that village pump discussion a week and a half ago. — Newslinger talk 20:41, 3 November 2025 (UTC)
- Thanks for that, too. Pity it got archived without a consensus being found. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)
- I'm mildly hopeful the LLMDISCLOSE may make the cut, the rest was my being unaware of the prior RFC and basically re-asking for something the community had already supported. But even if LLMDISCLOSE doesn't make policy, I think tackling this one piece at a time will prove more useful than trying to push one grand proposal (unless it's easily broken into pieces and put up as a Watchlist notice, etc for a month). —Locke Cole • t • c • b 23:32, 3 November 2025 (UTC)
- Thanks for that, too. Pity it got archived without a consensus being found. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)
- @Tryptofish that seems like a rather decent pre-RfC. I see the point about a technical solution, but I think that a policy solution would probably be necessary first to both be a stopgap and to spur on the technical solution. It seems like the path forward is to propose that LLMDISCLOSE be made mandatory and elevated to policy? You obviously have way more experience in this arena so I'll happily defer to you, just don't want this to wither on the vine :) CaptainEek Edits Ho Cap'n!⚓ 17:27, 3 November 2025 (UTC)
- Generally speaking, the community doesn't object to all uses of programs to assist with writing. The concerns are about using programs that generate original content, with greater detail than any human input used to trigger the generation. Thus in my view, any guidance should focus on how the program is being used, rather than the underlying technology (which in any case might not be readily apparent to the end user).
- I think from a holistic perspective, a key question is what happens next if disclosure is required? If it's just a pre-cursor to removing the text, then maybe we should be banning generated text from mainspace instead (as with a disclosure requirement, of course this only works for those who read and follow guidance). If it's to queue up edits for examination, then what's the best format for disclosure that assists with this (template used on talk page, perhaps?) and do we need to organize more volunteers to manage and process the queue? Can we build more tools to help with analyzing edits (including presumably the vast majority of problematic edits from editors who won't comply with any relevant guidance)? isaacl (talk) 17:30, 3 November 2025 (UTC)
- I've been paying very close attention to as many community discussions about these issues as I can find, and yes, we definitely need to find a carefully calibrated middle ground between no regulation and too much regulation. It's very clear to me that many members of the editing community don't want a complete ban on LLM use, and want to be careful that it doesn't become something that editors might weaponize against one another, and from my own observation, I've come to agree that this is a risk that we need to avoid. At the same time, by now, most of us have seen use of LLM-generated content that is obviously disruptive.
- I've been working (I promise I have!) on trying to pull together a draft proposal for a potential policy (a full policy page, including, but going beyond, disclosure), that others could then workshop further before putting it to the community for an RfC. It's not easy, and I've been swamped with other stuff, both on and off site. So I want to promise CaptainEek that I won't let it wither on the vine – but I expect that it will take a while before it ripens on the vine. --Tryptofish (talk) 21:20, 3 November 2025 (UTC)
- User:CaptainEek - Why disclosure was written that way -- Archives shows the discussion in Oct 2023 about a RFC showing this page was clearly not going to be promoted to either policy or guideline, so it was relabeled as essay and to make the phrasing more essay-like replaced "must"s with "should"s to "better reflect its status as an essay". See also discussion about LLMDISCLOSE lacking clarity on how to disclose and being incentivised to hide it completely instead of disclosing. Cheers Markbassett (talk) 05:30, 4 November 2025 (UTC)
Maybe mentioning the new guideline higher on the page
editHeard of the new LLM guideline but didn't know its name, landed on this essay page, expected to see it in the disambiguation, and it was not in the disambiguation. I know the guideline is in fact linked lower in the page, but point is, it should probably be mentioned at the top of the page. -- Lampyscales (🐍 | C) 16:25, 25 November 2025 (UTC)
I made this page an information page
editSee title. I did it in Special:Diff/1325613952. Feedback is welcome. I think this change is for the better. SuperPianoMan9167 (talk) 03:19, 4 December 2025 (UTC)