Editing
Project:Village pump (proposals)
(section)
From Thetacola Wiki
Jump to navigation
Jump to search
Warning:
You are not logged in. Your IP address will be publicly visible if you make any edits. If you
log in
or
create an account
, your edits will be attributed to your username, along with other benefits.
Anti-spam check. Do
not
fill this in!
== Request for comments on research study == As part of a research project at the University of Michigan, we have been developing a new system for fixing broken links on the web. Given a broken link to a web page, our system, FABLE (which stands for Finding Aliases for Broken Links Efficiently), attempts to find the new URL at which that same page now exists on the web; please see the web page for the FABLE project (https://webresearch.eecs.umich.edu/fable/) for more details on how our system works. To gauge the accuracy of the URL replacements identified by FABLE, we are hoping to conduct the following study. We have run FABLE on a subset of links that have been marked as permanently dead ([[:Category:Articles with permanently dead external links]]). For each such link where FABLE has found the new URL for the permanently dead link, we plan to make a post on the Talk page of the corresponding article seeking feedback on whether the new URL identified by FABLE looks correct. We have developed a bot ([[User:FABLEBot]]) to make such posts, and an example of the posts it would make is on my Talk page ([[User talk:HarshaMadhyastha]]). Before we file for approval for our bot, I am posting here to seek your comments and concerns about this study. Thank you! [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 15:02, 9 August 2022 (UTC) :Hi @[[User:HarshaMadhyastha|HarshaMadhyastha]]. Thank you for thinking of Wikipedia for this project and for consulting the community before proceeding with this idea. I think the general direction of this project is good, but I am quite worried about the number of talk page messages such a bot will generate. Talk page messages are "expensive" in that they stay around as clutter for a long time (forever on many pages) and generally bots do not leave messages on article talk pages. There are a couple other designs that could work better. One is to have an off-wiki website interface on which authenticated users can approve or disapprove particular URL changes. Another is to create a single page on-wiki that lists all, or many, of the proposed link replacements all in one place, similar to [[Wikipedia:Database reports]] (e.g. [[Wikipedia:Database reports/Potential biographies of living people (1)]]), that users can work through in bulk. Let me know if we can further discuss these ideas. Best, '''[[User:L235|KevinL]]''' (<small>aka</small> [[User:L235|L235]] '''路''' [[User talk:L235#top|t]] '''路''' [[Special:Contribs/L235|c]]) 15:46, 9 August 2022 (UTC) ::Oh, I just remembered the example of "bot posting talk page messages" not going well. It was {{user|InternetArchiveBot}}, which posted "External links modified" sections on talk pages and really irritated community members ([https://en.wikipedia.org/w/index.php?title=Special%3AContributions&target=InternetArchiveBot&namespace=1&tagfilter=&start=&end=2017-01-01&limit=100 example]), and if I remember correctly got blocked for it. <br>Separately, one other thing that you probably shouldn't do is direct people to an off-wiki site just to provide feedback. (Off-wiki hosted sites that use OAuth to actually perform the change, such as [https://iabot.toolforge.org/index.php?page=runbotsingle], and hosted on Toolforge, are probably OK.) They can provide feedback on-wiki, perhaps in a templated or tabled form that's machine-readable. Hope that's OK with your study methodology. I don't think you'll get community support otherwise. Best, '''[[User:L235|KevinL]]''' (<small>aka</small> [[User:L235|L235]] '''路''' [[User talk:L235#top|t]] '''路''' [[Special:Contribs/L235|c]]) 15:49, 9 August 2022 (UTC) ::::{{u|InternetArchiveBot}} stopped posting on talk pages after [[Wikipedia:Village_pump_(proposals)/Archive_145#Disable_messages_left_by_InternetArchiveBot|this 2018 discussion]], with the main motivation being (if I recall correctly) that the error rate was low, and the action performed wasn't significant enough to be worth posting about. [[User:Uanfala|Uanfala]] ([[User talk:Uanfala|talk]]) 16:48, 9 August 2022 (UTC) :::Thank you for the feedback [[User:L235]]! I certainly appreciate the concern regarding clutter on Talk pages. For now, our plan was to post on the Talk pages of at most 200 articles, as our goal at the moment is just to gauge the accuracy of our system's output. Do you think the community would support such a limited one-time study? One of our motivations for posting on Talk pages was that users who are watching the Talk page of an article are more likely to have the necessary context for the external links included in the article. :::Alternatively, I really like your idea of creating a single page which lists all proposed link replacements. We are working on creating such a page on our project site, but we could instead create it on-wiki, if that would make it more palatable. Do you have any suggestions/thoughts on what such a page should look like? Would it suffice to have a table with columns for "Wiki article", "Dead link in article", and "Potential replacement URL for dead link"? What would be the best way to seek user feedback on which of the rows in the table are correct? [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 16:42, 9 August 2022 (UTC) ::::Hi @[[User:HarshaMadhyastha|HarshaMadhyastha]], one other concern about the talk page messages is that you aren't going to get many responses. If you do 200 pages, I'd be surprised if you got 20 people reviewing the link.{{pb}} Re the page, two more columns you can add are: (1) the full citation (not just the link), for context, and (2) a column for users to mark whether the correction is good or not. Best, '''[[User:L235|KevinL]]''' (<small>aka</small> [[User:L235|L235]] '''路''' [[User talk:L235#top|t]] '''路''' [[Special:Contribs/L235|c]]) 16:52, 9 August 2022 (UTC) :::::Hi @[[User:L235]], A quick follow up question: if we were to create a page which shows a table of the URL replacements that we have discovered, any thoughts/recommendations on how we would draw the community's attention to this page? Thank you. [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 13:54, 10 August 2022 (UTC) : Some ideas that are probably better than talk page messages: Use a tool similar to https://oabot.toolforge.org, or just make edits to the article directly and trust page watchers to correct any incorrect matchings. [[User:Pppery|* Pppery *]] [[User talk:Pppery|<sub style="color:#800000">it has begun...</sub>]] 16:01, 9 August 2022 (UTC) ::Thank you. We were concerned that having our bot directly edit articles would receive pushback, since some of the URL replacements we find are likely to be incorrect. So far, by our estimation, we get only 5% wrong, but that's still more than 0 :) ::What safeguards would you recommend we put in place in order to get approval to directly edit articles? [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 16:46, 9 August 2022 (UTC) ::: That's really up to [[WP:BAG|the bot approvals group]], who will likely approve the bot for a trial of some small number of edits, then evaluate the fixes for themselves and/or see if anyone complains, not me, but remember that even an incorrect repair does not cause much harm since the original dead URL is not very useful. [[User:Pppery|* Pppery *]] [[User talk:Pppery|<sub style="color:#800000">it has begun...</sub>]] 17:11, 9 August 2022 (UTC) :This sounds really useful! I don't see any major problems with talk page posts, and it may even be the preferable option during the pilot run. However, I agree that it may be better to eventually skip it. Maybe the bot can just update the url, add a custom tag (some variant of {{verification needed}}), and allow editors to either approve the edit by removing the tag or to reject it by reverting the bot's edit. There should be a way to catch those accepts and rejects automatically, right? A link for editor feedback can also be available within the tag (or its corresponding help page) and in the bot's edit summary. [[User:Uanfala|Uanfala]] ([[User talk:Uanfala|talk]]) 16:48, 9 August 2022 (UTC) *'''support pilot''' The effect on Wikipedia is 200 talk page messages, which is acceptable. If this proceeds then I expect registration at [[:meta:Research:Projects]], publishing a plan to publish research findings in a way that the Wikimedia community can accept, and a commitment to responding to questions and comments after posting the messages. I was one of the people who complained about the similar Internet Archive project, but in that case, there were several hundred thousand talk page messages posted, and that high number was the project. 200 seems reasonable; if someone complains then a lower number could be negotiated but to me this seems fine. If you are able to give more to this project, then commit to more documentation or publishing a peer reviewed paper, as the Wikimedia community appreciates that. Thanks. [[User:Bluerasberry|<span style="background:#cedff2;color:#11e">''' Bluerasberry '''</span>]][[User talk:Bluerasberry|<span style="background:#cedff2;color:#11e">(talk)</span>]] 16:53, 9 August 2022 (UTC) :Thank you all for the comments, suggestions, and feedback. Greatly appreciated! My students and I will discuss how best to proceed and circle back here soon. :We have been working for a couple of years now on improving FABLE's coverage, accuracy, and efficiency. Looking forward to apply our work to help tackle link rot on Wikipedia. [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 20:23, 9 August 2022 (UTC) A few scenarios #If the goal is to simply evaluate the potential, instead of 200 messages, you could simply consolidate the suggested urls on a centralized page for review. This could be in the bot's own userspace and wouldn't need any big community input so long as it stays in userspace. See [[WP:BOTUSERSPACE]] for the relevant guidance. #If the goal is an actual bot, and the bad match rate is unknown, the OABot model would be a good one, where the bot is making a suggestion and a human approves. This would be a semi-automated tool and wouldn't need a [[WP:BRFA]]. #If the goal is an actual bot, and the bad match rate is low/acceptable, the bot could make its own edits. But there would still a need for an extensive trial and a [[WP:BRFA]]. In all cases [[WP:BOTPOL]] should be reviewed.  <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] 路 [[Special:Contributions/Headbomb|c]] 路 [[WP:PHYS|p]] 路 [[WP:WBOOKS|b]]}</span> 11:53, 11 August 2022 (UTC) :Thank you for the input. :Currently, we are definitely leaning towards option 1. Our main concern is: how will those who are interested in providing feedback on the URL replacements we find learn about our page's existence? This is why we are thinking that, once we create our page that lists all the URL replacements that we have found so far, we would also post on the Talk pages of 200 of these articles (after getting bot approval, of course). These Talk page messages would point users back to our consolidated page, thereby helping raise awareness of its existence. :Any thoughts/suggestions? [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 18:03, 11 August 2022 (UTC) {{Re|HarshaMadhyastha}} once you have results, you can simply post a notice here (or maybe [[WP:VPM]] would be better), and at [[WP:BOTN]] and that should give you plenty of feedback. If you target specific types of articles like "mostly medical articles", you can also post notice on related Wikiprojects, like WikiProject Medicine.  <span style="font-variant:small-caps; whitespace:nowrap;">[[User:Headbomb|Headbomb]] {[[User talk:Headbomb|t]] 路 [[Special:Contributions/Headbomb|c]] 路 [[WP:PHYS|p]] 路 [[WP:WBOOKS|b]]}</span> 02:13, 12 August 2022 (UTC) :Sounds good. Thank you very much for the pointers! :We are currently simply churning through all articles listed in [[:Category:Articles with permanently dead external links]] in alphabetical order. [[User:HarshaMadhyastha|HarshaMadhyastha]] ([[User talk:HarshaMadhyastha|talk]]) 02:29, 12 August 2022 (UTC)
Summary:
Please note that all contributions to Thetacola Wiki may be edited, altered, or removed by other contributors. If you do not want your writing to be edited mercilessly, then do not submit it here.
You are also promising us that you wrote this yourself, or copied it from a public domain or similar free resource (see
Project:Copyrights
for details).
Do not submit copyrighted work without permission!
Cancel
Editing help
(opens in new window)
Navigation menu
Page actions
Project page
Discussion
Read
Edit source
History
Page actions
Project page
Discussion
More
Tools
Personal tools
Not logged in
Talk
Contributions
Create account
Log in
Navigation
Main page
Recent changes
Random page
Help about MediaWiki
Search
Tools
What links here
Related changes
Special pages
Page information