Isaac (Isaac Johnson)
Research Scientist

Projects (5)
View All

acl*research_collaborations_policy_admins
Policy
Applied Science
Component
Policy-Admins
Group
Trusted-Contributors
Group
WMF-NDA
Policy

Calendar

User Details

User Since: Oct 1 2018, 2:19 PM (374 w, 2 d)
Availability: Available
IRC Nick: isaacj
LDAP User: Isaac Johnson
MediaWiki User: Isaac (WMF) [ Global Accounts ]

Recent Activity
View All

Mon, Dec 1

Isaac updated the task description for T378617: Update mwedittypes to handle HTML diffs.

Mon, Dec 1, 6:18 PM · Research-Freezer, Research-engineering

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Mentors have a link on their Mentor Dashboard that will navigate them to a filtered view of RecentChanges, limited to their mentees:
Mentors have two different mentee filters available in Recent Changes:

Oh I love this @KStoller-WMF -- thank you for the correction!

Mon, Dec 1, 1:13 PM · Research (FY2025-26-Research-October-December), Essential-Work

Wed, Nov 26

Isaac updated subscribers of T397550: Develop a dataset for editor Q&A.

Updates:

I updated the code for extracting passages to also grab past questions asked via Growth's mentorship module (i.e. sections on user talk pages that match the format Question from... (<date>) and questions asked via help-me templates on user talk pages (presence of help-me-* template). I'm working on generating their embeddings so they can be added to the question-bank corpus in the prototype.
Moyan shared her code for her previous experiment with providing feedback to new editors via AI: https://github.com/phoebexxxx/newcomer-llms-user-study/tree/main
- The core functionality is a nearest-neighbor index overtop several core content policies for RAG purposes combined with an instruction to the agent (gpt-4o-mini) to rephrase the participant's question for better retrieval purposes. I have a working nearest-neighbor index but I think that "please rephrase this question for..." is a key functionality to explore where we prototype workflows with a LLM.
I spoke with @Trizek-WMF about his experiences/thoughts around mentorship. My summary below:
- Answers are often quite slow with 1:1 mentorship (I've been seeing this too in the data).
- Lots and lots of repeat questions (I've been seeing this too in the data).
- A number of editors think their mentor is a bot or AI. Makes me think on one hand that having a bot respond to newcomer questions (one idea we have) could exacerbate this but it might also be a reminder to emphasize that they have a human mentor as well who can provide more context/support/etc. It also might be an opportunity to more clearly set expectations for the newcomer.
- Sometimes newcomers seem to think their mentors are responsible when things don't go well for them. That's hard to do something about but it makes me wonder whether there aren't ways to help mentors better track their mentees so they can step in earlier (if needed) -- e.g., alerts when a mentee is reverted or a form of RecentChanges that is automatically filtered to their mentees. I don't think this latter exists but should be possible to build as rc_actor is a field in RecentChanges so the hard part is deploying a table that has the actor IDs for a mentor's mentees.
- Because it takes a while for mentors to respond or mentees to return for the answer, pages have often been archived. While DiscussionTools should fix this issue, in reality the "This topic could not be found on this page, but it does exist on the following page:..." message might be missed (or perhaps just confusing for a newcomer?).
- Different wikis definitely have different systems/norms around mentorship. French Wikipedia for instance doesn't really use help-me templates but does have a Teahouse equivalent (Forum des nouveaux).
- Mentorship is not recognized within spaces like Admin bids (in the same way that e.g., experience patrolling is). Part cultural but also might be a function of how hard it is to summarize one's impact via mentorship. This is an opportunity for making available more statistics about positive outcomes from mentorship.
- Some potential issues with answers: many experienced editors use wikitext but newcomers are on VE; rules evolve and so old answers may not always be right; rules evolve and so documentation may be behind; many "rules" aren't written down in a formal way.
- He thought a bot that can help handle the repetitive questions very quickly would be welcomed by many folks as it would relieve pressure on quick responses and handle the less interesting inquiries.

Wed, Nov 26, 9:08 PM · Research (FY2025-26-Research-October-December), Essential-Work

Fri, Nov 21

Isaac added a comment to T360794: Implement stream of HTML content on mw.page_change event.

This is very exciting @JMonton-WMF ! I think @fkaelin is the best person to answer both of your questions.

Fri, Nov 21, 2:20 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Research, Event-Platform

Thu, Nov 20

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Updates:

I built a prototype for natural-language search overtop the Help/Policy namespaces on English Wikipedia. There's a backend API that I can use for testing against an eventual dataset of queries (if we have explicit "correct" answers) and a UI for exploration. The API/UI show the natural-language search results alongside what is returned by our existing keyword search as guided by these entrypoints curated by editors.
- API: https://mentorship-search.wmcloud.org/search/?k=5&query=can%20I%20add%20a%20link%20to%20twitter? (code)
- UI: https://wiki-topic.toolforge.org/search-help?query=can+I+add+a+link+to+twitter%3F (code)
The nearest-neighbor searches are brute-forced (as opposed to an approximate index) so it takes a second or two. Using the Qwen3-Embedding-0.6B model for embeddings. It showed a strong improvement anecdotally over the much smaller standard sentence-transformers models. I suspect adding in a reranking model would help even more but that would require storing the text too (and not just embeddings) and slow things down a good bit further.
This is actually the third iteration -- the first one was all Help/Wikipedia namespaces but it was way too messy with all the admin noticeboard etc. pages. Second was only top-level pages (no subpages) to remove all that discussion but that was too coarse because I lost some important Q&A archives and even though the results were higher quality, they mixed together very different contexts -- e.g., policies, help documentation, Q&A. So in this current iteration, I have explicitly separated out the different sources so that in theory they could be separately contextualized for an end-user -- e.g., here are similar questions, here is relevant policy, here's some how-to etc.

Thu, Nov 20, 9:15 PM · Research (FY2025-26-Research-October-December), Essential-Work

Isaac updated the task description for T410663: Upgrade AMD GPU + torch version of ML Labs machines.

Thu, Nov 20, 4:25 PM · Essential-Work, Machine-Learning-Team

Isaac added a comment to T410663: Upgrade AMD GPU + torch version of ML Labs machines.

Tagging you @Trokhymovych as I think you mentioned having (similar?) issues with a Qwen re-ranking model as well that seemed to relate to the torch version?

Thu, Nov 20, 4:24 PM · Essential-Work, Machine-Learning-Team

Isaac created T410663: Upgrade AMD GPU + torch version of ML Labs machines.

Thu, Nov 20, 4:23 PM · Essential-Work, Machine-Learning-Team

Tue, Nov 18

Isaac created T410405: Inconsistent page title styles in Mediawiki content current v1 dumps.

Tue, Nov 18, 3:21 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), DPE-Mediawiki-Content

Fri, Nov 14

Isaac updated subscribers of T399642: [Signal] Identify cases where reference does not support published claim.

Just dropping a few quick thoughts in case it's helpful if work is picked up in this space because they've been sitting in my head for a bit and I'm happy to now have somewhere to dump them:

Relevant lit:
- How Grounded is Wikipedia? A Study on Structured Evidential Support and Retrieval -- this is the closest analog to this task in that part of their work was checking Wikipedia claims against their sources. They claim 30% failed verification (for biographies) using GPT-4o-mini as the model to do the claim extraction + verification. Useful deep-dive into how to find relevant evidence to a claim (they find re-ranking via LLMs to be important as a final step after a basic retrieval stage).
- Detecting Corpus-Level Knowledge Inconsistencies in Wikipedia with Large Language Models -- fact-checking Wikipedia with itself; so again not exact same but some very solid data pipelines to consider from these folks regarding finding relevant statements to a claim.
- Improving Wikipedia verifiability with AI -- this was about the broader challenge of finding "better" citations for a claim, but contains some useful ideas etc.
- WikiCheck: An end-to-end open source Automatic Fact-Checking API based on Wikipedia -- this was about fact-checking external claims with Wikipedia, so the opposite question to what's posed here, but with some shared methods.
- A German newspaper also did a deep analysis of a sample of articles on dewiki. You can see annotations from a bunch of dewiki editors after they went through them (table) and Signpost coverage. They found 20% of pages w/ outdated info and ~15% with incorrect info (not just outdated).
Considerations:
- This is a classification task and fine-tuned smaller models still tend to do comparable or better than LLMs if there is some good data available for fine-tuning. There's the {{failed_verification}} templates as mentioned above that could be used for this. But also because this is a relatively generic task (not particularly wiki-specific) and Wikipedia is so central to the fact-checking sphere (so often appears in their datasets), my understanding is that more generic fine-tuned models should be pretty appropriate for our context even without us training on our own very Wikipedia-specific examples. So we wouldn't necessarily need a huge dataset of those.
- I've played around a bit in this space and my experience was that the data pre-processing is just as important if not more as compared to choosing the right model. For example:
  - There's the question of extracting the claim with its appropriate context from Wikipedia. This is a lot easier if it's an Edit Check so in-context in VisualEditor and we can capture specific instances of e.g., new sentence added + citation. But if you're applying a model to existing content (Suggested Edit), it's a bit trickier to capture the specific claim with enough context to understand but not so much that you're really fact-checking multiple claims. Folks vary between using basic heuristics -- e.g., grabbing just the sentence, the whole paragraph, etc. -- and using LLMs to extract the specific claim and adapt to varying levels of contexts. The latter is probably more effective -- see Dense X Retrieval: What Retrieval Granularity Should We Use?.
  - There's the question of fetching the text of the source being cited -- with AI etc. destroying the internet, we're seeing a lot more paywalled content and we'd have to make sure that we don't return a ton of false positives just because the external website blocked the request to some degree. Relying on Internet Archive links can help with this potentially but I've heard that websites have started to block them as well (e.g., Reddit block).
  - There's the question of cleaning the source HTML and extracting just the relevant text but not all the boilerplate, menus, etc. This generally isn't a big deal in this context because you just need to find one statement that supports (or not) so having some noisy text isn't a big deal, but it can slow things down if you're also processing a bunch of it with LLMs. Models with longer context windows also help reduce the importance of this.
  - There's the question of ranking the potential evidence for what's most relevant so not all of it has to be checked. Mostly I see folks recommend a basic similarity-based ranking followed by more complex re-ranking of the top few candidates with a LLM.
Suggested first steps:
- If you all want to pick this up, I'd start with building a small-ish dataset of positive and negative examples (even just starting with 20 of each would probably be okay though 50 of each would be better).
  - If it's for Edit Check, I'd grab a random sample of recent content adds and manually check them. If you're having trouble finding failed-verification examples, I'd narrow down to those that were reverted on the assumption that there'd be more failed-verifications in those. For each negative example, I'd grab a positive example from the same article.
  - If it's for a Suggested Edit, I'd grab some of the {{failed_verification}} sentences and a few claims with citations in those articles that don't have that template applied (so you have a semi-balanced dataset of positive and negative claims).
- Once you have the set of claims + citations, I'd then scrape the sources of all of those and see how effective that is. That should already give you a good sense qualitatively of the scale of the challenge. And then run that small dataset through a few LLMs or existing fine-tuned language models to see how they do. That should hopefully be reasonably quick and give a decent idea of what level of accuracy you can expect with a basic setup.

Fri, Nov 14, 6:34 PM · CommunityConfiguration-Adoption, EditCheck, Editing-team, VisualEditor

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Updates:

I started to look into the {{Help me}} template (notebook + ping @MGerlach as the person who flagged this pathway to me). The code is hacky because we don't have a nice content diff dataset for talk pages so I had to find Help me sections post-hoc and then try to guess who added the request etc., but there were at least 1700 instances on English Wikipedia of editors whose account was <= 10 days old using the template so this could be a good dataset to mine for more newcomer questions. These almost exclusively happen on the newcomer's user page (usage on article talk pages is much more likely to be more experienced editors).
I met with Moyan and Tiziano (external researchers) to discuss some ideas about where this could go. We're going to meet again in early December but they're both excited about the space. Looking ahead, we will work to expand the qualitative coding I'm currently doing of Newcomer Homepage questions (and I think I'll add in the Help Me questions from newer users). This already has revealed quite a bit but we'd then choose one potential space for intervention and build out a prototype and evaluate it. Some of the potential intervention ideas (please chime in if you have others) that have already come from our discussions:
- Natural-language search of Policy/Help namespaces. This was what I came into the project thinking and very likely will still pursue because it should be effective given that these namespaces are relatively constrained in size, not super dynamic, contain a fair bit of jargon, and have many massive/diverse pages that challenge the utility of keyword search. This is also great for prototyping because it's almost purely back-end and easy to incorporate into tooling to test out if we get to that point. Plus it aligns nicely with other work on Semantic Search happening.
- Same as above but of FAQ / Question spaces only. Essentially rather than providing directly the answer, this would help editors find similar questions and see how other editors responded (with answers, asks for clarification, caution about breaking policies, etc.).
- LLM agent to help editors rewrite their questions so they are easier to answer. This could support better Search as well but also ensure there's enough context for an editor to be able to answer directly as opposed to have to first ask for a follow-up (with all the newcomer drop-off that occurs the longer the conversation goes). I like this as a really nicely-constrained and principled use of AI that doesn't get in between the interactions between editors (just tries to ease things from the sidelines). Some similarities to the ideas proposed by Cristian Danescu (meta) but harder to prototype because you need it installed by newcomers so that either requires essentially a full Product deployment or very limited field study at edit-a-thons where you could individually install it for folks.
- "I'm just a human" auto-responder for mentors. This is kinda a combination of the above two ideas but with more interesting prototyping opportunities. Essentially the idea would be that when a mentee asks a question on their mentor's talk page, if the mentor has opted in, a bot would automatically collect that question, query an AI agent, and post a quick-follow up depending on the level of context provided. Probably always included is some boilerplate language about how editors are people and might not be active in this moment so please be patient and check back. If the question has enough info, maybe the response includes a few relevant links from on-wiki documentation / question banks based on the Search prototype. Maybe if the question is lacking context, the bot asks the editor to clarify. Maybe the AI even tries to answer the question. This could be configurable as well -- e.g., an editor could opt in to just the Search links but no answer or just the Clarification component but not the others.
- Tool for helping newcomers keep track of the questions they've asked. It'd be great to be able to track whether the question was answered etc. but that gets a lot trickier because questions get moved around as pages get archived. Easiest would be to just retain the original section link and allow the DiscussionTools extension to handle discovery of the section even if it's been moved. And then the improved Thank/Reply functionality for the editor figuring out how to follow-up.

Fri, Nov 14, 5:20 PM · Research (FY2025-26-Research-October-December), Essential-Work

Wed, Nov 12

Isaac closed T385834: CIKM '25 Support as Resolved.

PhD Symposium wrapped up! I was not able to attend but reportedly the day went smoothly.

Wed, Nov 12, 4:25 PM · Essential-Work, Research

Fri, Nov 7

Isaac updated subscribers of T397550: Develop a dataset for editor Q&A.

Thanks for calling that out @KStoller-WMF ! In my informal conversations with lots of experienced editors, exhaustion is definitely a factor though the motivation/desire to help still exists. I definitely came into this wondering how to help mentees but the more I do it, I think the "how do we help mentors get more enjoyment out of the process" question is also crucial.

Fri, Nov 7, 6:59 PM · Research (FY2025-26-Research-October-December), Essential-Work

Thu, Nov 6

Isaac added a comment to T398482: develop a content diff index design plan.

Oooh good find! Just repasting link with question-mark included for easier discovery: https://en.wikipedia.org/wiki/Wikipedia:Village_pump_(technical)#Find_articles_that_have_recently_had_a_specific_word_added?

Thu, Nov 6, 1:59 PM · Research, Research-engineering

Oct 31 2025

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

weekly update:

No major update as other urgent work took up most of my time. Had a good discussion with Moyan Zhou of UMN though about the role of AI in mentorship that sparked some thoughts about how AI could potentially both help newcomers with rephrasing their question and mentors in digging up links etc. to make it easier to respond, but largely stay out of the middle of the relationship where that human connection is important.

Oct 31 2025, 5:12 PM · Research (FY2025-26-Research-October-December), Essential-Work

Oct 28 2025

Isaac added a comment to T405754: Outreachy 31: Micro-task Generator for Organizers on Wikipedia.

Hey folks - a few things to share with the broader group:

Thanks for sending along your notebooks for review! We're working our way through them and appreciate the patience. Given that the deadline for final submission is November 3rd, please submit in the next day if you are intending to so there is time for feedback.
The final application requests a project timeline. My suggestion: you can break down week-by-week what would be required to turn your notebook into a fully-functioning app that organizers could use. Please include time for making improvements and learning new skills if you need that. Overall, we're generally more interested in your notebook and responses to other questions though so don't sweat it if you're not really sure how to do this.
Some common themes across the notebooks:
- We value independent thought, so feel free to try out ideas. That said, please explain the reasoning behind your choices -- e.g., why did use a certain threshold for "good" vs. "bad"; why did you choose the signals you went with; why did you choose the articles you went with; etc. There is no right answer so it's much more important that we understand how you got to the code you produce.
- Please please please check the outputs of your functions. Are the results correct and what you expected? GenAI can be helpful with code but it also can steer you very wrong.
- Remember to take a look at your public link (https://public-paws.wmcloud.org/User:<your-username>/Micro-Task-Generator.ipynb) before submitting as sometimes code/comments can be hard to read in the public form (good use of Markdown cells often helps with this).
- Feel free to delete the example code that we provided but do make sure to answer each of the TODOs.

Oct 28 2025, 9:27 PM · Research, Outreachy (Round 31)

Oct 27 2025

Isaac updated the task description for T383090: [medium] Devise approach for assigning knowledge cut-off dates to Wikipedia articles.

Oct 27 2025, 1:21 PM · research-ideas

Oct 23 2025

Isaac added a comment to T406531: NEWFEATURE REQUEST: Add new referral sources to pageview data.

so the way I see it, it means we may not need a new classifier for chat agents.

@Mayakp.wiki that's fine by me then -- I mainly am concerned with a shifting definition than the exact boundaries and it sounds like you are planning to stick with ChatGPT/Perplexity as Search Engines as opposed to moving them to their own category at a later point. Thanks!

Oct 23 2025, 7:27 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review, Essential-Work, Movement-Insights (FY25-26 H1), Data-Platform

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Another quick thought: I was curious how many mentees actually had an email (necessary to get a notification that their mentor had responded if they logged out from Wikipedia) and it's up around 91% so that's not necessarily a major issue here as far as mentee drop-off. EDIT: added authentication check and that's only 67% of mentees, so perhaps a larger factor.

Oct 23 2025, 7:23 PM · Research (FY2025-26-Research-October-December), Essential-Work

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Weekly update:

I started coding up some questions from Wikipedia Teahouse. After 5 of them, I'm going to pause though. They tend to be far more detailed/advanced and I think out-of-scope for my goals at the moment. These are questions that almost certainly do need the level of detail/context that an editor can provide in their reply (i.e. bad fit for just surfacing documentation). The Teahouse folks are also largely doing a good job of responding pretty quickly -- e.g., 3 of the 5 questions got responses in ~10 minutes. It's telling that in all three of those cases, the conversation was much more in-depth than the usual question + single response (9, 5, and 6 responses) and actually saw the question-asker continue to engage. For the other two (2 hours and 13 hours to first response), the question-asker never re-engaged.
Mentees essentially never thank their mentor (despite occasionally using this feature to thank others) and often don't respond to their initial thread if the question isn't answered in the first ~10 minutes. We may want to nudge mentees to thank their mentor when their response is helpful. The more accessible Thanks link on talk pages (details) should be a big help when it's deployed to English Wikipedia but perhaps there's a good place to nudge mentees to use this functionality when they appreciate a mentor response (as it's still slightly hidden).
I talked with a number of folks at WikiConference North America about this work, which led to some interesting ideas:
- Mentorship has at least two dual goals: giving the question-asker specific feedback on what they should do next (competency) and also advising on broader norms within Wikipedia (relatedness). The former is what I think we might address better via improved Search over documentation while the latter is what is still important to preserve as a human interaction.
- At some point, it might be valuable to consider what data could help mentors in assessing their work. This would have to be done carefully because folks are doing this out of their own goodwill and you don't want to transform it into another chore or just plain work -- i.e. it shouldn't feel like grading or surveillance. That said, statistics on mentee survival/success, response times (maybe too surveillance-y?), or other outcome-related data might help in surfacing particularly successful mentors or identify areas for improvement. So maybe e.g., a public top-list of best mentors by engagement/outcomes and then folks can privately view their own statistics about response time etc.
- In relation to discussions around the progression system (T395678), mentors might eventually be folks who could "sign off" on someone achieving a basic level of skills. This could be purely for feedback purposes or help build the confidence of new editors, a way for a mentee to "graduate" out of the mentorship program if mentors feel they have too many folks on their plate, or perhaps even be tied to receiving some sort of user access level if that's deemed helpful?
- I'm less convinced about this but leaving it here as a thought: we may want to institute some sort of back-up similar to how Help me templates work. E.g., if a mentee isn't receiving a response within some timeframe, other editors could be pinged. Perhaps more appropriate would be doing that if e.g., the mentor has not edited in the last 24 hours? Generally I see some issues with slow responses on Growth Homepage mentor questions (especially as compared to Teahouse) though it's been pretty rare that a mentor doesn't respond in e.g., 24 hours so I don't really think this is a problem that needs to be solved.
Next steps: I'm realizing that there are a lot of approaches to getting feedback (many listed in the description of this task) but it might be helpful to describe them in a bit more organized way -- e.g., whether it pings an individual, a small group, or a large group; how easy to use; how discoverable; etc. This will also help me in deciding whether I want to continue coding up the Newcomer Homepage Mentor questions or switch to a third source.

Oct 23 2025, 4:38 PM · Research (FY2025-26-Research-October-December), Essential-Work

Isaac added a comment to T383088: webrequest dataset sets referer_class "unknown" instead of "external (search engine)" for origin-based referer values.

Just commenting here too as a duplicate of T406531#11303547: I personally would leave in place the expectation that referrers start with http or https. My read is that that behavior is largely coming from bots who are improperly mocking up a referrer. I don't see nearly the volume that Krinkle saw in January and the majority of it is being labeled as automated (query below). Given that it's also not acceptable behavior per the specs, I'd lean towards us enforcing the expectation of having legitimate referers as a further check against bot data.

Oct 23 2025, 3:56 PM · Patch-For-Review, Analytics-Data-Problem, Movement-Insights, Product-Analytics, Data-Engineering

Isaac added a comment to T406531: NEWFEATURE REQUEST: Add new referral sources to pageview data.

Thanks for the ping! A few thoughts but don't let this block the work if you all want to proceed:

I personally would leave in place the expectation that referrers start with http or https. I can go comment on T383088 too but my read is that that behavior is largely coming from bots who are improperly mocking up a referrer. I don't see nearly the volume that Krinkle saw in January and the majority of it is being labeled as automated (query below). He noted that it's not actually acceptable behavior but believed it to be an issue caused by some privacy extensions perhaps. It's a judgment call but I'd prefer that we expect legitimate referers as a further check against bot data.
+1 to changing IPs to unknown -- seems reasonable and I assume not a major impact on our data so consistency is less important.
I'd be cautious about adding ChatGPT and Perplexity into our Search Engine definition. This is a broader philosophical thing so I don't think any right answer but my thoughts: I don't think "Search Engines" are actually well-defined anymore and by all means, Google is very chat-agenty these days so the boundaries are getting more and more blurred. That said, if we plan to establish a "chat agent" referer class, then please don't temporarily put ChatGPT/Perplexity into the Search Engine data as it'll just cause confusion as to why there's a temporary blip in the data.

Oct 23 2025, 3:53 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review, Essential-Work, Movement-Insights (FY25-26 H1), Data-Platform

Oct 21 2025

Isaac added a comment to T407026: Investigation: Add more data points to Contributions tab (editing basics).

Just quickly chiming in on words/references:

My mwedittypes library can do this. There's also a UI/API for it if you're curious to see what it looks like: https://wiki-topic.toolforge.org/diff-tagging. That's all hosted on Cloud Services and not actively maintained though so let me know before using it in anything live but fine for exploring, prototyping etc.
The references are relatively straightforward. The default wikitext-based approach (what's happening in the UI above) is just counting <ref> tags that are present in the wikitext. That means it will miss some things -- e.g., see PAWS:references-wikitext-vs-html.ipynb -- but probably good enough for the use-case of analytics. You can also do it via HTML, which will be far more accurate and is implemented in the Python library (just not exposed via the UI/API). The other difference is that on the HTML side, I distinguish between references (i.e. new sources in the reflist at the bottom) and citations (i.e. in-line usage of those references). The wikitext one is really doing citations though we could adjust to capture both I think if desired.
Words are more complex. Two things:
- What is considered "text" in an article: the library currently strips out references, templates, images, lists, categories, and a few other things (code). Essentially aiming for gathering the core text in the article. This could be over-written though if you all are interested in a different set of elements. Using HTML here also brings some additional flexibility -- e.g., if you wanted to count words in infoboxes but not clean-up templates for instance.
- How do you count "words" in text: once you have the core text, there is still the challenge of counting up words. For whitespace-delimited languages like English, that's pretty trivial (split on whitespace and because you don't care about specifics, you don't have to worry too much about cleaning up punctuation or stuff like that). For non-whitespace-delimited languages like Chinese or Thai, it's a lot trickier. We do have another library (mwtokenizer) for doing this but it's giving you the sorts of tokens you might hear discussed in the context of LLMs -- i.e. they aren't promised to be true words but instead common sequences-of-characters so sometimes full words but sometimes just chunks of words. For the moment, mwedittypes just falls back to saying how many characters were changed but I've been meaning to incorporate in the mwtokenizer logic so happy to talk about that if you're interested.
Of note, if you go with mwedittypes, you'd get some other elements for free -- e.g., how many images were added, how many clean-up templates were removed (HTML only), if an infobox was added (HTML only), and presumably any other element that you might want to report on.

Oct 21 2025, 5:59 PM · OKR-Work, CampaignEvents, Connection-Team (Connection-Current-Sprint)

Oct 10 2025

Isaac added a comment to T405754: Outreachy 31: Micro-task Generator for Organizers on Wikipedia.

I'm getting started with the first task, and need a few clarifications: The question is very open ended, there are a number of relevant articles on climate change - some are based on the recent current happenings, some are basic concepts, some describing past events, etc.

As we are aiming for a list of newcomers, should I only target pages that need minor changes?

Is it okay to think from a perspective of adding new information, fact checking, removing biased content etc.?

Do I link such articles to my answers?

I'm a little stumped as to how to approach this. could anyone provide insights? I just need to understand what I am working with here, thanks!

@shreya-bhagat thanks for this thoughtful question! The intent is for it to be open-ended -- there won't be any one right solution here. The questions you ask make me think that you're thinking carefully about it, which is the most important part. Feel free to choose articles whose content you're more familiar with too (it doesn't have to just be the globally "most important" content). Whichever route you go, just explain why you chose the set that you did. You'll then use that set of articles for your remaining analyses.

Oct 10 2025, 9:18 PM · Research, Outreachy (Round 31)

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

A general reflection too: it's really powerful to go through these editor journeys via the questions they're asking mentors and their Contribution history and trying to figure out what was going on. Many are quite short (unfortunately) with many misconceptions based on their actions but it's really interesting to see them try at creating user pages, getting help, making edits, etc. And then very heartwarming when you see an editor figure it out and keep editing!

Oct 10 2025, 4:09 PM · Research (FY2025-26-Research-October-December), Essential-Work

Isaac updated the task description for T397550: Develop a dataset for editor Q&A.

Oct 10 2025, 4:03 PM · Research (FY2025-26-Research-October-December), Essential-Work

Isaac added a comment to T397550: Develop a dataset for editor Q&A.

Weekly update:

I began considering what it would mean to extract nice structured datasets of Q&A from pages -- e.g., WP:Teahouse Archives -- but paused that effort as I realized that a) it was non-trivial, and, b) that I wasn't fully sure yet what I would want to extract so better to return to the question after spending some more time with the data. I want to process the HTML but that's also a consideration: HTML will make it much easier to extract e.g., policy/help links and nice clean text from the conversations. Because I'm really only interested in the final question+answers, it's okay that HTML largely locks me into working with the current snapshot as opposed to the full history of the conversation. There are some parsers that exist for wikitext+talk pages if I decide to change direction -- they likely wouldn't work exactly for my needs but might have some of the logic around e.g., extracting timestamps, usernames, etc.
I pivoted instead to starting some qualitative coding of editor Q&A. I began with newcomer questions via the Newcomer Homepage mentor module largely because there were some discussions happening about the impact of that module that I thought might benefit from more data. I grabbed 100 random mentor questions from English Wikipedia (query below) and have gotten through 15 of them (thanks to @TAndic for helping me think through my codebook). Still very small sample but some early takeaways:
- 5 did not really receive responses (2 mentors seemed to be generally inactive at that time, 1 was a case of the question being ignored problem, 2 were cases of the question being off-topic/unintelligible and were eventually reverted).
- Of the 10 with responses: 2 mentor responses came within ~20 minutes; 5 responses came in 12-20 hours; 2 took 1.5 days, and 1 came a month later.
- The mentor responses were largely helpful/kind -- sometimes directly answering the question, sometimes asking for clarification. Mentees almost never responded back or thanked them though. More common actually was the mentee making a follow-up in a new section (twice) or on their own talk page (once). Only twice did they actually follow-up on the original question.
- Of the questions where the intention was clearer, 7 were about editing existing articles and 4 were about creating new articles. Most questions were generic (e.g., "how do I create an article?") and probably would have benefited from some follow-up questions/answers. The needs were pretty diverse (general workflow, questions about policies, questions about wikitext/syntax, help with approving articles, etc.)
- There were reasonable COI concerns in 4 of the questions. On the flip side, several of the newcomers clearly were acting in good-faith and just trying to figure things out. Many it was unclear (generic question and not enough other activity to judge).
- The outcomes for these 15 aren't great though a few mentees made it through:
  - No contributions for a month after question and then returned to edit occasionally
  - Asked again about their draft article on different talk page and on Commons for some reason, but then stopped editing
  - Made two more edits to their draft article about a month later but eventually declined for notability reasons and they never edited again
  - Kept editing but most of it was reverted for lack of sources. Eventually blocked.
  - Never edited beyond the question
  - Never edited beyond the question
  - Never edited beyond the question
  - Made edit but was reverted. Then made more policy-conforming edit and hasn't edited since. Likely COI though.
  - Never made edit they asked about or edited again. Page is still broken 2 years later from their initial attempts
  - Never edited beyond the question
  - Never edited beyond the question
  - Figured it out and kept editing
  - Figured it out and kept editing
  - Fixed typo and asked follow-up in wrong place and then stopped
  - Unclear what was going on with mentee but eventually they dropped off
Next steps for me will be to pull some samples from other sources to diversify my sample. Once I have a better sense of what's out there, I'll return to the question of whether to try to more automatically extract some of this or continue in a more manual fashion.

Oct 10 2025, 4:02 PM · Research (FY2025-26-Research-October-December), Essential-Work

Oct 9 2025

Isaac added a comment to T405754: Outreachy 31: Micro-task Generator for Organizers on Wikipedia.

Echoing the welcome to everyone -- it's great to see so many applicants and all the conversation!

Oct 9 2025, 2:45 PM · Research, Outreachy (Round 31)

Oct 7 2025

Isaac added a comment to T406531: NEWFEATURE REQUEST: Add new referral sources to pageview data.

FYI chiming in as I've done a bit of this in the past but am happy to pass it off to others!

Oct 7 2025, 3:33 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review, Essential-Work, Movement-Insights (FY25-26 H1), Data-Platform

Oct 3 2025

Isaac updated the task description for T405754: Outreachy 31: Micro-task Generator for Organizers on Wikipedia.

Oct 3 2025, 5:44 PM · Research, Outreachy (Round 31)

Sep 29 2025

Isaac added a project to T405754: Outreachy 31: Micro-task Generator for Organizers on Wikipedia: Research.

Sep 29 2025, 8:39 PM · Research, Outreachy (Round 31)

Isaac renamed T397550: Develop a dataset for editor Q&A from [long] Develop a dataset for editor Q&A to Develop a dataset for editor Q&A.

Sep 29 2025, 8:35 PM · Research (FY2025-26-Research-October-December), Essential-Work

Sep 26 2025

Isaac updated subscribers of T371158: [SPIKE] What percentage of edits are reverted because of peacock behavior?.

Adding a comment for documentation purposes:

Sep 26 2025, 4:02 PM · Editing-team (Kanban Board), Product-Analytics (Kanban), EditCheck

Isaac added a comment to T371021: [articletopic-outlink] Allow using `page_id` as alternative to the `page_title` parameter..

FYI if you all ever want to support rev ID for this model, it's pretty simple to implement. A few choices:

Map any given revision ID to its current page ID and then go from there. On one hand, this might be confusing if e.g., someone submits a revision ID from 10 years ago but gets a prediction based on content from today. On the other hand, this model is the rare model where the topic really is a concept that exists outside of the page and we're just using the page to predict it. So we don't expect it to change with every revision (that's more of a bug than a feature). So I think it'd be reasonable to just map any revision ID to the current data. It's also way simpler.
Actually support arbitrary revisions of pages as the source of features. For this, you'd need to extract the wikilinks manually from the page. Some code below. Slower because now you're fetching page HTML but still probably pretty performant.

Sep 26 2025, 1:57 PM · Lift-Wing, Machine-Learning-Team

Sep 19 2025

Isaac added a comment to T405059: Adapt hasrecommendation to filter by score and possibly rank by score.

Obviously let's keep this focused on tone-check but just reminding of our related use-case for a potential time topic filter where you'd do something like: articletopic:time>1960 (T397375).

Sep 19 2025, 4:42 PM · Growth-Team, Revise-Tone-Structured-Task, Essential-Work, MW-1.45-notes (1.45.0-wmf.22; 2025-10-07), Discovery-Search (2025.09.26 - 2025.10.17), CirrusSearch

Sep 18 2025

Isaac added a comment to T374554: [medium] Devise approach for aligning citation templates across languages.

A few updates:

Notebook for collecting Citoid params for each language (where available): https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Citations/t374554-gather-citoid-params.ipynb
Notebook for building representative article HTML dataset for every language to explore Citoid coverage etc.: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Citations/t374554-html-random-dataset.ipynb
Some descriptive stats on a few common Citoid parameters and how well-structured references are across the different Wikipedia languages: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Citations/t374554-citoid-coverage.ipynb
Exploration of one important parameter date and it's non-Citoid year backup: https://public-paws.wmcloud.org/User:Isaac%20(WMF)/Citations/t374554-date-vs-year-enwiki.ipynb

Sep 18 2025, 3:36 PM · research-ideas

Sep 17 2025

Isaac added a comment to T270140: Release dataset on top search engine referrers by country, device, and language.

Hey @Alexey_Skripnik -- glad you discovered the dataset! What you're running into is not the anonymity threshold but actually the geographic filtering that happens. You can see more details in T348504 but the relevant country policy that's being enforced here: https://foundation.wikimedia.org/wiki/Legal:Wikimedia_Foundation_Country_and_Territory_Protection_List. What that means unfortunately is that we would need to apply differential privacy to this dataset to obscure the counts, so it would be a much larger lift than just another round of aggregation. We also would need to recalculate the raw data as the filter is applied upfront. I think that's unlikely to be prioritized on our end but I'll watch for opportunities to push for the more long-term solution of switching the data pipeline over to using differential privacy.

Sep 17 2025, 1:18 PM · Data-Engineering, Privacy Engineering, Research

Sep 15 2025

Isaac added a comment to T298912: Access to aggregate User Agent statistics.

Just a head's up that there is some data available about common user-agent components though I'm not sure if it meets the needs of this task as it's more structured/split as opposed to the raw string:

https://analytics.wikimedia.org/published/datasets/periodic/reports/metrics/browser/
You can visualize the above via https://analytics.wikimedia.org/dashboards/browsers/#all-sites-by-os
And I believe this is the relevant code if interested about implementation details: https://gerrit.wikimedia.org/g/analytics/refinery/+/edfea882db211d861d52833bedf3f4a62d522317/hql/browser/general/browser_general_iceberg.hql

Sep 15 2025, 4:54 PM · User-Urbanecm, Data-Engineering

Sep 10 2025

Isaac added a comment to T400698: Reduce noisy wikidata injected rows to recentchanges.

What might be happening in those cases is that they exceed the entityUsageModifierLimits (in production, up to 33 different property IDs per entity are tracked, beyond that it gets collapsed into “uses any and all statements”).

Oh interesting @Lucas_Werkmeister_WMDE - I was not aware but very cool functionality! Yeah, so then the question I guess is whether this collapse happens a lot and how much RC noise would be saved by allowing it to collapse into "any and all identifiers" instead.

Sep 10 2025, 4:13 PM · OKR-Work, Wikidata Integration in Wikimedia projects, Wikidata

Isaac added a comment to T400698: Reduce noisy wikidata injected rows to recentchanges.

Unsolicited idea: would it be feasible to also provide Lua functionality to fetch just Wikidata identifiers for an item? When I analyzed Wikidata usage on English Wikipedia a few years back (which isn't representative in how they use it but still captures some of the trends), my conclusion was that a large amount of usage was for fetching identifiers (taxonbar, authority control, etc.) but while these templates were fetching the whole item because that was the only reasonable approach (and so triggering lots of unrelated property updates in RecentChanges), they only wanted the identifiers part: https://meta.wikimedia.org/wiki/Research:External_Reuse_of_Wikimedia_Content/Wikidata_Transclusion

Sep 10 2025, 12:54 PM · OKR-Work, Wikidata Integration in Wikimedia projects, Wikidata

Sep 4 2025

Isaac changed the status of T361637: Support for topic infrastructure work, a subtask of T343241: Build a taxonomy for "impactful topics", from Open to Stalled.

Sep 4 2025, 9:17 PM · OKR-Work, Research

Isaac changed the status of T361637: Support for topic infrastructure work from Open to Stalled.

Marking this stalled for now. We have talked with a number of Product teams but don't have a clear team who is positioned to take the lead on pushing these changes through just yet. This seems to reflect the general challenge that many teams would benefit from these changes and are doing work in this space, but no one team owns the topic model and the changes are based on community feedback (as opposed to a Product OKR). I'll continue to look for further stakeholders/opportunities.

Sep 4 2025, 9:17 PM · Research, OKR-Work

Aug 13 2025

Isaac moved T398482: develop a content diff index design plan from FY2025-26-Research-July-September to In Progress on the Research board.

Aug 13 2025, 1:01 PM · Research, Research-engineering

Isaac moved T400030: Draft first version of research direction on readers from FY2025-26-Research-July-September to In Progress on the Research board.

Aug 13 2025, 1:00 PM · Research (FY2025-26-Research-October-December)

Isaac updated the task description for T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research.

Aug 13 2025, 1:00 PM · Research

Isaac updated the task description for T398248: [Q1 FY 25-26 Applied Sciences Team] Knowledge Gaps Research.

Aug 13 2025, 12:59 PM · Research

Aug 12 2025

Isaac moved T400690: Moderator research summarization from FY2025-26-Research-July-September to In Progress on the Research board.

Aug 12 2025, 7:53 PM · Research

Isaac moved T400691: Survey Support Desk Q1 FY 25-26 from FY2025-26-Research-July-September to In Progress on the Research board.

Aug 12 2025, 7:53 PM · Research

Isaac moved T400692: Community Insights 2025 Test from FY2025-26-Research-July-September to Staged on the Research board.

Aug 12 2025, 7:53 PM · Research

Isaac closed T377012: Isaac support for WikiNLP Workshop as Resolved.

Resolving this. A question of what this would look like for next year. There's a September 5th deadline for EACL/ACL: https://www.aclweb.org/portal/content/eaclacl-2026-joint-call-workshops but EMNLP/AACL will release a call later in the fall.

Aug 12 2025, 7:50 PM · Research, Essential-Work

Isaac updated the task description for T377012: Isaac support for WikiNLP Workshop.

Aug 12 2025, 7:47 PM · Research, Essential-Work

Isaac added a comment to T377012: Isaac support for WikiNLP Workshop.

More fine-grained takeaways:

Aug 12 2025, 7:47 PM · Research, Essential-Work

Isaac added a comment to T377012: Isaac support for WikiNLP Workshop.

Finished! I'll record some of my takeaways below and then close this task out.

Aug 12 2025, 7:36 PM · Research, Essential-Work

Isaac added a comment to T385834: CIKM '25 Support.

Update:

We accepted 11 submissions. Camera-readies are due in late August but Sheridan handles that so I don't have to do any validation (as with ACL workshops).
Next question to tackle will be how to have folks present at the conference and what the program looks like. My co-chair will be taking the lead on that though as he will be in-person and knows the community better than me.

Aug 12 2025, 7:30 PM · Essential-Work, Research

Isaac moved T377012: Isaac support for WikiNLP Workshop from FY2025-26-Research-July-September to In Progress on the Research board.

Aug 12 2025, 7:13 PM · Research, Essential-Work

Aug 6 2025

Isaac added a comment to T390007: Write and publish Research Report No. 12.

... You can do that by adding a fake parameter to the page and it can be anything as the site is static. It just need to be a new URL. e.g. https://research.wikimedia.org/report.html?r=12

Ahhh thanks for the tip! I'd been trying to force a refresh but that wasn't having any effect. Now I know!

Aug 6 2025, 3:39 PM · Research-outreach, Research

Isaac closed T390007: Write and publish Research Report No. 12 as Resolved.

Thanks @DDeSouza! Confirmed that I can see it live (weirdly only on chrome desktop or if I switch my IP address to a VPN but I won't pretend to understand how the internet caches things and presumably most people don't have this problem) so now resolving the task. Communication to wiki-research-l: https://lists.wikimedia.org/hyperkitty/list/[email protected]/thread/RRISCJJQ3SWUKT6YJ7JW5GGCKDRKD52D/

Aug 6 2025, 1:13 PM · Research-outreach, Research

Aug 5 2025

Isaac added a comment to T390007: Write and publish Research Report No. 12.

Looks good to me - thanks @DDeSouza ! Ready to publish now.

Aug 5 2025, 1:36 PM · Research-outreach, Research

Jul 31 2025

Isaac added a comment to T371900: [Research Engineering Request] Productionize article-country data dependencies.

is this task still valid?

Yes - the articlecountry model is on LiftWing but the two dependencies listed in this task are static and have no official way of updating beyond re-running my Jupyter notebooks. Additionally, T387041: Generate Airflow DAG for creating article-country SQLite DB lists a third dependency that is also still valid. I don't know if this would fall under REng or ML at this point though.

Jul 31 2025, 11:20 AM · Research, Research-engineering

Isaac added a comment to T400869: Requesting GitLab account activation for aaronshaw.

I can vouch for Aaron - the username was provided to me over Slack by his account and matches his wiki account.

Jul 31 2025, 8:32 AM · Essential-Work, GitLab (Account Approval), Release-Engineering-Team

Isaac created T400869: Requesting GitLab account activation for aaronshaw.

Jul 31 2025, 8:31 AM · Essential-Work, GitLab (Account Approval), Release-Engineering-Team

Jul 29 2025

Isaac added a comment to T390007: Write and publish Research Report No. 12.

Thanks @DDeSouza ! We're still waiting on the final sign-off for the Forward but I'll let you know when that happens. Let me know when the other sections are ready and I'll review.

Jul 29 2025, 12:00 PM · Research-outreach, Research

Jul 28 2025

Isaac added a comment to T396514: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data.

I was thinking that the addition of an access_method column, which would then be populated with desktop and mobile web (just like in the webrequest table) to keep it as similar looking to the webrequest table. What are ya'll's thoughts about that?

Oh yep, that would work for me and probably even simpler! Captures it at the current URL stage but that should still generally be reflective of which UI the person was using when they clicked the link.

Jul 28 2025, 2:40 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review

Isaac added a comment to T396514: NEW BUG REPORT wmf.interlanguage_navigation missing mobile data.

Thanks @Milimetric ! One thought: would it be easier to just record whether the previous URL is mobile or desktop as opposed to the four mobile_to_mobile, mobile_to_desktop, desktop_to_desktop, desktop_to_mobile? The thinking is that the language switch presumably always either honors the previous URL or if it does change, it's not because the user requested it but because e.g., the page automatically redirected to desktop or something like that. So from an analysis perspective, you probably care most about which type of page the person started on (because that tells you which type of UI they were using when they triggered the switch) but where they ended up doesn't tell you anything additional. Hopefully simplifies the code a little bit and is easier to query as well. @CMyrick-WMF I welcome your thoughts as well!

Jul 28 2025, 12:57 PM · Data-Engineering (Q2 FY25/26 October 1st - December 31th), Patch-For-Review

Jul 18 2025

Isaac updated the task description for T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research.

Jul 18 2025, 9:11 PM · Research

Isaac added a comment to T390007: Write and publish Research Report No. 12.

Also every section but the Forward is now stable and can be safely copied over. The Forward won't be stable until the 28th at the earliest. So I think a goal could be to have everything but that ready for review on the 28th and then we can hopefully review that while also providing you with the final section and be ready to go on the 30th.

Jul 18 2025, 4:58 PM · Research-outreach, Research

Isaac added a comment to T390007: Write and publish Research Report No. 12.

Adding another ask to be done by July 30th: @DDeSouza could you have https://wikiworkshop.org/2025/ redirected to https://meta.wikimedia.org/wiki/Wiki_Workshop_2025? That way we can link to a more canonical URL in the report but have it go to the right place. I assume that just means duplicating the current redirect at https://wikiworkshop.org/ to also be at https://wikiworkshop.org/2025/.

Jul 18 2025, 4:20 PM · Research-outreach, Research

Jul 17 2025

Isaac assigned T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research to YLiou_WMF.

Jul 17 2025, 7:59 PM · Research

Isaac assigned T398248: [Q1 FY 25-26 Applied Sciences Team] Knowledge Gaps Research to CMyrick-WMF.

Jul 17 2025, 7:59 PM · Research

Isaac assigned T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research to diego.

Jul 17 2025, 7:58 PM · Research

Isaac updated the task description for T361637: Support for topic infrastructure work.

Jul 17 2025, 7:58 PM · Research, OKR-Work

Isaac added a comment to T399890: Requesting GitLab account activation for khascall.

Thanks @Aklapper!

(Meh, the GitLab account and the Phab account use different email addresses and the Phab account is linked to some personal account instead, but I assume that is because no WMF SUL account has been provided by WMF to Kaylea, for unknown but unfortunate reasons?)

Jul 17 2025, 7:55 PM · Essential-Work, GitLab (Account Approval), Release-Engineering-Team

Isaac added a comment to T390007: Write and publish Research Report No. 12.

Thanks @DDeSouza ! The draft doc we're working from is here (internal only) and I've marked the sections that are stable with [ready] so they can be copied into a draft MR if you want to get a head start. I'll also let you know when the rest of the sections are finalized. Feel free to leave me questions here or in the doc if there's any oddities that you notice though I'll be out next week so likely won't be able to respond until July 28th.

Jul 17 2025, 7:49 PM · Research-outreach, Research

Isaac added a comment to T399890: Requesting GitLab account activation for khascall.

@brennen: this is for work that will be documented in T399696: GitLab Private Repository Request for: research/npov-workstream-research so if you get a chance to take a look at that task as well and approve or send us any follow-up questions, that'd be much appreciated.

Jul 17 2025, 7:02 PM · Essential-Work, GitLab (Account Approval), Release-Engineering-Team

Isaac added a comment to T399890: Requesting GitLab account activation for khascall.

Chiming in to vouch for Kaylea -- she's contracting with us on the Research team and shared this task with me internally on slack

Jul 17 2025, 6:57 PM · Essential-Work, GitLab (Account Approval), Release-Engineering-Team

Isaac claimed T390007: Write and publish Research Report No. 12.

Jul 17 2025, 2:04 PM · Research-outreach, Research

Isaac added a comment to T219903: Keep research.wikimedia.org landing page updated.

Seeing the updates now -- thanks all and no harm done! Good reminder to us to manually verify the live changes too

Jul 17 2025, 12:43 PM · Patch-For-Review, periodic-update, Research

Jul 16 2025

Isaac updated subscribers of T390007: Write and publish Research Report No. 12.

@DDeSouza flagging that we're almost ready to start moving forward with the next Research Report. We're aiming for having the content ready to deploy on July 30th though most of it will be ready for you to put into a MR before then and it's just a matter of waiting for a few final confirmations before deploying. I'll follow up shortly with some more concrete asks and try to make most of the asks this week so you have at least a week to prepare but figured I'd give you a head's up now that we have a clearer timeline for the work.

Jul 16 2025, 8:54 PM · Research-outreach, Research

Isaac triaged T399726: Provide a recommendation on the optimal retraining frequency for ML models as Medium priority.

Jul 16 2025, 6:53 PM · Research

Isaac reassigned T392305: [Request] Create a replicable system to determine the optimal retraining frequency for ML models from • MunizaA to diego.

Jul 16 2025, 6:52 PM · Research-engineering, Research

Isaac added a comment to T219903: Keep research.wikimedia.org landing page updated.

Thanks @DDeSouza - looks good to me to release (I don't see the changes publicly - not sure if you're holding until confirmation or somehow the merge was unsuccesful). And thanks for including Miriam's title update as well.

Jul 16 2025, 6:23 PM · Patch-For-Review, periodic-update, Research

Jul 15 2025

Isaac updated the task description for T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research.

Jul 15 2025, 8:21 PM · Research

Isaac updated the task description for T398247: [Q1 FY 25-26 Applied Sciences Team] Knowledge Integrity Research.

Jul 15 2025, 8:21 PM · Research

Jul 11 2025

Isaac added a comment to T385834: CIKM '25 Support.

Update:

18 submissions to Symposium that we're gathering reviewers for. This is a nice load as most of them appear to be good quality and last year they had 20 but that was evidently far too many so this should allow us to accept a reasonable proportion without overloading the day. They had 98 submissions last year and I suspect a large difference there is the location (South Korea) and reduced travel from the US.

Jul 11 2025, 5:49 PM · Essential-Work, Research

Isaac added a comment to T377012: Isaac support for WikiNLP Workshop.

Updates:

Proceedings accepted
Schedule largely finalized (both keynotes) but still looking for a local Wikimedian for a conversation/AMA. Turns out that Europe+August=Vacation is still alive and well :)

Jul 11 2025, 5:47 PM · Research, Essential-Work

Jul 10 2025

Isaac updated the task description for T398248: [Q1 FY 25-26 Applied Sciences Team] Knowledge Gaps Research.

Jul 10 2025, 6:53 PM · Research

Jul 9 2025

Isaac updated the task description for T398249: [Q1 FY 25-26 Applied Sciences Team] Building the Foundations Research.

Jul 9 2025, 6:49 PM · Research

Jul 8 2025

Isaac created T398960: Wikirun reference statistics undercounts by a large margin.

Jul 8 2025, 1:11 PM · Wikirun-Game

Jul 7 2025

Isaac updated the task description for T398248: [Q1 FY 25-26 Applied Sciences Team] Knowledge Gaps Research.

Jul 7 2025, 7:16 PM · Research

Jul 2 2025

Isaac closed T371865: Who are moderators? as Resolved.

Resolving this epic! A quick summary of the very large amount of work that was accomplished:

Background report: https://meta.wikimedia.org/wiki/Research:Develop_a_working_definition_for_moderation_activity_and_moderators
Focus on decentralized aspects of moderation: https://meta.wikimedia.org/wiki/Research:Crowdsourced_Content_Moderation
Code: https://gitlab.wikimedia.org/repos/research/who-are-moderators
Patrolling data dashboard (internal): https://superset.wikimedia.org/superset/dashboard/605/ (documentation)
This work is now feeding into FY25-26 Annual Plan's WE1.3 key result, which focuses on increasing moderation actions. It has also helped inform adjacent areas of work such as the NPOV workstreams.
It identified the importance of HTML diff data and laid the groundwork for building that dataset (T380874), though prioritization is pending the direction taken by the Moderator Tools team. It would also have benefited from productionized edit diffs (T351225) but we worked around that dependency.