Parsoid/Round-trip testing

The Parsoid code includes a round-trip testing system that tests code changes, composed of a server that gives out tasks and presents results and clients that do the testing and report back to the server. The code is in the testreduce repo which is an automatic mirror of the repo in gerrit. The roundtrip testing code has been fully puppetized.

There's a private instance of the server on testreduce1002 that currently tests a representative (~160000) set of pages from different wikipedia languages. You can access the web service at https://parsoid-rt-tests.wikimedia.org/

Private setup

edit

The instructions to set up a private instance of the round-trip test server can be found here. A MySQL database is needed to keep the set of pages and the testing results.

RT-testing setup

edit

Coordinator runs on testreduce1002.eqiad.wmnet. RT-testing clients run on testreduce1002. You need access to a bastion server on wikimedia.org to access testreduce1002. See SSH configuration for access to production on wikitech. These clients access Parsoid REST API that runs on parsoidtest1001.eqiad.wmnet.

The clients are managed/restarted by systemd and the config is in /lib/systemd/system/parsoid-rt-client.service. Please do not modify the config on testreduce1002 directly (they will be overwritten by puppet runs every 30 minutes). Any necessary changes should be made in puppet and deployed.

To {stop,restart,start} all clients on a VM (not normally needed):

# On testreduce1002
sudo service parsoid-rt-client stop
sudo service parsoid-rt-client restart
sudo service parsoid-rt-client start

Client logs are in systemd journals and can be accessed as:

### Logs for the parsoid-rt-client service on testreduce1002
# equivalent to tail -f <log-file>
sudo journalctl -f -u parsoid-rt-client
# equivalent to tail -n 1000
sudo journalctl -n 1000 -u parsoid-rt-client

### Logs of the parsoid-rt testreduce server
sudo journalctl -f -u parsoid-rt

### Logs for the parsoid service
sudo journalctl -f -u parsoid

In the current setup, the testreduce clients talk to a global parsoid service that runs on parsoidtest1001. So, look at the logs of the parsoid service on parsoidtest1001 to find problems / bugs with the parsoid code being tested. These logs are also mirrored to Kibana which you can find on this dashboard.

Starting a test run

edit

It's probably best to check that we're not currently running tests on a parsoid commit. Use the sudo journalctl -f -u parsoid-rt-client command on testreduce1002 to verify that it says "The server does not have any work for us right now".

To start rt-testing a particular parsoid commit, run the following command on your local computer from your checked-out copy of Parsoid:

# To test a commit which is already merged on Parsoid's main/master branch:
bin/start-rt-test.sh <sha-of-parsoid-code-to-rt-test>

# Add -u if your .ssh/config does not automatically use the correct userid:
# Example: bin/start-rt-test.sh -u ssastry 645beed2

# You can also test a commit which has not yet been merged:
bin/start-rt-test.sh --gerrit <changeid>

This updates the parsoid checkout on parsoidtest1001 and testreduce1002, and restarts the parsoid-php and parsoid-rt-client services.

(Note: if it complains "detected dubious ownership in repository at /srv/parsoid-testing" go ahead and ssh into both testreduce1002 and parsoidtest1001 and run the command that it suggests: git config --global --add safe.directory /srv/parsoid-testing.)

Updating the round-trip server code

edit

The rt-server code lives in /srv/testreduce.You can run npm install on testreduce1002 if you need to update node modules.

cd /srv/testreduce
git pull
sudo service parsoid-rt restart

Running the regression script

edit

After an rt run, we compare diffs with previous runs to determine if we've introduced some new semantic differences. However, since the runs happen on different dates and the production data is used, there's going to be some natural churn to account for. The regression script automates the process of rerunning the rt script on a handful of pages to determine if there are any true positives

# Use https://parsoid-rt-tests.wikimedia.org/commits to select a commit pair to
# check, then click "Regressions" at the top, and pass the resulting URL to this
# command run on yourlocal machine
# Make sure that an rt run isn't in progress on testreduce1002 / parsoidtest1001
php tools/RegressionTesting.php --url https://parsoid-rt-tests.wikimedia.org/regressions/between/<oracle>/<commit>

# Alternatively, you can manually create a list of titles to check, and use:
php tools/RegressionTesting.php -t <title-file> <oracle> <commit>

# Again, `-u <bastion-uid>` can be added if it's not already in your `.ssh/config`

Note that the script will checkout the specified commits while running but that it doesn't do anything for dependencies -- Parsoid will be running in integrated mode with the latest production mediawiki version(s) and their corresponding mediawiki-vendor packages (depending on the Host field used in the request). So, at present, it isn't appropriate when bumping dependency versions in between commits. The <oracle> ("known good") and <commit> ("to be tested") can be anything that git recognizes, including tag names -- they don't necessarily have to correspond to the hashes you provided to the rt server, although usually that's what you'll use.

Crashers prevent the script from running and may need to be pruned. They can then be tested individually as follows on testreduce1002.

# On parsoidtest1001, checkout the commit you want test
cd /srv/parsoid-testing
git checkout somecommit

# On testreduce1002, run the roundtrip script
node bin/roundtrip-test.js --proxyURL http://parsoidtest1001.eqiad.wmnet:80 --parsoidURL http://DOMAIN/w/rest.php --domain en.wikipedia.org "Sometitle"

Finally: be sure to check on the parsoid-tests dashboard for notices and errors; that will be our first warning of logspam created by our tested release.

Running Parsoid tools on parsoidtest1001

edit

Parsoid will run in integrated mode on parsoidtest1001 from /srv/parsoid-testing but it requires use of the MWScript.php wrapper in order to configure mediawiki-core properly. More information on mwscript is at Extension:MediaWikiFarm/Scripts. A sample command would look like:

$ echo '==Foo==' | sudo -u www-data php /srv/mediawiki/multiversion/MWScript.php /srv/parsoid-testing/bin/parse.php --wiki=hiwiki --integrated

Parts of that command are often abbreviated as an helper alias mwscript in your shell to make invocations easier.

The maintenance/parse.php script in mediawiki-core also has a --parsoid option that may be useful.

Tracing / dumping with MWScript.php wrapper

edit

If you need to trace / dump via the --trace and --dump CLI options with the mwscript wrapper, you will need to use the --logFile option in parse.php to get the logs.

Proxying requests to parsoidtest1001

edit

Historically, Parsoid's HTTP API hasn't been exposed externally. In order to query it, requests can be proxied through parsoidtest1001,

NO_PROXY="" no_proxy="" curl -x parsoidtest1001.eqiad.wmnet:80 http://en.wikipedia.org/w/rest.php/en.wikipedia.org/v3/page/html/Dog

The NO_PROXY="" no_proxy="" business is to ignore the environment variables that are set, which override the explicit proxy from -x.

Troubleshooting

edit

Disk space issues

edit

As mentioned on the script, https://wikitech.wikimedia.org/wiki/Parsoid/Common_Tasks#Freeing_disk_space explains how to free disk space on testreduce1002.

Wrong commit on parsoidtest1001

edit

It can happen that, when starting rt-testing, the commit is not updated on parsoidtest1001. One reason for that can be a permission issue when checking out the repository, which may not interrupt the script. This may show up as unexpected results in the rt-tests or in the logs ("I should not get these errors with this commit").

  Warning: It's probably a good idea to have sudo service parsoid-rt-client status running in a terminal somewhere during the duration of the operations to check that it doesn't gets restarted.
  • Check on parsoidtest1001 that the commit in /srv/parsoid-testing is the same as the one on the top of the list in https://parsoid-rt-tests.wikimedia.org/commits
  • If they are different, congratulations, you found out what was weird in your rt-testing.
  • Fixing things and restarting:
    • Log in to testreduce1002
    • If the rt-testing is still running, run sudo service parsoid-rt-client stop. Check that it's indeed stopped with sudo service parsoid-rt-client status.
    • Clear the database results for the previous execution. The full procedure for this requires some tweaking/additional documentation.:
      • Start mysql with mysql -u testreduce -p testreduce, use the password in /etc/testreduce/parsoid-rt.settings.js
      • delete from results where commit_hash='<hash>'; (note: this can take a few minutes, probably up to 10)
      • delete from stats where commit_hash='<hash>';
      • delete from perfstats where commit_hash='<hash>';
      • SOMETHING TBD needs to be done on the pages table to clear the crashers - deleting the lines corresponding to the commit is probably a Bad Idea, it seems plausible that something along the lines of update pages claim_num_tries=0, claim_timestamp=null,latest_stat=null,latest_result=null,latest_score=0,num_fetch_errors=0 where claim_hash = <hash> might do the trick, but this needs to be confirmed.
      • It's also unclear whether it's necessary/useful/problematic to remove the corresponding line in thecommit. It seems necessary, if doing that, to restart the parsoid-rt server so that the "known commits" are cleared an the line gets re-added. Maybe. Probably just update the timestamp and be done with it.
  • Restart the rt-testing as if nothing had happened.

Migrating to a K8s based MW setup

edit

Currently we are in the process of migrating away from testreduce100x and parsoidtest100x. The custom bare metal nodes need to be decommissioned since they are the last mediawiki/php nodes. Here is an overall description of the migration steps.

Kubernetes based mw-parsoid env

edit

We have a new kubernetes based env to use for parsoid testing. It enables overriding the parsoid codebase under /srv/parsoid-testing by providing devs write access (both for pulling specific commits and for manual intervention). The setup is similar to mw-debug which lets requests to be routed to a specific backend via an http header.

More specifically in our case this will GET the page using the overridden dev parsoid version, and more specifically the one on eqiad:

curl https://en.wikipedia.org/wiki/Earch -H "X-Wikimedia-Debug: k8s-mw-parsoid-eqiad"

There are two envs: k8s-mw-parsoid-eqiad, k8s-mw-parsoid-codfw. The parsoid files can be accessed via the experimental nodes.

# ssh mw-experimental.{codfw,eqiad}.wmnet
# cd /srv/parsoid-testing

mw-experimental instance is just used for mounting the overridden version of parsoid. This is not the server that mediawiki/php runs.

New testreduce instance

edit

In the new setup, we can use mw-parsoid from public traffic (via custom header routing). So, we don't need an internal instance of mediawiki for parsoid testing. There is a new instance of testreduce in cloudvps in our team's project (wikitextexp).

The webservice is exposed here: https://parsoid-rt-tests.wmcloud.org/

The server is accessible via SSH here: ssh ctt-rt-testing-01.wikitextexp.eqiad1.wikimedia.cloud

The setup is the same with testreduce1002.

Testing scripts usage

edit

Currently we have two scripts for our weekly chores: bin/start-rt-test.sh and tools/RegressionTesting.php. The default values are still targeting the bare metal infrastructure.

In order to run them against the new environment, the scripts calls should be adapted for the interim as follows:

UID=<your-uid>
COMMIT=<commit>
PARSOID_HOST=mw-experimental.eqiad.wmnet
TESTREDUCE_HOST=ctt-rt-testing-01.wikitextexp.eqiad1.wikimedia.cloud

./bin/start-rt-test.sh -u jgiannelos --parsoid-host $PARSOID_HOST --testreduce-host $TESTREDUCE_HOST --no-restart-fpm --deploy-mw-parsoid <sha>

php tools/RegressionTesting.php -u $UID \
  --url "https://parsoid-rt-tests.wmcloud.org/regressions/between/$COMMIT/$COMMIT" \
  --testreduce-server $TESTREDUCE_HOST \
  --parsoidtest-server $PARSOID_HOST \
  --proxyURL http://localhost:8080 \
  --restartPHP false \
  --headers '{"X-Wikimedia-Debug":"k8s-mw-parsoid-eqiad"}' \
  --updateTestreduce 0 \
  --deploy-mw-parsoid

When we completely migrate to the new env, the overridden args will be used as defaults.

Getting shell access

edit

Currently this is a bit hacky but there is a script that devs can run from the home dir in deployment node.

To get a bash shell to the running mw-parsoid container:

> ssh deployment.eqiad.wmnet
> # Download https://phabricator.wikimedia.org/P84273 as mw-experimental-shell
> chmod +x mw-experimental-shell
> ./mw-experimental-shell mw-parsoid eqiad

To fetch files from the running container to the deployment node:

> ssh deployment.eqiad.wmnet
> # Download https://phabricator.wikimedia.org/P92440 as mw-experimental-fetch
> chmod +x mw-experimental-fetch
> ./mw-experimental-fetch mw-parsoid eqiad /tmp/foo ~/localtmp

NOTE: Devs have write access only on /tmp. Make sure that ad-hoc scripts write to a folder like /tmp/out-$(date +%s)

Deploying mw-parsoid

edit

mw-parsoid is not deployed with the rest of the train. We need to manually deploy to use the current mediawiki images. Our test scripts do this as part of the orchestration steps. For manual deployments:

> ssh deployment.eqiad.wmnet
> cd /srv/deployment-charts/helmfile.d/services/mw-parsoid
> helmfile -e eqiad apply
> helmfile -e codfw apply

Grafana dashboard

edit

There is a dedicated grafana dashboard prepared from the SREs for the mw-parsoid instance here

Currently tested steps

edit
  •   Done Start roundtrip testing for a specific commit
  •   Done Start regression testing for a specific commit
  •   Done Trigger a roundtrip-test locally using custom headers for routing
  •   Done Get shell to the running container for manually running commands from deployment node
  •   Done Copy remote file/dir from a running container to deployment node
  •   Done Manual edit of /srv/parsoid-testing in ssh mw-experimental.{eqiad/codfw}.wmnet
  •   Done Dashboard for mw-parsoid specific logs on logstash

Todo / Roadmap

edit

Please look at the general Parsoid roadmap.

Server UI and other usability improvements

edit

We recently changed the server to use a templating system to separate the code from the presentation. Now other improvements could be done on the presentation itself.

Ideas for improvement:
edit
  • Improve pairwise regressions/fixes interface on commits list bug 52407. Done!
  • Flag certain types of regressions that we currently search for by eye: create views with
    • Regressions introducing exactly one semantic/syntactic diff into a perfect page, and
    • Other introductions of semantic diffs to pages that previously had only syntactic diffs.
  • Improve diffing in results views:
    • Investigate other diffing libraries for speed,
    • Apply word based diffs on diffed lines,
    • Diff results pages between revisions to detect new semantic/syntactic errors,
    • Currently new diff content appears before old, which is confusing; change this.
  • Have a "clear last run" script
  • Automatically prune the results DB after each run, so (a) the latest run is guaranteed to be preserved, and (b) we never have to manually run the free-disk-space script before starting an rt test run.