gitlab-docker-runner-v2.analytics instance is inaccessible via SSH
Closed, ResolvedPublic

Description

I recently created instance gitlab-docker-runner-v2.analytics.eqiad1.wikimedia.cloud.

Now I am trying to SSH into it and get the following error:

ssh -J [email protected] xcollazo@gitlab-docker-runner-v2.analytics.eqiad1.wikimedia.cloud
Connection closed by UNKNOWN port 65535

In verbose mode I get:

...
debug1: Authentications that can continue: publickey
debug1: Next authentication method: publickey
debug1: get_agent_identities: bound agent to hostkey
debug1: get_agent_identities: ssh_fetch_identitylist: agent contains no identities
debug1: Will attempt key: /Users/xcollazo/.ssh/xcollazo_wmf_non_prod ED25519 SHA256:8McG5lNISHCG8P0GOdcdzWIPNTc4WoGlfD7ZH6rAjSc explicit
debug1: Offering public key: /Users/xcollazo/.ssh/xcollazo_wmf_non_prod ED25519 SHA256:8McG5lNISHCG8P0GOdcdzWIPNTc4WoGlfD7ZH6rAjSc explicit
debug1: Server accepts key: /Users/xcollazo/.ssh/xcollazo_wmf_non_prod ED25519 SHA256:8McG5lNISHCG8P0GOdcdzWIPNTc4WoGlfD7ZH6rAjSc explicit
debug1: channel 0: FORCE input drain
Connection closed by UNKNOWN port 65535
debug1: channel 0: free: direct-tcpip: listening port 0 for gitlab-docker-runner-v2.analytics.eqiad1.wikimedia.cloud port 22, connect from 127.0.0.1 port 65535 to UNKNOWN port 65536, nchannels 1
Killed by signal 1.

Note I can SSH fine to other instances under the analytics project.

Event Timeline

The Cloud-Services project tag is not intended to have any tasks. Please check the list on https://phabricator.wikimedia.org/project/profile/832/ and replace it with a more specific project tag to this task. Thanks!

taavi subscribed.

The instance has Puppet roles applied that do not compile:

root@gitlab-docker-runner-v2:~# sudo run-puppet-agent
Info: Using environment 'production'
Info: Retrieving pluginfacts
Info: Retrieving plugin
Info: Loading facts
Error: Could not retrieve catalog from remote server: Error 500 on SERVER: Server Error: Evaluation Error: Error while evaluating a Resource Statement, Evaluation Error: Error while evaluating a Function Call, Class[Docker::Configuration]: expects a value for parameter 'settings' (file: /srv/puppet_code/environments/production/modules/docker/manifests/init.pp, line: 22, column: 5) on node gitlab-docker-runner-v2.analytics.eqiad1.wikimedia.cloud
Warning: Not using cache on failed catalog
Error: Could not retrieve catalog; skipping run

And as the Puppet compilation is failing, the instance won't have the settings needed for logging in applied on it.

This is not an infrastructure issue, thus untagging us.

Aklapper renamed this task from gitlab-docker-runner-v2.analytics instance is innaccesible via SSH to gitlab-docker-runner-v2.analytics instance is inaccessible via SSH.Thu, Nov 13, 9:01 PM

@taavi, I removed all puppet roles, and it still doesn't allow me to SSH in.

If the initial set of Puppet runs fail then the instance will get stuck in a weird state. I have manually kicked off a run which might or might not get it working again.

If the initial set of Puppet runs fail then the instance will get stuck in a weird state. I have manually kicked off a run which might or might not get it working again.

Got it, thank you!

xcollazo claimed this task.

Accessible again, closing.