For long-running operations, it often helps to model the active job as a REST resource with its own structure and/or sub-resources.
For example, starting a job may return a result such as
202 Accepted
Location: https://example.com/jobs/123
At that URL, the client will get a structure such as
{
"status":"running"
}
as long as the job is running,
{
"status":"finished",
"result":"https://example.com/jobs/123/result"
}
when it is completed and a result is available, or
{
"status":"interaction-required",
"prompt":"xyz service not available, please restart it or cancel job.",
"continue":"https://example.com/jobs/123/continue/<token>"
}
to interact with the user. The job would continue (retrying xyz access) after the client posts something to the continue URL (which would include an idempotency token as suggested by @NPSF3000 to prevent accidentally continuing the next interaction), or would be cancelled by using DELETE on the job URL, for example.
The details about which kinds of interaction are possible and how they are presented in the client would need to be designed based on the specific needs of these jobs, but the main thing is that the operation start does not just return the location of the result but of a reified job object that can be queried and manipulated.