The Wayback Machine - https://web.archive.org/web/20220315162239/https://github.com/aws/aws-sam-cli/pull/3509
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix Lambda invoke in user networks #3509

Open
wants to merge 2 commits into
base: develop
Choose a base branch
from

Conversation

awilkins
Copy link

@awilkins awilkins commented Dec 1, 2021

Which issue(s) does this change fix?

#2837
(at least for the subset of cases that are caused by using a user defined network in a docker stack)

Why is this change necessary?

The Docker CLI client won't let you attach containers to a user
network and the bridge at the same time, presumably for a good
reason.

One good reason seems to be that name resolution doesn't
work - which means that if you pass the service name of e.g.
localstack on the same user network to provide endpoints for AWS
APIs for testing, it's not going to work.

Disconnecting the container from the bridge network seems to fix this.

Secondly, if SAM is running on a container inside the same user
network (the likeliest scenario for using one?) the Lambda ports are
not available on localhost, and the remapped ports are not available
inside the network.

How does it address the issue?

Therefore if the container is in a user network :

  • Detach it from the bridge network
  • Set its hostname to the first 12 chars of the container ID
  • Set its rapid port to 8080

Then invocations work properly.

What side effects does this change have?

  • Detaches containers in a user network from the bridge adapter
    • Not sure if this has any deleterious effects, because I can't imagine a scenario where someone would want their Lambda invokers on both networks
  • Stops lamba executors in a user network from working if SAM is executing outside the docker stack
    • Probably needs fixing. Not sure if the code has an existing method of working this out.

Checklist

  • Add input/output type hints to new functions/methods
  • Write design document (Do I need to write a design document?)
  • Write unit tests
  • Write/update functional tests
  • Write/update integration tests
  • make pr passes
  • make update-reproducible-reqs if dependencies were changed
  • Write documentation

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.

The Docker CLI client won't let you attach containers to a user
network and the bridge at the same time, presumably for a good
reason. One good reason seems to be that name resolution doesn't
work - which means that if you pass the service name of e.g.
`localstack` on the same user network to provide endpoints for AWS
APIs for testing, it's not going to work. Disconnecting the container
from the bridge network seems to fix this.

Secondly, if SAM is running on a container inside the same user
network (the likeliest scenario for using one?) the Lambda ports are
not available on `localhost`, and the remapped ports are not available
inside the network.

Therefore if the container is in a user network :

- Detach it from the bridge network
- Set its hostname to the first 12 chars of the container ID
- Set its rapid port to 8080

Then invocations work properly.
@jfuss
Copy link
Contributor

@jfuss jfuss commented Dec 1, 2021

@awilkins I hesitate with this. I get your reasoning but I don't think we should assume it is bad. We use a docker client, which eventually communicates to the docker daemon. If it was bad or not allowed, I assume docker wouldn't let you do it at all.

What do you mean by: "Stops lamba executors in a user network from working if SAM is executing outside the docker stack"? Are you saying that SAM CLI won't be able to call a container over localhost if a network is provided?

It's been a while since I have dug into the docker networking but I am unsure what side effects this may have. Customers may be using networks and docker.host.internal. We would need to add integration with cases we can think of and validate this is actually safe to be doing.

@awilkins
Copy link
Author

@awilkins awilkins commented Dec 1, 2021

I think hesitation is entirely correct, because clearly this is a core bit of SAM's APIG support. But I figured opening a PR was a good context to discuss what needs fixing before it would be ready to merge.

If it was bad or not allowed, I assume docker wouldn't let you do it at all.

The CLI client refuses to let you attach containers to both networks :

❯ docker run --net lambda --net bridge -it bash
docker: conflicting options: cannot attach both user-defined and non-user-defined network-modes.
See 'docker run --help'.

The client library doesn't seem to check this though (it literally just posts the request to the daemon, and I assume the daemon doesn't check either).

What do you mean by: "Stops lamba executors in a user network from working if SAM is executing outside the docker stack"

If you've set --docker-network to a value other than host, the patch will make SAM try to invoke the lambdas over their "real" host/port {container_id[:12]}:8080 rather than their exposed host/port localhost:{remapped_port} ; really this should only happen if SAM is running on a container in a network that can reach them (ie, the same user network).

Customers may be using networks and docker.host.internal. We would need to add integration with cases we can think of and validate this is actually safe to be doing.

Agreed. It's working great for my case (which is a docker-compose stack of api-tests --> sam-api --> [invoke_containers] --> localstack using the --exit-code-from api-tests flag to get the test outcome), but I don't want to spoil anyone else's fun.

Guess there's some design docs in my future.

@jfuss
Copy link
Contributor

@jfuss jfuss commented Dec 2, 2021

@awilkins

The CLI client refuses to let you attach containers to both networks :

The command you provided isn't what SAM CLI does. We first create the container and then attach it to a network if one is provided. Not sure if that matters but could be a difference that impacts how the docker cli (or daemon) works.

If you've set --docker-network to a value other than host, the patch will make SAM try to invoke the lambdas over their "real" host/port {container_id[:12]}:8080 rather than their exposed host/port localhost:{remapped_port} ; really this should only happen if SAM is running on a container in a network that can reach them (ie, the same user network).

I need to think more here but --docker-network is for putting the local Lambda Container into a specific network. SAM CLI still needs access to call the container from localhost (we have to make sure not to break the non SAM CLI running in a container case).

Guess there's some design docs in my future.

Maybe not :). What I want to make sure is if we do add this we can add tests that verify it works and doesn't regress what we already support. SAM CLI was never really built for running in a container, as it needs Docker. There are ways around it but not something we really design for. Everything that doesn't use docker is fine but Docker in Docker + networking is overhead (and frankly much easier if you don't run in docker but I digress).

I am not going to be able to get back to this until next week. I am going to assign it to myself and come back to this. Again, I don't want to block these docker in docker cases but we have to be very careful because we don't want to break customers using the CLI on their system and not within docker.

@jfuss jfuss self-assigned this Dec 2, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment