The Wayback Machine - https://web.archive.org/web/20200812161737/https://github.com/PrefectHQ/prefect/issues/3100
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Auth error when registering a flow with s3 storage #3100

Open
n-batalha opened this issue Aug 6, 2020 · 6 comments
Open

Auth error when registering a flow with s3 storage #3100

n-batalha opened this issue Aug 6, 2020 · 6 comments

Comments

@n-batalha
Copy link

@n-batalha n-batalha commented Aug 6, 2020

Description

When I create a pipeline with s3 storage, I get an auth error. Potential reason is flagged below under Environment

I made it reproducible with the script below, save it as example.py, make your env vars available:

export AWS_ACCESS_KEY_ID=REDACTED
export AWS_SECRET_ACCESS_KEY=REDACTED
export AWS_SESSION_TOKEN=REDACTED

I also have export PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS='{"ACCESS_KEY": "REDACTED", "SECRET_ACCESS_KEY": "REDACTED"}' although the docs say that we should use the normal config for boto .

Now run:

import os

import boto3
from prefect import Parameter, Flow, task
from prefect.environments import LocalEnvironment
from prefect.environments.storage import S3


# testing boto directly
s3 = boto3.resource("s3")

for bucket in s3.buckets.all():
    print(bucket.name)



# testing boto via Prefect
@task
def say_hello(person: str) -> None:
    print("Hello, {}!".format(person))


with Flow("Say hi!") as flow:
    name = Parameter("name")
    say_hello(name)


storage = S3(bucket="REDACTED")
# also tried, after setting AWS_CREDENTIALS:
# storage = S3(bucket="REDACTED", secrets=["AWS_CREDENTIALS"])


flow.storage = storage

flow.environment = LocalEnvironment(
    metadata={"image": "REDACTED"}
)

flow.register()

I get:

  File "example.py", line 36, in <module>
    flow.register()
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/prefect/core/flow.py", line 1575, in register
    registered_flow = client.register(
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/prefect/client/client.py", line 677, in register
    serialized_flow = flow.serialize(build=build)  # type: Any
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/prefect/core/flow.py", line 1429, in serialize
    storage = self.storage.build()  # type: Optional[Storage]
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/prefect/environments/storage/s3.py", line 184, in build
    raise err
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/prefect/environments/storage/s3.py", line 177, in build
    self._boto3_client.upload_fileobj(
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/boto3/s3/inject.py", line 539, in upload_fileobj
    return future.result()
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/s3transfer/futures.py", line 106, in result
    return self._coordinator.result()
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/s3transfer/futures.py", line 265, in result
    raise self._exception
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/s3transfer/tasks.py", line 126, in __call__
    return self._execute_main(kwargs)
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/s3transfer/tasks.py", line 150, in _execute_main
    return_value = self._main(**kwargs)
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/s3transfer/upload.py", line 692, in _main
    client.put_object(Bucket=bucket, Key=key, Body=body, **extra_args)
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/botocore/client.py", line 316, in _api_call
    return self._make_api_call(operation_name, kwargs)
  File "/Users/nbatalha/Library/Caches/pypoetry/virtualenvs/pipelines-P6yz7rn1-py3.8/lib/python3.8/site-packages/botocore/client.py", line 635, in _make_api_call
    raise error_class(parsed_response, operation_name)
botocore.exceptions.ClientError: An error occurred (InvalidAccessKeyId) when calling the PutObject operation: The AWS Access Key Id you provided does not exist in our records.

Expected Behavior

Flow is uploaded

Reproduction

See above. If you need I can provide poetry.lock.

Environment

NOTE how the boto env vars are not listed below, that could be the issue.

prefect diagnostics
{
  "config_overrides": {
    "server": {
      "telemetry": {
        "enabled": true
      }
    }
  },
  "env_vars": [
    "PREFECT__LOGGING__LEVEL",
    "PREFECT__CONTEXT__SECRETS__CRUNCHBASE_API_KEY",
    "PREFECT__CLOUD__AUTH_TOKEN",
    "PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS"
  ],
  "system_information": {
    "platform": "macOS-10.15.6-x86_64-i386-64bit",
    "prefect_version": "0.12.6",
    "python_version": "3.8.3"
  }
}
@joshmeek
Copy link
Member

@joshmeek joshmeek commented Aug 6, 2020

Hi @n-batalha what happens if you initialize boto3 client directly and call upload_fileobj to put something in your bucket? The error suggests that this isn't something to do with prefect but with your access key.

@n-batalha
Copy link
Author

@n-batalha n-batalha commented Aug 6, 2020

@joshmeek the same credentials work to upload, I use it to store Prefect results in s3 and for a direct boto upload.

@n-batalha
Copy link
Author

@n-batalha n-batalha commented Aug 6, 2020

@joshmeek to be 100% sure I edited the reproducible example above to do an upload in the same bucket, works.

@joshmeek
Copy link
Member

@joshmeek joshmeek commented Aug 6, 2020

🤔 be sure that you aren't unintentionally reading AWS credentials from one of the other default locations that boto3 uses https://boto3.amazonaws.com/v1/documentation/api/latest/guide/configuration.html

FWIW I ran your code exactly and had no issue uploading to a bucket:

➜ python s3storagetest.py
...my buckets printed here...
Result check: OK
[2020-08-06 11:51:08] INFO - prefect.S3 | Uploading say-hi/2020-08-06t11-51-08-712556-00-00 to my-bucket
Flow: https://cloud.prefect.io/tenant-name/flow/flow-id
@n-batalha
Copy link
Author

@n-batalha n-batalha commented Aug 6, 2020

@joshmeek good heads up. I believe the env vars take precedence so it should work for both.

The docs state (the part here) that the same setup is required as boto, but there is a bit going on in here:

  • https://github.com/PrefectHQ/prefect/blob/master/src/prefect/environments/storage/s3.py#L201-L203
  • def get_boto_client(
    resource: str, credentials: dict = None, use_session: bool = False, **kwargs: Any
    ) -> "boto3.client":
    """
    Utility function for loading boto3 client objects from a given set of credentials.
    Args:
    - resource (str): the name of the resource to retrieve a client for
    - credentials (dict, optional): a dictionary of AWS credentials used to
    initialize the Client; if not provided, will attempt to load the
    Client using ambient environment settings
    - use_session (bool, optional): a boolean specifying whether to load
    this client using a session or not; defaults to `False`
    - **kwargs (Any, optional): additional keyword arguments to pass to boto3
    Returns:
    - Client: an initialized and authenticated boto3 Client
    """
    aws_access_key = None
    aws_secret_access_key = None
    aws_session_token = None
    if credentials:
    aws_access_key = credentials["ACCESS_KEY"]
    aws_secret_access_key = credentials["SECRET_ACCESS_KEY"]
    aws_session_token = credentials.get("SESSION_TOKEN")
    else:
    ctx_credentials = prefect.context.get("secrets", {}).get("AWS_CREDENTIALS", {})
    aws_access_key = ctx_credentials.get("ACCESS_KEY")
    aws_secret_access_key = ctx_credentials.get("SECRET_ACCESS_KEY")
    aws_session_token = ctx_credentials.get("SESSION_TOKEN")
    if use_session:
    # see https://boto3.amazonaws.com/v1/documentation/api/latest/guide/resources.html?#multithreading-multiprocessing # noqa
    session = boto3.session.Session()
    return session.client(
    resource,
    aws_access_key_id=aws_access_key or kwargs.pop("aws_access_key_id", None),
    aws_secret_access_key=aws_secret_access_key
    or kwargs.pop("aws_secret_access_key", None),
    aws_session_token=aws_session_token
    or kwargs.pop("aws_session_token", None),
    **kwargs
    )
    else:
    return boto3.client(
    resource,
    aws_access_key_id=aws_access_key or kwargs.pop("aws_access_key_id", None),
    aws_secret_access_key=aws_secret_access_key
    or kwargs.pop("aws_secret_access_key", None),
    aws_session_token=aws_session_token
    or kwargs.pop("aws_session_token", None),
    **kwargs
    )

Removing ~/.aws had no impact (still fails), but removing the PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS env var allows the script above to run. I can see the env var above being picked up for the boto creds, as opposed to leaving it to boto entirely.

I am leaving this open as it should work when both PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS and the boto env vars are present?

@n-batalha
Copy link
Author

@n-batalha n-batalha commented Aug 6, 2020

Btw, debugging I can see that the PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS values are correctly parsed in here and the values themselves seem fine. They are pulled from the other env vars via:

export PREFECT__CONTEXT__SECRETS__AWS_CREDENTIALS='{"ACCESS_KEY": "'${AWS_ACCESS_KEY_ID}'", "SECRET_ACCESS_KEY": "'${AWS_SECRET_ACCESS_KEY}'"}'
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
2 participants
You can’t perform that action at this time.