DEV Community

Cover image for Automatically Upload Lambda Layer Packages to S3
Jana Hockenberger
Jana Hockenberger

Posted on

Automatically Upload Lambda Layer Packages to S3

Why using Lambda Layers?

The benefits of AWS Lambda Layers should not be missed. With Lambda Layers you get the possibility to package code which can then be reused in different functions.

In big environments usually all Lambda Layers are located in one specific account and then shared via Layer Version Permissions to then get used by the necessary functions.

You can use Lambda layers either to write your own functions or to make entire packages available which are not covered by the Lambda runtimes. This blog article will provide you a solution on how you can upload Lambda layer packages with complex folder structures to a S3 Bucket. As you should always deploy your resources with Infrastructure as Code, a manual upload is not a satisfying solution here.

Deep-Dive into the Lambda Function

All steps are covered in one Lambda function. Depending on your favor, you can also split them up. In this example I will guide you through all steps being executed in one function.

Summary of the procedure

The base procedure is as follows:
The package is placed in our Customizations for Control Tower (CfCT) repository. A Lambda functions checks the path where the package is located, and recursively loop through all folder, sub folders and files.
The solution leverages the non-persistent storage provided by default in every Lambda function. The same folder and file structure then gets created in the local /tmp directory.
After that the local directory gets zipped and uploaded to S3 where another automatism gets the Zip File to then add it to the Lambda layers.

Architecture

This blog article should only focus on the upload part to S3. This part is quite tricky, because the folder structure is not clear and the automatism should stay dynamic enough to also be used for further packages which should get added as a layer with different folder structures.

Necessary environment variables

This example uses Bitbucket as a repository hosting service. Prerequisites are a working CodeStar Connection between the CfCT pipeline including authentication to have the necessary access to the repository.

BITBUCKET_WORKSPACE_NAME = os.environ['WORKSPACE_NAME']
BITBUCKET_REPO_NAME = os.environ['REPOSITORY_NAME']
BITBUCKET_TOKEN_PARAMETER = os.environ['BITBUCKET_TOKEN_PARAMETER']
BITBUCKET_BRANCH_NAME = os.environ['BRANCH_NAME']
BUCKET_NAME = os.environ['BUCKET_NAME']
Enter fullscreen mode Exit fullscreen mode

Corresponding environment variables should be set in the Lambda function which have all necessary Bitbucket information included like the Bitbucket workspace, the repository name, the token parameter and the branch name.

Another environment parameter is a list of the folder paths, where the packages are located. Also the S3 bucket where the Zip file will get uploaded should be set as an environment variable

How to trigger the Lambda function

The Lambda function gets triggered as soon as the CodePipeline starts. You can realize this with a separate CodePipeline or an EventBridge Trigger which listens to the corresponding CloudTrail event.

First the API token is retrieved from an encrypted SSM Parameter to be able to set up the connection to the Bitbucket repository

def getApiToken():
    ssmClient = boto3.client('ssm')
    try:
        response = ssmClient.get_parameter(
            Name=BITBUCKET_TOKEN_PARAMETER,
            WithDecryption=True
        )
        token = response['Parameter']['Value']
        return token
    except ClientError as e:
        print(e)
        raise e
Enter fullscreen mode Exit fullscreen mode

Collecting all items from the repository

The Lambda functions loops through every entry of the folder path variable and calls the getFolder method

In the getFolder method the base URL for Bitbucket is joined and the token is set in the headers variable. This step is necessary to access the remote repository.

def getFolder(token, folderPath, s3Client):
    print(f"Checking Folder Path {folderPath}")
    baseUrl = f'https://api.bitbucket.org/2.0/repositories/{BITBUCKET_WORKSPACE_NAME}/{BITBUCKET_REPO_NAME}/src/{BITBUCKET_BRANCH_NAME}/{folderPath}'
    print(f"Base Url: {baseUrl}")
    headers = {'Authorization': f'Bearer {token}'}
Enter fullscreen mode Exit fullscreen mode

After that the getAllItems method gets called. An empty list variable gets initialized and with the help of the request package a get method gets called to capture all the files from the provided folder path out of the repository

def getAllItems(url, headers):
    files = []
    while url:
        response = requests.get(url, headers=headers)
        response.raise_for_status()  
        data = response.json()
        files.extend(data.get('values', []))

        url = data.get('next', None)
    return files
Enter fullscreen mode Exit fullscreen mode

Back to the getFolder method, the localFolderPath gets set to /tmp because this is where the package structure should be temporarily saved. With the help of the os package and the makedirs package a folder with the same name gets created in the Lambda functions environment.

localFolderPath = os.path.join('/tmp', folderPath.lstrip('/'))
    os.makedirs(localFolderPath, exist_ok=True)
    print(f"Created local folder: {localFolderPath}")
    newFolderPath=""
Enter fullscreen mode Exit fullscreen mode

Local creation of the package structure

Now comes the complicated part: The function iterates through all items retrieved from the getAllItems method using a for-loop. The item object looks like this:

{
  "path": "lambda/layers/xxxxx",
  "commit": {
    "hash": "xxxxx",
    "links": {
      "self": {
        "href": "https://api.bitbucket.org/2.0/repositories/xxxxx/commit/xxxxx"
      },
      "html": {
        "href": "https://bitbucket.org/xxxxx/commits/xxxxx"
      }
    },
    "type": "commit"
  },
  "type": "commit_file",
  "attributes": [],
  "escaped_path": "lambda/layers/xxxx",
  "size": 1779,
  "mimetype": "text/x-python",
  "links": {
    "self": {
      "href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
    },
    "meta": {
      "href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
    },
    "history": {
      "href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

The item path and the item type of the current file gets saved into two variables

If the item_type equals commit directory, the old path plus the name of the item gets set as the new_folder_path and the folder gets created in the /tmp directory as well. After that, the getFolder function gets called again with the new folder path.

for item in repoItems:
        itemPath = item['path']
        itemType = item['type']

        # Check whether itemType is a directory
        if itemType == 'commit_directory':
            itemPath = itemPath.split('/')[-1]
            newFolderPath = os.path.join(folderPath, itemPath).lstrip('/')

            # Folder gets created locally
            localSubfolderPath = os.path.join('/tmp', newFolderPath)
            os.makedirs(localSubfolderPath, exist_ok=True)
            print(f"Found folder and created local subfolder: {localSubfolderPath}")

            # getFolder function gets called again with new folder path
            getFolder(token, newFolderPath, s3Client)
Enter fullscreen mode Exit fullscreen mode

If the itemType equals commit_file, the file name gets read out of the whole path. Then the url variable gets set to point to the file in the Bitbucket directory and is created in the /tmp directory under the correct sub folder.

elif itemType == 'commit_file':
            print(f"Found file: {itemPath}")
            fullPath = item['path']
            pathParts = fullPath.split('/')
            fileName = pathParts[-1]

            # Get the file content from Bitbucket and upload it to S3
            url = f'https://api.bitbucket.org/2.0/repositories/{BITBUCKET_WORKSPACE_NAME}/{BITBUCKET_REPO_NAME}/src/{BITBUCKET_BRANCH_NAME}/{folderPath}/{fileName}'
            headers = {'Authorization': f'Bearer {token}'}
            response = requests.get(url, headers=headers)
            if response.status_code == 200:
                localFilePath = os.path.join(localFolderPath, fileName)
                with open(localFilePath, 'wb') as file:
                    file.write(response.content)
                print(f"Found file and created local file: {localFilePath}")
        else:
            print(f"Unknown item type: {itemType} for {itemPath}")
    return newFolderPath
Enter fullscreen mode Exit fullscreen mode

Zip and upload of the package to S3

After the for loop is finished, the processFolder function gets called to read out the package name of the Lambda package
The addFolderToArchive function gets called next.

def getAllItems(url, headers):
    files = []
    while url:
        response = requests.get(url, headers=headers)
        response.raise_for_status()  
        data = response.json()
        files.extend(data.get('values', []))

        url = data.get('next', None)
    return files
Enter fullscreen mode Exit fullscreen mode

This method is responsible for uploading the package structure to S3. First the current timestamp is generated, then the name of the Zip file is set. With the shutil package the folder gets zipped via the make_archive function and uploaded to the provided bucket from the environment variable. The last step is to save the package name with the timestamp in the end in a SSM parameter which can then further be used to build the part where the layer itself gets created and shared with the other accounts.

def addFolderArchive(folderName, folderPath, s3Client):
    timeStamp = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
    zipFileName = f'{folderName}_{timeStamp}.zip'

    fullFolderPath = f'/tmp/{folderPath}/{folderName}'
    tempZipFile = f'/tmp/{zipFileName}'

    # Zip File gets created and uploaded to S3
    shutil.make_archive(tempZipFile[:-4], 'zip', fullFolderPath)
    s3Client.upload_file(tempZipFile, BUCKET_NAME, zipFileName)

    # SSM Parameter gets set with package name
    ssm_client = boto3.client('ssm')
    ssm_client.put_parameter(
        Name=f'/org/layer/package/{folderPath}/{folderName}/zipArchive',
        Description=f'Archive name for {folderName} in s3 Bucket {BUCKET_NAME}',
        Value=zipFileName,
        Type='String',
        Overwrite=True
    )

    # Local files are removed
    if os.path.exists(tempZipFile):
        os.remove(tempZipFile)
    if os.path.exists(fullFolderPath):
        shutil.rmtree(fullFolderPath)
    print(f"Folder {folderName} archived as {zipFileName} and uploaded to S3")

Enter fullscreen mode Exit fullscreen mode

The local zip file gets created from the Lambda environment and the function is finished.

The whole Lambda code can be found in my GitHub account.

About Me

Hi! My name is Jana, I live in the Southwest of Germany and when I'm not smashing weights in the gym I love to architect solutions in AWS making my and the customers lives easier.

My computer science journey started as an On-Premise System Administrator over the time developing to an AWS Architect. As I know both the "old" and "new" world, I know common pain points in architectures and being able to provide solutions to solve them and making them not even more efficient but also cheaper!

I enjoy to learn and as the AWS portfolio is evolving all the time, I also try to stay up to date by getting certified and checking out newly launched products and services.

If you want to lift your environment either to the cloud or want to leverage your already migrated environment to use more of the cloud services, hit me up or check out Public Cloud Group GmbH!

If you want to support me, you can buy me a coffee!

About PCG

Public Cloud Group supports companies in their digital transformation through the use of public cloud solutions.

With a product portfolio designed to accompany organisations of all sizes in their cloud journey and competence that is a synonym for highly qualified staff that clients and partners like to work with, PCG is positioned as a reliable and trustworthy partner for the hyperscalers, relevant and with repeatedly validated competence and credibility.

We have the highest partnership status with the three relevant hyperscalers: Amazon Web Services (AWS), Google, and Microsoft. As experienced providers, we advise our customers independently with cloud implementation, application development, and managed

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.