Why using Lambda Layers?
The benefits of AWS Lambda Layers should not be missed. With Lambda Layers you get the possibility to package code which can then be reused in different functions.
In big environments usually all Lambda Layers are located in one specific account and then shared via Layer Version Permissions to then get used by the necessary functions.
You can use Lambda layers either to write your own functions or to make entire packages available which are not covered by the Lambda runtimes. This blog article will provide you a solution on how you can upload Lambda layer packages with complex folder structures to a S3 Bucket. As you should always deploy your resources with Infrastructure as Code, a manual upload is not a satisfying solution here.
Deep-Dive into the Lambda Function
All steps are covered in one Lambda function. Depending on your favor, you can also split them up. In this example I will guide you through all steps being executed in one function.
Summary of the procedure
The base procedure is as follows:
The package is placed in our Customizations for Control Tower (CfCT) repository. A Lambda functions checks the path where the package is located, and recursively loop through all folder, sub folders and files.
The solution leverages the non-persistent storage provided by default in every Lambda function. The same folder and file structure then gets created in the local /tmp
directory.
After that the local directory gets zipped and uploaded to S3 where another automatism gets the Zip File to then add it to the Lambda layers.
This blog article should only focus on the upload part to S3. This part is quite tricky, because the folder structure is not clear and the automatism should stay dynamic enough to also be used for further packages which should get added as a layer with different folder structures.
Necessary environment variables
This example uses Bitbucket as a repository hosting service. Prerequisites are a working CodeStar Connection between the CfCT pipeline including authentication to have the necessary access to the repository.
BITBUCKET_WORKSPACE_NAME = os.environ['WORKSPACE_NAME']
BITBUCKET_REPO_NAME = os.environ['REPOSITORY_NAME']
BITBUCKET_TOKEN_PARAMETER = os.environ['BITBUCKET_TOKEN_PARAMETER']
BITBUCKET_BRANCH_NAME = os.environ['BRANCH_NAME']
BUCKET_NAME = os.environ['BUCKET_NAME']
Corresponding environment variables should be set in the Lambda function which have all necessary Bitbucket information included like the Bitbucket workspace, the repository name, the token parameter and the branch name.
Another environment parameter is a list of the folder paths, where the packages are located. Also the S3 bucket where the Zip file will get uploaded should be set as an environment variable
How to trigger the Lambda function
The Lambda function gets triggered as soon as the CodePipeline starts. You can realize this with a separate CodePipeline or an EventBridge Trigger which listens to the corresponding CloudTrail event.
First the API token is retrieved from an encrypted SSM Parameter to be able to set up the connection to the Bitbucket repository
def getApiToken():
ssmClient = boto3.client('ssm')
try:
response = ssmClient.get_parameter(
Name=BITBUCKET_TOKEN_PARAMETER,
WithDecryption=True
)
token = response['Parameter']['Value']
return token
except ClientError as e:
print(e)
raise e
Collecting all items from the repository
The Lambda functions loops through every entry of the folder path variable and calls the getFolder
method
In the getFolder
method the base URL for Bitbucket is joined and the token is set in the headers variable. This step is necessary to access the remote repository.
def getFolder(token, folderPath, s3Client):
print(f"Checking Folder Path {folderPath}")
baseUrl = f'https://api.bitbucket.org/2.0/repositories/{BITBUCKET_WORKSPACE_NAME}/{BITBUCKET_REPO_NAME}/src/{BITBUCKET_BRANCH_NAME}/{folderPath}'
print(f"Base Url: {baseUrl}")
headers = {'Authorization': f'Bearer {token}'}
After that the getAllItems
method gets called. An empty list variable gets initialized and with the help of the request package a get
method gets called to capture all the files from the provided folder path out of the repository
def getAllItems(url, headers):
files = []
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
files.extend(data.get('values', []))
url = data.get('next', None)
return files
Back to the getFolder
method, the localFolderPath
gets set to /tmp
because this is where the package structure should be temporarily saved. With the help of the os
package and the makedirs
package a folder with the same name gets created in the Lambda functions environment.
localFolderPath = os.path.join('/tmp', folderPath.lstrip('/'))
os.makedirs(localFolderPath, exist_ok=True)
print(f"Created local folder: {localFolderPath}")
newFolderPath=""
Local creation of the package structure
Now comes the complicated part: The function iterates through all items retrieved from the getAllItems method using a for-loop. The item object looks like this:
{
"path": "lambda/layers/xxxxx",
"commit": {
"hash": "xxxxx",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/xxxxx/commit/xxxxx"
},
"html": {
"href": "https://bitbucket.org/xxxxx/commits/xxxxx"
}
},
"type": "commit"
},
"type": "commit_file",
"attributes": [],
"escaped_path": "lambda/layers/xxxx",
"size": 1779,
"mimetype": "text/x-python",
"links": {
"self": {
"href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
},
"meta": {
"href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
},
"history": {
"href": "https://api.bitbucket.org/2.0/repositories/xxxxx"
}
}
}
The item path and the item type of the current file gets saved into two variables
If the item_type equals commit directory
, the old path plus the name of the item gets set as the new_folder_path and the folder gets created in the /tmp
directory as well. After that, the getFolder
function gets called again with the new folder path.
for item in repoItems:
itemPath = item['path']
itemType = item['type']
# Check whether itemType is a directory
if itemType == 'commit_directory':
itemPath = itemPath.split('/')[-1]
newFolderPath = os.path.join(folderPath, itemPath).lstrip('/')
# Folder gets created locally
localSubfolderPath = os.path.join('/tmp', newFolderPath)
os.makedirs(localSubfolderPath, exist_ok=True)
print(f"Found folder and created local subfolder: {localSubfolderPath}")
# getFolder function gets called again with new folder path
getFolder(token, newFolderPath, s3Client)
If the itemType
equals commit_file
, the file name gets read out of the whole path. Then the url
variable gets set to point to the file in the Bitbucket directory and is created in the /tmp
directory under the correct sub folder.
elif itemType == 'commit_file':
print(f"Found file: {itemPath}")
fullPath = item['path']
pathParts = fullPath.split('/')
fileName = pathParts[-1]
# Get the file content from Bitbucket and upload it to S3
url = f'https://api.bitbucket.org/2.0/repositories/{BITBUCKET_WORKSPACE_NAME}/{BITBUCKET_REPO_NAME}/src/{BITBUCKET_BRANCH_NAME}/{folderPath}/{fileName}'
headers = {'Authorization': f'Bearer {token}'}
response = requests.get(url, headers=headers)
if response.status_code == 200:
localFilePath = os.path.join(localFolderPath, fileName)
with open(localFilePath, 'wb') as file:
file.write(response.content)
print(f"Found file and created local file: {localFilePath}")
else:
print(f"Unknown item type: {itemType} for {itemPath}")
return newFolderPath
Zip and upload of the package to S3
After the for loop is finished, the processFolder function gets called to read out the package name of the Lambda package
The addFolderToArchive
function gets called next.
def getAllItems(url, headers):
files = []
while url:
response = requests.get(url, headers=headers)
response.raise_for_status()
data = response.json()
files.extend(data.get('values', []))
url = data.get('next', None)
return files
This method is responsible for uploading the package structure to S3. First the current timestamp is generated, then the name of the Zip file is set. With the shutil
package the folder gets zipped via the make_archive
function and uploaded to the provided bucket from the environment variable. The last step is to save the package name with the timestamp in the end in a SSM parameter which can then further be used to build the part where the layer itself gets created and shared with the other accounts.
def addFolderArchive(folderName, folderPath, s3Client):
timeStamp = datetime.datetime.now().strftime("%Y%m%d%H%M%S")
zipFileName = f'{folderName}_{timeStamp}.zip'
fullFolderPath = f'/tmp/{folderPath}/{folderName}'
tempZipFile = f'/tmp/{zipFileName}'
# Zip File gets created and uploaded to S3
shutil.make_archive(tempZipFile[:-4], 'zip', fullFolderPath)
s3Client.upload_file(tempZipFile, BUCKET_NAME, zipFileName)
# SSM Parameter gets set with package name
ssm_client = boto3.client('ssm')
ssm_client.put_parameter(
Name=f'/org/layer/package/{folderPath}/{folderName}/zipArchive',
Description=f'Archive name for {folderName} in s3 Bucket {BUCKET_NAME}',
Value=zipFileName,
Type='String',
Overwrite=True
)
# Local files are removed
if os.path.exists(tempZipFile):
os.remove(tempZipFile)
if os.path.exists(fullFolderPath):
shutil.rmtree(fullFolderPath)
print(f"Folder {folderName} archived as {zipFileName} and uploaded to S3")
The local zip file gets created from the Lambda environment and the function is finished.
The whole Lambda code can be found in my GitHub account.
About Me
Hi! My name is Jana, I live in the Southwest of Germany and when I'm not smashing weights in the gym I love to architect solutions in AWS making my and the customers lives easier.
My computer science journey started as an On-Premise System Administrator over the time developing to an AWS Architect. As I know both the "old" and "new" world, I know common pain points in architectures and being able to provide solutions to solve them and making them not even more efficient but also cheaper!
I enjoy to learn and as the AWS portfolio is evolving all the time, I also try to stay up to date by getting certified and checking out newly launched products and services.
If you want to lift your environment either to the cloud or want to leverage your already migrated environment to use more of the cloud services, hit me up or check out Public Cloud Group GmbH!
If you want to support me, you can buy me a coffee!
About PCG
Public Cloud Group supports companies in their digital transformation through the use of public cloud solutions.
With a product portfolio designed to accompany organisations of all sizes in their cloud journey and competence that is a synonym for highly qualified staff that clients and partners like to work with, PCG is positioned as a reliable and trustworthy partner for the hyperscalers, relevant and with repeatedly validated competence and credibility.
We have the highest partnership status with the three relevant hyperscalers: Amazon Web Services (AWS), Google, and Microsoft. As experienced providers, we advise our customers independently with cloud implementation, application development, and managed
Top comments (1)
Some comments may only be visible to logged-in visitors. Sign in to view all comments.