COLLECTED BY
Organization:
Internet Archive
Focused crawls are collections of frequently-updated webcrawl data from narrow (as opposed to broad or wide) web crawls, often focused on a single domain or subdomain.
The Wayback Machine - https://web.archive.org/web/20200813185308/https://github.com/topics/aws-glue
Here are
44 public repositories
matching this topic...
Updated
Aug 12, 2020
Python
Glue scripts for converting AWS Service Logs for use in Athena
Updated
Sep 12, 2019
Python
COVID Response - Analytics, AI, and data science API and sample notebooks
Updated
Apr 23, 2020
Jupyter Notebook
A data catalog for database tables and columns to track PII and PHI.
Updated
Jul 24, 2020
Python
Bring your own data Labs: Build a serverless data pipeline based on your own data
Learn how to use Kinesis Firehose, AWS Glue, S3, and Amazon Athena by streaming and analyzing reddit comments in realtime. 100-200 level tutorial.
Updated
Jun 26, 2020
Python
Data Lake as Code, featuring ChEMBL and OpenTargets
Updated
Aug 13, 2020
Jupyter Notebook
🌉 Reference implementation for granting cross-account AWS Glue Data Catalog access from Amazon Athena
Updated
Aug 13, 2020
Python
Build and Deploy A Serverless Data Pipeline on AWS
Updated
Nov 1, 2019
Python
Automate the daily partitioning of your CloudTrail bucket in Athena
Updated
Jul 20, 2020
JavaScript
🐋 Docker image for AWS Glue Spark/Python
Updated
May 26, 2020
Dockerfile
Discover how you can migrate from traditional deployments to serverless architectures with AWS
Updated
Feb 1, 2019
JavaScript
A CLI to manage and monitor permissions in AWS Lake Formation
Updated
Mar 31, 2020
Python
Automated data quality suggestions and analysis with Amazon Deequ on AWS Glue
Updated
Aug 7, 2020
Scala
Demo for building Real Time Data Collection Pipeline on AWS
Updated
Jan 8, 2019
JavaScript
AWS Glue tutorial for data developers.
Updated
Sep 2, 2019
Python
Proof of Value Terraform Scripts to utilize Amazon Web Services (AWS) Security, Identity & Compliance Services to Support your AWS Account Security Posture.
Terraform module which creates Glue resources on AWS
AWS ETL example via AWS DMS & AWS Glue
Example of how to set SBT up for local development of AWS Glue Scripts
Updated
May 15, 2020
Scala
AWS Glue Libraries are additions and enhancements to Spark for ETL operations.
Updated
May 13, 2020
Python
Personal take on GraphDB + AML with AWS Neptune + Glue + Lambda.
Updated
Nov 13, 2018
Python
AWS Glue - Incremental Pull Script
Updated
Apr 10, 2019
Python
This is a project which demonstrates creation of a data pipeline by scraping data using twitter API and creating a data delivery stream using Kinesis Firehose for ingesting data to Amazon S3.
Updated
May 4, 2020
Python
Financal data streaming and analysis with AWS Kinesis and Athena
Updated
Jun 9, 2020
Jupyter Notebook
DevOps에 대한 개념 이해와 AWS 개발자 도구를 활용한 실습 및 연구
Updated
Jun 15, 2020
Java
Discover how you can migrate from traditional deployments to serverless architectures with AWS
Updated
Feb 26, 2019
JavaScript
Pexip Infinity log analysis on the AWS cloud
Continuing with my case study on reading a big data file, this is the fifth part of my trilogy :-) on how I got on reading a big'ish file with C, Python, spark-python and spark-scala, AWS Elastic Map reduce and AWS Athena.
Updated
Jun 25, 2018
Python
Resources for AWS Big data certification preparation
Updated
Mar 18, 2020
TSQL
Improve this page
Add a description, image, and links to the
aws-glue
topic page so that developers can more easily learn about it.
Curate this topic
Add this topic to your repo
To associate your repository with the
aws-glue
topic, visit your repo's landing page and select "manage topics."
Learn more
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session.
You signed out in another tab or window. Reload to refresh your session.