8

I'm trying to export Sentinel-2 vegetation indices and texture features sampled at points using Google Earth Engine. I have a large number of points and features, so instead of exporting all the features from all points at once, I am exporting features from one point at a time inside a loop.

For each point, I run the following export task:

python
Copy
Edit
task = ee.batch.Export.table.toDrive(
    collection=sampled_data,
    description=f'S2_{point_number}',
    fileNamePrefix=f'S2_{point_number}',
    folder='S2_textures',
    fileFormat='CSV'
)
task.start()

The Problem: Even though I set folder='S2_textures' in Export.table.toDrive, Earth Engine is creating a new folder called S2_textures each time in my Google Drive.

I expected all the CSV files to be saved into one single S2_textures folder.

Question: How can I ensure that all the exported CSVs go into a single folder without creating multiple "S2_textures" folders?

5
  • 1
    I'm encountering the same error with code that previously worked correctly. A new folder with the same name is being created for each task I run, even though a folder with that name already existed before running the code. Maybe EE changed something? It's strange because they still state in their documentation: > The Google Drive Folder that the export will reside in. Note: (a) if > the folder name exists at any level, the output is written to it, (b) > if duplicate folder names exist, output is written to the most > recently modified folder, (c) if the folder name does not exist, a new > Commented Apr 28 at 11:45
  • 1
    Same issue here... a previously working code is now creating a new folder with the exact same name for each export tasks. This looks like a mistake on Google Earth Engine side. Hopefully, this will be solved soon enough. Jason Commented Apr 30 at 16:31
  • 1
    I am having the same problem for code that previously worked. I have even switched from service key to OAuth 2.0. Nothing I have done will fix it. What is interesting is that I can write to my Google Drive, but when I try to use python api to see any folders, they are not visible. Strange behavior. A colleague just joined and is able to download to a single folder. Commented May 8 at 14:08
  • I'm experiencing the same thing. This issue has started recently without any changes to my code, so it seems to be something on the gee side... Commented Aug 13 at 13:26
  • I don't have the reputation to answer, but if anyone is still running into this, it seems to have been fixed in newer versions of earthengine-api. I had this issue with 1.1.5 and lower versions, but not with 1.6.4. Commented Aug 21 at 16:54

2 Answers 2

2

In the meantime, my workaround is to merge all the unwanted folders to the original one (the one I previously used to work with) after everything has been exported. This way, the drive structure is the one I expected and its clean again. Here's a piece of code to do that if you are working with the Google Drive associated to the service account for Google Earth Engine:

# Imports
#==========

from dotenv import load_dotenv 
import os
import ee
from pydrive.auth import GoogleAuth
from pydrive.drive import GoogleDrive
import 


# Parameters
#==========

env_variable_for_path_to_key_file = 'GEE_KEY_FILE'
service_account_address = '[email protected]' # Replace XXX and YYY or the full address to the one you need
google_drive_folder_name = 'GEE_exports'


# Set Google Drive
#==========

# Load .env file
load_dotenv()

# Access the environment variable
path_key_file = os.getenv(env_variable_for_path_to_key_file)

# Authenticate with the service account
credentials = ee.ServiceAccountCredentials(service_account_address, path_key_file)
ee.Initialize(credentials)

# Authenticate to Google Drive (of the Service account)
gauth = GoogleAuth()
gauth.credentials = ServiceAccountCredentials.from_json_keyfile_name(path_key_file, scopes=['https://www.googleapis.com/auth/drive'])

google_drive_object = GoogleDrive(gauth)


# Combine all same-name folders to one, the oldest
#==========

def merge_duplicate_folders(google_drive_object, folder_name):
    # 1. Find all folders with the given name
    folder_list = google_drive_object.ListFile({
        'q': f"title = '{folder_name}' and mimeType = 'application/vnd.google-apps.folder' and trashed = false"
    }).GetList()

    if len(folder_list) <= 1:
        print(f"No duplicates to merge. Found {len(folder_list)} folder(s) named '{folder_name}'.")
        return

    print(f"Found {len(folder_list)} folders named '{folder_name}'. Merging...")

    # 2. Choose one folder as the target (the oldest one in this example)
    folder_list = sorted(folder_list, key=lambda f: f['createdDate'], reverse=False)
    target_folder = folder_list[0]
    target_id = target_folder['id']
    print(f"Using folder '{target_folder['title']}' ({target_id}) as the target.")

    # 3. Move contents of other folders to the target
    for dup_folder in folder_list[1:]:
        print(f"Processing duplicate folder: {dup_folder['title']} ({dup_folder['id']})")
        # Get all files in this duplicate folder
        files = google_drive_object.ListFile({
            'q': f"'{dup_folder['id']}' in parents and trashed=false"
        }).GetList()
        for f in files:
            print(f"  Moving file: {f['title']}")
            # Add new parent
            f['parents'] = [{'id': target_id}]
            f.Upload()  # Move to new parent folder
        # Delete the now-empty folder
        print(f"  Deleting duplicate folder: {dup_folder['id']}")
        dup_folder.Delete()

    print("✅ Merge complete.")

merge_duplicate_folders(google_drive_object, google_drive_folder_name)
1

The way EE drive export deals with folders is unfortunate, IMO. If a folder with the specified name already exists, it will be reused. If not, a new folder is created at the root. When launching multiple concurrent tasks, I'm not sure at what time that check happens. Based off the behaviour you're seeing, I'd guess it's immediately when the task is submitted, leading to multiple folder being created.

A workaround for this is to explicitly create the folder beforehand. You can use the Google Drive API for that. An example is here.

This is something I'm doing myself when I need to control exactly which folder an export ends up in. But I first ensure the folder get a unique name by appending a timestamp to it. This prevents exports ending up in another folder that by chance happens to have the same name.

2
  • Unfortunately what this ignores is that the behaviour is not the consequence of the code alone. As OP, commenters here, myself and colleagues of mine are experiencing, previously functional code has been broken by a change made behind the scenes by Google. Having an explicitly (even non-programmatically) created target folder does NOT prevent this bug from manifesting. Commented May 20 at 15:24
  • I am unable to reproduce this, with version 1.1.5rc0 at least. Files are exported into previously created folders, even when they're deeply nested. Commented May 21 at 12:36

Start asking to get answers

Find the answer to your question by asking.

Ask question

Explore related questions

See similar questions with these tags.