Index location | AWS S3 | Google Cloud Storage | Azure Blob Storage |
---|---|---|---|
AWS | ✅ | ✅ | ✅ |
GCP | ❌ | ✅ | ✅ |
Azure | ❌ | ✅ | ✅ |
example_namespace1
and example_namespace2
, your directory structure would look like this:
__default__
. The default namespace must be empty.Column name | Parquet type | Description |
---|---|---|
id | STRING | Required. The unique identifier for each record. |
values | LIST<FLOAT> | Required. A list of floating-point values that make up the dense vector embedding. |
metadata | STRING | Optional. Additional metadata for each record. To omit from specific rows, use NULL . |
example_namespace1
and example_namespace2
and upload 4 Parquet files into each, your directory structure would look as follows after the upload:
start_import
operation to start an asynchronous import of vectors from object storage into an index.
uri
, specify the URI of the bucket and import directory containing the namespaces and Parquet files you want to import. For example:
s3://BUCKET_NAME/IMPORT_DIR
gs://BUCKET_NAME/IMPORT_DIR
https://STORAGE_ACCOUNT.blob.core.windows.net/CONTAINER_NAME/IMPORT_DIR
integration_id
, specify the Integration ID of the Amazon S3, Google Cloud Storage, or Azure Blob Storage integration you created. The ID is found on the Storage integrations page of the Pinecone console.
error_mode
, use CONTINUE
or ABORT
.
ABORT
, the operation stops if any records fail to import.CONTINUE
, the operation continues on error, but there is not any notification about which records, if any, failed to import. To see how many records were successfully imported, use the describe_import
operation.id
that you can use to check the status of the import:
InProgress
, but 100.0
percent complete. Once all the imported records are indexed and fully available for querying, the import operation is set to Completed
.
describe_import
operation with the import ID:
status
, percent_complete
, and records_imported
:
error
field with the reason for the failure:
list_imports
operation to list all of the recent and ongoing imports. By default, the operation returns up to 100 imports per page. If the limit
parameter is passed, the operation returns up to that number of imports per page instead. For example, if limit=3
, up to 3 imports are returned per page. Whenever there are additional imports to return, the response includes a pagination_token
for fetching the next page of imports.
list_import
paginates automatically.cancel_import
operation cancels an import if it is not yet finished. It has no effect if the import is already complete.
Metric | Limit |
---|---|
Max size per import request | 2 TB or 200,000,000 records |
Max namespaces per import request | 10,000 |
Max files per import request | 100,000 |
Max size per file | 10 GB |
__default__
namespace of an index, the default namespace must be empty.