Documentation Index
Fetch the complete documentation index at: https://arize-ax.mintlify.dev/docs/llms.txt
Use this file to discover all available pages before exploring further.
Log predictions and actuals for classification, regression, ranking, and object detection models. Monitor drift, performance, and data quality.
Key Capabilities
- Stream or batch logging
- Embedding features for drift detection
- SHAP values for explainability
- Tags and metadata for segmentation
- Support for delayed actuals
- Export data for offline analysis
Stream Logging
Log individual predictions in real-time as they occur in production. log_stream sends the request asynchronously and returns a concurrent.futures.Future. Call .result() on the future to block until the request completes.
from arize.ml.types import ModelTypes, Environments
future = client.ml.log_stream(
space_id="your-space-id",
model_name="fraud-detection-v1",
model_type=ModelTypes.SCORE_CATEGORICAL,
environment=Environments.PRODUCTION,
prediction_id="unique-prediction-id",
prediction_label=("not fraud", 0.85),
actual_label=("fraud", 1.0),
features={
"transaction_amount": 150.0,
"merchant_category": "online_retail",
"user_age": 32,
},
embedding_features={
"user_embedding": ([0.1, 0.2, ...], "user_123"),
},
)
print(f"Logged prediction: {future.result().status_code}")
Add custom tags for segmentation and filtering.
future = client.ml.log_stream(
space_id="your-space-id",
model_name="fraud-detection-v1",
model_type=ModelTypes.SCORE_CATEGORICAL,
environment=Environments.PRODUCTION,
prediction_id="unique-prediction-id",
prediction_label=("not fraud", 0.85),
features={"transaction_amount": 150.0},
tags={"risk_level": "high", "merchant_type": "new"},
)
Batch Logging
Log bulk predictions from historical data or batch processing.
from arize.ml.types import Schema, EmbeddingColumnNames
import pandas as pd
# Define schema to map DataFrame columns
schema = Schema(
prediction_id_column_name="prediction_id",
timestamp_column_name="prediction_ts",
prediction_label_column_name="predicted_label",
actual_label_column_name="actual_label",
feature_column_names=["feature_1", "feature_2", "feature_3"],
embedding_feature_column_names={
"text_embedding": EmbeddingColumnNames(
vector_column_name="text_vector",
link_to_data_column_name="text_content",
),
},
)
# Log batch data
response = client.ml.log(
space_id="your-space-id",
model_name="fraud-detection-v1",
model_type=ModelTypes.SCORE_CATEGORICAL,
environment=Environments.PRODUCTION,
dataframe=prod_df,
schema=schema,
model_version="1.2.0",
)
print(f"Log status: {response.status_code}")
With SHAP Values
Include SHAP values for model explainability.
schema = Schema(
prediction_id_column_name="prediction_id",
prediction_label_column_name="predicted_label",
feature_column_names=["feature_1", "feature_2"],
shap_values_column_names={
"feature_1": "shap_feature_1",
"feature_2": "shap_feature_2",
},
)
response = client.ml.log(
space_id="your-space-id",
model_name="fraud-detection-v1",
model_type=ModelTypes.SCORE_CATEGORICAL,
environment=Environments.PRODUCTION,
dataframe=df_with_shap,
schema=schema,
)
Export Data
Export ML model data for offline analysis, custom processing, or archival.
from datetime import datetime
from arize.ml.types import Environments
start_time = datetime.strptime("2024-01-01", "%Y-%m-%d")
end_time = datetime.strptime("2026-01-01", "%Y-%m-%d")
# Export to DataFrame
df = client.ml.export_to_df(
space_id="your-space-id",
model_name="fraud-detection-v1",
environment=Environments.PRODUCTION,
model_version="1.2.0",
start_time=start_time,
end_time=end_time,
)
print(f"Exported {len(df)} records")
Export to Parquet
client.ml.export_to_parquet(
space_id="your-space-id",
model_name="fraud-detection-v1",
environment=Environments.PRODUCTION,
start_time=start_time,
end_time=end_time,
path="./model_data_export.parquet",
)
Export capabilities:
- Time-range filtering
- DataFrame or Parquet output
- Efficient Arrow Flight transport for large exports
- Progress bars for long-running exports
Supported Model Types
| Model Type | Use Case |
|---|
SCORE_CATEGORICAL, MULTI_CLASS | Multi-class classification |
BINARY_CLASSIFICATION | Binary classification |
NUMERIC, REGRESSION | Regression tasks |
RANKING | Ranking and recommendation systems |
OBJECT_DETECTION | Computer vision object detection |
GENERATIVE_LLM | Use client.spans instead for LLMs |
Supported Environments
| Environment | Description |
|---|
PRODUCTION | Live production traffic |
TRAINING | Training dataset |
VALIDATION | Validation/test dataset |
TRACING | For LLM traces (use client.spans) |