The Wayback Machine - https://web.archive.org/web/20200619082106/https://github.com/awslabs/amazon-sagemaker-examples/issues/971
Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SageMaker-ModelMonitoring - how does it work with non-Amazon algorithms? Also needs better documentation #971

Open
sermolin opened this issue Jan 6, 2020 · 0 comments

Comments

@sermolin
Copy link

@sermolin sermolin commented Jan 6, 2020

The notebook seems to use a pre-trained model from https://github.com/awslabs/amazon-sagemaker-examples/blob/master/introduction_to_applying_machine_learning/xgboost_customer_churn/xgboost_customer_churn.ipynb. The notebook should refer to the data schema from the above example when discussing generated traffic and suggested constraints.

Cell Deploy the model to Amazon SageMaker. THIS REQUIRES MORE EXPLANATION A DOCUMENTATION REFERENCE TO https://sagemaker.readthedocs.io/en/stable/model.html

DataCaptureConfig documentation is needed. I could not find detailed description of each parameters and acceptable values. For example, where is sampling_percentage defined?

Cell Create a baselining job with training dataset.

  • How does suggest_baseline() generates constraints? Based on what parameters, etc?
  • This seems to be a Spark job (based on the output log). What resources are consumed by .suggest_baseline Spark job and how much do they cost?

Cell Explore the generated constraints and statistics
baseline_statistics() seems to apply to only built-in algorithms with pre-built containers. It leverages Deequ, computes KLL sketches, etc. Please, provide an example of how the statistics works with non-Amazon algorithms, such as an open-source XGBoost

Cell Create a Schedule:
How to run schedule_cron_expression every few minutes for development purposes? Currently, Amazon SageMaker only supports hourly integer rates between 1 hour and 24 hours:
https://docs.aws.amazon.com/sagemaker/latest/dg/model-monitor-schedule-expression.html

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
1 participant
You can’t perform that action at this time.