When is AWS Sagemaker not the best choice?
1 . Cost
- It’d be easier to spin up mlflow on ecs spot instances and way cheaper than sagemaker.
- There are examples on sagemaker working with mlflow: https://cosminsanda.com/posts/experiment-tracking-with-mlflow-inside-amazon-sagemaker/
- I still think it’s better to docker the work and push it to ecs or eks.
2 . Flexibility
- Sagemaker experiments requires all of the training jobs to be done using the sagemaker training api (meaning spending $$$ for them).
- This could be an issue if you didn’t want to only use sagemaker supported algorithm.
- sagemaker experiments is not very useful after I was told of its limitation.
3 . Cloud Vendor limitation
- if cross cloud vendor tracking needed, then ML-flow is the natural choice.
- for example tracking metrics from Azure or GCP
4 . Documentation & Community Support
- limited documentation, or issue solutions => unless you pay for AWS premium support for help
MLFlow Automatic logging
https://www.mlflow.org/docs/latest/tracking.html#automatic-logging
The following libraries support autologging:
- Scikit-learn
- TensorFlow and Keras
- Gluon
- XGBoost
- LightGBM
- Statsmodels
- Spark
- Fastai
- Pytorch
References:
- https://towardsdatascience.com/mlops-with-mlflow-and-amazon-sagemaker-pipelines-33e13d43f238
- https://towardsdatascience.com/5-simple-steps-to-mlops-with-github-actions-mlflow-and-sagemaker-pipelines-19abf951a70
- https://aws.amazon.com/blogs/machine-learning/managing-your-machine-learning-lifecycle-with-mlflow-and-amazon-sagemaker/
- https://neptune.ai/blog/amazon-sagemaker-alternatives
- https://neptune.ai/blog/best-mlops-platforms-to-manage-machine-learning-lifecycle