Project Github

https://github.com/aws/sagemaker-spark

SageMakerEstimator

@Xin:

  • basically this is a translater of spark estimator actifacts into sagemaker training job, then hosting an endpoint via AWS ECS docker.
  • After the Sagemaker Training Job is created, this Estimator will poll from AWS for success.

Adapts a SageMaker learning Algorithm to a Spark Estimator. Fits a SageMakerModel by running a SageMaker Training Job on a Spark Dataset. Each call to fit() submits a new SageMaker Training Job, creates a new SageMaker Model, and creates a new SageMaker Endpoint Config. A new Endpoint is either created by or the returned SageMakerModel is configured to generate an Endpoint on SageMakerModel transform.

On fit, the input dataset is serialized with the specified trainingSparkDataFormat using the specified trainingSparkDataFormatOptions and uploaded to an S3 location specified by trainingInputS3DataPath.

Reference