Meson: Workflow Orchestration for Netflix Recommendations

@Xin: to me, basically a wrapper of Mesos and SparkML/R, but with some customized features for data prepare and model selection

  • kind of lame to me. maybe because this is 2 years ago
  • kubeflow is open sourced and better UI

https://netflixtechblog.com/meson-workflow-orchestration-for-netflix-recommendations-fc932625c1d9

High level use cases

  • Selecting a set of users — This is done via a Hive query to select the cohort for analysis
  • Cleansing / preparing the data — A Python script that creates 2 sets of users for ensuring parallel paths
  • In the parallel paths, one uses Spark to build and analyze a global model with HDFS as temporary storage.
    • The other uses R to build region (country) specific models. The number of regions is dynamic based on the cohort selected for analysis.
  • Validation — Scala code that tests for the stability of the models when the two paths converge.
    • In this step we also go back and repeat the whole process if the model is not stable.
  • Publish the new model — Fire off a Docker container to publish the new model to be picked up by other production systems

Meson offers a Scala based DSL

val getUsers = Step("Get Users", ...)
val wrangleData = Step("Wrangle Data", ...)
...
val regionSplit = Step("For Each Region", ...)
val regionJoin = Step("End For Each", ...)
val regions = Seq("US", "Canada", "UK_Ireland", "LatAm", ...)
val wf = start -> getUsers -> wrangleData ==> (
  trainGlobalModel -> validateGlobalModel,
  regionSplit **(reg = regions) --< (trainRegModel, validateRegModel) >-- regionJoin
) >== selectModel -> validateModel -> end
// If verbs are preferred over operators
val wf = sequence(start, getUsers, wrangleData) parallel {
  sequence(trainGlobalModel, validateGlobalModel)
  sequence(regionSplit,
           forEach(reg = regions) sequence(trainRegModel, validateRegModel) forEach,
           regionJoin)
} parallel sequence(selectModel, validateModel, end)

Some more UI Screenshots, check original post

I have to say, Uber’s ML platform UI is a bit cooler ;p