Artificial Intelligence Engineering – The Google Cloud Solution (Part 3 of 3)

In parts 1 and 2 of this series, we covered the challenges our client in the entertainment industry was facing with the automation of their machine learning models and how we designed an AI (Artificial Intelligence) engineering solution for them on Microsoft Azure. In part 3 we have designed a similar solution on Google Cloud.

The solution 

Google Cloud offers AI Platform pipelines for MLOPs solutions which not only facilitates developing, deploying and running ML workloads but also provides pipeline versioning, metadata tracking, model experiments and visualization.

We found Google Cloud AI Platform’s customizable prebuilt pipeline templates very handy for accelerating the model build process with data from BigQuery or Google Cloud storage.

A Google Kubernetes Engine (GKE) cluster was created automatically at the initiation of the Google AI Platform Pipeline and we could also use a pre-created cluster managed from the Cloud AI Platform UI.

Google Cloud AI Platform pipelines support Kubeflow Pipelines SDK for fully customizable pipelines and TensorFlow Extended (TFX) SDK for ML Pipelines based on TensorFlow.

GCP AI Platform pipelines integrate with GCP managed services like BigQuery, Dataflow, other GCP AI services and Cloud Functions.

GCP AI Platform’s built-in Hypertune library eased the hyperparameter tuning process and also assisted in choosing the best performing model out of various hyperparameter tuning runs.

GCP Explainable AI’s features like AI explanation, What-if analysis and Continuous evaluation provided easy model explainability.

The components of the machine learning orchestration pipeline were re-usable and shareable to allow rapid and reliable experimentation. Kubeflow’s friendly user interface (UI) provided us with visuals on detailed pipeline steps and tasks.

The task of model deployment after data processing, building and training, was done by linking Google Cloud Build to the model’s GitHub repository. A Cloud Build job was configured to compile, execute build, train and deploy the model using triggers for code updates or data refreshes. Alternatively, one could also use Cloud Functions and Cloud run for simpler model deployments.

The Result

We were able to design an end-to-end machine learning workflow using Google Cloud’s AI Platform which supports template-based pipeline construction, versioning, artifact lineage and metric tracking of models and datasets. We found that Google’s fully automated CI/CD solutions powered by Cloud Build are a good choice for a complete Augmented AI experience as they can facilitate pipeline runs based on event triggers.

The complete stack of services powered by Google not only accelerates the machine learning process but also allows data scientists, data engineers and machine learning engineers to work in collaboration with the ability to share and work on multiple versions of artifacts and pipelines for experimentation, thus reducing re-work on data preparation and feature engineering, making it easy to update and re-deploy machine learning models in production.

Our take on Google Cloud AI Platform

Google Cloud AI Platform provides seamless creation of end-to-end ML pipeline starting from ingesting data to preparing, discovering, training, deploying and serving ML models. AI Platform has managed notebook instances integrated with BigQuery, Cloud Dataproc, and Cloud Dataflow, making the entire data to machine learning cycle easy.

One aspect worth mentioning is the option for exporting BigQuery ML models into a TensorFlow’s SavedModel format on GCP Cloud storage and deploying it using the Google Cloud AI platform which truly democratizes the use of Machine Learning. Microsoft Azure also provides a similar solution where Azure Synapse machine learning models converted to ONNX format are readable via Azure Machine learning.

Databricks recently partnered with Google Cloud offering seamless integration across data and AI services on Google Cloud.

As in Azure, in Google Cloud one would also need to assemble multiple GCP services to achieve the desired outcomes.

We found that Microsoft Azure’s machine learning designer’s drag and drop features are very user friendly and require little to no coding for creating and deploying simple machine learning projects. While Google Cloud tends to be more data scientist friendly meaning that one would need a bit of development experience to work on GCP AI tools but, that holds for advanced machine learning even on Microsoft Azure. Both TensorFlow and Kubeflow are powerful machine learning tools, developed and open-sourced by Google. Using these requires some work to build ML pipelines, but they are more customizable than Azure Machine Learning. 

Organizations that are already using on-prem TensorFlow or Kubeflow pipeline solutions may find it very easy to move to Google Cloud.

The future of AI on Google Cloud is bright, and we can’t wait to see a fully automated AI Engineering environment completely incorporated with CI/CD features along with Google’s AutoML solutions all being offered as part of Google’s AI Platform services.

Connect with me for further information on ML Automation Architecture design and implementation on either Google Cloud or Microsoft Azure.

Share on: