KServe (Kubeflow KFServing) Live Coding Session

KFserving graduated from Kubeflow and is now Kserve. MLOps Community Meetup #83! Last Wednesday Demetrios Brinkmann and Alexey Grigorev talked to Theofilos Papapanagiotou, Data Science Architect of Prosus.

This is the practical session following up on MLOps Meetup #40 Hands-on Serving Models Using KFserving (https://youtu.be/VtZ9LWyJPdc).

Abstract

We start with the serialization of TensorFlow/PyTorch/SKLearn models into files and the deployment of an inference service on a Kubernetes cluster. Great MLOps means great model monitoring, so then we look at inference service metrics, model server metrics, payload logs, class distributions.

For AI ethics on production, we use the explainers pattern with many different explainers, fairness detectors, and adversarial attacks. For integrations, we use the transformer pattern to process as well as to enrich the inference request with online features from a feature store.

Bio

Theo is a recovering Unix Engineer with 20 years of work experience in Telcos, on internet services, video delivery, and cybersecurity. He is also a university student for life; BSc in CS 1999, MSc in Data Coms 2008, and MSc in AI 2017.

Nowadays he calls himself an ML Engineer, as he expresses through this role his passion for System Engineering and Machine Learning. His analytical thinking is driven by curiosity and a hacker’s spirit. He has skills that span a variety of different areas: Statistics, Programming, Databases, Distributed Systems, and Visualization.

Serialization of TensorFlow using SavedModel format: https://www.tensorflow.org/guide/saved_model
PyTorch using model archive: https://github.com/pytorch/serve/blob/master/model-archiver/README.md
SKLearn using pickle
Model serving using Tensorflow Serving, Torchserve, MLServer: https://www.tensorflow.org/tfx/guide/serving, https://pytorch.org/serve/, https://github.com/SeldonIO/MLServer
Deployment of a Kserve inference service on a Kubernetes cluster, we are using the kind on docker desktop. Installation of Kserve on kind: https://kserve.github.io/website/, https://kind.sigs.k8s.io/, https://github.com/theofpa/kserve-tutorial/blob/main/quick_install_kind.sh
Great MLOps means great model monitoring, so then we look at inference service metrics, model server metrics, payload logs, class distributions.
Cloudevents consumer to Elasticsearch: https://github.com/theofpa/event-es
For AI ethics on production, we use the explainers pattern with many different explainers, fairness detectors, and adversarial attacks, from Seldon Alibi and Trusted-AI: https://github.com/SeldonIO/alibi, https://github.com/Trusted-AI
For integrations, we use the transformer pattern to process as well as to enrich the inference request with online features from feast using a transformer: https://github.com/kserve/kserve/tree/master/docs/samples/v1beta1/transformer/feast
Installation of feast locally on kind/desktop: https://github.com/theofpa/kserve-tutorial/tree/main/kserve_feast

Stop Thinking, Just Do!