Inference as a service is seeing wide adoption in the cloud and in on-premise data centers. Accommodating various types of model servers like TensorFlow* Serving, OpenVINO™ Model Server or Seldon Core* in Kubernetes* is a great mechanism to achieve scalability and high-availability for such workloads. Nevertheless, the task of configuring the Kubernetes load balancer can [...]

Read More…

[...]

Read More…

The post Inference at Scale in Kubernetes appeared first on Blogs@Intel.


Source: Intel – Blog