Democratizing AI: How GKE Makes Machine Learning Accessible (2024)

Democratizing AI: How GKE Makes Machine Learning Accessible (3)

Generative AI has kept the GKE product team busy over the last year. We put together this article with a curated list of many of the new features we have released on GKE especially useful for Machine Learning, Artificial Intelligence and Large Language Models. We also listed some Open Source and community projects that work well on GKE.

This article is largely based on content authored originally by Nathan Beach with the help of Marcus Johansson.

Graphics Processing Units are a very common type of Hardware Accelerators used to perform resource-intensive tasks, such as Machine learning (ML) inference and training and Large-scale data processing. In GKE Autopilot and Standard, you can attach GPU hardware to nodes in your clusters, and then allocate GPU resources to containerised workloads running on those nodes.

  • A3 VM, powered by NVIDIA H100 GPUs, is generally available The A3 VM is optimised for GPU supercomputing and offers 3x faster training and 10x greater networking bandwidth compared to the prior generation. A3 is also able to operate at scale, enabling users to scale models to tens of thousands of NVIDIA H100 GPUs.
  • G2 VM with NVIDIA L4 GPUs offers great inference performance-per-dollar The G2 VM became GA earlier this year, but we recently announced fantastic MLPerf results for the G2, including up to 1.8x improvement in performance per dollar compared to a comparable public cloud inference offering.
  • GPUs slicing on GKE: When using GPUs with GKE, Kubernetes allocates one full GPU per container even if the container only needs a fraction of the GPU for its workload, which might lead to wasted resources and cost overrun. To improve GPU utilisation, multi-instance GPUs allow you to partition a single NVIDIA A100 GPU in up to seven slices. Each slice can be allocated to one container on the node independently.
  • GPU dashboard available on the GKE cluster details page: When viewing a specific GKE cluster details in the Cloud Console, the Observability tab of the GKE cluster now includes a dashboard for GPU metrics. This provides visibility into utilisation of GPU resources, including utilisation by GPU model and by Kubernetes node.
  • Autopilot now supports L4 GPUs in addition to existing support for NVIDIAs T4, A100, and A100–80GB GPUs.
  • Automatic GPU driver installation is available in GKE 1.27.2-gke.1200 and later, which enables you to install NVIDIA GPU drivers on nodes without manually applying a DaemonSet.

TensorFlow Processing Units (TPUs) are Google’s custom-developed application-specific integrated circuits (ASICs) used to accelerate machine learning workloads. Compared to GPUs which are general purpose processing units that support many different applications and software. TPUs are optimised to handle massive matrix operations used in neural networks at fast speeds. GKE supports adding TPUs to nodes in the cluster to train machine learning models.

Ray.io is an open-source framework to easily scale up Python applications across multiple nodes in a cluster. Ray provides a simple API for building distributed, parallelized applications, especially for deep learning applications.

Visit g.co/cloud/gke-aiml for helpful resources about running AI workloads on GKE.

Democratizing AI: How GKE Makes Machine Learning Accessible (2024)
Top Articles
Home
How to short the housing market and REITs
Kathleen Hixson Leaked
Devon Lannigan Obituary
Bashas Elearning
Lifewitceee
T Mobile Rival Crossword Clue
Polyhaven Hdri
The Many Faces of the Craigslist Killer
Hello Alice Business Credit Card Limit Hard Pull
Edgar And Herschel Trivia Questions
Everything You Need to Know About Holly by Stephen King
Razor Edge Gotti Pitbull Price
Costco Gas Foster City
Kirksey's Mortuary - Birmingham - Alabama - Funeral Homes | Tribute Archive
‘The Boogeyman’ Review: A Minor But Effectively Nerve-Jangling Stephen King Adaptation
683 Job Calls
Mini Handy 2024: Die besten Mini Smartphones | Purdroid.de
Airtable Concatenate
Belledelphine Telegram
Star Wars Armada Wikia
12657 Uline Way Kenosha Wi
Riverstock Apartments Photos
Superhot Free Online Game Unblocked
Skepticalpickle Leak
Yu-Gi-Oh Card Database
Rainfall Map Oklahoma
Log in or sign up to view
Emily Katherine Correro
Jambus - Definition, Beispiele, Merkmale, Wirkung
RFK Jr., in Glendale, says he's under investigation for 'collecting a whale specimen'
#scandalous stars | astrognossienne
1400 Kg To Lb
Montrose Colorado Sheriff's Department
Otter Bustr
Dr Adj Redist Cadv Prin Amex Charge
Wayne State Academica Login
If You're Getting Your Nails Done, You Absolutely Need to Tip—Here's How Much
Parent Portal Pat Med
Yakini Q Sj Photos
Ucla Basketball Bruinzone
Phmc.myloancare.com
9294027542
Food and Water Safety During Power Outages and Floods
Steam Input Per Game Setting
60 Second Burger Run Unblocked
Great Clips Virginia Center Commons
sin city jili
O.c Craigslist
Tamilyogi Cc
Latest Posts
Article information

Author: Manual Maggio

Last Updated:

Views: 6719

Rating: 4.9 / 5 (49 voted)

Reviews: 80% of readers found this page helpful

Author information

Name: Manual Maggio

Birthday: 1998-01-20

Address: 359 Kelvin Stream, Lake Eldonview, MT 33517-1242

Phone: +577037762465

Job: Product Hospitality Supervisor

Hobby: Gardening, Web surfing, Video gaming, Amateur radio, Flag Football, Reading, Table tennis

Introduction: My name is Manual Maggio, I am a thankful, tender, adventurous, delightful, fantastic, proud, graceful person who loves writing and wants to share my knowledge and understanding with you.