Burst on-prem GPU workloads from BeeGFS/E-Series clusters to Spot Ocean for Spark in the cloud

12 Jan 2023 -

3 minute read

Problem statement

Enterprises with analytics, HPC and Deep Learning workloads that have high-bandwidth storage requirements use BeeGFS with NetApp E-Series.

For various reasons they may need to burst-to-cloud. Some of the main challenges in this process:

Data replication from on-premises BeeGFS to the cloud
Storage performance in the cloud
Cost of compute resources in the cloud

Data replication

For obvious resons (granularity) in this use case file and object replication are generally a better choice than volume replication.

To copy BeeGFS files to the cloud you may use a file sync tool of your choice: rsync, rclone, etc.

Alternatively, NetApp has a subscription (charged per hour) service called Cloud Sync.

I wrote about various ways to sync files and objects here. If you use Cloud Sync, automation is availble via the Cloud Sync API.

Data replication from the cloud to on-premises is usually not a problem because we’re talking just about the results (few KB to few GB, perhaps). To avoid having to open enterprise firewall to incoming connections (or - even worse - use VPN), simply post your results to the cloud provider’s Object Store and download them from there using Cloud Sync or rclone.

Storage

For Big Data anlaytics and DL/ML workloads it usually pays to use fast storage because that saves compute costs. Cloud GPUs aren’t extactly cheap, so if you use BeeGFS on-premises, you likely want to use it in the public cloud for similar workloads.

The creators of BeeGFS, ThinkParQ, have created BeeOND, a subscription service that’s based on BeeGFS running on hyperscaler hardware. Back in 2019 it was possible to get close to 100 GiB/s from such clusters (see this example from Azure).

Compared to ONTAP-based cloud storage, BeeOND is limited in terms of data management features: backup, snapshots, etc. If you need to protect your cloud data before BeeOND subscription is terminated, make a copy in the hyperscaler’s Object Storage.

CSI drivers

BeeGFS CSI for BeeOND
NetApp Astra Trident CSI for ONTAP-based cloud storage

GPU compute nodes

We want to avoid unnecessary cost of GPU compute resources.

To do that we can use a Spot.io service called Spot Ocean for Spark.

Spot doesn’t seem to build GPU clusters from scratch, but Spot allows you to import existing clusters to Spot and let Spot control and manage them.

This means we can build a cluster with GPU-based worker nodes and tell Spot Ocean for Spark to use it.

The next question is whether containers used by Spot Ocean for Spark have CUDA? Spot Ocean Spark uses its own images. Or, to be perfectly correct, containers based on its base images (this will come useful later).

At the time of writing this post, the images don’t seem to have CUDA libraries in them. Because we don’t have to use Spot images and can use custom Docker images based on Spot’s base images we can easily start with those and create custom containers with CUDA drivers.

Here’s information about the official Spark official images which can be used as base and this is the list.

Performance monitoring

For short-lived clusters I’d probably use hyperscaler’s monitoring and CLI tools built into BeeGFS. Why?

Cost optimization is done by Spot
Cluster will be deleted anyway

Alternatively, BeeGFS monitoring plugin for Grafana (which can run on-premises or be the free Grafana Cloud; both would connect to a small long-running Influx DBv1 container that could be left running in the case bursting to cloud happens frequently enough.)

Workflow

The entire workflow would look like this:

Preparation
- Build images with CUDA version recommended by hyperscaler and store them in private registry
Replication
- Stand-up a minimal BeeOND cluster
- Replicate data to BeeOND
Compute
- Grow BeeOND cluster to enough nodes
- Stand up a Kubernetes cluster with GPU-based nodes and deploy BeeGFS CSI
- Import the cluster to Spot Ocean for Spark
- Run Spark and other workloads
Terminate temporary environment
- Copy results to Object Storage or back to on-premises
- Scale down to zero or destroy Kubernetes cluster
- Destroy BeeGFS cluster

Users who burst to the cloud often could scale up and down rather than re-create clusters every time.

Burst on-prem GPU workloads from BeeGFS/E-Series clusters to Spot Ocean for Spar...

Burst on-prem GPU workloads from BeeGFS/E-Series clusters to Spot Ocean for Spark in the cloud

Problem statement

Data replication

Storage

CSI drivers

GPU compute nodes

Performance monitoring

Workflow

Recommend

2 HR Meta-Trends Transforming Careers and the Workforce

Popular JWT cloud security library patches “remote” code execution hole

What space does to your body: Swollen heads, shrunken legs, round hearts - Washi...

ASSERTION_FAILED or FAA_CMP Errors in tcode AJAB and OAAQ and FAA_CMP and OAAR

Microsoft employee realizes why users keep complaining on Windows 11 system requ...

TMD十年往事：程维“失意”，王兴“思考”，张一鸣“逆袭”

布响丸辣！法国人立志告别社交网络：太费时间和精力

Why girls are getting more jobs then boys? [Serious Issue]

Discover how Microsoft is innovating in the metaverse for a sustainable, clean e...

Latest PS5 System Software Update Adds DualSense Edge Support Ahead Of Launch

About Joyk