ack-arena

更新时间:
复制 MD 格式

ack-arena is an ACK component that installs and manages Arena in your ACK clusters. Install it from the ACK console in a few clicks, without manual setup.

What is Arena

Arena is a lightweight client for managing machine learning workloads on Kubernetes. It covers the complete ML lifecycle — data preparation, model development, training, and prediction — so data scientists can focus on model work rather than cluster management.

Arena integrates with Alibaba Cloud services and supports GPU sharing and Cloud Paralleled File System (CPFS). It runs optimized deep learning frameworks to maximize heterogeneous computing resource utilization.

Why use ack-arena

ack-arena simplifies Arena installation. Installing open-source Arena manually requires configuring Kubernetes resources, namespaces, role-based access control (RBAC), and operators. ack-arena automates this through the ACK console, reducing setup to a few clicks.

ack-arena stays integrated with ACK. ack-arena deploys Arena as a managed component in your cluster, keeping it aligned with ACK's service updates and GPU resource management.

Usage notes

For installation instructions, see Install Arena.

Release notes

Version 0.8.5-264b96a-aliyun

Release date: 2021-06-08

Images:

  • registry.cn-beijing.aliyuncs.com/acs/arena-binary-installer:0.8.5-264b96a-aliyun

  • registry.cn-beijing.aliyuncs.com/acs/arena-deploy-manager:0.8.5-264b96a-aliyun

New features

  • Added support for NVIDIA Triton inference tasks.

  • Added the arena-uninstall command to uninstall Arena.

Bug fixes

  • Previously, running arena top node showed the total GPU count as 0. Now the command reports the correct GPU count.

  • Previously, RBAC permissions for managing CronJobs were not granted during installation. Now the required permissions are included.

Impact: No impact on workloads.

Version 0.8.0-ba37c8a-aliyun

Release date: 2021-04-06

Images:

  • registry.cn-beijing.aliyuncs.com/acs/arena-binary-installer:0.8.0-ba37c8a-aliyun

  • registry.cn-beijing.aliyuncs.com/acs/arena-deploy-manager:0.8.0-ba37c8a-aliyun

New features

  • Added Arena SDK for Python and Arena SDK for Java.

  • Added support for Seldon inference tasks.

  • Added support for generating kubeconfig scripts for multiple tenants.

  • Added support for customizing the startup sequence of roles in TensorFlow training jobs.

Bug fixes

  • Previously, Apache Spark jobs could not be submitted. Now Spark job submission works as expected.

  • Previously, the LogViewer URL could not be retrieved when no chief pods were provisioned. Now the URL is returned correctly regardless of pod state.

Impact: No impact on workloads.

Version 0.7.1-3559f56-aliyun

Release date: 2021-01-27

Images:

  • registry.cn-beijing.aliyuncs.com/acs/arena-binary-installer:0.7.1-3559f56-aliyun

  • registry.cn-beijing.aliyuncs.com/acs/arena-deploy-manager:0.7.1-3559f56-aliyun

Bug fixes

  • Previously, et-operator was not installed in the arena-system namespace. Now it is installed correctly.

Impact: No impact on workloads.

Version 0.7.0-c6f5800-aliyun

Release date: 2021-01-25

Images:

  • registry.cn-beijing.aliyuncs.com/acs/arena-binary-installer:0.7.0-c6f5800-aliyun

  • registry.cn-beijing.aliyuncs.com/acs/arena-deploy-manager:0.7.0-c6f5800-aliyun

New features

  • Added Arena SDK for Go.

  • Added Prometheus Service support.

  • Added the -g option to arena get to display GPU information.

  • Added the -c option to arena logs to specify a container.

Improvements

  • Updated the output formats of arena list, arena get, arena serve list, and arena serve get.

Bug fixes

  • Previously, running arena serve delete could delete multiple jobs unintentionally. Now the command deletes only the specified job.

Impact: No impact on workloads.