name: musa-torch-coding description: Transcribe audio via OpenAI Audio Transcriptions API (Whisper). homepage: https://platform.openai.com/docs/guides/speech-to-text metadata: { "openclaw": { "emoji": "☁️", "requires": { "bins": ["curl"], "env": ["OPENAI_API_KEY"] }, "primaryEnv": "OPENAI_API_KEY", }, }

MUSA Torch Coding

Guide for generating PyTorch code that runs on Moore Threads (摩尔线程) MUSA GPUs using torch_musa.

Overview

MUSA (Metaverse Unified System Architecture) is Moore Threads' GPU computing platform. This skill helps generate code that:

Runs on Moore Threads GPUs via torch_musa
Converts CUDA code to MUSA-compatible code
Sets up proper environments (conda v1.2/v1.3)
Follows MUSA best practices

Key Differences: CUDA vs MUSA

CUDA	MUSA
`torch.cuda`	`torch.musa`
`torch.device("cuda")`	`torch.device("musa")`
`torch.cuda.is_available()`	`torch.musa.is_available()`
`backend='nccl'`	`backend='mccl'`
`torch.cuda.device_count()`	`torch.musa.device_count()`
`torch.cuda.get_device_name()`	`torch.musa.get_device_name()`

Environment Setup

⚠️ Important: MUSA Uses Pre-configured Conda Environments

DO NOT install PyTorch, vLLM, or related packages manually. MUSA environments are custom-built and include:

MUSA-specific PyTorch builds (not compatible with standard PyTorch)
MUSA-customized vLLM versions
MUSA drivers and SDK integration

Installing standard packages from PyPI will break the environment.

Conda Environment (v1.2/v1.3)

MUSA provides pre-configured conda environments. Common environment names:

v1.2 - MUSA SDK v1.2 environment
v1.3 - MUSA SDK v1.3 environment (newer)

# List available MUSA environments
conda env list | grep -E "(v1\.2|v1\.3|musa)"

# Activate the appropriate environment
conda activate v1.2  # or v1.3

# Verify MUSA availability
python -c "import torch_musa; import torch; print(torch.musa.is_available())"

Environment Detection & Setup

If no MUSA conda environment is detected:

Check if MUSA is installed:

which musaInfo  # Should show musaInfo path
ls /usr/local/musa/  # MUSA SDK location

If MUSA is not set up:
- Use the musa-env-setup skill for complete environment installation
- The skill covers SDK installation, conda setup, and vLLM-MUSA configuration
Common conda environment locations:
- /opt/conda/envs/
- ~/conda/envs/
- /usr/local/conda/envs/

Key Environment Variables

Variable	Purpose
`MUSA_VISIBLE_DEVICES=0,1,2,3`	Control visible GPU IDs
`MUSA_LAUNCH_BLOCKING=1`	Synchronous kernel launch
`MUDNN_LOG_LEVEL=INFO`	Enable MUDNN logging
`TORCH_SHOW_CPP_STACKTRACES=1`	Show C++ stack traces

Code Generation Rules

When generating PyTorch code for MUSA:

Always import torch_musa

import torch_musa  # Must import before using torch.musa

Use torch.device("musa")

device = torch.device("musa") if torch.musa.is_available() else torch.device("cpu")
tensor = torch.tensor([1.0, 2.0], device=device)

Use 'mccl' for distributed training

dist.init_process_group(backend='mccl', ...)

Mixed precision (AMP) is supported

from torch.cuda.amp import autocast, GradScaler  # Same API

TensorCore optimization available
- Set torch.backends.musa.matmul.allow_tf32 = True for TensorFloat32

Model Templates

For common model types, see templates in references/:

reference.md - Complete MUSA API reference

Common Tasks

Check GPU Availability

import torch
import torch_musa

print(f"MUSA available: {torch.musa.is_available()}")
print(f"Device count: {torch.musa.device_count()}")
print(f"Device name: {torch.musa.get_device_name(0)}")

Training Loop Pattern

import torch_musa

# Device setup
device = torch.device("musa") if torch.musa.is_available() else torch.device("cpu")

# Model and data to device
model = model.to(device)
inputs = inputs.to(device)

# Training (same as CUDA)
optimizer.zero_grad()
outputs = model(inputs)
loss = criterion(outputs, targets)
loss.backward()
optimizer.step()

Distributed Training (DDP)

import torch.distributed as dist
import torch_musa

# Initialize with mccl backend
dist.init_process_group(backend='mccl', rank=rank, world_size=world_size)

# Create process group on MUSA
torch.cuda.set_device(local_rank)  # torch_musa extends torch.cuda API

Code Conversion

When converting existing CUDA code to MUSA:

Add import torch_musa at the top
Replace cuda with musa in device strings
Replace nccl with mccl for distributed backend
Keep all other PyTorch API calls unchanged

Troubleshooting

Device not found: Ensure user is in render group: sudo usermod -aG render $(whoami)
Library not found: Check LD_LIBRARY_PATH includes /usr/local/musa/lib/
Build issues: Clean and rebuild: python setup.py clean && bash build.sh
Docker issues: Use --env MTHREADS_VISIBLE_DEVICES=all

Reference

For detailed API reference and examples, see references/reference.md.

musa-torch-coding

Description

MUSA Torch Coding

Overview

Key Differences: CUDA vs MUSA

Environment Setup

⚠️ Important: MUSA Uses Pre-configured Conda Environments

Conda Environment (v1.2/v1.3)

Environment Detection & Setup

Key Environment Variables

Code Generation Rules

Model Templates

Common Tasks

Check GPU Availability

Training Loop Pattern

Distributed Training (DDP)

Code Conversion

Troubleshooting

Reference

Reviews (0)

Comments (0)

Compatible Platforms

Links

Pricing

Related Configs

self-improving-agent

Self Improving Agent

Find Skills

Summarize