Quickstart

This guide will walk you through:

  • defining a task in a simple YAML format

  • provisioning a cluster and running a task

  • using the core SkyPilot CLI commands

Be sure to complete the installation instructions first before continuing with this guide.

Hello, SkyPilot!

Let’s define our very first task, a simple Hello, SkyPilot! program.

Create a directory from anywhere on your machine:

$ mkdir hello-sky
$ cd hello-sky

Copy the following YAML into a hello_sky.yaml file:

resources:
  # Optional; if left out, automatically pick the cheapest cloud.
  cloud: aws
  # 1x NVIDIA V100 GPU
  accelerators: V100:1

# Working directory (optional) containing the project codebase.
# Its contents are synced to ~/sky_workdir/ on the cluster.
workdir: .

# Typical use: pip install -r requirements.txt
# Invoked under the workdir (i.e., can use its files).
setup: |
  echo "Running setup."

# Typical use: make use of resources, such as running training.
# Invoked under the workdir (i.e., can use its files).
run: |
  echo "Hello, SkyPilot!"
  conda env list

This defines a task with the following components:

  • resources: cloud resources the task must be run on (e.g., accelerators, instance type, etc.)

  • workdir: the working directory containing project code that will be synced to the provisioned instance(s)

  • setup: commands that must be run before the task is executed (invoked under workdir)

  • run: commands that run the actual task (invoked under workdir)

All these fields are optional.

To launch a cluster and run a task, use sky launch:

$ sky launch -c mycluster hello_sky.yaml

Tip

This may take a few minutes for the first run. Feel free to read ahead on this guide.

Tip

You can use the -c flag to give the cluster an easy-to-remember name. If not specified, a name is autogenerated.

The sky launch command performs much heavy-lifting:

  • selects an appropriate cloud and VM based on the specified resource constraints;

  • provisions (or reuses) a cluster on that cloud;

  • syncs up the workdir;

  • executes the setup commands; and

  • executes the run commands.

In a few minutes, the cluster will finish provisioning and the task will be executed. The outputs will show Hello, SkyPilot! and the list of installed Conda environments.

Execute a task on an existing cluster

Once you have an existing cluster, use sky exec to execute a task on it:

$ sky exec mycluster hello_sky.yaml

The sky exec command is more lightweight; it

  • syncs up the workdir (so that the task may use updated code); and

  • executes the run commands.

Provisioning and setup commands are skipped.

Bash commands are also supported, such as:

$ sky exec mycluster python train_cpu.py
$ sky exec mycluster --gpus=V100:1 python train_gpu.py

For interactive/monitoring commands, such as htop or gpustat -i, use ssh instead (see below) to avoid job submission overheads.

View all clusters

Use sky status to see all clusters (across regions and clouds) in a single table:

$ sky status

This may show multiple clusters, if you have created several:

NAME       LAUNCHED     RESOURCES             COMMAND                            STATUS
gcp        1 day ago    1x GCP(n1-highmem-8)  sky cpunode -c gcp --cloud gcp     STOPPED
mycluster  4 mins ago   1x AWS(p3.2xlarge)    sky exec mycluster hello_sky.yaml  UP

SSH into clusters

Simply run ssh <cluster_name> to log into a cluster:

$ ssh mycluster

Multi-node clusters work too:

# Assuming 3 nodes.

# Head node.
$ ssh mycluster

# Worker nodes.
$ ssh mycluster-worker1
$ ssh mycluster-worker2

The above are achieved by adding appropriate entries to ~/.ssh/config.

Transfer files

After a task’s execution, use rsync or scp to download files (e.g., checkpoints):

$ rsync -Pavz mycluster:/remote/source /local/dest  # copy from remote VM

For uploading files to the cluster, see Syncing Code and Artifacts.

Stop/terminate a cluster

When you are done, run sky stop mycluster to stop the cluster. To terminate a cluster instead, run sky down mycluster. Find more commands that manage the lifecycle of clusters here.

Next steps

Congratulations! In this quickstart, you have launched a cluster, run a task, and interacted with SkyPilot’s CLI.

Next steps: