Interactive Nodes

Interactive Nodes

SkyPilot provides interactive nodes, the user’s personal work servers in the clouds. These are single-node VMs that can be quickly accessed by convenient CLI commands:

  • sky gpunode

  • sky cpunode

  • sky tpunode

Interactive nodes are normal SkyPilot clusters. They allow fast access to instances without requiring a task YAML specification.

Workflow

Use sky gpunode to get a node with GPU(s):

$ # Create and log in to a cluster with the
$ # default name, "sky-gpunode-<username>".
$ sky gpunode

$ # Or, use -c to set a custom name to manage multiple clusters:
$ # sky gpunode -c node0

Use --gpus to change the type and the number of GPUs:

$ sky gpunode  # By default, use 1 K80 GPU.
$ sky gpunode --gpus V100
$ sky gpunode --gpus V100:8

$ # To see available GPU names:
$ # sky show-gpus

Directly set a cloud and an instance type, if required:

$ sky gpunode --cloud aws --instance-type p2.16xlarge

See all available options and short keys:

$ sky gpunode --help

SkyPilot also provides sky cpunode for CPU-only instances and sky tpunode for TPU instances (only available on Google Cloud Platform).

To log in to an interactive node, either re-type the CLI command or use ssh:

$ # If the cluster with the default name exists, this will directly log in.
$ sky gpunode

$ # Equivalently:
$ ssh sky-gpunode-<username>

$ # Use -c to refer to different interactive nodes.
$ # sky gpunode -c node0
$ # ssh node0

Because SkyPilot exposes SSH access to clusters, this means clusters can be easily added into tools such as Visual Studio Code Remote.

Since interactive nodes are just normal SkyPilot clusters, sky exec can be used to submit jobs to them.

Interactive nodes can be stopped, restarted, and terminated, like any other cluster:

$ # Stop at the end of the work day:
$ sky stop sky-gpunode-<username>

$ # Restart it the next morning:
$ sky start sky-gpunode-<username>

$ # Terminate entirely:
$ sky down sky-gpunode-<username>

Note

Since sky start restarts a stopped cluster, auto-failover provisioning is disabled—the cluster will be restarted on the same cloud and region where it was originally provisioned.

Getting multiple nodes

By default, interactive clusters are a single node. If you require a cluster with multiple nodes (e.g., for hyperparameter tuning or distributed training), use num_nodes in a YAML spec:

# multi_node.yaml
num_nodes: 16
resources:
  accelerators: V100:8
$ sky launch -c my-cluster multi_node.yaml

To log in to the head node:

$ ssh my-cluster