Using Ray Notebook

更新时间:
复制 MD 格式

This guide explains how to access the built-in Ray Notebook in a Ray application and run a simple distributed application. This pre-configured, cloud-based JupyterLab environment lets you write and debug Ray programs in your browser, removing the need to set up a complex local development environment.

Scenario

Local development environments often lack the computing resources for large-scale data processing or machine learning model training. Configuring them can also be complex and lead to inconsistencies with production environments.

Ray Notebook solves this problem. It pre-installs JupyterLab on a Ray application, allowing developers to access it directly through a browser and use the entire application's computing resources for interactive development, thereby simplifying the transition from a local prototype to a distributed run.

Before you begin

Before you begin, obtain the connection address and configuration information from your Ray application. You must also add the private or public IP address of your development environment to the application whitelist.

Procedure

Follow these steps to access the built-in Ray Notebook in a Ray application using public network access.

  1. Log in to JupyterLab:

    1. In a browser, open the public Jupyter URL you obtained.

    2. In the "Password or token" field, enter the value of secret.jupyterlab.password to open the JupyterLab interface.

  2. Develop a Ray program: In JupyterLab, open the Launcher by navigating to File > New Launcher. From the Launcher, create a Python code file by clicking Python 3 (ipykernel) under Notebook, Python 3 (ipykernel) under Console, or Python File under Other.

  3. Run code: For example, in a Notebook, enter and run the following code in a code cell.

    import ray
    import time
    @ray.remote
    def retrieve_task(item, db):
        time.sleep(item / 10.)
        return item, db[item]
    if __name__ == "__main__":
        database = [
            "Learning", "Ray", "Flexible", "Distributed", "Python", "for",
            "Machine", "Learning"
        ]
        ray.init()
        db_object_ref = ray.put(database)
        retrieve_refs = [
            retrieve_task.remote(item, db_object_ref) for item in [0, 2, 4, 6]
        ]
        result = [print(data) for data in ray.get(retrieve_refs)]

    The output is as follows:

    (0, 'Learning')
    (2, 'Flexible')
    (4, 'Python')
    (6, 'Machine')