Distributed deployment

更新时间:
复制 MD 格式

This topic describes how to perform a distributed deployment of ossimport. Distributed deployment is supported only on Linux systems and not on Windows systems.

Prerequisites

  • Select at least two machines for the migration cluster. Designate one machine as the Master and the others as Workers.

  • Ensure that the Master can connect to the Workers using Secure Shell (SSH).

  • Ensure that all Workers have the same username and password.

    Note

    An SSH connection is required between the Master and Workers. Alternatively, you can configure the logon information for the Workers in the sys.properties file.

Download and install ossimport

  1. Download ossimport

    Download the distributed version: ossimport-2.3.7.tar.gz.

  2. Install ossimport

    Note

    Select one machine to be the Master. Perform the following operations on the Master node.

    1. Log on to the server and run the following command to create the ossimport directory.

      mkdir -p $HOME/ossimport
    2. Go to the directory that contains the compressed package and run the following command to decompress the package to the specified directory.

      tar -zxvf ossimport-2.3.7.tar.gz -C $HOME/ossimport

      The file structure after decompression is as follows:

      ossimport
      ├── bin
      │   ├── console.jar     # JAR package for the Console module
      │   ├── master.jar      # JAR package for the Master module
      │   ├── tracker.jar     # JAR package for the Tracker module
      │   └── worker.jar      # JAR package for the Worker module
      ├── conf
      │   ├── job.cfg         # Job configuration file template
      │   ├── sys.properties  # System operational parameter configuration file
      │   └── workers         # Worker list
      ├── console.sh          # Command line interface, currently supported only on Linux
      ├── logs                # Log directory
      └── README.md           # README document, read carefully before use
      • OSS_IMPORT_HOME: The root directory of ossimport. The default is the directory from the decompression command, $HOME/ossimport. You can also set this by running export OSS_IMPORT_HOME=<dir> or by modifying the $HOME/.bashrc system configuration file. We recommend that you use the default directory.

      • OSS_IMPORT_WORK_DIR: The working directory of ossimport. This is specified by the workingDir configuration item in conf/sys.properties. The recommended directory is $HOME/ossimport/workdir.

      • Use absolute paths for OSS_IMPORT_HOME and OSS_IMPORT_WORK_DIR, such as `/home/<user>/ossimport` or `/home/<user>/ossimport/workdir`.

Configuration

A distributed deployment has three configuration files: conf/sys.properties, conf/job.cfg, and conf/workers.

  • conf/job.cfg: The configuration file template for jobs in distributed mode. Before you start the data migration, modify the parameters as needed.

  • conf/sys.properties: The system operational parameter configuration file. You can configure parameters, such as the working directory and Worker operational parameters, in this file.

  • conf/workers: The list of Worker nodes.

Important
  • Confirm the parameters in sys.properties and job.cfg before you submit a job. Parameters cannot be modified after a job is submitted.

  • Finalize the Worker list in the workers file before you start the service. You cannot add or delete Workers after the service starts.

Run the service

  • Run a migration task

    In a distributed deployment, the general workflow for running a task is to modify the job configuration file, deploy the service, purge any job with the same name, submit the job, start the migration service, view the job status, retry failed subtasks, and stop the migration service. The details are as follows:

    • Deploy the service by running bash console.sh deploy in the Linux terminal. This command deploys ossimport to all machines based on the conf/workers configuration.

      Note

      Before deployment, ensure that the conf/job.cfg and conf/workers configuration files are modified.

    • Purge a job with the same name. If you have run a job with the same name and need to run it again, you must purge the original job first. You do not need to run the purge command if the job has not been run before or if you want to retry failed subtasks. In the Linux terminal, run bash console.sh clean job_name.

    • Submit a data migration job. ossimport does not allow you to submit jobs with the same name. If a job with the same name exists, use the clean command to purge it first. When you submit a job, you must specify its configuration file. The job configuration file template is located at conf/job.cfg. You can modify the template as needed. In the Linux terminal, run bash console.sh submit [job_cfg_file]. This command submits the job with the specified job_cfg_file. The job_cfg_file parameter is optional. If you do not specify this parameter, the default configuration file $OSS_IMPORT_HOME/conf/job.cfg is used. By default, $OSS_IMPORT_HOME is the directory where console.sh is located.

    • Start the service by running bash console.sh start in the Linux terminal.

    • View the job status by running bash console.sh stat in the Linux terminal.

    • Retry a failed job. A job may fail due to network issues or other reasons. The retry operation applies only to failed subtasks. Successful subtasks are not retried. On Linux, run bash console.sh retry [job_name] in the terminal. The job_name parameter is optional. If you specify a job_name, the failed subtasks of that specific job are retried. If you do not specify a job_name, the failed subtasks of all jobs are retried.

    • Stop the service by running bash console.sh stop in the Linux terminal.

    Tips:

    • If you enter an incorrect parameter, bash console.sh automatically displays the correct command format.

    • Use absolute paths for directories in configuration files and when you submit jobs.

    • The job configuration refers to the configuration items in job.cfg.

      Important

      Configuration items cannot be modified after they are submitted.

  • Common causes of job failure

    • A file in the source directory was modified during the upload. The log/audit.log file contains a SIZE_NOT_MATCH error. In this case, the original file was uploaded successfully, but the modifications were not uploaded to OSS.

    • A source file was deleted during the upload, which caused the operation to fail.

    • The source filename does not comply with OSS naming conventions. For example, a filename cannot start with a forward slash (/) or be empty. This non-compliance causes the upload to OSS to fail.

    • The data source file failed to download.

    • The program exits unexpectedly, and the job status is Abort. In this case, contact us for support.

  • Job status and logs

    After a job is submitted, the Master node breaks it down into tasks. The Worker nodes execute the tasks, and the Tracker collects the task statuses. After the job is complete, the contents of the workdir directory are as follows:

    workdir
    ├── bin
    │   ├── console.jar     # JAR package for the Console module
    │   ├── master.jar      # JAR package for the Master module
    │   ├── tracker.jar     # JAR package for the Tracker module
    │   └── worker.jar      # JAR package for the Worker module
    ├── conf
    │   ├── job.cfg         # Job configuration file template
    │   ├── sys.properties  # System operational parameter configuration file
    │   └── workers         # Worker list
    ├── logs
    │   ├── import.log      # Migration log
    │   ├── master.log      # Master log
    │   ├── tracker.log     # Tracker log
    │   └── worker.log      # Worker log
    ├── master
    │   ├── jobqueue                 # Stores jobs that are not yet broken down
    │   └── jobs                     # Stores job running statuses
    │       └── xxtooss              # Job name
    │           ├── checkpoints      # Checkpoint records for the Master breaking down a Job into Tasks
    │           │   └── 0
    │           │       └── ED09636A6EA24A292460866AFDD7A89A.cpt
    │           ├── dispatched       # Tasks that have been assigned to Workers but are not yet complete
    │           │   └── 192.168.1.6
    │           ├── failed_tasks     # Failed tasks
    │           │   └── A41506C07BF1DF2A3EDB4CE31756B93F_1499348973217@192.168.1.6
    │           │       ├── audit.log     # Task operational log. You can view the cause of errors in this log.
    │           │       ├── DONE          # Flag for a successful task. Empty if failed.
    │           │       ├── error.list    # List of task errors. You can view the list of error files here.
    │           │       ├── STATUS        # Task status flag file. The content is Failed or Completed, indicating whether the subtask failed or succeeded.
    │           │       └── TASK          # Task description information
    │           ├── pending_tasks    # Unassigned tasks
    │           └── succeed_tasks    # Successfully run tasks
    │               └── A41506C07BF1DF2A3EDB4CE31756B93F_1499668462358@192.168.1.6
    │                   ├── audit.log    # Task operational log. You can view the cause of errors in this log.
    │                   ├── DONE         # Flag for a successful task.
    │                   ├── error.list   # List of task errors. Empty on success.
    │                   ├── STATUS       # Task status flag file. The content is Failed or Completed, indicating whether the subtask failed or succeeded.
    │                   └── TASK         # Task description information
    └── worker  # Status of tasks that are running on the Worker. Managed by the Master after completion.
        └── jobs
            ├── local_test2
            │   └── tasks
            └── local_test_4
                └── tasks
    Important
    • For information about the job execution, see logs/import.log.

    • To find the cause of a task failure, see master/jobs/${JobName}/failed_tasks/${TaskName}/audit.log.

    • To find the files that failed to migrate for a task, see master/jobs/${JobName}/failed_tasks/${TaskName}/error.list.

    • The preceding logs are for troubleshooting purposes only. Do not build dependencies on this content in your services or applications.

Verify migration results

ossimport does not verify migrated files, so it cannot guarantee the correctness and consistency of the migration results. After a migration task is complete, you must verify the data consistency between the source and destination.

If you delete the source data before you verify data consistency between the source and destination, you are responsible for any resulting data loss.

Common errors and troubleshooting

For more information, see Common errors and troubleshooting.