Installation and deployment (MongoDB-compatible syntax)

更新时间:
复制 MD 格式

If your application is built on the MongoDB protocol but you want to run it on a relational database that has atomicity, consistency, isolation, and durability (ACID) properties and supports complex queries, use PolarDB for PostgreSQL (PolarFlex). This topic describes how to deploy a PolarDB for PostgreSQL (PolarFlex) cluster that is compatible with MongoDB syntax. This lets you connect to and use the database with MongoDB clients and drivers without changing your application code.

Feature overview

PolarDB for PostgreSQL (PolarFlex) supports the MongoDB protocol and syntax through a built-in compatibility layer. When you enable MongoDB compatibility mode, the database instance listens on two ports simultaneously:

  • PostgreSQL protocol port (default: 1523): Accepts and processes standard SQL requests.

  • MongoDB protocol port (default: 27030): Accepts and processes requests from MongoDB clients.

Prerequisites

  1. Obtain the required software package for deployment. The PolarDB for PostgreSQL (PolarFlex) database engine includes MongoDB syntax compatibility.

  2. The deployment environment must meet the following hardware and software requirements:

    Item

    Configuration details

    Hardware

    Server architecture

    Only the x86 architecture is supported.

    CPU

    ≥ 2 cores

    Memory

    ≥ 8 GB

    Disk

    System disk: The root partition (/) must have at least 10 GB of free space.

    Data disk: At least one local data disk with a capacity of 256 GB or more.

    Network

    1 GE or higher.

    Software

    Operating system

    CentOS 7.2 or later, or other Red Hat-compatible operating systems such as Kylin, UOS, or Anolis OS.

    Note

    Set the character set to LANG=en_US.UTF-8.

    File system

    The file system for the local disk is ext4.

    glibc version

    ≥ 2.15

    Note

    Run the following command to check the glibc version. Check the output list to see if it includes GLIBC_2.15 or a later version.

    strings /lib64/libc.so.6 | grep "^GLIBC_2."

    Python

    Python 3 is installed and has a symbolic link to python.

    Note

    Command to install Python 3 and set the symbolic link:

    sudo yum install -y python3
    sudo ln -sf $(which python3) /usr/bin/python

Billing

PolarDB for PostgreSQL (PolarFlex) is a commercial product. Deployment and use require a purchased license. Contact us to obtain a 30-day free trial version for a single node. After the trial period ends, the database is automatically throttled.

Notes

Before you start the deployment, read the following limits and recommendations:

  • Version compatibility: The MongoDB-compatible syntax feature currently supports only the PostgreSQL 16 version. Ensure that you use the corresponding dedicated software package.

  • User permissions: Run all installation operations as the root user.

  • Configurations that cannot be modified after deployment: The following two parameters cannot be changed after the cluster is created. To change them, you must uninstall and then redeploy the cluster.

    • compatibility_mode: Compatibility mode. To enable the MongoDB compatibility feature, set this parameter to mongodb in the configuration file.

    • polardb_data_root_dir: The database file directory. The default path, /var/lib/thirdDB, is on a partition that might have insufficient space. Set this parameter to a path on a data disk with sufficient capacity based on your data volume planning.

  • Configuration file format: The configuration file must be in strict YAML format. Do not use tab characters.

  • Password security: When you use the one-click deployment script, enclose passwords that contain special characters, such as !, in single quotation marks (' '). This prevents the script from failing.

Procedure

Follow these steps to deploy and verify the cluster.

Step 1: Install the cluster management tool (pdbcli)

  1. Obtain the package named polarflex-mongodb-${version}-${build-date}.tar.gz and upload it to the target host.

  2. Log on to the host as the root user and run the following commands to decompress and install the management tool. The following example uses version 2.3.2.4:

    # Create a working directory
    version=2.3.2.4
    mkdir -p polarflex-mongodb-${version}
    # Decompress the package to the working directory
    tar -C polarflex-mongodb-${version}/ -xf polarflex-mongodb-${version}-${build-date}.tar.gz
    # Go to the working directory and run the installation script
    cd polarflex-mongodb-${version}/
    ./scripts/install.sh
  3. After the installation is complete, run the pdbcli version command to verify the installation. If the command returns a version number, the pdbcli tool is installed successfully.

Step 2: Prepare and check the configuration file

The configuration file defines key information for the cluster, such as its topology, ports, and database parameters. The default file is config.yaml, which is located in the polarflex-${version}/ folder. You can select and modify a template file based on your deployment architecture.

  1. Select a configuration template

    To simplify the configuration of config.yaml, templates for different scenarios are provided in the same directory.

    Configuration file template

    Description

    config_template.yaml

    A template for a cluster with one primary node and two standby nodes. This architecture ensures high availability (HA) with a recovery time objective (RTO) of less than 60 seconds and a recovery point objective (RPO) of 0.

    config_master_slave.yaml

    A template for a cluster with one primary node and one standby node. This architecture ensures HA with an RTO of less than 60 seconds, but it cannot guarantee an RPO of 0.

    config_single_node.yaml

    A template for a single-node cluster. This architecture does not ensure high availability or reliability and is not recommended for production environments.

  2. Modify key configurations Open your selected template file. Modify the following two parameters to enable the MongoDB compatibility feature and plan for data storage:

    • Compatibility mode

      # ... Other configurations ...
      compatibility_mode: mongodb # Enable MongoDB-compatible syntax mode
      # ... Other configurations ...
    • File directory

      # ... Other configurations ...
      polardb_data_root_dir: /path/to/your/data_directory # Change this to a path on a high-capacity data disk
      # ... Other configurations ...

    You also need to enter the IP address (ansible_host) for each host according to your environment. The following example shows the configuration for a cluster with one primary node and two standby nodes:

    # Modify the content of the configuration file based on your deployment environment. After you finish, you can run pdbcli validate to check for configuration issues.
    # The configuration file must be in YAML format and cannot contain equal signs (=) or tab characters.
    
    # The following configuration file example shows how to build a three-host cluster. The host IP addresses are 10.XX.XX.1,
    # 10.XX.XX.2, and 10.XX.XX.3. Configure the file as needed.
    all:
      children:
        cm:
          hosts:
            host01: null
            host02: null
            host03: null
          var: null
        db:
          hosts:
            host01:
              polardb_polar_hostid: 1
            host02:
              polardb_polar_hostid: 2
              polardb_node_type: standby
            host03:
              polardb_polar_hostid: 3
              polardb_node_type: standby
          vars:
            polardb_custom_params:
            - max_standby_streaming_delay = 900000
            - max_connections = 3300
            - polar_max_super_conns = 1500
            - max_slot_wal_keep_size = 64000
            - log_statement = 'ddl'
            polardb_service_restart_sec: 5
            hugepage_enabled: off
        proxy:
          hosts:
            host01: null
            host02: null
          var: null
      hosts:
        host01:
          ansible_host: 10.XX.XX.1 # HOST01_IP [Required]
        host02:
          ansible_host: 10.XX.XX.2 # HOST02_IP [Required]
        host03:
          ansible_host: 10.XX.XX.3 # HOST03_IP [Required]
      vars:
        ansible_group_priority: 99
        ansible_python_interpreter: /usr/bin/python
        cluster_id: polardb1
        cm_consensus_port: 7001
        cm_service_port: 5001
        cm_tls_service_port: 6001
        cm_db_sync_mode: SYNC
        polardb_data_root_dir: /var/lib/thirdDB
        license_dir: license
        polardb_enable_direct_io: false
        polardb_multi_instance_per_host: true
        polardb_polar_enable_pfs_mode: false
        polardb_port: 1523
        polardb_proxy_port: 12369
        polardb_proxy_port_rwlb: 12370
        polardb_proxy_admin_port: 12371
        polardb_storage_mode: local_filesystem_mode
        polardb_user: polar1
        primary_db_host: host01
        ue_node_driver_service_port: 12355
        password_encrypt: true
        lc_ctype: en_US.UTF8
        compatibility_mode: mongodb # Enable MongoDB-compatible syntax mode
        # Each parameter can be passed in through environment variable. Here is an example.
  3. Validate the configuration After you modify the file, run the pdbcli validate command to check the configuration file for syntax errors.

Step 3: Pre-configure the operating system

Before you create the cluster, ensure that Transparent Enormous Pages (THP) are disabled on the operating system.

  1. Run the cat /sys/kernel/mm/transparent_hugepage/enabled command to check the status.

  2. If the output is not always madvise [never], run the following command to disable it:

    echo never > /sys/kernel/mm/transparent_hugepage/enabled

Step 4: One-click installation and cluster creation

The polarflex-deploy.sh script is provided in the working directory to simplify cluster deployment. This script automates steps such as setting up password-free Secure Shell (SSH) access, generating the config.yaml file, and installing and creating the cluster.

Run the command that corresponds to your planned cluster architecture. Replace 10.XX.XX.X and {{password}} with the actual host IP address and the root password.

  • Deploy a cluster with one primary node and two standby nodes

    bash polarflex-deploy.sh -m "10.XX.XX.1" -p '{{password}}' -m "10.XX.XX.2" -p '{{password}}' -m "10.XX.XX.3" -p '{{password}}'
  • Deploy a cluster with one primary node and one standby node

    bash polarflex-deploy.sh -m "10.XX.XX.1" -p '{{password}}' -m "10.XX.XX.2" -p '{{password}}'
  • Deploy a single-node cluster

    bash polarflex-deploy.sh -m "10.XX.XX.1" -p '{{password}}'

Step 5: Verify the deployment

After the deployment is complete, perform the following checks to confirm that the cluster status is Normal.

  1. Check the cluster status Run the pdbcli status command. If the status of components such as cluster_manager, master, standby, and proxy is RUNNING, the cluster was deployed successfully.

    Using config file: ./config.yaml
    Cluster Status:
    {
    	"phase": "RunningPhase",
    	"cluster_manager": [
    		{
    			"endpoint": "172.xxx.xxx.xxx:5001",
    			"phase": "RUNNING"
    		},
    		{
    			"endpoint": "172.xxx.xxx.xxx:5001",
    			"phase": "RUNNING"
    		},
    		{
    			"endpoint": "172.xxx.xxx.xxx:5001",
    			"phase": "RUNNING"
    		}
    	],
    	"master": {
    		"endpoint": "172.xxx.xxx.xxx:1523",
    		"cust_id": "0",
    		"work_path": "/var/lib/thirdDB/clusters/polardb1",
    		"phase": "RUNNING",
    		"start_at": "2025-03-13 19:41:50"
    	},
    	"standby": [
    		{
    			"endpoint": "172.xxx.xxx.xxx:1523",
    			"cust_id": "0",
    			"work_path": "/var/lib/thirdDB/clusters/polardb1",
    			"phase": "RUNNING",
    			"start_at": "2025-03-13 19:42:02",
    			"sync_status": "SYNC"
    		},
    		{
    			"endpoint": "172.xxx.xxx.xxx:1523",
    			"cust_id": "0",
    			"work_path": "/var/lib/thirdDB/clusters/polardb1",
    			"phase": "RUNNING",
    			"start_at": "2025-03-13 19:42:02",
    			"sync_status": "SYNC"
    		}
    	],
    	"proxy": [
    		{
    			"endpoint": "172.xxx.xxx.xxx:12369",
    			"phase": "RUNNING"
    		},
    		{
    			"endpoint": "172.xxx.xxx.xxx:12369",
    			"phase": "RUNNING"
    		}
    	],
    	"plugins": [
    		{
    			"name": "golang-manager",
    			"status": "Plugin run err topo is null errCount 2"
    		}
    	],
    	"disk": {
    		"state": "UNLOCK",
    		"quota": "UNSET",
    		"usage": ""
    	}
  2. Check PostgreSQL protocol connectivity By default, the cluster creates the user admin (password: postgres) and the database admin_db. Run the following command to test connectivity on the PostgreSQL port:

    PGPASSWORD=postgres /u01/polardb_pg/bin/psql -h localhost -p1523 -U admin -d admin_db -c 'show polardb_version'

    If the PolarDB version number is returned, such as PolarDB V2.0.16.8.0, the database is installed correctly and you are using the correct database kernel version.

  3. Check MongoDB protocol connectivity Use a MongoDB client, such as mongosh, to connect to the MongoDB port (default: 27030) and verify that the compatibility feature is working correctly.

    1. Install the mongosh client (example for CentOS)

      sudo vim /etc/yum.repos.d/mongodb-org-6.repo
      
      # Paste the following content into the file
      [mongodb-org-6.0]
      
      name=MongoDB Repository
      baseurl=https://repo.mongodb.org/yum/redhat/8/mongodb-org/6.0/x86_64/
      gpgcheck=1
      enabled=1
      gpgkey=https://pgp.mongodb.com/server-6.0.asc
      
      # Exit vim and run the following command to install mongodb-org
      sudo yum install -y mongodb-org
    2. Connect and test

      In the command, replace {{HOST01_IP}} with the actual IP address of the primary node and {{polar_mongodb_port}} with the MongoDB listener port (default: 27030).

      mongosh "mongodb://test_user:superuser_20250530@{{HOST01_IP}}:{{polar_mongodb_port}}"

      After a successful connection, the prompt changes to admin_db>. You can insert data to verify the connection:

      db.users.insertMany([
        { name: "Alice", age: 25, email: "alice@example.com" },
        { name: "Bob", age: 30, email: "bob@example.com" },
        { name: "Charlie", age: 22, email: "charlie@example.com" }
      ])

      If the data is inserted successfully, the MongoDB compatibility feature is working correctly.

Troubleshooting and O&M

During daily use and troubleshooting, you might need to check service processes or view log files.

Check service processes

After the database starts, you can check the process list of the operating system to confirm the service status. The process ID (PID) of the main database process is recorded in the postmaster.pid file in the data directory.

  • PolarDB for PostgreSQL (PolarFlex) core processes

    • postgres: logger: The log printing process.

    • postgres: checkpointer: The periodic checkpoint process.

    • postgres: background writer: The periodic dirty page flushing process.

    • postgres: walwriter: The Write-Ahead Logging (WAL) log flushing process.

    • postgres: autovacuum launcher: The automatic cleanup scheduling process.

    • postgres: stats collector: The statistics information collection process.

  • MongoDB compatibility component process

    • python2 /u01/polar_documentdb_daemon/polar_documentdb_daemon.py: The daemon process for the MongoDB compatibility component. It is responsible for health checks and automatic restarts.

View log files

By default, all log files are saved in the log subdirectory of the data directory. If your cluster ID is polardb1 and you have not modified the data directory, the path is:

cd /var/lib/thirdDB/clusters/polardb1/log/
ls

The log file related to the MongoDB compatibility feature is:

polar_documentdb_daemon.log: The log for the MongoDB compatibility component daemon process.

What to do next

FAQ

If the installation fails, how do I clean up the environment and reinstall?

If the deployment is interrupted or fails, run the cleanup command first. Then, restart the process from Step 4: One-click installation and cluster creation. Do not rerun the deployment script without first cleaning up the environment.

pdbcli delete cluster && pdbcli uninstall cluster