Connect to DataV

更新时间:
复制 MD 格式

DataV is a visualization product from Alibaba Cloud. You can use its graphical interface to easily build professional visualization applications, which enriches how you present log analysis data. This topic describes how to connect Simple Log Service to DataV to display data on a large screen.

Prerequisites

Background information

Real-time dashboards are widely used in large-scale online promotions and are based on a stream processing architecture. The architecture consists of the following modules:

  • Data ingestion: Collects data from each source in real time.

  • Intermediate storage: Uses Kafka-like queues for decoupling production and consumption systems.

  • Real-time computing: A key part of data processing. It subscribes to real-time data and uses computation rules to process data in a window.

  • Result storage: Stores computation results in SQL and NoSQL databases.

  • Visualization: Calls APIs to retrieve and display the results.

Alibaba Group provides many mature products for these tasks. The typical options are as follows:

image

You can connect Simple Log Service directly to DataV using the query and analysis API to display data on a large screen. 日志服务对接DataV

Features

Two computing methods are available based on data volume, real-time requirements, and business needs.

  • Real-time computing (stream processing): Fixed computations on changing data.

  • Offline computing (data warehouse + offline computing): Changing computations on fixed data.

In log analysis scenarios that require high real-time performance, Simple Log Service provides a mechanism to index data in LogHub in real time. You can directly query and analyze the data using LogSearch/Analytics. LogSearch/Analytics provides the following benefits:

  • Fast: Query billions of data records within one second with up to five conditions. You can analyze and aggregate hundreds of millions of data records within one second with up to five aggregation dimensions and a GroupBy clause. No waiting or pre-calculation of results is required.

  • Real-time: In 99.9% of cases, logs can be displayed on the large screen within one second of being generated.

  • Dynamic: Whether you modify statistical methods or backfill data, the results are refreshed and displayed in real time without waiting for recalculation.

LogSearch/Analytics has the following limits:

  • Data volume: A single computation is limited to 10 billion rows. If you exceed this limit, you must specify a time range.

  • Computing flexibility: Computations are limited to SQL-92 syntax. Custom user-defined functions (UDFs) are not supported.

Add a Simple Log Service (SLS) data source

  1. Access the DataV console.

  2. On the Workbench page, click Data Preparation > Data Source in the navigation pane on the left. On the Data Source page, click New Data Source.

  3. From the Type dropdown, select Simple Log Service (SLS).

  4. Enter the required details for Simple Log Service (SLS).

    image

    Parameter

    Description

    Custom Data Source Name

    The display name of the data source. You can name it as needed.

    AppKey

    The AccessKey ID of the account that has access privileges to the target SLS.

    AppSecret

    The AccessKey Secret of the account that has access privileges to the target SLS.

    EndPoint

    Enter the endpoint of the SLS service. For more information, see the endpoint document. Fill in the endpoint based on the network type and region of your SLS service.

    For example, in the VPC network in the Shanghai region, the EndPoint is entered as https://cn-shanghai-intranet.log.aliyuncs.com.

  5. Once the information is entered, click OK to complete the addition of the data source.

    The new data source will be automatically listed in the data source directory.

Use a Simple Log Service (SLS) data source

  1. Access the DataV console.

  2. On the Workbench page, hover over the data dashboard you want to edit and click Edit.

    Important

    If you haven't created a data dashboard on your Workbench page, please see Create a PC Dashboard Using a Template for guidance on creating one.

  3. On the canvas editing page, select a widget from the canvas.

    If your canvas lacks widgets, please add them first. For more information, see Widget Canvas Operations.

  4. In the widget configuration panel on the right, select Data Source.

    image

  5. In the Set Data Source section, choose Data Source Type as Simple Log Service (SLS).

  6. From the Select An Existing Data Source dropdown, pick the configured log service data source.

  7. Below, in the Query field, input the query parameters.

    You can use JSON objects as query parameters. The query parameters you may enter include the following:

    {
    "projectName": "test",
    "logStoreName": "access-log",
    "topic": "test",
    "from": 1509897600,
    "to": 1509984000,
    "query": "" ,
    "line": 100,
    "offset": 0
    }
    Note

    For more information on the query syntax of the query parameter, see query syntax and features.

  8. Click View Data Return Results to check the returned data.

Example: Adjust a real-time dashboard for Apsara Conference website access based on different statistical methods

During the Apsara Conference, a temporary requirement arises to calculate the nationwide access volume for the conference website and display it on a real-time dashboard. You have already configured full log collection and enabled the query and analysis feature in Simple Log Service. Therefore, you only need to enter a query and analysis statement. During this process, the requirements are adjusted as follows:

  • Original requirement: On the first day of the Apsara Conference, you need to calculate the number of unique visitors (UVs) for the day.

    Query the data of the forward field in all NGINX access logs. This field records one or more IP addresses of the visiting user, with one forward field per log. Use approx_distinct(forward) to calculate the number of unique IP addresses. To obtain the number of UVs from midnight on the first day of the conference to the current time, use the following statement.

    * | select approx_distinct(forward) as uv
  • First requirement adjustment: On the second day of the Apsara Conference, the requirement is adjusted to calculate the user access data for the yunqi.aliyun.com domain name.

    You can add a host filter condition for a real-time query. Use the following statement.

    host:yunqi.aliyun.com | select approx_distinct(forward) as uv
  • Second requirement adjustment: During the statistical process, you find that the forward field in the NGINX access log contains multiple IP addresses. You only need the first IP address by default.

    Use the following statement.

    host:yunqi.aliyun.com | select approx_distinct(split_part(forward,',',1)) as uv
  • Third requirement adjustment: On the third day of the Apsara Conference, a condition is added. You need to exclude the access volume from users who visit through the UC Browser and click its ads. Calculate the nationwide user access volume from unique IP addresses that are not referred by UC Browser ads.

    In this case, you can add a not filter condition. Use the following statement.

    host:yunqi.aliyun.com not URL:uc-iflow  | select approx_distinct(split_part(forward,',',1)) as uv

    Figure 2. Example示例