Appendix: Typical scenarios of data exploration

更新时间:
复制 MD 格式

This topic outlines common issues and their solutions encountered when executing data exploration tasks across various compute engines.

The scenarios are detailed below.

Compute engine

Category

Cause of abnormality

Solution

Cloudera Data Platform 7.x

Node error

The optimizer feature of the engine is not functioning properly.

set hive.optimize.shared.work = false;

AsiaInfo DP5.3

Memory overflow

Memory overflows on both the Map and Reduce sides.

set mapreduce.map.memory.mb=10150;

set mapreduce.map.java.opts=-Xmx6144m;

set mapreduce.reduce.memory.mb=10150;

set mapreduce.reduce.java.opts=-Xmx8120m;

E-MapReduce 3.x, E-MapReduce 5.x, CDH 5.x, CDH 6.x, FusionInsight 8.x, Cloudera Data Platform 7.x, AsiaInfo DP5.3

Slow execution speed

The number of concurrent job executions is set too low.

set hive.exec.parallel=true;

set hive.exec.parallel.thread.number=16;