通过本文您可以了解E-MapReduce的系统事件。

事件类型 事件名称 事件含义 事件状态 事件等级
Flow EMR-110401002 工作流已成功 Normal Info
Flow EMR-110401003 工作流已提交 Normal Info
Flow EMR-110401004 作业已提交 Normal Info
Flow EMR-110401005 工作流节点已启动 Normal Info
Flow EMR-110401006 工作流节点状态已检查 Normal Info
Flow EMR-110401007 工作流节点已完成 Normal Info
Flow EMR-110401008 工作流节点已结束 Normal Info
Flow EMR-110401009 工作流节点已取消 Normal Info
Flow EMR-110401010 工作流已取消 Normal Info
Flow EMR-110401011 工作流已重跑 Normal Info
Flow EMR-110401012 工作流已恢复 Normal Info
Flow EMR-110401013 工作流已暂停 Normal Info
Flow EMR-110401014 工作流已结束 Normal Info
Flow EMR-110401015 工作流节点已失败 Normal Info
Flow EMR-110401016 作业已失败 Normal Info
Flow EMR-210401001 工作流已失败 Normal Info
Flow EMR-210401003 工作流节点启动超时 Normal Info
Flow EMR-210401004 作业启动超时 Normal Info
Scaling Scaling:ScalingActivity:Failed 伸缩活动失败 Normal Critical
Scaling Scaling:ScalingActivity:Timeout 伸缩活动超时 Normal Critical
Scaling Scaling:ScalingActivity:Rejected 弹性伸缩被拒绝 Normal Critical
StatusNotification StatusCheck 服务组件状态 Failed Warn
Exception Agent:EcmAgentHeartbeatExpired EcmAgent心跳消息过期 Critical Critical
Maintenance Maintenance:HDFS.DataNode.DataTransferPortUnAvailable DataNode的数据传输端口不可用 Critical Critical
Maintenance Maintenance:HDFS.DataNode.ExitUnexpected DataNode进程异常退出 Critical Critical
Maintenance Maintenance:HDFS.DataNode.IpcPortUnAvailable DataNode的IPC端口不可用 Critical Critical
Maintenance Maintenance:HDFS.DataNode.OOM.UnableToCreateNewNativeThread DataNode OOM不能创建新的Native线程 Critical Critical
Maintenance Maintenance:HDFS.NameNode.IpcPortUnAvailable NameNode的IPC端口不可用 Critical Critical
Maintenance Maintenance:HDFS.NameNode.LoadFsImageException NameNode加载FsImage异常 Critical Critical
Maintenance Maintenance:HDFS.NameNode.LowAvailableDiskSpaceAndInSafeMode 磁盘空间不足导致NameNode处于安全模式 Critical Critical
Maintenance Maintenance:HDFS.NameNode.OOM NameNode发生OOM Critical Critical
Maintenance Maintenance:HDFS.NameNode.ResourceLow NameNode资源不足 Critical Critical
Maintenance Maintenance:HDFS.NameNode.SyncJournalFailed NameNode同步日志失败 Critical Critical
Maintenance Maintenance:HDFS.ZKFC.ActiveStandbySwitchOccured ZKFC触发NameNode主备切换 Critical Critical
Maintenance Maintenance:HDFS.ZKFC.TransportLevelExceptionInMonitorHealth ZKFC监控NameNode健康状态时发生传输层异常事件 Critical Critical
Maintenance Maintenance:HDFS.ZKFC.UnableToConnectToQuorum ZKFC不能连接ZookeeperQuorum Critical Critical
Maintenance Maintenance:HDFS.ZKFC.UnableToStartZKFC ZKFC不能启动 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.DataBaseCommunicationLinkFailure HiveMetaStore数据库通信链路失败 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.DataBaseConnectionFailed HiveMetaStore数据库连接失败 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.DataBaseDiskQuotaUsedup HiveMetastore数据库磁盘空间用尽 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.JdbcCommunicationException HiveMetaStore发生JDBC通信异常 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.MaxQuestionsExceeded HiveMetastore超过最大查询数 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.MaxUpdatesExceeded HiveMetastore超过最大更新数 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.MaxUserConnectionExceeded HiveMetastore超过最大用户连接数 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.OomOccured HiveMetaStore发生OOM Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.ParseConfError HiveMetastore配置文件解析错误 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.RequiredTableMissing HiveMetastore请求的表丢失 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.hiveServer2PortUnAvailable HIVE.HiveMetaStore.hiveServer2Port不可用 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.ConnectToZkTimeout hiveServer2连接ZK超时 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.ErrorParseConf hiveServer2配置解析错误 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.ErrorStartingHiveServer HiveServer2启动错误 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.FailedInitMetaStoreClient HiveServer2初始化MetaStore客户端失败 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.FailedToConnectToMetaStoreServer HiveServer2连接MetaStoreServer失败 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.HiveServer2OOM HiveServer2发生OOM Critical Critical
Maintenance Maintenance:HOST.OomFoundInVarLogMessage 主机/var/log/message有OOM异常 Critical Critical
Maintenance Maintenance:SPARK.SparkHistory.OomOccured SparkHistory发生OOM Critical Critical
Maintenance Maintenance:YARN.JobHistory.ExitUnExpectedly JobHistory服务异常退出 Critical Critical
Maintenance Maintenance:YARN.JobHistory.StartingError JobHistory服务启动错误 Critical Critical
Maintenance Maintenance:YARN.NodeManager.DeadNodeDetected 检测到死亡的NodeManager节点 Critical Critical
Maintenance Maintenance:YARN.NodeManager.OOM NodeManager发生OOM Critical Critical
Maintenance Maintenance:YARN.NodeManager.StartingError NodeManager启动错误 Critical Critical
Maintenance Maintenance:YARN.NodeManager.UnHealthyForDiskFailed 磁盘错误导致不健康的NodeManager Critical Critical
Maintenance Maintenance:YARN.ResourceManager.ErrorInStarting ResourceManager启动错误 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.ErrorInTransitionToActiveMode ResourceManager切换到Active模式发生错误 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.ExitUnexpected ResourceManager异常退出 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.OOM ResourceManager发生OOM Critical Critical
Maintenance Maintenance:YARN.ResourceManager.UnkownHostException ResourceManager发生UnkownHostException异常 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.ZKRMStateStoreCannotConnectZK YARN服务中ZKRMStateStore不能连接ZK Critical Critical
Maintenance Maintenance:YARN.TimelineServer.ErrorInStarting TimelineServer启动错误 Critical Critical
Maintenance Maintenance:YARN.TimelineServer.ExistUnexpectedly TimelineServer异常退出 Critical Critical
Maintenance Maintenance:ZOOKEEPER.UnableToRunQuorumServer ZOOKEEPER不能运行QuorumServer Critical Critical
Maintenance Agent:Maintenance.EcmAgentTimeout EcmAgent长时间断连 Critical Critical
Maintenance Maintenance:ZOOKEEPER.LeaderPortUnAvailable Zookeeper的LeaderPort不可用 Critical Critical
Maintenance Maintenance:ZOOKEEPER.ClientPortUnAvailable Zookeeper的ClientPort不可用 Critical Critical
Maintenance Maintenance:YARN.NodeManager.UnHealthyNodesExist YARN存在不健康的节点 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.ActiveStandbySwitch ResourceManager发生主备切换 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.PortUnAvailable ResourceManager的服务端口不可用 Critical Critical
Maintenance Maintenance:STORM.Nimbus.ThriftPortUnAvailable STORM.Nimbus.ThriftPort不可用 Critical Critical
Maintenance Maintenance:HDFS.DataNode.FailueVolumes DataNode有坏盘 Critical Critical
Maintenance Maintenance:HDFS.JournalNode.RpcPortUnAvailable JournalNode的RPC端口不可用 Critical Critical
Maintenance Maintenance:HDFS.NameNode.ActiveStandbySwitch NameNode发生主备切换 Critical Critical
Maintenance Maintenance:HDFS.NameNode.BlockCapacityNearUsedUp NameNode块容量趋于耗尽 Critical Critical
Maintenance Maintenance:HBASE.HMaster.IpcPortUnAvailable HBASE.HMaster的IPC端口不可用 Critical Critical
Maintenance Maintenance:HBASE.HRegionServer.IpcPortUnAvailable HBASE.HRegionServer的IpcPort不可用 Critical Critical
Maintenance Maintenance:HBASE.ThriftServer.ServicePortUnAvailable HBASE.ThriftServer的服务端口不可用 Critical Critical
Maintenance Maintenance:HUE.OozieAdminPortUnAvailable Oozie的管理端口不可用 Critical Critical
Maintenance Maintenance:HUE.PortUnAvailable HUE的服务端口不可用 Critical Critical
Maintenance Maintenance:HDFS.NameNode.InSafeMode NameNode处于安全模式过长 Critical Critical
Maintenance Maintenance:HDFS.NameNode.MissingBlock HDFS有数据块丢失 Critical Critical
Maintenance Maintenance:HIVE.HiveMetaStore.PortUnAvailable HIVE.HiveMetaStore的端口不可用 Critical Critical
Maintenance Maintenance:HOST.HighMemoryUsage 内存使用量过高 Critical Critical
Maintenance Maintenance:HOST.LowDiskForMntDisk /mnt/disk1可用空间过低 Critical Critical
Maintenance Maintenance:HOST.LowRootfsDisk 根文件系统所在盘可用空间过低 Critical Critical
Maintenance Maintenance:HIVE.HiveServer2.CannotConnectByAnyURIsProvided 不能通过提供的URIs连接到HiveServer2 Critical Critical
Maintenance Maintenance:HDFS.DataNode.TooManyDataNodeDead HDFS存在过多死亡的DataNodes Critical Critical
Maintenance Maintenance:HDFS.DataNode.ExceptionInSecureMain DataNode的SecureMain发生异常 Critical Critical
Maintenance Maintenance:HDFS.DataNode.OomForJavaHeapSpace JavaHeapSpace引起OOM错误 Critical Critical
Maintenance Maintenance:HDFS.DataNode.DeadDataNodesExist HDFS存在Dead的DataNode Critical Critical
Maintenance Maintenance:HDFS.NameNode.TooMuchDfsCapacityUsed HDFS存储空间使用过多 Critical Critical
Maintenance Maintenance:HDFS.NameNode.TooMuchBlockCapacityUsed HDFS的块空间使用过多 Critical Critical
Maintenance Maintenance:HDFS.NameNode.RpcPortCallQueueLengthTooLong NameNode的RPC请求队列过长 Critical Critical
Maintenance Maintenance:HDFS.NameNode.WriteToJournalNodeTimeout HDFS写JournalNode超时 Critical Critical
Maintenance Maintenance:HDFS.NameNode.BothActive 两个NameNode节点都处于Active状态 Critical Critical
Maintenance Maintenance:HDFS.NameNode.BothStandy NameNode节点均处于Standy状态 Critical Critical
Maintenance Maintenance:HDFS.NameNode.ExitUnexpectely NameNode异常退出 Critical Critical
Maintenance Maintenance:HDFS.NameNode.DirectoryFormatted HDFS发生目录格式化 Critical Critical
Maintenance Maintenance:HDFS.DataNode.VolumeFailuresExist HDFS存在坏盘 Critical Critical
Maintenance Maintenance:HDFS.NameNode.CorruptBlocksOccured HDFS存在坏块 Critical Critical
Maintenance Maintenance:YARN.JobHistory.PortUnAvailable JobHistory的服务端口不可用 Critical Critical
Maintenance Maintenance:HOST.LowAbsoluteFreeMemory 主机可用的绝对内存剩余空间过小 Critical Critical
Maintenance Maintenance:HOST.TooManyProcessesOnMasterHost 主节点的进程数过多 Critical Critical
Maintenance Maintenance:HOST.VmHostShutDown 主机关闭 Critical Critical
Maintenance Maintenance:HOST.VmHostStartUp 主机启动 Critical Critical
Maintenance Maintenance:HOST.CpuStuck 主机CPU卡顿 Critical Critical
Maintenance Maintenance:HDFS.ZKFC.PortUnAvailable HDFS.ZKFC端口不可用 Critical Critical
Maintenance Maintenance:HDFS.NameNode.TooMuchHeapUsedByTooManyFilesAndBlocks 文件和块数过多导致多亮的堆内存消耗 Critical Critical
Maintenance Maintenance:HDFS.NameNode.TooMuchDataNodeCapacityUsed DataNode空间使用过多 Critical Critical
Maintenance Maintenance:YARN.NodeManager.ErrorRebootingNodeStatusUpdater NodeManager启动RebootingNodeStatusUpdater失败 Critical Critical
Maintenance Maintenance:ZOOKEEPER.PeerPortUnAvailable ZOOKEEPER的peer端口不可用 Critical Critical
Maintenance Maintenance:YARN.WebAppProxyServer.PortUnAvailable YARN.WebAppProxyServer服务端口不可用 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.BothInStandby ResourceManager两个节点都处于Standby状态 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.BothInActive ResourceManager两个节点都处于Active状态 Critical Critical
Maintenance Maintenance:YARN.TimelineServer.PortUnAvailable YARN.TimelineServer端口不可用 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.CouldNotTransitionToActive ResourceManager不能切换到Active状态 Critical Critical
Maintenance Maintenance:ZOOKEEPER.LeaderFollowerSwitch Zookeeper发生主从切换 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.InvalidConf.CannotFoundRM_HA_ID ResourceManager无效配置问题:不能找到RM_HA_ID Critical Critical
Maintenance Maintenance:YARN.NodeManager.LostNodesExist 存在丢失的NodeManager节点 Critical Critical
Maintenance Maintenance:YARN.ResourceManager.PortUnAvailable YARN.ResourceManager服务端口不可用 Critical Critical
EMRStudio EMRStudio:Airflow.AirflowScheduler.DagRunFailure.WARNING Airflow设置告警级别为WARNING的DAG运行失败 Warn Warn
EMRStudio EMRStudio:Airflow.AirflowScheduler.DagSLAFailure.CRITICAL Airflow设置告警级别为CRITICAL的DAG SLA失败 Critical Critical
EMRStudio EMRStudio:Airflow.AirflowScheduler.DagSLAFailure.WARNING Airflow设置告警级别为WARNING的DAG SLA失败 Warn Warn