Is to learn Java Mendatory for Hadoop

Enable heap dumps for Apache Hadoop services in Linux-based HDInsight

Heap dumps contain a snapshot of the application's memory, including the values ​​of variables at the time the dump was created. Hence, they are useful for diagnosing problems that arise while running.

services

You can enable heap dumps for the following services:

  • Apache hcatalog: temp tone
  • Apache hive: hiveserver2, metastore, derbyserver
  • mapreduce : jobhistoryserver
  • Apache yarn: resourcemanager, nodemanager, timelineserver
  • Apache hdfs: datanode, secondarynamesode, namenode

You can also enable heap dumps for the mapper and reducer processes run by HDInsight.

Basic information on configuring heap dumps

Heap dumps are enabled by passing options (or parameters) to the JVM when a service starts. For most Apache Hadoop services, you can change the shell script used to start the service to pass these options.

In every script there is an export for * _OPTS that contains the options passed to the JVM. Example: In the script hadoop-env.sh the line that begins with contains the options for the NameNode service.

Mapper and reducer processes differ slightly as they are child processes of the MapReduce service. Each mapper or reducer process runs in a child container, and there are two entries that contain the JVM options. Both are in mapred-site.xml contain:

  • mapreduce.admin.map.child.java.opts
  • mapreduce.admin.reduce.child.java.opts

Note

We recommend using Apache Ambari to change the scripts and settings in mapred-site.xml because Ambari handles replicating changes to nodes in the cluster. See Using Apache Ambari for the detailed steps.

Enable heap dumps

The following option enables heap dumps in the event of an "OutOfMemoryError":

The + Sign indicates that this option is enabled. The default value is disabled.

warning

Heap dumps are not enabled by default for Hadoop services on HDInsight clusters because the dump files can be large. If you enable them for troubleshooting, be sure to disable them once you've reproduced the problem and collected the dump files.

Dump location

The default location for the dump file is the current working directory. You can control where the file is saved by using the following option:

Example: causes the dumps to be saved in the “/ temp” directory.

Scripts

You can also trigger a script when a OutOfMemoryError occurs. For example, you can trigger a notification so you know that the error has occurred. Use the following option to trigger a script when OutOfMemoryError occurs:

Note

Because Apache Hadoop is a distributed system, any script used must be present on all nodes in the cluster where the service is running.

The script must also be in a location accessible by the account running the service and granted execute permissions. You could e.g. B. Store scripts in and use them to grant read and execute permissions.

Using Apache Ambari

Proceed as follows to change the configuration for a service:

  1. In a web browser, navigate to, where is the name of your cluster.

  2. Use the list on the left to select the service area you want to change. For example HDFS. In the middle area, select the tab Configs out.

  3. Enter in the field filter the text opts a. Only elements with this text will be displayed.

  4. Look for the entry * _OPTS for the service you want to enable heap dumps for and add the options you want to enable. In the following illustration, the entry was HADOOP_NAMENODE_OPTS added:

    Note

    When enabling heap dumps for the child mapper or reducer process, look for the fields labeled mapreduce.admin.map.child.java.opts and mapreduce.admin.reduce.child.java.opts.

    Use the button Save to save the changes. You can enter a brief note describing the changes.

  5. After the changes are applied, the icon for reboot required displayed next to one or more services.

  6. Select each service that requires a restart and use the button Service Actions, around Turn On Maintenance Mode to select. Maintenance mode prevents the service from generating alerts when you restart it.

  7. After activating maintenance mode, use the button Restart for the service to Restart All Effected to select.

    Note

    The entries for the button Restart (Restart) can be different for other services.

  8. After restarting the services, use the button Service Actions, around Turn Off Maintenance Mode to select. This will allow Ambari to continue monitoring for alerts for the service.