Archivo de la etiqueta: Java

HBase and Zookeeper debugging


I came across some scenarios where an application (i.e. Mapreduce) communicating to HBase through YARN could silently fail with a timeout like the following: 2017-01-30 19:42:03,657 DEBUG [main] org.apache.hadoop.hbase.client.ConnectionManager$HConnectionImplementation: locateRegionInMeta parentTable=hbase:meta, metaLocation=, attempt=9 of 35 failed; retrying after sleep of … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , , , | Deja un comentario

Checking Yarn child execution environment


Never go out without this: $ sudo -u yarn jps 27343 YarnChild 4156 NodeManager 27292 Jps $ sudo strings -f /proc/27343/environ /proc/27343/environ: STDERR_LOGFILE_ENV=/var/log/hadoop-yarn/containers/application_1485807340469_0019/container_1485807340469_0019_01_000003/stderr /proc/27343/environ: SHELL=/bin/bash /proc/27343/environ: TERM=linux /proc/27343/environ: HADOOP_HOME=/usr/lib/hadoop /proc/27343/environ: YARN_PID_DIR=/var/run/hadoop-yarn /proc/27343/environ: NM_HOST=ip-172-31-5-156.us-west-2.compute.internal /proc/27343/environ: HADOOP_PREFIX=/usr/lib/hadoop /proc/27343/environ: YARN_OPTS= -XX:OnOutOfMemoryError=’kill -9 %p’ … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

Debugging Java Threads


Which Java process is using most of the CPU: $ ps u -C java Generate the Java thread dump: $ jstack -l PId > PId-threads.txt From the Java threads I can count: $ awk ‘/State: / { print }’ < … Seguir leyendo

Publicado en Uncategorized | Etiquetado , | Deja un comentario

Java change default version / cambiar la version Java por defecto


If we have more than one Java version installed on your Linux server (Redhat flavor) you can change defaults using ‘alternatives’ command: [hadoop@ip-172-31-36-252 ~]$ sudo /usr/sbin/alternatives –config java There are 2 programs which provide ‘java’.   Selection    Command ———————————————– *+ … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

FileInputFormat vs. CombineFileInputFormat


When you put a file into HDFS, it is converted to blocks of 128 MB. (Default value for HDFS on EMR) Consider a file big enough to consume 10 blocks. When you read that file from HDFS as an input … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario

Consider boosting spark.yarn.executor.memoryOverhead


This is a very specific error related to the Spark Executor and the YARN container coexistence. You will typically see errors like this one on the application container logs: 15/03/12 18:53:46 WARN YarnAllocator: Container killed by YARN for exceeding memory … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

YARN / Map Reduce memory settings


On Hadoop 1, we used to use mapred.child.java.opts to set the Java Heap size for the task tracker child processes. With YARN, that parameter has been deprecated in favor of: mapreduce.map.java.opts – These parameter is passed to the JVM for mappers. … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario