Archivo de la etiqueta: Java

FileInputFormat vs. CombineFileInputFormat


When you put a file into HDFS, it is converted to blocks of 128 MB. (Default value for HDFS on EMR) Consider a file big enough to consume 10 blocks. When you read that file from HDFS as an input … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario

Consider boosting spark.yarn.executor.memoryOverhead


This is a very specific error related to the Spark Executor and the YARN container coexistence. You will typically see errors like this one on the application container logs: 15/03/12 18:53:46 WARN YarnAllocator: Container killed by YARN for exceeding memory … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

YARN / Map Reduce memory settings


On Hadoop 1, we used to use mapred.child.java.opts to set the Java Heap size for the task tracker child processes. With YARN, that parameter has been deprecated in favor of: mapreduce.map.java.opts – These parameter is passed to the JVM for mappers. … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario

Adding a JAR path to Hadoop classpath


This is simple, but it is a frequent question: If we need to add some specific path pointing to a thirdparty library we can run a command like the following: $ export HADOOP_CLASSPATH=$HADOOP_CLASSPATH:/home/hadoop/.versions/Cascading-2.5-SDK/binary/cascading/*:/home/hadoop/.versions/Cascading-2.5-SDK/binary/cascading/lib/cascading-core/* Here I am adding two directories to … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

Hive: dealing with Out of Memory and Garbage Collector errors.


This is the common error: java.lang.OutOfMemoryError: GC overhead limit exceeded This error will occur in several Java environments, but, in particular, with Hive, is pretty common when big structures or several thousands objects are stored in memory. According to Sun, … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario

Simple Java Telnet Port Scanner


It can be improved in many ways, but.. import java.io.*;  import java.net.*;  import java.util.*;  import java.util.TimerTask;  //import org.apache.commons.*;//import org.apache.commons.net.telnet.TelnetClient;  class Connectivity extends TimerTask  {      public static void main(String args[])      {          try          {              System.out.println(“Please enter ip … Seguir leyendo

Publicado en Uncategorized | Etiquetado | Deja un comentario

Testing Java Cryptography Extension (JCE) is installed


If JCE is already installed, you should see on that the jar files ‘local_policy.jar’ and ‘US_export_policy.jar’ are on $JAVA_HOME/jre/lib/security/ But, we can test it: import javax.crypto.Cipher; import java.security.*; import javax.crypto.*; class TestJCE { public static void main(String[] args) { boolean … Seguir leyendo

Publicado en Uncategorized | Etiquetado , | Deja un comentario