Archivo del Autor: hvivani

Acerca de hvivani

sysadmin, developer, RHCSA

MapReduce: Compression and Input Splits

This is something that always rise doubts: When considering compressed data that will be processed by MapReduce, it is important to check if the compression format supports splitting. If not, the number of map tasks may not be the expected. … Sigue leyendo

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario

yarn: change configuration and restart resource manager on a live cluster

This procedure is to change Yarn configuration on a live cluster, propagate the changes to all the nodes and restart Yarn resource manager. Both commands are listing all the nodes on the cluster and then filtering the DNS name to … Sigue leyendo

Publicado en Uncategorized | Etiquetado , , , , , | Deja un comentario

Hadoop 1 vs Hadoop 2 – How many slots do I have per node ?

This is a topic that always rise a discussion… In Hadoop 1, the number of tasks launched per node was specified via the settings mapred.map.tasks.maximum and mapred.reduce.tasks.maximum. But this is ignored when set on Hadoop 2. In Hadoop 2 with … Sigue leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

Hadoop useful commands

- Copy fromLocal/ToLocal from/to S3: $ bin/hadoop fs -copyToLocal s3://my-bucket/myfile.rb /home/hadoop/myfile.rb $ bin/hadoop fs -copyFromLocal job5.avro s3://my-bucket/input – Merge all the files from one folder into one single file: $ hadoop jar ~/lib/emr-s3distcp-1.0.jar –src s3://my-bucket/my-folder/ –dest s3://my-bucket/logs/all-the-files-merged.log –groupBy ‘.*(*)’ –outputCodec … Sigue leyendo

Publicado en Uncategorized | Etiquetado , | Deja un comentario

Generar clave publica desde clave privada

Necesito tener esto a mano: ssh-keygen -y -f ~/.ssh/test-key.pem > ~/.ssh/test-key.pem.pub Chequear previamente que los permisos en test-key.pem sean 600.

Publicado en Uncategorized | Etiquetado , | Deja un comentario

Hadoop: HDFS find / recover corrupt blocks

1) Search for files on corrupt files: A command like ‘hadoop fsck /’ will show the status of the filesystem and any corrupt files. This command will ignore lines with nothing but dots and lines talking about replication: hadoop fsck … Sigue leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

Simple Java Telnet Port Scanner

It can be improved in many ways, but.. import java.io.*;  import java.net.*;  import java.util.*;  import java.util.TimerTask;  //import org.apache.commons.*;//import org.apache.commons.net.telnet.TelnetClient;  class Connectivity extends TimerTask  {      public static void main(String args[])      {          try          {              System.out.println(“Please enter ip … Sigue leyendo

Publicado en Uncategorized | Etiquetado | Deja un comentario