Archivo de la etiqueta: Linux

Java change default version / cambiar la version Java por defecto


If we have more than one Java version installed on your Linux server (Redhat flavor) you can change defaults using ‘alternatives’ command: [hadoop@ip-172-31-36-252 ~]$ sudo /usr/sbin/alternatives –config java There are 2 programs which provide ‘java’.   Selection    Command ———————————————– *+ … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

monitoring HTTP requests on the fly


Install httpry: sudo yum install httpry or $ sudo yum install gcc make git libpcap-devel $ git clone https://github.com/jbittel/httpry.git $ cd httpry $ make $ sudo make install then run: sudo httpry -i eth0 Output will be like: httpry version … Seguir leyendo

Publicado en Uncategorized | Etiquetado , | Deja un comentario

Control Characters on vi Linux editor


Show hidden/control characters on vi: :set list   Hide hidden/control characters on vi: :set nolist   Replace hidden/control characters on vi: :%s/^M//g :%s/.$//g  

Publicado en Uncategorized | Etiquetado , | Deja un comentario

yarn: execute a script on all the nodes in the cluster


This is more Linux script related, but, sometimes we have a Hadoop (YARN) cluster running and we need to run a post install script or activity that executes on all the nodes in the cluster: for i in `yarn node … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , , , | Deja un comentario

Indexing Common Crawl Metadata on Elasticsearch using Cascading


If you want to explore how to parallelize the data ingestion into Elasticsearch, please have a look to this post I have written for Amazon AWS: http://blogs.aws.amazon.com/bigdata/post/TxC0CXZ3RPPK7O/Indexing-Common-Crawl-Metadata-on-Amazon-EMR-Using-Cascading-and-Elasticsearch It explains how to index Common Crawl metadata into Elasticsearch using Cascading connector … Seguir leyendo

Publicado en Mis Publicaciones, Uncategorized | Etiquetado , , , , , | Deja un comentario

How Ganglia works


What is Ganglia ? Ganglia is a scalable distributed monitoring system for high-performance computing systems such as clusters and Grids. It is based on a hierarchical design targeted at federations of clusters. It leverages widely used technologies such as XML … Seguir leyendo

Publicado en Uncategorized | Etiquetado , , | Deja un comentario

Back to the basics: Creating a SPEC file from a Maven project


1) Build the package with the provided pom.xml: $ mvn package 2) Rebuild the RPM structure: $ mvn -DskipTests=true rpm:rpm A structure like the following will be created: /target/rpm/<app_name>/BUILD /target/rpm/<app_name>/RPMS /target/rpm/<app_name>/SOURCES /target/rpm/<app_name>/SPECS /target/rpm/<app_name>/SRPMS

Publicado en Uncategorized | Etiquetado , , , | Deja un comentario