FileDescriptors and HBase

Though HBase works on HDFS, when it comes to need of open file handles, it comes close to any regular database and needs a lot of file descriptors open.  Linux by default limit the number of file descriptors to 1024. You can check it by issuing $ ulimit -n 1024 To change [...]



Hardware configuration for a poc/test hadoop cluster

  Daemons Before going deeper into hardware configuration, let's first see what different daemons are there in hadoop.  NameNode SecondaryNameNode JobTracker DataNode TaskTracker In a small cluster, NN, SNN and JT are on same machine. DN and TT are always on same [...]



Enum in Java

  For those of us who were coding extensive development in Java even before Java 1.5 know the pain faced before enum was introduced. The solution used to be Type-safe Enums which was a design pattern to simulate enum functionality as closely as possible.  Introduction of enums changed [...]



Tuning different hadoop parameters

dfs.replication sets the file replication factor. Default values 3. To add or modify this property go to hdfs-site.xml which by default is located in $HADOOP_HOME/conf/hdfs-site.xml   dfs.replication 4 dfs.block.size HDFS is designed [...]



Creating a Simple SOAP Web Service using Maven in 5 Steps

SOAP SOAP stands for simple object access protocol. In this SOAP Service we'll have 4 classes    TimeService.java    TimeServiceImpl.java    TimePublisher.java    TimeClient.java pom.xml pom.xml for this project is barebones as java 6 onwards provide native [...]



Who is this SecondaryNameNode anyway

SecondaryNameNode is one of the most confusing things in Hadoop.  There are not many books in Hadoop and I happened to be reading one and I quit reading the book the moment I saw this sentence. "The NameNode is a single point of failure, and on failure it will stop all the operations of the [...]

