Login with  Log in with facebook
Hiring Manager? SIGN UP HERE
Q1A Tech Tips
Information for developers tagged by technical specialty
Have a cool tech tip you want to share? Add it here

Vote
Answer
21 Views

Yes, you can do it using output commiters. Output Committers Hadoop makes sure a job either succeds or fails gracefully. This is done via OutputCommitter. This is accessible from OutputFormat by OutputFormat.getOutputCommiter() public abstract OutputCommitter [...]

RishiYadav
05/28/2013 at 19:38
Vote
Answer
8 Views

Remove from  mapred.exclude Remove from  hdfs.exclude $ hadoop mradmin -refreshNodes $ hadoop dfsadmin -refreshNodes $ hadoop-daemon.sh start tasktracker $ hadoop-daemon.sh start datanode

RishiYadav
05/27/2013 at 13:48
Vote
Answer
6 Views

If your cluster does not have excludes file, add it in hdfs-site.xml dfs.hosts.exclude /usr/local/hadoop/conf/excludes Names a file that contains a list of hosts excluded from cluster   Add hostname of the node you want to remove to  mapred.exclude [...]

RishiYadav
05/27/2013 at 13:47
Vote
Answer
6 Views

Intermediate data is not written in hdfs but in local disk.

RishiYadav
05/21/2013 at 18:53
Vote
Answer
11 Views

If all replicas of one or more blocks of a file become unavailable, a file is considered corrupt and any attempt to access this file will lead to exception. To check health of hadoop filesystem like Linux hadoop has "fsck" command.  fsck generates a summary report that lists the [...]

RishiYadav
05/06/2013 at 19:12
Vote
Answer
4 Views

Hadoop stores data in form of blocks. A block is replicated across the cluster as per the replicationFactor which is 3 by default. Default block size is 64MB.  A file is divided into blocks when it's moved into the cluster. A file can be divided into multiple blocks but one block can [...]

RishiYadav
04/30/2013 at 13:26
Vote
Answer
1 View

one easy way to differentiate between Hadoop old api and new api is packages. old api packages are identifiable by  mapred or to put it precisely subpackages of org.apache.hadoop.mapred package, new api packages are identifiable by  mapreduce or to put precisely subpackages of [...]

RishiYadav
04/25/2013 at 19:41
Vote
Answer
4 Views

delete from Users where (rowid, email_adress) not in (select min(rowid), email_adress from Users group by email_address);

RishiYadav
04/24/2013 at 16:45
Vote
Answer
2 Views

With so many Hadoop versions floating around , it becomes a challenge which one to choose. Choice is simple though, always go for most stable version which at the time of this writing is 1.0.4.  Yarn which is also called Hadoop 2 is making progress towards first stable release. Yarn [...]

RishiYadav
04/20/2013 at 10:22
Vote
Answer
3 Views

An error mostly comes when you move from pseudo-distributed mode to distributed mode. The resolution is simple, the listening ports for hadoop are not open. If you are using ubuntu and namenode is running on 8020 command to open the port is -> ufw allow 8020

RishiYadav
04/08/2013 at 21:47
Prev12345678910...1314Next  Showing 1 to 10 of total 137 records
Schedule a Demo

Schedule a Demo with us

Name *
Email *
Phone *
Company *
Details