Login with  Log in with facebook
Hiring Manager? SIGN UP HERE

Hadoop write path is slightly more complicated than read path. 

First client library sends a request to namenode with a named file. After checking permissions NameNode creates filesystem metadata for the file. No blocks are created yet. 

Response to client tells that request to open file was successful and client can go ahead and start writing data. 

Client starts writing data to the steam which is split into packets and queued in memory.

Another thread in client consumes this queue and requests namenode to give set of datanodes to write first block. Client makes connection to the first datanode in the list, this datanode connects to the second datanode and so on and so forth. Data packets are sent to first datanode and using this replication pipeline it sends to the second and third datanode.  Each datanode separately acknowledges to the client, once write is complete. 

This continues until the block is full and then client makes a second request to namenode to give another set of datanodes.

Rishi Yadav
04/10/2013 at 18:13
If you want to post any answer to this forum then you need to log in.
Schedule a Demo

Schedule a Demo with us

Name *
Email *
Phone *
Company *