Hadoop Distributed File System

  • Formal
  • The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications.
  • Practical
  • The Hadoop Distributed File System (HDFS) is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications. In a large cluster, thousands of servers both host directly attached storage and execute user application tasks. By distributing storage and computation across many servers, the resource can grow with demand while remaining economical at every size. HDFS stores metadata on a dedicated server, called the NameNode. Application data are stored on other servers called DataNodes. All servers are fully connected and communicate with each other using TCP-based protocols. Unlike Lustre and PVFS, the DataNodes in HDFS do not rely on data protection mechanisms such as RAID to make the data durable. Instead, like GFS, the file content is replicated on multiple DataNodes for reliability. While ensuring data durability, this strategy has the added advantage that data transfer bandwidth is multiplied, and there are more opportunities for locating computation near the needed data.

联系我们

欢迎与我们交流!

* Required
* Required
* Required
* Invalid email address
提交此表单,即表示您同意 Asia Growth Partners 可以与您联系并分享洞察和营销信息。
不,谢谢,我不想收到来自 Asia Growth Partners 的任何营销电子邮件。
提交

Thank you for your message!
We will contact you soon.