The Google File System is not as state-of-the-art as you may think. It is not a distributed file system in a strict sense (unlike global file system or GPFS). Its model has three components: The clients who uses the file system, a master server to do housekeeping and holds metadata,...
[more]
Kandula et al (2009) The Nature of Datacenter Traffic: Measurements & Analysis (IMC)
A paper from Microsoft on data center traffic patterns.
[more]
Gu and Grossman (2007) UDT: UDP-based Data Transfer for High-Speed Wide Area Networks (ComNet)
Proposed a transport protocol over UDP to better throughput on high BDP network.
[more]
Dean and Ghemawat (2004) MapReduce: Simplified Data Processing on Large Clusters (OSDI)
The Google paper on MapReduce.
[more]
Gu and Grossman (2009) Lessons Learned From a Year's Worth of Benchmarks of Large Data Clouds (MTAGS)
An alternative implementation (Sector/Sphere) of MapReduce computation in U. Illinois Chicago.
[more]