HDFS Federation

The Single NameNode Bottleneck

In a standard HDFS deployment, a single NameNode stores the entire namespace in memory. As clusters grow to hundreds of petabytes, this becomes a bottleneck in three ways:

Limit	Impact
RAM capacity	Namespace size capped by NameNode heap (~1 billion files per 200 GB RAM)
Single namespace	All tenants share one directory tree — no isolation
Throughput	All metadata operations serialized through one JVM

HDFS Federation solves this by allowing multiple independent NameNodes, each managing its own portion of the namespace.

How Federation Works

┌─────────────────────────────────────────────────────┐
│                  HDFS Federation                     │
│                                                     │
│   NameNode A          NameNode B          NameNode C │
│   /user /tmp          /data /warehouse    /logs      │
│       │                   │                   │      │
│   Block Pool A        Block Pool B        Block Pool C│
│       │                   │                   │      │
└───────┼───────────────────┼───────────────────┼──────┘
        │                   │                   │
   ┌────▼────────────────────────────────────────▼───┐
   │          Shared DataNode Cluster                 │
   │  (DataNodes report all block pools to all NNs)  │
   └─────────────────────────────────────────────────┘

Key concepts:

Block Pool: Each NameNode owns a block pool — a set of blocks on the DataNodes. DataNodes serve all block pools simultaneously.
Namespace Volume: A NameNode + its block pool together form a namespace volume. Failure of one does not affect others.
ViewFs: A client-side mount table that maps paths to the correct NameNode (hdfs://nn-a/user, hdfs://nn-b/data, etc.).

Configuration

hdfs-site.xml — enable federation

<!-- Define two NameNode services -->
<property>
  <name>dfs.nameservices</name>
  <value>ns1,ns2</value>
</property>

<!-- NameNode for ns1 -->
<property>
  <name>dfs.namenode.rpc-address.ns1</name>
  <value>namenode1.example.com:8020</value>
</property>
<property>
  <name>dfs.namenode.http-address.ns1</name>
  <value>namenode1.example.com:9870</value>
</property>

<!-- NameNode for ns2 -->
<property>
  <name>dfs.namenode.rpc-address.ns2</name>
  <value>namenode2.example.com:8020</value>
</property>
<property>
  <name>dfs.namenode.http-address.ns2</name>
  <value>namenode2.example.com:9870</value>
</property>

core-site.xml — ViewFs mount table

<property>
  <name>fs.defaultFS</name>
  <value>viewfs://clusterX</value>
</property>

<!-- Mount /user and /tmp to NameNode 1 -->
<property>
  <name>fs.viewfs.mounttable.clusterX.link./user</name>
  <value>hdfs://ns1/user</value>
</property>
<property>
  <name>fs.viewfs.mounttable.clusterX.link./tmp</name>
  <value>hdfs://ns1/tmp</value>
</property>

<!-- Mount /data to NameNode 2 -->
<property>
  <name>fs.viewfs.mounttable.clusterX.link./data</name>
  <value>hdfs://ns2/data</value>
</property>

With ViewFs configured, clients use normal paths (/user/alice, /data/sales) and the client library transparently routes to the correct NameNode.

Format and Start

Format each NameNode separately with a shared cluster ID so DataNodes can register with all:

# Get cluster ID from first NameNode format
hdfs namenode -format -clusterid hadoop-cluster-01

# Format second NameNode with same cluster ID
hdfs namenode -format -clusterid hadoop-cluster-01 \
  -force    # on namenode2

# Start both NameNodes
hdfs --daemon start namenode   # run on each NameNode host
hdfs --daemon start datanode   # run on DataNode hosts

Balancer Across Pools

The HDFS Balancer works per-namespace. Run it against each NameNode independently:

hdfs balancer -fs hdfs://ns1
hdfs balancer -fs hdfs://ns2

When to Use Federation

Namespace exceeds ~500 million files on a single NameNode
Multiple teams/projects need isolated namespaces with separate quotas
You need to split metadata throughput across multiple NameNodes
Planning for multi-tenant cluster with chargeback per namespace

For clusters under 300–400 million files, standard High Availability (active/standby NameNode pair) is sufficient and simpler to operate.

The Single NameNode Bottleneck​

How Federation Works​

Configuration​

hdfs-site.xml — enable federation​

core-site.xml — ViewFs mount table​

Format and Start​

Balancer Across Pools​

When to Use Federation​