I recently rebuilt my Ceph lab to more closely mirror a real production deployment, rather than the usual - it works but don’t look too closely lab setups.
The goals were simple but non-negotiable:
- 3 MONs (odd quorum)
- 2 MGRs (HA control plane)
- Host-level fault domain
- Replication size = 3
- RGW (S3) only — no CephFS, no RBD
- Clean DNS (no
/etc/hostshacks)
This post walks through the exact process I used to deploy a clean, repeatable Ceph RGW cluster using cephadm on Ubuntu, with explicit placement control and zero surprises.
π§ Cluster Design – Nodes & IPs
Monitor / Manager Nodes (Control Plane)
- ceph-mon01 — MON + MGR —
172.16.1.81 - ceph-mon02 — MON + MGR —
172.16.1.82 - ceph-mon03 — MON + MGR —
172.16.1.83
RGW (S3 Gateway)
-
ceph-rgw01 — RGW Gateway —
172.16.1.86
OSD Storage Nodes
- ceph-osd01 — OSD Node (6 × 80 GB disks) —
172.16.1.91 - ceph-osd02 — OSD Node (6 × 80 GB disks) —
172.16.1.92 - ceph-osd03 — OSD Node (6 × 80 GB disks) —
172.16.1.93
π Cluster Requirements
- Ceph deployed via cephadm
- Object storage only (RGW / S3)
- Replication size = 3
- Root SSH enabled (lab convenience)
- DNS must exist (forward + reverse) for all hosts
-
❌
/etc/hostsis not allowed for Ceph nodes
π ️ Prepare All Nodes
Set FQDN Hostnames (run the correct command on each node)
π DNS Sanity Check (run on every node)
If this doesn’t work, stop here — Ceph will absolutely punish bad DNS later.
π¦ Install Required Packages
π Enable Root SSH (lab choice)
Ensure:
π Bootstrap Ceph (on ceph-mon01 only)
Install cephadm
Bootstrap the cluster
Temporary placement safety (important!)
This prevents Ceph from temporarily placing MON/MGR daemons on random hosts before we apply labels.
π Fix SSH Key Distribution
Cephadm generates its SSH key here:
Copy it to all nodes:
➕ Add Hosts to the Cluster
From the cephadm shell on mon01:
π·️ Label Hosts (this is where control happens)
π§ Deploy Additional MONs
π Lock MON and MGR Placement (don’t skip this)
This guarantees:
- Exactly 3 MONs
- Exactly 2 MGRs
- Zero daemon drift
Verify:
π½ Deploy OSDs
You should end up with 18 OSDs total (6 per node).
☁️ Deploy RGW (S3)
π§± Set Replication Size = 3
RGW system pools (after RGW deploy)
Defaults for future bucket pools
✅ Final Result
At this point you have:
- 3-node MON quorum
- HA MGR (1 active, 1 standby)
- Host-level CRUSH failure domain
- All RGW pools replicated ×3
- Future buckets inherit correct durability
- A Ceph lab that behaves like production
If you’ve ever wondered why Ceph gets a bad reputation — it’s usually because people skip steps like DNS, placement locking, or CRUSH validation. Don’t do that. π
Happy clustering π
Comments
Post a Comment