Pat’s NAS Building Tips and Rules of Thumb

My old friend Brian Moses usually publishes one or two DIY NAS builds each year. If you’re looking to build your own storage server for your home office, and you’re interested in having a shopping list that will just work, you should check out Brian’s builds. He usually alternates between an economical DIY NAS build and an insane, top-dollar NAS build.

If you have questions that aren’t answered in his build blogs, you’ve come to the right place. I’m sure I’m going to be repeating things Brian has said over the years. I will be expanding on some of those things, and hopefully touching on some other topics.

Some of Brian's NAS server builds

These are tips and general rules of thumb. I am not aiming for precision. When you’re done reading this post, I want you to be able to quickly make comparisons between things. I don’t want you to have to do math. I want the decisions you have to make to seem obvious.

Most of what I say in this post will be true 90% of the time. There are always exceptions. There will always be use cases where this information will break down. I’ll do my best to make note of these occasions, but we won’t be diving deep into any of those situations here.

DIY NAS: EconoNAS 2019 at Brian’s Blog
DIY NAS: 2019 Edition at Brian’s Blog

A hard disk is faster than Gigabit Ethernet

Network speeds are always published in bits per second. Storage speeds are always published in bytes per second. This makes for some awkward misunderstandings when people say things like “gigs per second” without qualifying whether they’re talking about bits or bytes.

You might see that your hard disk tops out at “150 meg per second,” and assume that it will only use 15% of the bandwidth on your “1000 meg per second” Gigabit Ethernet network. As it turns out, there are eight bits in a byte, so you have to multiply the hard disk’s 150 megabytes per second throughput by eight to see that it is pumping out 1.2 gigabits per second’s worth of data.

Hot swap bays

There’s other overhead to account for. All the layers on the network stack each whittle away at your theoretical 1,000 megabits per second, and your file-sharing layer eats up the most.

A good rule of thumb to convert from raw network speed to storage speed is to just divide by ten. Your Gigabit Ethernet network can move files around at about 100 megabytes per second.

You don’t need a big, fancy RAID or SSDs to max out one or two Gigabit Ethernet ports.

A solid-state drive is almost as fast as 10-Gigabit Ethernet

The ancient Crucial M500 SSD in my desktop computer can just about max out a SATA version 2.0 port. My measurements back in 2014 said this drive can manage 300 megabytes per second.

If you use our rule of thumb, you’ll know that my 5-year-old solid-state drive can fill up about 1/3 of a 10-Gigabit Ethernet pipe. SATA version 3.0 is twice as fast, and even the cheapest SSDs can easily reach 500 megabytes per second now.

CX4 Cable for Infiniband

Without accounting for redundancy, you only need two SSDs or six 7200-RPM hard disks to keep up with 10-Gigabit Ethernet.

Building a Cost-Conscious, Faster-Than-Gigabit Network at Brian’s Blog
InfiniBand: An Inexpensive Performance Boost For Your Home Network at patshead.com

10-Gigabit Ethernet is slower than many NVMe drives

Our rule of thumb says 10-Gigabit Ethernet will max out at around 1 gigabyte per second. In the real world, it won’t do quite that well, but it will get close.

The Samsung Evo Plus NVMe can move data at 3.5 gigabytes per second.

Serving files doesn’t require a gargantuan processor

When I put a 20-gigabit Infiniband card into my virtual machine host, it was running an AMD 5350 CPU. This is a slow, laptop-grade, power-sipping CPU. It was slow when it was new in 2015, and it would be even slower today.

I don’t have adequate PCIe slots to fully utilize my ancient Infiniband cards. They are 8x PCIe 2.0 cards in 4x PCIe 3.0 slots. My available slots just aren’t wide enough, so my tiny Infiniband network maxes out at around 7.5 gigabits per second.

I know it seems like I’m veering off course here, but I feel that I needed those previous two paragraphs to make sure you understand that the CPU wasn’t much of a bottleneck here.

Motherboard with Ryzen 5 1600

My original blog post says I hit 320 megabytes per second, and it says I was able to double that throughput by running multiple file-transfer operations at the same time. Let’s just assume that is 600 megabytes per second. That low-end, laptop-grade CPU can manage 600 megabytes per second over Samba.

I have since upgraded that server to a Ryzen 1600. I still have the same theoretical limitations from the PCIe bus, so I am not able to break 700 megabytes per second.

If you’re serving files over plain, old Gigabit Ethernet, any Intel or AMD CPU made in the last 10 or 15 years will have no trouble keeping up. Those old processors might not keep up with 10gbe, but you absolutely won’t need a high-end processor either!

Building a Low-Power, High-Performance Ryzen Homelab Server to Host Virtual Machines at patshead.com

Streaming video is easy work

I’m always seeing folks asking what kind of crazy hardware they need to stream video around the house. Is Gigabit Ethernet fast enough? Do I need lots of hard drives? Do I need SSDs?

Streaming video doesn’t require much network or disk performance at all. An HD Blu-ray disc maxes out at 35 megabits per second, while Ultra HD Blu-ray discs max out at 128 megabits per second. This doesn’t mean movies are encoded to max out these numbers, but this is the upper limit on what the media can deliver. In fact, the 128 megabits per second number only applies to triple-layer discs. Most movies won’t be hitting numbers like that.

These numbers are tiny! If you have more than six or seven televisions, you might start to run out of bandwidth on your Gigabit Ethernet port if you’re only serving up those rare UHD 128 megabit streams. I don’t even know if these sorts of Blu-ray discs exist yet!

Even in the worst-case scenarios, streaming your prerecorded video doesn’t require any heavy lifting.

When streaming isn’t just streaming

A lot of people seem to use Plex for their video-streaming needs. Plex can do plain old streaming. That’s where Plex just pulls the video off the hard drive and sends it directly to the client in its existing form.

Plex is also capable of transcoding video. This allows the Plex server to convert 4K content down to something like 1080p, 720p, or 480p. Maybe one of your televisions can’t decode a 4K video. Maybe your phone is streaming from your Plex server over the Internet, and you just don’t have enough bandwidth to handle the original, full-quality stream.

Depending on your needs, this can require a ton of CPU or GPU power. If you need to send the streams to a phone using your data plan, you probably can’t get around this. If quality in these situations isn’t all that important, you probably don’t need all that much processing power. Encoding to 4K requires about four times as much processing power as encoding to 1080p, and encoding to 1080p requires about six times as much power as encoding to 480p.

I’ve seen people build Plex servers with expensive CPUs and GPUs just to stream to lower-spec devices around the house. I think that’s a terrible idea. Set-top boxes that can play 4K content are inexpensive now. They also sip power compared to a beast of a machine that you might need for transcoding!

I guess we can get away with calling this a rule of thumb. If you’re transcoding video, you’re not building a NAS anymore.

Random access kills performance

Input/output operations per second (IOPS). Seeks per second. This is how we measure random access.

To seek to a random location, a spinning hard disk needs to move the head to the correct position. Once the head is in the correct position, it has to wait for that data to pass under the head. The platter is spinning at a constant speed, so that’s all it can do.

This limits a single hard disk to around 100 to 200 IOPS. This improves a bit as you add more disks to your RAID, but spinning disks will never approach the thousands, tens of thousands, or even hundreds of thousands of IOPS of a solid-state drive.

Even if you built your NAS without using spinning media, network latency has an impact on your maximum IOPS, too.

Networks and old-school hard drives are good at sequential access, but pretty poor when it comes to random access.

Larger drives have better throughput than smaller drives

This isn’t always true, but it is often true when you’re talking about spinning media.

We’ve been using 7200-RPM hard drives for decades. I want to say my first 7200-RPM disks were a pair of 13 GB IBM DeskStar drives. All disks spinning at 7200 RPM will have the same average seek time, but the amount of data that passes under the read/write head on each revolution keeps going up.

The math to determine exactly how much faster things get hurts my head, and it really isn’t important. The important thing to remember is that they don’t make any 7200-RPM desktop hard drives today that are slower than Gigabit Ethernet!

This is sometimes true with solid-state drives as well. You can think of it as having a RAID of flash chips attached to your SATA port.

Keep this in mind when you’re shopping, but don’t assume that it is true. A 2-terabyte drive from five years ago might have four 500-megabyte platters. A similar drive manufactured today might only have a single 2-terabyte platter. The newer drive should be faster.

Don’t make assumptions. If performance truly matters, pay attention to benchmarks.

Don’t buy identical drives!

When you buy a Dell or HP server, it will be loaded up with a stack of identical drives. In fact, there’s a good chance that most servers in every rack in your data center have the same make and model of hard drive.

This isn’t a big deal when you’re operating at that kind of scale. We aren’t that big. We’re building a homelab NAS. We don’t have shelves full of spare parts. We don’t have a tech that will come out in 4 hours with a replacement drive. We need things to keep working.

If you buy six identical drives, you might get lucky. On the other hand, you might draw the short straw. Every now and then there’s a bad batch of drives, or an entire product line that winds up being terrible. When you start reading the news that your choice of hard drive has a defect, and they’re failing like crazy, you’ll start to worry about your data.

Hedge your bets. Don’t buy six identical drives. Buy from two or three different manufacturers. When you read the news that the fast, reasonably priced IBM Deskstar drives that you bought are destined to succumb to an early death, you won’t have to freak out. Only two of your six drives are Deskstars, and you have a RAID 6. Both Deskstars could fail at the same time, and you won’t even notice!

Building a Homelab Server at Brian’s Blog

Assume disks will fail

Disks fail all the time. Backblaze posts fantastic write-ups about their hard disk failure rates, and they have a huge number of disks. Their failure rates are usually under 5% for the first two or three years, then the chance of failure increases quite a bit.

I’m not even going to attempt to do the math here. I’m terrible at math.

If you have one disk with a 5% chance of failure, that’s not too bad. If you have 10 disks, and each disk has a 5% chance of failing, then the odds of losing one in the first year or two are quite high.

Assume drives will fail. Budget accordingly.

RAID is not a backup!

RAID helps prevent downtime. RAID doesn’t do much of anything to ensure the integrity of your data.

If you have a single drive, and it fails, your server is going down. You’ll have to replace the drive, reinstall your operating system, and restore from backup.

If a disk in your RAID fails, the server keeps on ticking. You can swap in a new disk, the array will rebuild, and you’ll be ready for your next disk failure. If you don’t have hot-swap-capable gear, you might have to shut down the server to replace the drive. This is a minor inconvenience compared to restoring from backup!

RAID does not protect you from:

Disk controller failure
A bug in your SATA driver
Too many disks failing
Accidentally deleting things
Buggy software deleting or corrupting data
Ransomware
Malicious users messing with your stuff

If your data is worth storing, then it is worth backing it up. Off-site backups are even better!

Pat’s Backup Strategy for 2019
Backing up my FreeNAS to Backblaze B2 at Brian’s Blog

Data you spend time creating is worth backing up

Those photos and videos of your kids? You should absolutely back those up. One or more copies of important stuff like that had better be stored at an off-site location. Those copies could be in the cloud, or on a hard drive stashed at someone else’s house. You can’t go back and take those photos again.

Pat's Backups on his Seafile server

On the other hand, you don’t need to back up your Steam library. Grand Theft Auto V is 91 gigabytes. If you lose your local copy, it doesn’t matter. You just have to log back into Steam, click the install button, and before long you’ll have your game reinstalled.

Here’s our rule of thumb. If you created the data, it is worth backing up. Photos, videos, save games, spreadsheets, or blog posts. If you spent your valuable time creating a file, you should be backing it up, and at least one of those backups should be stored somewhere outside your home.

Pat’s Backup Strategy for 2019
Backing up my FreeNAS to Backblaze B2 at Brian’s Blog

Choosing a RAID level

I have an entire blog post dedicated to the topic of choosing the right RAID level for your needs, so I’m going to keep this brief.

RAID 5 is unsafe. The odds of encountering a read error while rebuilding a RAID 5 array increased to well above 90% when hard disks reached the 300- to 400-GB range. RAID 6 works just like RAID 5, except you lose one more disk’s worth of space to redundancy. Use RAID 6 instead.

This is when Rascal got interested in what the camera might be. Harley is staring directly into Rascal's butthole. That blurry orange area on the right is Rascal. pic.twitter.com/rVS51RHtgO
— Pat Regan (@patsheadcom) October 15, 2019

Here’s another somewhat accurate rule of thumb for you. RAID 6 read performance is similar to RAID 0. Every time you add a drive, read speeds increase. Use twice as many disks, and you’ll see roughly twice as much performance. RAID 6 write speeds tend to feel about as fast as a single drive.

Use RAID 6 or RAID-Z2 if you’re storing backups, videos, or photos.

If you need that extra write performance, choose RAID 10. If you only have two disks, use RAID 1.

Never use RAID 0.

Choosing a RAID Configuration For Your Home Server

Hot-swap bays are overrated

When I was a professional server and network engineer, hot-swap bays were awesome. We had racks full of servers from the same manufacturer. They all used the same disk enclosures, and we only had two or three different sizes of disk. When a drive died, you grabbed a fresh drive from the shelf, plugged it in place of the failed disk, and waited for someone to pick up the dead drive.

It doesn’t work like that at home. I have one server that could benefit from hot-swap bays. Even if I had two machines here at home, odds are good that they’d have completely different enclosures and require different size disks.

The manufacturer isn’t going to put them into sleds for me. I’m going to have to pull a disk out of the server, unscrew it from the sled, screw the fresh disk into the sled, then slide it into the case.

I'm slowly preparing to #3dprint my own #NAS case, hopefully with a bit of my own touches! I sat aside some time today to start printing/checking Toby K's calibration objects. Want to know more? Check it out at: https://t.co/NOPRpM5rEA pic.twitter.com/IkzLFGfo8M
— Brian Moses (@briancmoses) April 14, 2019

It isn’t much more work for me to pop the side off my case and swap out a disk. The side panel on my server is held on by thumb screws, and my case doesn’t require any tools for installing 3.5” hard disks.

Yes. I need to power down my home server to swap a drive. This is just my home. I can tolerate 15 minutes of downtime every once in a great while.

Don’t forget that many DIY servers at home are built from consumer-grade parts. SATA is supposed to support hot swapping. Sometimes it just doesn’t work, and you’ll have to reboot anyway.

If space isn’t at a premium, you can find plenty of ATX tower cases that will hold 8 or more drives for well under $100. Most of the nice cases with hot-swap bays are closer to $300.

Growing ZFS can be inefficient and costly

You can’t yet add disks to an existing ZFS zpool. If you create a RAID-Z2 with 6 disks, it will always have 6 disks.

You have two upgrade paths when you need more space. You can replace all 6 disks with larger disks. You swap out one disk at a time, then let the RAID rebuild before swapping the next. After swapping the last disk, your zpool will automatically grow.

You can also create a second zpool. You might do this by creating another RAID-Z2 with 6 more disks. When you do this, you will have two disks’ worth of redundancy in both RAID-Z2 arrays. This is a little wasteful.

Growing Linux MD RAID is cheap and easy

Linux’s MD RAID system is extremely flexible. You can easily add additional disks to most existing arrays. When you run out of space, you can grow your RAID 5, 6, or 10 by adding a single disk to the array.

Linux’s RAID 10 implementation is a little different than most people expect, as it allows you to run an array with an odd number of disks.

I’ve already written an article about choosing a RAID level for your home server, and it goes into much greater detail about this.

Adding Another Disk to the RAID 10 on My KVM Server at patshead.com
Building a NAS: Buy Lots of Drives or Just What You Need? at patshead.com

Don’t overthink it

I can already see that this blog post is going to be more than 3,000 words long by the time I get to the end. This isn’t even detailed information. I could easily write another 3,000 words about almost any individual heading in this post.

Don’t overthink your build. Unless you have unique needs, you don’t need the fastest CPU. You don’t need to find the perfect drives. You don’t need the ideal RAID configuration. You don’t need a perfect backup solution.

If you don’t want to think about anything, just build Brian Moses’s EconoNAS. He has chosen economical parts. He has hard drive recommendations. He’s even bought all this hardware, assembled it, and tested it. You’ll know all this stuff fits together and functions, and there shouldn’t be any weird gotchas!

Let’s just make a list of the tips and rules!

I guess this is the TL;DR summary:

Disks are faster than Gigabit Ethernet
SSDs are as fast as 10gbe
A NAS doesn’t require much CPU, especially for Gigabit
Streaming video doesn’t require much bandwidth
If you’re transcoding video, you’re not just building a NAS anymore
Sequential access is fast and easy
Random access is slow and may require SSD or RAM caching
Big drives are often faster than small drives
Build your RAID with drives from different manufacturers
RAID is not a backup!
ZFS helps prevent silent data corruption
Linux MD RAID can be grown more easily than ZFS
Hot-swap bays are overrated
The largest hard disks are rarely a good value
Data you create is worth backing up
Don’t overthink it

One entry on the list isn’t mentioned anywhere else in this post.

Conclusion

If you found this blog post, you’re already thinking about building a NAS. You may have been directed here from one of Brian’s DIY NAS build blogs, or you may have been asking Google about something I’ve covered up above.

Either way, I hope I managed to answer your question, and I encourage you to get building! There’s nothing magical about a NAS. A NAS server is just a computer. As I said earlier, don’t overthink it. You probably have an idea of what you need, so get out there and get building!

Did I do a good job with these tips? Did I leave anything important out? I left a ton of information out to keep this under 3,000 words, and we’re at 3,400 words now, so I’m certain someone will have questions! Please feel free to ask a question in the comments, or stop by the Butter, What?! Discord server to chat with Brian or I about NAS builds and other fun things!

DIY NAS: EconoNAS 2019 at Brian’s Blog
DIY NAS: 2019 Edition at Brian’s Blog
Building a Cost-Conscious, Faster-Than-Gigabit Network at Brian’s Blog
InfiniBand: An Inexpensive Performance Boost For Your Home Network at patshead.com
Building a Low-Power, High-Performance Ryzen Homelab Server to Host Virtual Machines at patshead.com
Building a Homelab Server at Brian’s Blog
Pat’s Backup Strategy for 2019
Backing up my FreeNAS to Backblaze B2 at Brian’s Blog
Choosing a RAID Configuration For Your Home Server
Adding Another Disk to the RAID 10 on My KVM Server at patshead.com
Building a NAS: Buy Lots of Drives or Just What You Need? at patshead.com