Tuesday, July 13, 2021

Setting up home NAS - Part 4 - Raspberry Pi to build locally redundant NAS for home

This is the 5th post in the "Setting up home NAS" series. Below are the previous posts. It is not required to read previous posts to understand this. But that gives some background on this series.

Introduction

I was running below model for the home NAS system. My main contributor is the video category. My DJI Osmo Pocket produces high-bitrate videos. Also the RAWs.
  • Primary
    • A router attached USB drive working as NAS. \\192.168.1.1\share. It is of 1TB capacity
    • Sync the files to my personal computer that has a 500GB drive. So effectively getting only 500GB which are backed up. SyncToy is working fine for on-demand sync
  • Secondary (after publishing to YouTube)
    • Combine all the raw videos to single and upload to YouTube in private visibility
    • One back up to another external disk that has 2TB.
    • That means during editing time, 2 copies are in the router attached USB drive and personal laptop. After publishing, one uploaded to YouTube and the other copy in the different 2 TB hard disk.
As the COVID-19 pandemic slowed down and places got opened, I got chances to visit more places. Meaning more files to primary storage. One trip brings close to 10Gigs. The 500GB limit is almost reached its 3/4th capacity.

Problem

All my previous posts were explaining a problem and a solution. Every time a new problem. This time it is again going to reach the capacity and I had to do something to increase NAS storage space. As always I need two copies of files to be kept. Not an enterprise-grade NAS.

Requirement

This time I did some requirement engineering. Its nothing but looking at the past on how much data I was generating and forecasted how much more I would be in 

Below are some facts I could see
  • After I bought DJI Osmo Pocket there are a lot of videos coming out from trips
  • One trip produce 10GB on average
  • There are around 1.5 trips avg we are making per month. ~15GB
  • We shoot some videos at home as well. 2.5-3 GB
  • Adding the above + a buffer of 2GB ~ 20GB per month ~ 240GB/year ~ 1200GB/ 1.2TB for 5 years
    • Buffer is required for technical video downloads and anything which may come on our way.
  • There have to be 2 copies resulting in 2,400 GBs. But once published one backup is in YouTube itself with visibility private. So the final requirement is 1.8TB
There can be 2 systems similar to what I have now. Once published the videos can be moved to secondary.

Options

Is it time to think big or just add more drives? My YouTube channels are still in it's infancy. No hope to get monetized in the near future. The google ads revenue from Blogs is just enough to renew the https://joymononline.in site.

If I spend more and I lose interest in YouTubing, there will be trips but just photos coming out from those. Photos can easily be stored in 2 places by using Google Photos and one hard drive.

Currently, I have one more 1TB internal hard disk which is taken from the personal laptop when it was upgraded to SSD. Total of 1 TB hard disks x2, 2 TB hard disk x 1, and 500 GB on a personal laptop.
Below are the options I could find 
  1. RaspberryPi 4 or a similar low-powered device. Connect 1 TB hard drives to it and use as primary. Secondary continue as it
  2. Dedicated NAS device.
  3. Build a server machine by sourcing refurbished parts.
  4. Store in the cloud. OneDrive/Google drive etc...
If the budget is not a constraint, I would be going with the cloud option only. 

Solution

Finally decided to go with RaspberryPi 4. Below are the rationale
  • Its low power
  • It can be used for something else
    • If I lose interest in Youtubing which reduces the data generation.
    • If the OneDrive or Google Drive drastically reduce their price for me to afford for 2TB.
    • If our total family income increases and gets more budget to tech things cloud storage would be affordable. Yes, we do some family budgeting on the expenses. It is a big topic on its own
  • I am already planning to learn Linux and would like to learn the shell way. Why don't I learn with my own problems?
  • To advance my career, I would like to learn more about enterprise IT concerns. Dealing with my own problems would be a good starting point
  • Obviously, my requirement of 1.8TB for the next 5 years can be satisfied with this setup. The only problem would be the slow publishing of videos which causes more videos to stay in primary storage which has only 1TB capacity.
The architecture would be similar to what I am using now
  • Primary storage
    • 1TB connected to RPi, expose as \\SMB shared path
    • Backup 1TB attached to the router. 
    • Daily sync RPi connected the drive to the router attached one
    • We will be seeing why can't I just connect 2 HDDs to RPi 4 which has 2 USB 3 ports
  • Secondary
    • Same setup as of now

Home NAS using Raspberry Pi

Finally, we are discussing the business. Let us see what are the decisions, steps and issues faced.

Decisions

There are some decisions to be made

Size of RPi

The RPi 4 comes in different sizes based on the RAM. Many users say it requires only 2 GB to run as NAS. 8GB would definitely be overkill. So settled on 4GB RPi

Powered USB Hub

This is more electrical than setting up something. The very basic is any electrical equipment needs a power source. When we connect the hard disk to RasPi via a USB port, the power source of the HDD is RasPi. If RasPi cannot provide enough power to the HDD, it cannot function normally. We know the voltage is 5V but that is not all. We need to understand the maximum Watt requirement of HDD during its operation. Rotational Disk-based HDD requires the highest power to start spinning up. RasPi should be able to give that power though it may require less power during operation. In order to connect the dots, we need to consider the current as well which is measured in Amperes often noted in devices using either A or Amps or mA (milli Amps). Below is the relation that we learned in school.

W = VA

or

W=V*A (just for developers)

If one device specify W and V and another device specify V and A we can kind of do this calculation to see whether they match up 

Specs

It's time to read specs that nobody likes. RasPi specs say all 4 USB devices can together draw 1.2A of current. We know the voltage is 5V. So the max Power that can be given out through all the USB ports is 1.2 * 5V = 6W

In my case, I have 2.5" a laptop drive which has a requirement of 1.2Amps during spin up and an operational requirement of .8Amps. Meaning I can connect only one HDD to the RasPi directly at a time without an external power source. If I need to connect both the HDDs to RasPi, I would require a USB hub that has its own power source. Or buy another external HDD that is powered by a separate adapter.

This is the reason why I had to still connect my backup HDD of primary storage to the router. The router has enough power to drive HDD.

RPi OS

There are many operating systems available for RasPi. Even we can have Windows using its IoT Core edition. But decided to go with Raspberry Pi OS (Rasbian) as that is targeted to this device.

HDD FileSytem

Prefer to go with NTFS than Linux native Ext format. The main reason is in case RasPi fails I can connect the HDDs to my personal windows laptop. Planning to slowly migrate to Linux by starting with USB Linux installation with persistence storage.

Sync method

Our aim is to have one HDD connected to RasPi and another HDD to router and sync them. One more decision to be made is how to sync these 2 HDDs. There are 2 options I could find though they are not for solving the same issue and mutually exclusive.

  • RAID

There are already articles available on how to set up. Here is one good article on the same. I don't recommend this approach as this is mirroring. As soon as we do a change, it's replicated. No way to get the old copy in case we accidentally delete something.

  • RSYNC

This is a utility command which can sync 2 folders. The command has to execute to get the sync to happen. We can schedule easily during off-hours. Since it's home use, any time starting at 1 AM is fine. But the time depends on how your family members use the system.

There are disadvantages as well. If we copy something in the morning then in the afternoon primary hard disk failed. We lose the data as the sync happens only the next day at 1 AM. Also, if we accidentally delete something and we recognize it next week, there is no way to retrieve it.

I decided to go with this approach. Below is the link
https://www.howtogeek.com/139433/how-to-turn-a-raspberry-pi-into-a-low-power-network-storage-device/

Steps to get NAS up and running from RPi

These steps are mainly based on the article above in the RSYNC section. But there are some additional steps. Only the deviations and issues faced are documented rest of this article. Better to read the above article before reading further.

Let us start with something, not in that article.

Setting time zone

When we set up for the first time, the timezone would be the UK hence the time. If the location is not in the UK, it is better to change the time zone. Changing the timezone will update the time accordingly. I am not adding the step here as it's simply obtained by google.

Mount points

The tutorial mentioned above already has an instruction to install the NTFS driver. That has to be done before mounting the NTFS formatted drives. 

There are 2 mountings to be done. 

First mounting the USB drive connected to RasPi. To check the USB drive detected or not, use the below command
lsblk
It will list down the disks connected. Get the name from the list. It would be mostly /dev/sda1 if there is only one USB drive connected. In my case, I am using the first partition. Below goes the command to mount
sudo mount -t auto /dev/sda1 /media/usbseagate1tbc/
It has the values I used. the /dev/sdb1 is applicable only in my scenario. The second argument mount point is just a folder in /media folder. To my understanding, it can be even mounted to /mnt as well. Just naming things.

Next is mounting the external network path which exposes the router attached disk to RasPi. Below goes the command
sudo mount -t cifs -o username=<user name>,password=<pass>,vers=1.0,uid=pi  //192.168.1.1/USB_Media  /media/router
This is a little different than mounting the HDD connected to RasPi. Also, this step is not in the tutorial. Here we have to do the following
  • Make sure RasPi is able to connect to the router IP using the SMB port
  • My router only supports SMB 1 version. Fortunately, RasPi didn't complain anything about that.
  • The user name and password depend on what router support. Some router supports using the same credentials used to connect to it.
  • The IP of the router will be the same 192.168.1.1 but the path depends on how many partitions available on the HDD.
  • The uid is the user id that accesses the mounted external share. 
This step was not straightforward. I had to try many steps mentioned in forums and finally got it working. Something didn't work and some helped.

Expose SMB share from RasPi 

The steps to install samba and exposing a path from RasPi are the same. Just refer to the tutorialOne difference was that the command to restart the samba server is as follows
sudo /etc/init.d/smbd restart

Access share from the windows machine

The user is a standard user created. Its credentials will be used to access the share from other machines using SMB.
 
Below is the command to map the network path to x: drive in windows machines.
net use x: \\raspberrypi\usbseagate1tbshare /user:raspberrypi\<username>

eg: net use x: \\raspberrypi\usbseagate1tbshare /user:raspberrypi\<j3dshareuser>
Unfortunately, I faced an issue with this and had to spend 3-4 hours debugging it. Details are given in the Issues section.

rsync

The rsync command is the same as what is used in the tutorial. Only difference is that I added the progress indicator
rsync /media/usbseagate1tbc/Media/ /media/router/Media -av --progress --delete
Hope it's self-explanatory. I added it to check the speed.

CRON

Below is the command in crontab. The idea is similar to the tutorial. But added functionality to log the rsync status to a log file. Below goes the command in crontab
* 1 * * * /usr/local/bin/nassync.sh
Below are the contents of the shell script that is stored in the nassync.sh file. There was an issue faced when setting up log files based on time from the CRON directly without the .sh file. Details in issues section.
rsync /media/usbseagate1tbc/Media/ /media/router/Media -av --progress --delete >> "/var/log/j3dnas/$(date +%F_%H-%M-%S.log)" 2>&1
 rsync /media/usbseagate1tbc/materials-video/ /media/router/materials-video/ -av --progress --delete >> "/var/log/j3dnas/mat_video.log" 2>&1
Hope the command is self explanatory. The location of the .sh file should be accessible to the user who setup the CRON job. I used pi user itself. Better try to run the .sh file from the pi user's putty session before adding to CRONTab.

Planning to improve this script to send a notification of any failure.

Issues

Let us discuss the issues faced.

Access from different homegroup Windows computer

The first issue faced (mentioned in the previous section) was about accessing the share from non-HOMEGROUP computers. Tried to change the Local policy but no access to change it. The error that I got when using the net use command to map the network share to the Windows drive letter was

The password or user name is invalid for \\server. 

Enter the password for 'server\Test' to connect to 'server': 
System error 1326 has occurred. 

The user name or password is incorrect.

The password obviously matches as I am able to connect from HOME GROUP computers. I tried even by exposing the share from RasPi without authentication and many combinations mentioned in this thread.

Finally, I found a step from the SuperUser forum. It worked. It's a registry change.

Location - HKLM\SYSTEM\CurrentControlSet\Control\Lsa
Data type - DWORD 
Key name - LmCompatibilityLevel
Value - 5

If it does not exist, create it.

Case sensitivity

This is obviously because I am not familiar with the Linux file system. I know its case sensitive. Trying to use all lowercase for the folder and file names. But 3-4 times, I stuck because some characters in long paths were having caps in between.

The interesting thing I noticed is that the ls don't care but the cp command does. More on this can be read in my blog post about Linux case sensitivity in mounted network path - ls v/s cp commands.

Shell script in CRON cmd

What we really want is that every CRON job execution should produce a log file based on time. The below expression should have worked
* 1 * * * rsync /media/usbseagate1tbc/Media/ /mnt/router/Media -av --progress --delete >> "/var/log/j3dnas/$(date +%F_%H-%M-%S.log)" 2>&1
Unfortunately, it didn't work. 

Then thought of invoking a shell script from CRON which has the rsync commands and it worked. I dig deep into why the shell script mainly the $(date... expression to change the path to insert the date and time didn't work when executed from CRON. The result is another blog post "Linux - script in command shell v/s triggered from cron"

This may look simple but it is not to people like me who are new to the Linux world. This is just basic this doesn't cover what will happen when the RasPi reboot. Things need to be redone if RasPi fails and replaced with another RasPi. Hopefully, I can script everything and publish it. 

In case anyone has better idea please add to comments

1 comment:

Anonymous said...

I just wanted to take a moment to express my appreciation for your fantastic blog. Your posts are always so informative, engaging, and thought-provoking. Your writing style is clear, concise, and easy to understand, and you have a way of making even the most complex topics accessible to your readers.
If you are interested in buying Raspberry Pi or Arduino Board & Accessories then kindly click here