Fast multithreaded backup of terrabytes from Singapore using Backblaze
April 09, 2015
Ever since I found Backblaze, and I don’t know why I did not find it earlier, I have been quite happy with money well spent. I currently have 4 USB drives apart from 2.5TB internal space. This totals to a lot of space and I only use 4Tb of it. This is my home setup. I have tons of VMs and development stuff lying around. I recently lost a fairly new Western Digital Drive of 2TB for no good reason. It just died… like physically. Taking 1.5TB of VMs and snapshots. Terrible day as I spent 2-3 hours opening the drive up and confirming it was completely dead. I should have seen the signs…
I decided online backup was the way to go. While slow atleast it would keep my data recoverable without adding more drives. I looked at Amazon S3 as my first Choice. However with Terrabytes of data the cost would become VERY VERY prohibitive. I don’t mind paying but the difference between buying drives and backup has to be equal or less to be economic. I could just as easily built a raid 1 by buying lots of cheap drives and putting them into a NAS for backup.
After (a few) quick Google search revealed BackBlaze. A lot of Sysadmins around swear by it. I took a look at what these guys do and this post and knew they try very hard to make data backup affordable and reliable.
Their plan amazed me. US$50 for 1 year of unlimited backup per system and it includes attached drives. Secondly the ability to ship drives of your data back to you. While I have 1GBps connection it would still be slow to get a few Terrabytes from across the world and it is not going to be healthy use of time.
I downloaded the trial. I had some issues initially getting it running because the interface isn’t very intuitive but once you know what’s where it’s pretty much on it’s own as it starts its first back up. So off I want happily to sign up and backup everything.
Unfortunately the upload speed was painfully slow… somewhere in the region of 200-500KB/s at best. It would take a whole 3 years to transfer 5TB of data if not more. I emailed support to ask if there is option for multiple parallel upload since their application uses only ONE upload thread. They replied back that they would “soon” be releasing the multithreaded upload/download. I waited and 3 weeks later I was still left with over 90% of the volume of my files. However Backblaze was smart enough to keep the larger files for the last. The number of at-risk files were a few thousand. I usually keep my PC on at all times but I dont mind restarting now and then due to forced Windows 8 updates.This time I have had it on since Backblaze for almost 21 days non-stop. Something is better than nothing I say.
I was hypothesizing (read:daydreaming)  upon what could change in my life over the next 3 years as the first backup actually started to come to a closure… what would I be doing on the day when I  discover it actually completed. Perhaps I would celebrate and then find out a few more files have been added to the never ending list… or may be Joe from support was telling the truth
I hate their site for finding any new information so I googled for the answer once again hoping someone made a workaround… I could not believe the search result… It has the word “Multithreaded”. In my entire programming life , MT has not made me as happy as it did now.
so the GOOD news! The latest version of Backblaze enabled multi threaded uploads. So despite the physical distances between my BackBlaze and my local desktop, the conspiring bast**d of a  RTT I could now upload in parallel and use the max of my 500Mbps upload speed. Well anything better than the 2Mbps I was getting normally anyway :)
Excited. I could not wait and installed the latest version. Sure enough the option was there as per their pitch. With one thread my transfer was shown as 2.35 Mbps which I can easily confirm with other tools and my UBNT router.
With the thread set to 4 backups were already Flying. The file names whizzed by in the Backblaze control panel. Still small files though as the program had found new tiny useless fragments of my entire Raspberry Pi mirror to backup again. Who cares anyway. The only  I could tell if the speed was actually being maximized was when large files would be transferred. I was already seeing an improvement by 10x times
I am currently able to get a consistent 25-30 Mbits/ Sec.  The next step is only to contact my ISP MyRepublic and get then to do a better job of this. A Lower RTT could mean the world of difference.