Processing large quantities of high resolution images might be resource over-consuming even for fast computers. While most popular performance factor in computer world is CPU frequency or number of GPUs installed, storage efficiency is often underestimated. Nonetheless, it gives more performance boost than you may expect.
Article presents the concept of fast storage architecture optimized for serious photo & graphic workstations. It’s focused on daily photo editing workflow including intense batch processing with popular software like Adobe Photoshop™. This text summarises my experiences in storage design on various PC/Mac systems. System architecture in both platforms is basically identical, so you should be able to expand it easily with the same hardware available on the market.
Basic rule in my storage philosophy is one drive = one volume. In other words, every volume should be a separate physical device (drive or disk array).
DO NOT USE PARTITIONS
If you’re using disk partitions in simultaneous processes, number of physical drive I/O operations is divided by n where n = number of partitions on the disk. This usually causes proportional drop of performance. Independent drives is much more efficient solution. It’s also easier to maintain and much safer – if a drive failure occurs, you’re losing only one logical volume.
Here are general requirements to our storage system:
- for optimal maintenance, security and backup, system and applications are stored on individual, single drive (SYSTEM),
- as common for many apps, some fast independent temporary/scratch disk space is required (TEMP),
- currently edited files should be stored on separate drive for convenient access (INPUT),
- system should be optimised for batch processing with additional fast OUTPUT storage,
- there should be one decent capacity main storage volume for storing all data (RAID),
- BACKUP volume covers main storage RAID, SYSTEM and INPUT volumes.
Now, let’s sketch a graph presenting disk layout:
Arrows represent general data flow directions.
It turns out we should provide six basic devices. Amount of disk volumes might seem little crazy but I guarantee it’s very simple to maintain and gives you much more comfort than you would expect.
Our system merits some cool name so let’s find one. Acronym is always a good choice. First letters of our volume names gives a STIORB, which doesn’t really sound good. Taking letters to scramble, only two sane anagrams coming: BISTRO or ORBITS. Of course I pick the second one, otherwise article would be titled Take your workstation storage to the BISTRO which might lead to some misunderstandings. If you find more interesting acronyms falling out from your volume names, share it in comments!
OK, let’s back to Earth.
To meet the requirements of the ORBITS layout, your computer should be modern (2010 or later) generation Windows workstation, “boxed” MacPro or Hackintosh, capable to connect several drive devices through the mainboard connectors or via additional PCIe cards.
Most tasks of image processing requires fast sequential disk read/write speed rates of the storage devices. Following hardware recommendations are based on this assumption. As a hackintosh user, I opt for cross-platform compatible hardware.
PCIe SSD: future is now
When there’s no compromises and/or budget is not the limitation, go with PCIe SSD. This is quite new approach to storage, where all interfaces are omitted and your device is connected directly to the motherboard’s PCIe bus. Bandwidth of the 4 PCIe lanes used by the drives is incomparably higher than SATAIII. While SATAIII real speed limit is 600MB/s, PCIe storage easily exceeds 2000 MB/s.
CYLINDER MAC PRO USERS
Apple introduced this technology in the “cylinder” MacPro6,1, but with only one connector slot available in the chassis. That’s why storage expansion is possible only via Thunderbolt (which is nothing more than a PCIe x4 interface integrated with DisplayPort) or USB3 ports.
In the ORBITS architecture PCIe SSDs would be ideal candidates to SYSTEM, TEMP, INPUT and OUTPUT volumes. However, motherboard must feature M.2 slots or several PCIe 2.0 x4 slots available. Here are my recommendations:
1. Samsung SM951 AHCI (for Mac):
M.2 NAND PCIe SSD flash drive. Available in three capacity versions: 128 / 256 / 512 GB. If you compare the specifications you may notice they actually the same, except varying write speeds which are 600 / 1200 / 1500 MB/s correspondingly. If you can afford, pick the 512 GB version as it features best parameters. Note you have to purchase AHCI version of this device.
2. Samsung 950 Pro (for PC):
V-NAND NVMe SSD also in M.2 format. Available in tvo variants: 256 / 512 GB it’s even better choice but remember these drives uses different protocol, NVMe instead of AHCI.
Here’s nice video by RamCity presenting M.2 SSD mount in capable motherboard:
If your motherboard does not have dedicated M.2 connectors you will need an adapter to attach this little fellow to your workstation. I recommend you Lycom DT-120 adaper as a proven and reliable solution:
It’s not the cheapest, but for the price you get almost enterprise-grade solution. Lets sum it up. The 512 GB Samsung 950 Pro SSD price with the Lycom adapter bundled costs about $350 (Feb ’16). To cover our needs four drives are required, which gives $1400 total. For that price you get a 2 TB of super fast SSD storage. Compare it with MSRPs of other similar products:
- Intel 750 Series 1.2TB: $1500
- Intel P3700 Series 2.0TB: $4500
- Fusion-io ioMemory SX350 1.25TB: $5700
If you examine the specifications and prices of various client- and enterprise-grade storage drives and compare it with presented option, you’ll easily notice it’s fantastic price / performance ratio.
SATA: standard solution
When for some reason you can’t afford PCIe storage you can choose much cheaper way. It’s SATA SSD storage which actually is a standard in computer hardware.
Be aware of varying SATA bus speeds due to generation. SATAI is 150MB/s, SATAII is 300MB/s and SATAIII is 600MB/s. Remember: SATAIII SSD connected to SATAII interface will work only at a half of its speed!
To ensure optimal results, verify your mobo-disk combination interface conformity.
According to huge variety of mainboard types and standards there is no one particular solution for optimal connections. Your mobo may have some limitations here, not only in quantity but also quality (eg. SATAII vs. SATAIII) of connectors. Take this into consideration before any hardware purchases.
As nowadays Samsung leads in flash memory design, I recommend also their SATA SSDs. Any modern 840, 850, 750 series model is perfect choice.
Browse also other manufacturer stuff as there’s many good hardware on the market. I have found Adata products very affordable and reliable having good experiences using both previous and current generations of their drives. For 3 years of the SP series usage never had any failure. Another three SX900 drives after a year of almost 24/7 work still shows 100% remaining life.
For the ORBITS sake use 4 SSDs for SYSTEM, TEMP, INPUT & OUTPUT drives.
SYSTEM stores all system & user data, including applications. Pick minimum 128GB drive. TEMP Make it the same capacity as the system drive. It should be possibly fastest device so consider setting up a software RAID0 volume from two SATAIII SSDs (if attaching 5th device is possible). INPUT capacity should meet your personal requirements. OUTPUT drive doesn’t have to be big, but definitely should feature fast write speeds.
WHEN THERE’S NO SATA CONNECTORS
In case your motherboard is short of SATA connectors but has PCIe slot(s) free, you may need some extension card(s) which allows attaching SSD drive(s) in PCIe slot. Good example is Sonnet Tempo SSD:
available also in Pro Plus version:
You may also find similar cards of other 3rd party hardware providers.
MAIN STORAGE (RAID)
To setup main storage in optimal way, think about your data; which files you access daily, weekly, monthly or once per year(s).
After two decades of computer work and collecting tones of various data I came to conclusion: simple solution is always the best. I keep all my most important files on the one large volume. Here’s why:
- I have one main photo library built for long period of time,
- working for the same clients, I often back to files created even a long time ago,
- structure of linked documents is always preserved,
- I like to have everything in one place,
- it’s easy to backup.
Exclude from main storage closed projects you never back to. If you collect large-capacity raw video footage, store it on a separate external HDDs, BluRay discs or even tape backup (experiencing remarkable renaissance nowadays).
Anyhow, capacity is more important than speed here so we use HDDs instead of SSD drives.
As far as I don’t want to sacrifice speed I decide to use RAID0 of 4 hard disks.
FOUR ANTS DO THE WORK 4x TIMES FASTER THAN ONE ANT
Idea of symmetric data processing is as old as the computer invention. RAID is an abbreviation of Redundant Array of Independent Disks. Using multiple devices, RAID gives you the ability to enhance speed or safety (or both) of the storage.
There are two basic RAID levels: RAID0 (data striping) and RAID1 (data mirroring)
In RAID0 level, data is splitted (“striped”) by x where x = number of disks in array. Both read and write times are divided by x too (ie. when saving 1GB file in 4xHDD RAID0 volume, every member disk saves only 250 MB, that’s why it’s 4x faster). There is no redundancy here so this option is not fail-safe; if any disk member fails, you lose whole array (and data).
In RAID1 level, data is “mirrored” between member disks, and this is the reason of the “redundant” adjective presence. Note, write speed is limited to single device speed while the read speed can be multiplied by x (files are already on all member disks, so trick is in the readout process). RAID1 is fully redundant – if one member disk fails, your array is still operational.
You can of course combine above to 01 or 10. There is more RAID levels, so do further reading starting with this article at Wikipedia.
Selection of RAID level should be adequate to certain data usage and hardware capabilities. RAID for server database must have different features than one for the 4K video editing.
What HDD should you buy? For best reliability consider enterprise-grade or NAS/RAID dedicated disks. I recommend WD or HGST (former Hitachi) products. There are two formats of HDDs: 3.5″ & 2.5″. I recommend use of 2.5″ drives as they feature decent speeds, low power consumption and case space requirements. However, largest capacity HDD’s are still 3.5″ drives. I use and recommend WD Red™ (or newer Re™) series. These are reasonably priced, fast and reliable hard drives.
Inspect your computer expansion abilities before any hardware purchase.
AVOID “GREEN”/”ECO” DISKS USAGE IN RAID
As they have built-in special power saving mechanism their use in RAID systems should be avoided. In short: these disks has own procedures of going “sleep” which interferes with array power management. This shortly leads to array failure and necessity of data recovery.
To create a RAID array you will need the appropriate hardware controller. Many motherboards have such solutions built-in so verify your mobo capabilities first.
Software vs. hardware RAID
All modern OS has the capability of creation so called software RAID volumes. If your mobo features available HDD connectors you can connect drives and create RAID volume in appropriate system utility (Disk Management in Windows or Disk Utility in Mac OS X). The main objection here is additional CPU utilisation. However, for small arrays it’s ignorable, oscilating at 1-2% of CPU resource consumption. This is quite fair solution for low-cost workstations.
Independent hardware RAID controller (or HBA – Host Bus Adapter) has its own computing unit which process all RAID calculations, so influence to the main CPU is minimised. It is clear that dedicated hardware outperforms a general processor executing software instructions, so this should be always more efficient solution.
Many well-equipped motherboards have built-in RoC (RAID on Chip) which can handle simple RAIDs, and this solution is adequate to low-cost HBA extension PCIe card. Many mobos with Intel chipset onboard feature additional SATA or SAS connectors for use with integrated RST (Rapid Storage Technology).
There is huge variety of RAID controllers on the market. Beyond SATA HBAs you may choose SAS (Serial Attached SCSI) solutions which in most are SATA backward compatible. I prefer this option as it is newer, faster and more extensible solution. Additional bonus is easier connection with drives using one special cable (if you connect 4 drives via SATA, you need four data cables).
Controller usually is an additional PCIe adapter card which you should plug into your motherboard. There is no special requirements here but make sure you have a proper slot available (usually 4x lane slot is required).
I recommend Areca RAID adapters as they are reliable and supported by virtually all OS’es. For example, Areca 1883i RAID controler, although not cheap, is worth recommendation:
This is 8-port (2×4) PCIE 3.0 12GB/s adapter with SAS drive connector. Here is product website and here you may find its review. Price (ca. $600) is rather high, but I can assure, it is worth it.
More affordable hardware is made by HighPoint Technology. You may find in their offer various-grade PCIe storage solutions. RocketRAID 2720SGL is known to have very good price/value ratio (~$200).
HighPoint offers also Mac storage solutions on their hptmac.com website.
If you’re looking for really cheap solution try to find some used controllers from Dell, IBM, Fujitsu or other hardware manufacturers. Usually, they’re rebranded HBAs with chips from leading storage companies like AvagoTech™ (former LSI) or SiliconImage. Good example here are Dell H310 and IBM M5014 as they actually are LSI 9211-8i controller (based on very popular LSI SAS2008 chip).
When using SAS HBA you’ll probably need a proper HDD connector. Note there is variety of generations and types of connectors. If you’re not sure what cable type you should buy, this Wikipedia section should help.
If your hardware is collected, connect everything properly and create an array. Some adapters setup is done in the BIOS (during POST), some deliver system setup apps, and the others, like Areca provide browser-based mangement tools.
If you’re a writer, you can write the same again; if you’re an engineer, you can draw your bridge again, but if you’re a photographer… well, most often you just can’t take the same picture again. That’s why data backup for digital photographer is explicitly important.
Reasons of data loss may be more or less surprising: storage hardware failure, steal, fire or even a thunder strike (I met such situation personally). It can happen to you, so be prepared for that. ALWAYS MAKE BACKUPS.
VERY IMPORTANT NOTICE
Note we made decision to make main storage as a not redundant RAID0 volume. RAID volume failure probability is incomparably higher than a physical drive fault. This force us to provide some constant (or at least very frequent) backup solution.
If you do not provide one, you embrace a high risk of losing your precious data!
Now it becomes clear, the BACKUP storage here is crucial. As it stores all you data it should be verified and reliable solution.
Moreover, it should be external and (possibly) portable. What would you take with you in case of fire? I would take all my work and make it safe. This is not as crazy as it sounds, and very convenient if you have to share large amount of data between offices.
BACKUP storage recommendations
Our BACKUP should be four-HDD external device with ability to work as a redundant RAID (level 10 or 5) volume. Of course it should be connected to your workstation via possibly fastest interface available. Most modern products offers connection by popular fast interfaces (Thunderbolt, USB3, SATAIII) so verify which is best for your rig.
Backup philosophy in the ORBITS layout assumes one large redundant BACKUP storage for crucial data volumes in your workstation, including SYSTEM, INPUT (working volume) and RAID.
So the BACKUP drive should cover total capacity of these three volumes. For further needs make it double. There is never not enough storage. Compare the prices of the 2TB / 4TB / 6TB / 8TB drives and pick the best “$ per 1GB” ratio.
I recommend two group of products:
CalDigit T4 is good example here. For many years, CalDigit has been known as a manufacturer of the top-line storage solutions. Nothing changed here. Five years warranty for the products should dissipate any concerns. T4 device is ideal candidate for our needs. Great performance and reliability. However, your computer should have a Thunderbolt connector available (for use with 4 HDD, 1st generation of Thundrbolt is much more than sufficient). Price is $1699 for 16TB option.
2. eSATA / USB3
There is no eSATA or USB3 device which would beat the Thunderbolt, but ~300 MB/s level which they usually deliver is fair level of speed. There is variety of such devices on the market and many of them are built on the same controller chips (for example: very popular JMB393). Differences are in design and the price of course.
I have very good experiences with the STARDOM SohoRAID SR4-WBS3:
This is rock-stable external storage with beautiful design and variety of connectors. Once I had setup it with 4 WD Red drives it’s been working at my desktop for two years without single hitch. Every type of connection work with advertised speed. Email support exists, and helped me to reset lost array password in the second reply. MSRP of the 16TB SohoRAID is about $1200 but I’m sure buying drives separately may be cheaper. What I like special in SohoRAID is a solid top grip which makes carrying this heavy device very easy.
Create BACKUP volume
If you have finally obtain the hardware, connect and configure it (or in reverse order). Some devices are configurable with on-board buttons/switches, some you need to configure using included software / drivers.
Setup redundant RAID level: 10 or 5 and finally create one large BACKUP volume using system disk utility. Your backup device is ready.
Presented solution allows quick restore of your workstation to the last backup state (that’s why it’s good to make it regular and frequent) with no data loss.
Note I excluded the TEMP and OUTPUT volumes. There is no reason to backup temporary data as well as the OUTPUT, assuming the certain INPUT exists, of course.
Software for the BACKUP
All right, we have our hardware, let’s talk about – last but no least – software for our backup needs.
Although I’m addicted OS X user I don’t use TimeMachine. Main feature of it is storing multi-temporal data, which I don’t use because of my own system of versioning files.
What I need is a simple incremental backup software, able to copy entire volume file by file (not as disk image) independently.
These criterias meet the iBackup software. Simple, effective and free.
Once you setup configurations for every included volume (SYSTEM, INPUT, RAID) you’re ready to backup most important volumes of your workstation in one place. It is as simple as pushing “Backup Now” button or in an automated manner. Perfect.
Presented system and storage philosophy comes from my personal customs, habits and needs. I’m aware this is something personal and every workshop is different, but I hope my article will give you some directions to tune your rig.
ORBITS architecture storage is suitable for majority of image processing software and allows you to make work more efficient while keeping shop tidy.
If you have any suggestions or ideas of further developement, please share it in the comments.