Internxt, Sia, Storj: Blockchain Decentralized Cloud Storage
Blockchain has become a buzzword during the last decade, a term that for many is just so much technobabble. But blockchain technology is here to stay and it is steadily expanding to corner larger portions of the data market, including cloud storage solutions.
Google, for instance, uses a massive amount of meta-data attached to your data to create new software, serve you advertisements, and increase their monopoly over various portions of the tech sector. They’re an advertising company, so it makes sense that this is what they’re doing, but most users don’t realize that this is happening (or realize it but don’t understand it well enough to know why they should care). Besides, where is the competition? Dropbox, one of the largest names in the industry, costs a large chunk of monthly change and data within Dropbox isn’t encrypted in such a way that only the uploader can access it!
Cost, privacy, and security are elements where providers of new decentralized cloud storage say they excel… but is this true? We need to consider **the difference between decentralization and distribution. A distributed network is fairly common, this just means that data is being distributed across more than one location. Decentralized (generally speaking in this case) means that your data is not passing through a middle-man. There absolutely are projects that allow for true decentralized file sharing, but these are sometimes a bit more complicated to set up and operate than an average Dropbox subscription. Plenty of other “blockchain” cloud storage systems just use blockchain as a sort of pointer for where the data is being stored, and the system is ultimately controlled by a single actor (such as a company). Understanding which is which can be tricky.
The pros of blockchain and decentralized storage
Decentralized and distributed data storage is very likely here to stay, and blockchain storage is likely to become one of the main “blocks” upon which that decentralization takes place. It offers some native strengths, such as redundancy, data transparency, and potential for far better storage to cost ratios than are offered by traditional storage providers.
-
Redundancy: Most blockchain storage providers use a technique where uploaded information is split into “shards” and distributed to a number of data storage locations around the world. These shards are all encrypted and need to be recombined to form the whole, which can only by accomplished using the data-owner’s private key. The information itself lives on the various nodes while the blockchain records a unique identifier for that information. This means that when you upload your data it’s actually getting stored in multiple locations around the world, so if a disaster strikes on one place it’s unlikely to effect your data.
-
Data transparency: Everything that happens on the blockchain stays on the blockchain, that’s the whole point of the system. Every “transaction” to or from the blockchain (every time data is uploaded or downloaded) leaves a signature called a “hash” on the blockchain code, a unique combination of symbols that represents that unique data (and there are a huge range of things that go into creating this unique hash). This means that it’s very hard to fool the blockchain into thinking something has happened if it hasn’t really happened — if data is sent, anywhere, then the blockchain records that.
-
Security: most decentralized platforms use powerful public key cryptography systems that ensure zero-knowledge encryption. Only you can unlock your data and see what’s inside. The process sometimes referred to as “sharding” also comes into play; your data is spread out across a bunch of hosting servers as chunks of the original and a complete copy is never stored remotely, so even if one of these chunks were decrypted somehow, it would only be a useless collection of partial data. The original only becomes useful when it’s recombined with your key. Finally, these systems rely on other features to ensure that your data remains intact even if a number of hosts were to go offline — you actually only need a small percentage of hosts available to recombine your data and access it.
-
Storage to cost: This form of storage usually spreads out the storage space across many different nodes. These are people and organizations around the world that provide storage space in some form. Shards are downloaded into their storage whenever the blockchain says it should be, and then can be requested at any point in time. A complex set of features ensures that these providers maintain the data, and the innate encryption features of the shard process ensure that your data remains secure. Because data is split like this, no single provider needs to host a huge data server to store your information. This distributes the cost across a huge group of providers and lowers the ultimate end cost to you, the user.
Cons of blockchain and decentralized storage
There are a bunch of potential cons to using decentralized storage that need to be seriously considered before implementing this method of storing your data. These come in a number of areas, but specifically relate to the lifetime of a decentralized service, the speed of a decentralized service, and “the right to be forgotten” which creates inherent difficulties with blockchain technology.
Decentralized storage relies on a large network of active nodes — those users who are offering a portion of their available storage capacity to host shards from the network. Most decentralized storage providers use some form of cryptocurrency option to pay these users and provide them with an incentive to maintain high quality servers (with consistent “uptime” or availability). But that doesn’t mean that these providers won’t vanish. Unless enough people remain excited about a service, or the rewards are significant and consistent enough to merit continued hosting, it’s entirely possible the providers will vanish from the system. That doesn’t mean your data would be lost… right away, but if enough of those providers were to vanish then the whole infrastructure would break down.
Likewise, the system relies upon providers who can offer solid hosting, and that means speed. Decentralized cloud services can be slower than traditional ones if their network isn’t filled with providers who offer fast always accessible nodes within the network. They can also be potentially a lot faster because they might be relying on nodes that are far closer to your physical location than would normally be the case, as well as using multiple such nodes simultaneous to download data--so it’s not a cut and dry issue.
One of the largest issues that can frustrate potential users is the limitation of the right to be forgotten from the blockchain. Blockchains ensure security by making it so that every “block” of data relies on another block to exist. Some efforts have gone towards solving this problem, however it’s largely unclear from the perspective of an average user if their request for complete data deletion will be (or even can be) honored by the blockchain.
What I cover in this article are some of the top distributed blockchain providers.