blockchain vs database
Recall very first that the public and press at large tends to define both these terms imprecisely, and also that these categories of data storage are in some ways converging in any case.
What are blockchains?
Blockchains share certain characteristics with a handful of distributed database and data store types, particularly immutability and granular record- or cell-level encryption. As mainstream distributed databases add these features, it will be even more difficult to distinguish them from current-day blockchains.
When we researched NoSQL databases a duo of years ago, we learned that Couchbase and Datomic, for example, are immutable, and that Accumulo and its commercial counterpart Sqrrl have cell-level encryption. (See The rise of immutable data stores .) Reminisce also that the Hadoop Distributed File System (HDFS) is append-only, or immutable, and that streaming data architectures such as the Kappa architecture are centered around immutable data stores.
Hybrid blockchain/NoSQL databases and stores such as BigchainDB only promise to blur the lines more. But as of today, blockchains still constitute a different class of data storage because of their special transaction confirming and consensus building capability.
What’s a data store, and how does it differ from a database?
A data store is a stripped-down data storage medium that can sometimes serve in the place of a database, depending on the use case.
Data stores tend not to adhere, at least not rigorously, to ACID (Atomicity, Consistency, Isolation, Durability) principles for reasons of plainness and scaling. So they have advantages in certain use cases not requiring instant consistency. Keep in mind that vendors will call the data stores they suggest “databases”, and they do keep adding features to them, so over time they become less data store-like.
When most people think of a “database”, what they’re referring to is something along the lines of Oracle, MySQL or SQL Server—a relational database that assumes a tabular data model and a bias toward transactional, often numerical data. These databases are chock total of features and are still evolving, but aren’t all things to all people.
Also consider that traditional databases that are ACID compliant aren’t always optimal in other respects—for example, they ruin superseded data that could be useful by overwriting it and treat metadata as a separate, second-class citizen stored in a secondary location instead of as data. By contrast, semantic graph or RDF databases concentrate on enriching data description and contextualization and store metadata with the rest of the data.
Basically what’s happened over the last decade has been the proliferation of database and data store types for a range of different purposes. Blockchains (another data storage concept that could eventually belong under the database umbrella, indeed) just assign a higher priority to certain features that permit distributed ledger sharing, collective immutable recording and consensus building—quite valuable capabilities traditional databases haven’t yet been designed to suggest.
It’s hard to imagine a future where blockchains stay separate from other data storage types. More blurring seems inescapable as designers attempt to serve a broadening range of purposes along the lifecycle continuum. But more and more of the data being generated is on the less permanent and persistent end of the continuum. Blockchains permit permanent, immutable recordkeeping and are much slower than data stores designed to treat and distribute more perishable data. More data will become more or less “permanent”, but permanence comes at a cost that not all data warrants.