
With a legacy database like PostgreSQL, you will typically accomplish this by setting up a mirrored instance and then run each database in each region separately. Geographic scale: expanding your businessĪnother way to look at scale is an ability to expand your presence into new geographies. The database is even smart enough to notice if a replica is missing and will recreate it on another node so it ensures the replicas are complete. As noted above, data is written in triplicate across several member nodes of the database, so that if an individual node fails, you still have two copies of the data.

How do you know what’s correct within that data?Īgain, CockroachDB naturally delivers high availability and does so without any complex or costly operations. But what’s really scary is the remediation when two databases have been in production without syncing together. Second, the time it takes for the secondary to come online after a failure could be substantial and you may lose data or transactions in the process. First, the synchronization between primary and secondary can be complex and may be open to alignment issues between the two systems. This approach has worked for some time, however, it is wrought with issues. And then, when the crisis is resolved, you get the primary back online and you return to the original configuration. So, when the primary fails, the secondary can come online. And for legacy databases, such as PostgreSQL, you will typically obtain HA via deployment of an active/passive topology. In this configuration, you have a primary and a secondary instance of the database, and you’re doing some sort of synchronization between two instances to make sure that they’re in sync. High availability (HA) is important as it ensures you can access data and that you have backups in the event of a failure. All without any operational overhead other than starting a new node in the cluster! Active/passive systems work, to a point No matter where the data lives, every node can access data anywhere in the cluster. Rather, the cluster uses distributed consensus, writing the data in triplicate across several member nodes. Also, data is not synchronized among the nodes. So, with the simple addition of a new node into a cluster, you scale both storage capacity and transactional capacity. It is a distributed database where each node can service both reads and writes across all participating nodes.


It is a native capability that, in its most basic form, delivers horizontal scale without any operational overhead.
#Postgresql manual#
There are lots of different ways to shard a database, but ultimately all manual sharding leads to complex implementations, and a fair amount of overhead in actually implementing it (both from an operations point of view, and from a development point of view). Using a shard key or hash or whatever is right for you, you’ll need to split the database into smaller pieces and run it on multiple different compute instances. Once you’ve maxed out vertical scale, you have to shift to horizontal scale, which means manual sharding. You’ve increased capacity, but it’s essentially just vertical scale and you will reach its limit.
#Postgresql upgrade#
Then, when you near capacity on this instance you upgrade to bigger compute.

Typically, scale for a PostgreSQL database involves deploying it on the largest instance you can afford.
#Postgresql software#
Whether we need more compute or more storage, these resources have become commodities however, to take advantage of their seemingly limitless availability, the software we use must use them correctly. The cloud means scaleĬloud infrastructure promises easy scale for our apps and services at the push of a button. In the process we’ll point out where the limitations of single server, single instance architecture might pose challenges for modern cloud infrastructure. In this post, we simply unpack some of the architectural differences between PostgreSQL and CockroachDB. The world owes a debt of gratitude to the open source community that has built and supported this important project for the last 30 years. It would be wrong to begin a comparison blog post about PostgreSQL without first acknowledging that it is one of the most reliable and widely used databases in the history of software.
