Citus is Postgres that scales out horizontally. We do this by distributing queries across multiple Postgres servers—and as is often the case with scale-out architectures, this scale-out approach provides some great performance gains. And because Citus is an extension to Postgres, you get all the awesome features in Postgres such as support for JSONB, full-text search, PostGIS, and more.
The distributed nature of the Citus extension gives you new flexibility when it comes to modeling your data. This is good. But you’ll need to think about how to model your data and what type of database tables to use. The way you query your data ultimately determines how you can best model each table. In this post, we'll dive into the three different types of database tables in Citus and how you should think about each.
"Your father's lightsaber. This is the weapon of a Jedi Knight. Not as clumsy or random as a blaster. An elegant weapon, for a more civilized age."
—Obi-Wan Kenobi, Star Wars Episode IV: A New Hope
Announcing the release of Citus 6.2
Today I’m happy to announce that we’ve rolled out a new version of our database, Citus 6.2. Because as most of you know, good software never stops evolving. Nor should it. If you want the scoop on the new capabilities in Citus 6.2, just scroll ahead. But before diving in, I need to explain the lightsaber pic. Why? Because usually a picture speaks a thousand words, but sometimes it needs an annotation. :-)
When my colleagues first started on their journey to build Citus, they had a vision of combining the best aspects of relational databases with the elastic scale of NoSQL—to give developers a database that delivers SQL capabilities, at scale.
But vision alone does not make a successful company. The Citus co-founders needed a mix of key ingredients: the right team, good timing, good execution, a willingness to experiment and learn, plus (of course) a good idea.
When George Lucas describes his days before the first Star Wars film, he said he was “searching for just the right ingredients, characters and storyline.” In Lucas’s search for the right mix, he too had to iterate: he wrote four different screenplays before landing on the final version of the original film!
Because our CTO is such a big fan of Star Wars, Ozgun sometimes talks about his vision for Citus in the language of the Jedi: Ozgun has said his aim for Citus was “to create a database as elegant and as powerful as a lightsaber.” Now, I’m more of a Stranger Things fan myself (after all, mornings are for coffee and contemplation) but I get Ozgun’s desire to create a database that gives you the benefits of SQL—at scale.
Distributed databases often require you to give up SQL and ACID transactions as a trade-off for scale. Citus is a different kind of distributed database. As an extension to PostgreSQL, Citus can leverage PostgreSQL’s internal logic to distribute more sophisticated data models. If you’re building a multi-tenant application, Citus can transparently scale out the underlying database in a way that allows you to keep using advanced SQL queries and transaction blocks.
In multi-tenant applications, most data and queries are specific to a particular tenant. If all tables have a tenant ID column and are distributed by this column, and all queries filter by tenant ID, then Citus supports the full SQL functionality of PostgreSQL—including complex joins and transaction blocks—by transparently delegating each query to the node that stores the tenant’s data. This means that with Citus, you don’t lose any of the functionality or transactional guarantees that you are used to in PostgreSQL, even though your database has been transparently scaled out across many servers. In addition, you can manage your distributed database through parallel DDL, tenant isolation, high performance data loading, and cross-tenant queries.
Note: This is a guest blog post by Giuseppe “Pino” de Candia, the creator of Dynamo. We asked Pino to chime in with his thoughts on distributed databases and the trends he sees in this space. You can read more about Pino here.
When Ozgun, one of the founders of Citus Data, emailed me resources on scaling multi-tenant databases for B2B apps and asked me what I thought, all kinds of distributed systems tradeoffs started crossing my mind—along with memories of the forces that shaped Dynamo.
It’s been a decade since my team at Amazon worked on Dynamo, a highly available and scalable key-value store. By the time we started working on the project, Amazon was already going through two transitions.
One of the most common questions we get at Citus is how the rebalancer works–which is understandable. When you have an elastic scaled out database, how easy it is to scale is a key factor into how usable it will be. While we'll happily take time for anyone that's interested and live demo it, this walk through should give you a better idea of how it works for those of you that are curious.
Written byByMetin Doslu | March 15, 2017Mar 15, 2017
For many SaaS products, a common database problem is having one customer that has so much data, it adversely impacts other customers on the shared machine. This leads many to ask, “What do I do with my largest customer?”
Tenant isolation is a great way to solve this issue. Effectively it allows you to control which tenant or customer in particular you want to isolate on a completely new node. By separating a tenant, you get dedicated resources with more memory and cpu processing power.
A number of SaaS applications have data models where they want to have their customers interact with only their data. At the enterprise end you have companies like Salesforce and Workday that fall into this bucket, but we see a ton of small ones as well. If you're just getting started figuring out how you should approach your data so it can scale in the future, it doesn't have to be hard.
Here we're going to walk through an example data model that you can use as a basis for learning how you could apply the same to your own multi-tenant application.
Microservices and NoSQL get a lot of hype, but in many cases what you really want is a relational database that simply works, and can easily scale as your application data grows. Microservices can help you split up areas of concern, but also introduce complexity and often heavy engineering work to migrate to them. Yet, there are a lot of monolithic apps out that do need to scale. If you don't want the added complexity of microservices, but do need to continue scaling your relational database then you can with Citus. With Citus 6.1 we're continuing to make scaling out your database even easier with all the benefits of Postgres (SQL, JSONB, PostGIS, indexes, etc.) still packed in there.
With this new release customers like Heap and ConvertFlow are able to scale from single node Postgres to horizontal linear scale. Citus 6.1 brings several improvements, making scaling your multi-tenant app even easier. These include:
Integrated reference table support
Tenant Isolation
View support on distributed tables
Distributed Vaccum / Analyze
All of this with the same language bindings, clients, drivers, libraries (like ActiveRecord) that Postgres already works with.
Give Citus 6.1 a try today on Citus Cloud, our fully managed database-as-a-service on top of AWS, or read on to learn more about all that’s included in this release.
Written byByMarco Slot | January 17, 2017Jan 17, 2017
Indexes are an essential tool for optimizing database performance and are becoming ever more important with big data. However, as the volume of data increases, index maintenance often becomes a write bottleneck, especially for advanced index types which use a lot of CPU time for every row that gets written. Index creation may also become prohibitively expensive as it may take hours or even days to build a new index on terabytes of data in postgres. As of Citus 6.0, we’ve made creating and maintaining indexes that much faster through parallelization.
Written byByLukas Fittl | January 5, 2017Jan 5, 2017
Today we’re happy to announce our new activerecord-multi-tenant Ruby library, which enables easy scale-out of applications that are built on top of Ruby on Rails and follow a multi-tenant data model.
This Ruby library has evolved from our experience working with customers, scaling out their multi-tenant apps, and patching some restrictions that ActiveRecord and Rails currently have when it comes to automatic query building. It is based on the excellent acts_as_tenant library, and extends it for the particular use-case of a distributed multi-tenant database like Citus.