“Azure Cosmos DB is a fully managed NoSQL database for modern app development” is what I've read from the documentation. However, Azure Cosmos DB for PostgreSQL has just been announced like a new service in Azure. PostgreSQL is supposed to be a SQL database, right? I'll test that in this series of posts, where I run the simplest SQL schema, the old EMP/DEPT, showing what any Relational Database for OLTP should support: unique constraints and foreign keys.
There's a free trial, with no credit card required 😀 and a 7 days limit ☹️
This “Azure Cosmos DB for PostgreSQL” is actually CitusDB. This new service is a marketing rename from Hyperscale. I've been working a lot with Oracle Database. Renaming features by marketing is not new. We can see it from the https://learn.microsoft.com/en-us/azure/postgresql/hyperscale/overview link which is an HTTP redirect to https://learn.microsoft.com/en-us/azure/cosmos-db/postgresql/introduction
Great, I didn't test CitusDB yet in this blog post series about SQL features on distributed databases. Let's create the same sample schema:
citus=> CREATE TABLE dept ( deptno integer NOT NULL, dname text, loc text, description text, CONSTRAINT pk_dept PRIMARY KEY (deptno) ); CREATE TABLE citus=> CREATE TABLE emp ( empno integer generated by default as identity (start with 10000) NOT NULL, ename text NOT NULL, job text, mgr integer, hiredate date, sal integer, comm integer, deptno integer NOT NULL, email text, other_info json, CONSTRAINT pk_emp PRIMARY KEY (empno), CONSTRAINT emp_email_uk UNIQUE (email), CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno), CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno) ); CREATE TABLE
I start to distribute the tables, following the CitusDB documentation:
citus=> SELECT create_distributed_table('dept', 'deptno'); create_distributed_table -------------------------- (1 row) citus=> SELECT create_distributed_table('emp', 'empno'); ERROR: cannot distribute relation: emp DETAIL: Distributed relations must not use GENERATED ... AS IDENTITY. citus=>
Ok, this standard SQL feature exists in PostgreSQL since version 12 (it is also in YugabyteDB even with PostgreSQL 11.2 compatibility by the way) but it is not supported in CitusDB. Let's try with
citus=> drop table emp; DROP TABLE citus=> CREATE TABLE emp ( empno serial NOT NULL, ename text NOT NULL, job text, mgr integer, hiredate date, sal integer, comm integer, deptno integer NOT NULL, email text, other_info json, CONSTRAINT pk_emp PRIMARY KEY (empno), CONSTRAINT emp_email_uk UNIQUE (email), CONSTRAINT fk_deptno FOREIGN KEY (deptno) REFERENCES dept(deptno), CONSTRAINT fk_mgr FOREIGN KEY (mgr) REFERENCES emp(empno) ); citus=> SELECT create_distributed_table('emp', 'empno'); ERROR: cannot create foreign key constraint DETAIL: Foreign keys are supported in two cases, either in between two colocated tables including partition column in the same ordinal in the both tables or from distributed to reference tables citus=>
Another problem, very similar to what I have seen in the first post of this series, we cannot have referential integrity between distributed tables. In this example,
dept is a good candidate for a broadcasted reference table, but OLTP system of records have relationships like ORDERS/ITEMS, ACCOUNT/TRANSACTION, CUSTOMER/PAYMENT. Let's continue with this limitation of not distributing the parent table.
I change the parent table
dept table to not distribute it, as it is not supported for referential integrity, and try to distribute the child table
citus=> SELECT undistribute_table('dept',true); NOTICE: creating a new table for public.dept NOTICE: moving the data of public.dept NOTICE: dropping the old public.dept NOTICE: renaming the new table to public.dept undistribute_table -------------------- (1 row) citus=> SELECT create_reference_table('dept'); NOTICE: local tables that are added to metadata automatically by citus, but not chained with reference tables via foreign keys might be automatically converted back to postgres tables HINT: Executing citus_add_local_table_to_metadata($$public.emp$$) prevents this for the given relation, and all of the connected relations create_reference_table ------------------------ (1 row) citus=> SELECT create_distributed_table('emp', 'empno'); ERROR: cannot create constraint on "emp" DETAIL: Distributed relations cannot have UNIQUE, EXCLUDE, or PRIMARY KEY constraints that do not include the partition column (with an equality operator if EXCLUDE).
Ok, same problem as with many New SQL sharded databases that are not Distributed SQL: anything that involves cross-shard transactions is unsupported. Foreign keys and Unique constraints cannot be created. There's no global integrity enforced by the database. This is annoying in OLTP system of records where there is, in addition to the generated primary key, multiple natural keys. Those need to be validated to avoid duplicates, especially on highly available databases where transactions may be retried.
Let's drop all those constraints to see if we can go further:
citus=> alter table emp drop constraint emp_email_uk; ALTER TABLE citus=> alter table emp drop constraint fk_mgr; ALTER TABLE citus=> SELECT create_distributed_table('emp', 'empno'); create_distributed_table -------------------------- (1 row)
Good, this works. But I had to get rid about foreign key (except to the mostly static reference tables that can be distributed) and all unique secondary indexes.
Without Foreign key, if the application wants to avoid data corruption in a scalable way, it needs to run in Serializable isolation level.
Here is what could be a race condition, one user dropping a manager after checking it has no employees, another user adding an employee:
citus=> -- add the boss in department 10 citus=> insert into dept(deptno) values (10); INSERT 0 1 citus=> insert into emp(empno,ename,deptno,mgr) values(1, 'boss',10, null); INSERT 0 1 citus=> begin transaction isolation level serializable; BEGIN citus=*> -- add an employee referring to the boss citus=*> insert into emp(empno,ename,deptno,mgr) values(2, 'employee',10, 1); INSERT 0 1 -- connect a concurrent session citus=*> \! psql psql (13.7, server 14.5) citus=> begin transaction isolation level serializable; BEGIN citus=*> -- check if boss has employees citus=*> select * from emp; empno | ename | job | mgr | hiredate | sal | comm | deptno | email | other_info -------+-------+-----+-----+----------+-----+------+--------+-------+------------ 1 | boss | | | | | | 10 | | (1 row) citus=*> -- remove the boss as it has no employees citus=*> delete from emp where empno=1; DELETE 1 citus=*> commit; COMMIT -- back to session one commiting its transaction citus=> \q citus=*> commit; COMMIT -- final state: one employee with inexistent boss citus=> select * from emp; empno | ename | job | mgr | hiredate | sal | comm | deptno | email | other_info -------+----------+-----+-----+----------+-----+------+--------+-------+------------ 2 | employee | | 1 | | | | 10 | | (1 row)
This is a surprising result: no error but orphan child. This should not happen with serializable isolation level.
We should not have to read the documentation to know that an successful command actually did nothing. I can find it in the list of CitusDB limitations: https://docs.citusdata.com/en/v11.1/admin_guide/table_management.html?highlight=serializable#limitations.
I also see “No support for tuple locks” there, so no need to test for other alternatives to foreign keys. Locking whole tables is not a solution.
Being Distributed and PostgreSQL compatible is not easy. There are not a lot of players in this area which, like YugabyteDB, support all global constraints, referential integrity, isolation levels like PostgreSQL does. Actually, I know only YugabyteDB doing that but please comment if you think there are others.
CitusDB, like many sharded databases, is good for datawarhouse. They partition the data first and distribute the queries to a few shards, possibly each replicated with a primary-standby. This is sharding on top of monolithic SQL databases.
Distributed SQL like YugabyteDB, on the opposite, provide all SQL features on top of a distributed and replicated storage, which is seen as one global database with global ACID transactions. I detailed this architecture in https://www.yugabyte.com/blog/distributed-sql-yugabytedb-two-layer-architecture/