Does an UPDATE become an implied INSERT

CassandraCql

Cassandra Problem Overview


For Cassandra, do UPDATEs become an implied INSERT if the selected row does not exist? That is, if I say

 UPDATE users SET name = "Raedwald" WHERE id = 545127

and id is the PRIMARY KEY of the users table, and the table has no row with a key of 545127, will that be equivalent to

 INSERT INTO users (id, name) VALUES (545127, "Raedwald")

I know that the opposite is true: an INSERT for an id that already exists becomes an UPDATE of the row with that id. Older Cassandra documentation talked about inserts actually being "upserts" for that reason.

I'm interested in the case for CQL3, Cassandra version 1.2+.

Cassandra Solutions


Solution 1 - Cassandra

Yes, for Cassandra UPDATE is synonymous with INSERT, as explained in the CQL documentation where it says the following about UPDATE:

> Note that unlike in SQL, UPDATE does not check the prior existence of the row: the row is created if none existed before, and updated otherwise. Furthermore, there is no mean to know which of creation or update happened. In fact, the semantic of INSERT and UPDATE are identical.

For the semantics to be different, Cassandra would need to do a read to know if the row already exists. Cassandra is write optimized, so you can always assume it doesn't do a read before write on any write operation. The only exception is counters (unless replicate_on_write = false), in which case replication on increment involves a read.

Solution 2 - Cassandra

Unfortunately the accepted answer is not 100% accurate. inserts are different than updates:

cqlsh> create table ks.t (pk int, ck int, v int, primary key (pk, ck));
cqlsh> update ks.t set v = null where pk = 0 and ck = 0;
cqlsh> select * from ks.t where pk = 0 and ck = 0;

 pk | ck | v
----+----+---

(0 rows)
cqlsh> insert into ks.t (pk,ck,v) values (0,0,null);
cqlsh> select * from ks.t where pk = 0 and ck = 0;

 pk | ck | v
----+----+------
  0 |  0 | null

(1 rows)

Scylla does the same thing.

In Scylla and Cassandra rows are sequences of cells. Each column gets a corresponding cell (or a set of cells in the case of non-frozen collections or UDTs). But there is one additional, invisible cell - the row marker (in Scylla at least; I suspect Cassandra has something similar).

The row marker makes a difference for rows in which all other cells are dead: a row shows up in a query if and only if there's at least one alive cell. Thus, if the row marker is alive, the row will show up, even if all other columns were previously set to null using e.g. updates.

inserts create a live row marker, while updates don't touch the row marker, so clearly they are different. The example above illustrates that. One could argue that row markers are "internal" to Cassandra/Scylla, but as you can see, their effects are visible. Row markers affect your life whether you like it or not, so it may be useful to remember about them.

It's sad that no documentation mentions row markers (well, I found this: https://docs.scylladb.com/architecture/sstable/sstable2/sstable-data-file/#cql-row-marker but it's in the context of explaining SSTable internals, which is probably dedicated to Scylla developers more than to users).

Bonus: a cell delete:

delete v from ks.t where pk = 0 and ck = 0

is the same as a null update:

update ks.t set v = null where pk = 0 and ck = 0

indeed, a cell delete also doesn't touch the row marker. It only sets the specified cell to null.

This is different from a row delete:

delete from ks.t where pk = 0 and ck = 0

because row deletes insert a row tombstone, which kills all cells in the row (including the row marker). You could say that row deletes are the opposite of an insert. Updates and cell deletes are somewhere in between.

Solution 3 - Cassandra

What one can do is this however:

UPDATE table_name SET field = false WHERE key = 55 IF EXISTS;

This will ensure that your update is a true update and not an upsert.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionRaedwaldView Question on Stackoverflow
Solution 1 - CassandraRichardView Answer on Stackoverflow
Solution 2 - CassandrakbrView Answer on Stackoverflow
Solution 3 - CassandraKingOfHypocritesView Answer on Stackoverflow