Cassandra: text vs varchar

DatabaseCassandra

Database Problem Overview


Does anyone know the difference between the two CQL data types text and varchar in Cassandra? The Cassandra documentation describes both types as "UTF-8 encoded string" and nothing more.

Database Solutions


Solution 1 - Database

text is just an alias for varchar!

The documentation:

EDIT
Here's the link to the C* 1.2 docs. The text vs varchar info is still the same, however this document contains some extra datatypes.

EDIT v2 Documentation links have been updated to the docs for C* 3. I couldn't find a good alternative for the C* 1.2 docs.

Solution 2 - Database

Probably you meant the CQL storage types, if not, disregard my answer.

In CQL there has been a ongoing trend to try to distance from the internals of cassandra. Whether that is a good thing, or a bad thing, is open to interpretation. What is relevant, however, is in latest versions of CQL developers have been trying to come up with syntax that is more familiar to people who are not that in depth into cassandra's internals.

If you were to take a look into this SO question, you will get a nice illustration of the situation: https://stackoverflow.com/questions/16096009/creating-column-family-or-table-in-cassandra-while-working-datastax-apiwhich-us

In recent CQL versions, some aliases, alien to cassandra, but very well known to DBA's have started to appear. For example, the native to cassandra ColumnFamily has been aliased with Table, and text is just an alias for varchar and vice versa. Again, it is a matter of opinion if that is a good thing or not.

So, in conclusion, you can use varchar and text interchangeably.

Solution 3 - Database

Cassandra CQL Data Types text and varchar are synonmys/alias for each other.

  1. Data Type associated to Varchar is blob(The max theoretical size for a blob is 2 GB)
  2. Data Type associated to text is Varchar (meaning even you have used text but Cassandra internally treats as Varchar)
  3. blob type association will not create performance issues because Cassandra stores data in constant hexadecimal number.
  4. Reads will be faster due to Cassandra queries the right coordinates using primary key (partition key, clustering column) depending on how we design our table. enter image description here enter image description here

Solution 4 - Database

This threw me too when I started with Cassandra.

Both text and varchar are UTF8 encoded strings and are synonyms for each other, that is they are exactly the same thing.

As an added side note if one comes from a relational world like MS SQL, one would perhaps also be hesitant to use these types (especially TEXT) as the primary field for an entity. TEXT is especially usually associated with big blobs of text content that don't scream primary key to ones 3rd normal form relational mind. But since all Cassandra types are essentially stored as hexadecimal byte arrays on the disk there is no real significant performance when using them as the primary key.

Solution 5 - Database

it's the same for Cassandra. Both text and varchar are UTF8 encoded strings so they are exactly the same thing.

CQL Data types

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestiondarcyyView Question on Stackoverflow
Solution 1 - DatabaseLyuben TodorovView Answer on Stackoverflow
Solution 2 - DatabaseNikola YovchevView Answer on Stackoverflow
Solution 3 - DatabasePuttiView Answer on Stackoverflow
Solution 4 - DatabaseSarel EsterhuizenView Answer on Stackoverflow
Solution 5 - DatabaseLucABView Answer on Stackoverflow