Data structure for text editor

Data StructuresText Editor

Data Structures Problem Overview


This is an interview question. What data structure would you use to store the text in a text editor?

Data Structures Solutions


Solution 1 - Data Structures

On good old ZX-Spectrum one (or more, I do not know) text exditor used very simple structure.

There was one big buffer, which occupied all free RAM. Text was split in two parts at the cursor. Part before the cursor, was placed at the beginning of the buffer, and the rest at the end of the buffer. As text typed, data simply added to the end of first part, and when cursor is moved, text is copied forth and back.

Buffer layout:

Hello, World!
        ^------Cursor here

+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|H|e|l|l|o|,| |W| <free>  |o|r|l|d|!|
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|                ^         ^        |
begin           cur1      cur2    end

That's, how some edit operations was made:

Type a char:    buffer[cur1++] = character

Backspace:      cur1--

Cursor left:    buffer[--cur2] = buffer[--cur1]

Cursor right:   buffer[cur1++] = buffer[cur2++]

Buffer in action:

             Hello, W..............orld!
Press right          ^             ^
             Hello, Wo..............rld!
Press backspace       ^             ^
             Hello, W...............rld!
Press 0              ^              ^
             Hello, W0..............rld!
                      ^             ^

Solution 2 - Data Structures

Rope

> A rope is essentially a binary tree whose leaves are arrays of characters. A node in the tree has a left child and a right child - the left child is the first part of the string, while the right child is the final part of the string. Concatenation of two ropes simply involves the creation of a new tree node with both ropes as children. To ensure logarithmic time indexing and sub-string operations the resulting rope may need to be balanced. Various balancing strategies are possible. > >The main advantages of ropes as compared to storing strings as character arrays is that they enable much faster concatenation than ordinary strings, and don't require a large contiguous memory space to store a large string. The main disadvantages are greater overall space usage and slower indexing, both of which become more severe as the tree structure becomes larger and deeper. However, many practical applications of indexing involve only iteration over the string, which remains fast as long as the leaf nodes are large enough to benefit from cache effects.

Solution 3 - Data Structures

I know it's too late for an answer, but I found The Craft of Text Editing book really useful. It contains description of several buffer models with their pros and cons. Unfortunately, it doesn't mention Ropes data structure.

Solution 4 - Data Structures

You might find this interesting, even if it does not exactly answer your question:

https://stackoverflow.com/questions/4185624/most-efficient-data-structure-to-add-styles-to-text/4197146

I am hoping that the discussion will go to fascinating places :-)

Solution 5 - Data Structures

As @Vovanium already mentioned the basic theory of how gap buffer can be used, I have implemented a version of C/C++.

Code:

#include <stdio.h>
#define SZ 1000

char leftArray[SZ], rightArray[SZ];
int leftCount, rightCount;
int cursorPos;

/*
 * Private APIs
 */

void printArray(){

	for(register int i = 0; i < leftCount; i++){
		printf("%c", leftArray[i]);
	}

	for(register int i = rightCount - 1; i >= 0; i--){
		printf("%c", rightArray[i]);
	}
	printf("\n");
}

void adjust(int pos){

	while(leftCount > pos){
		rightArray[rightCount++] = leftArray[--leftCount];
	}

	while(leftCount < pos){
		leftArray[leftCount++] = rightArray[--rightCount];
	}
}


/*
 * Public APIs for Text Editor
 */

void init(){

	cursorPos = leftCount = rightCount = 0;
}

void addWord(char word[], int len){
    
    adjust(cursorPos);

	for(register int i = 0; i < len; i++){
		leftArray[leftCount++] = word[i];
	}
	leftArray[leftCount] = 0;
}

void addBackSpace(){
    
    adjust(cursorPos);
    leftCount--;
}

void moveCurson(int newPosition){

    cursorPos = newPosition;
}

void subString(int pos, int length, char result[]){

    	adjust(cursorPos);
    
	int k = 0;
    	int l = 0;
	while(k + pos < leftCount && k < length){
		result[k] = leftArray[pos + k];
		k++;
	}

	length -= k;
	while( l < length){
		result[k++] = rightArray[rightCount - 1 - l];
		l++;
	}
}

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionMichaelView Question on Stackoverflow
Solution 1 - Data StructuresVovaniumView Answer on Stackoverflow
Solution 2 - Data StructuresPeter AlexanderView Answer on Stackoverflow
Solution 3 - Data Structuresroman-kashitsynView Answer on Stackoverflow
Solution 4 - Data StructuresthkalaView Answer on Stackoverflow
Solution 5 - Data StructuresSazzad Hissain KhanView Answer on Stackoverflow