Real time collaborative editing - how does it work?

JavascriptAjax

Javascript Problem Overview


I'm writing an application in which I'd like to have near real time collaborative editing features for documents (Very similar to Google Documents style editing).

I'm aware of how to keep track of cursor position, that's simple. Just poll the server ever half second or second with the current user id, filename, line number and row number which can be stored in a database, and the return value of this polling request is the position of other user's cursors.

What I don't know how to do is update the document in such a way that it won't throw your cursor off and force a full reload as that would be far to slow for my purposes.

This really only has to work in Google Chrome, preferably Firefox as well. I don't need to support any other browser.

Javascript Solutions


Solution 1 - Javascript

The algorithm used behind the scenes for merging collaborative edits from multiple peers is called operational transformation. It's not trivial to implement though.

See also this question for useful links.

Solution 2 - Javascript

Real time collaborative editing requires several things to be effective. Most of the other answers here focus on only one aspect of the problem; namely distributed state (aka shared-mutable-state). Operational Transformation (OT), Conflict-Free Replicated Data Types (CRDT), Differential Synchronization, and other related technologies are all approaches to achieving near-real-time distributed state. Most focus on eventual consistency, which allow temporary divergences of each of the participants state, but guarantee that each participants state will eventually converge when editing stops. Other answers have mentioned several implementations of these technologies.

However, once you have shared mutable state, you need several other features to provide a reasonable user experience. Examples of these additional concepts include:

  • Identity: Who the people you are collaborating with are.
  • Presence: Who is currently "here" editing with you now.
  • Communication: Chat, audio, video, etc., that allow users to coordinate actions
  • Collaborative Cueing: Features that give indications as to what the other participants are doing and/or are about to do.

Shared cursors and selections are examples of Collaborative Cueing (a.k.a Collaboration Awareness). They help users understand the intentions and likely next actions of the other participants. The original poster was partly asking about the interplay between shared mutable state and collaborative cueing. This is important because the location of a cursor or selection in a document is typically described via locations within the document. The issue is that the location of a cursor (for example) is dependent on the context of the document. When I say my cursor is at index 37, that means character 37 in the document I am looking at. The document you may have right now may be different than mine, due to your edits or those of other users, and therefore index 37 in your document may not be correct.

So the mechanism you use to distribute cursor locations must be somehow integrated into or at least aware of the mechanism of the system that provides concurrency control over the shared mutable state. One of the challenges today is that while there are many OT / CRDT, bidirectional messaging, chat, and other libraries out there, they are isolated solutions that are not integrated. This makes it hard to build an end user system that provides a good user experience, and often results in technical challenges left to the developer to figure out.

Ultimately, to implement an effective real time collaborative editing system, you need to consider all of these aspects; and we haven't even discussed history, authorization, application level conflict resolution, and many other facets. You must build or find technologies that support each of these concepts in a way that make sense for your use case. Then you must integrate them.

The good news is that applications that support collaborative editing are becoming much more popular. Technologies that support building them are maturing and new ones are becoming available every month. Firebase was one of the first solutions that tried to wrap in many of these concepts into an easy to use API. A new-comer Convergence (full disclosure, I am a founder of Convergence Labs), provides an all-in-one API that supports the majority of these collaborative editing facets and can significantly reduce the time, cost, and complexity of building real time collaborative editing apps.

Solution 3 - Javascript

You don't need xmpp or wave for this necessarily. Most of the work on an opensource implementation called infinote already have been done with jinfinote ( https://github.com/sveith/jinfinote). Jinfinote was recently also ported to python ( https://github.com/phrearch/py-infinote) to handle concurrency and document state centrally. I currently use both within the hwios project ( https://github.com/phrearch/hwios), which relies on websockets and json transport. You don't want really want to use polling for these kind of applications. Also xmpp seems to complicate things unnecessarily imo.

Solution 4 - Javascript

After coming upon this question and doing a more careful search, I think the best standalone application to check out would be Etherpad, which runs as a JS browser app and using Node.js on the server side. The technology behind this is known as operational transformation.

Etherpad was originally a pretty heavyweight application that was bought by Google and incorporated into Google Wave, which failed. The code was released as open source and the technology was rewritten in Javascript for Etherpad Lite, now renamed just "Etherpad". Some of the Etherpad technology was probably also incorporated into Google Docs.

Since Etherpad, there have been various versions to this technology, notably some Javascript libraries that allow for integrating this directly into your web app:

I am the maintainer of the meteor-sharejs package for adding realtime editors directly to a Meteor app, which IMHO is the best of both worlds :)

Solution 5 - Javascript

As Gintautas pointed out, this is done by Operational Transformation. As I understand it, the bulk of the research and development on this feature was done as part of the now-defunct Google Wave project, and is known as the Wave Protocol. Fortunately, Google Wave is open-sourced, so you can get some good code samples at http://code.google.com/p/wave-protocol/

Solution 6 - Javascript

The Google Docs team did a little bit of a case study around how the real time collaboration worked, but I can't find the blog entry.

There is some decent stuff on the wikipedia page, though: http://en.wikipedia.org/wiki/Collaborative_real-time_editor

Solution 7 - Javascript

I've recently published a repository with a working example of what seems you're trying to achieve:

https://quill-sharedb-cursors.herokuapp.com

It's based off ShareDB (OT) working as the backend and Quill rich text editor on the frontend.

Basically just wires all these things with some more code to draw the cursors. The code should be fairly simple to understand and to copy over to any specific solution.

Hope it helps with the endeavor.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionBrandon WamboldtView Question on Stackoverflow
Solution 1 - JavascriptGintautas MiliauskasView Answer on Stackoverflow
Solution 2 - JavascriptMichael MacFaddenView Answer on Stackoverflow
Solution 3 - JavascriptPhrearchView Answer on Stackoverflow
Solution 4 - JavascriptAndrew MaoView Answer on Stackoverflow
Solution 5 - JavascriptStriplingWarriorView Answer on Stackoverflow
Solution 6 - JavascriptarnorhsView Answer on Stackoverflow
Solution 7 - JavascriptpedrosantaView Answer on Stackoverflow