Real Time Collaboration Technology Roundup
There are several known approaches available to anyone who wants to create a multi user application from scratch - or transform an existing single user application into multi user one. The purpose of this page is to give you an overview of your options. An excerpt from a talk Raphael (our CTO) gave on the matter will allow you to go further.
|Real time collaboration implies:
To do this while maintaining a reactive User Interface (<100ms latency between user action and display of that action) within the current network limitation implies the need for “optimistic replication”. Optimistic replication means editing a local replica of the document and then sending the changes. To allow for concurrent editing, it means that we can’t rely on any lock mechanism when one user is already editing it.
Real time concurrent editing between multiple users needs to respect the following needs:
- convergence (edits from users converge to one doc state XabYc)
- user intention preservation: Y has to remain between b & c no matter what
- causality preservation (context of actions matter in a collaborative situation)
The complexity comes mainly from the fact that one user’s operation can change the context. Hence it can impact the result of another user’s operation (for instance by incrementing an index that’s used for that second action) and generate unwanted results, either for the users or prone to break the document model (or both).
Operational transformation (OT) approach is to transform operations on the fly to account for changes in context through the use of algorithms. For instance, instead of inserting a character at position 10, we'll insert it at position 11 if in the meantime, another user has inserted a character before.[example with images]
Notable implementations: Google App For Work (Google Doc, Sheet etc.), Etherpads, Google Waves
- Well established solution for collaborative text editing
- Well furnished academic literature
- Successful projects made using it
- Complexity grows exponentially with each operation added.
- Complexity of algorithm grows with the number of concurrent of users.
- Not a turnkey solution for complex applications
Analysis: OT is probably the first shared theory to solve Real Time Collaboration. A common view amongst experts (us included) is that OT loses an essential information: the context needed to retrieve user’s intention. OT’s approach is to solve the problem without handling that context which implies an exponential complexity while the application grows. As a result the applications using it are likely to compromise on features while their development time keeps going up. This is well described in a quote from Joseph Gentle, who worked on Google Wave:
Unfortunately, implementing OT sucks. There's a million algorithms with different tradeoffs, mostly trapped in academic papers. The algorithms are really hard and time consuming to implement correctly. ... Wave took 2 years to write and if we rewrote it today, it would take almost as long to write a second time.
Indeed, Google Wave needed 200 algorithms to handle all the possible combinations between 15 operations.
In our view, you can consider OT for simple projects, ideally web based, and if the libraries you pick offer everything you need (and may need later). Be especially mindful of undo’s implementation, which has to be unique per user.
Differential synchronization is about finding a semantic difference between two documents and apply the change whenever it’s possible, even, if no conflict happened. It relies on an ability to perform smart context based differences.
Notable implementations: Git - SVN
Libraries/frameworks: Diff, Match and Patch from Neil Fraser.
- Captures user intention
- Reach minimal network impact (only the difference between 2 states)
- Not applicable to RTC
Analysis: This technique is used for tools like Git and SVN. However, as it doesn’t provide merging solution for every complex conflict situation and relies on manual conflict management, we can’t consider it as a real solution for Real Time Collaboration as defined above. It can still be used to ease collaboration with a kind of manual sync. Still it’s worth knowing as it brings a great idea: keeping as much context as possible in order to find out user intention.
CRDT considers that transforming operation is too complex and establishes a new model to handle RTC. In this model, objects can’t be destroyed but disabled. Moreover operations that impact the model must follow 3 constraints: they can be applied in any order (commutative and associative) and as many times as needed (idempotent, meaning that you can apply an operation twice and still have the same result as if you apply it once). The great thing is conflicts are then impossible: if you move an object that has just been “destroyed”, you’ll in fact move a “hidden” object. But as a result, your model will be monotonically growing, and designing operations with such constraints becomes quickly difficult with a part of research and risk associated with it.
Notable implementations: Bet365 (counter), League of Legends (chat)
History: The CRDT concept was first formally defined in 2007 by Marc Shapiro and Nuno Preguiça in terms of operation commutativity. Development was initially motivated by collaborative text editing. The concept of semilattice evolution of replicated states was first defined by Baquero and Moura in 1997, and development was initially motivated by mobile computing. The two concepts were later unified in 2011.
- Very robust
- Highly scalable
- Library with some operations already available.
- Hard to design new operations. Can be long with an uncertain outcome (R&D)
- Hard to implement undo and possible performance issue
- System invariants are built in operations. When they change you need to redesign operations
- P2P puts constraint on users network management skills
In a nutshell: CRDT is proven to be very efficient to deal with massive concurrent editing (11K text entry per second for League of Legend’s chat). Anything that would require operations not already available in the libraries is likely to be too much of a burden for implementation. Typically not suited for editing application with consequent UI. Recommended for server infrastructure. You can read more about CRDT and Collaborative Editing from an implementation perspective here.
Flip isn’t based on P2P so that it can support total order of events, which makes dealing with concurrent editing mathematically much simpler. Data are stored in a place that’s always accessible. Concurrent editing is solved through a set of algorithms based on the “compare&swap” principle. Modifications of the document go through transactions and Flip’s core, which is a kind of virtual machine, can play a transaction forward and backward. With this property, Flip can rollback a transaction when the server detected a conflict. As a (meaty) bonus, it also handles the undo-redo mechanism.
Notable implementations: Ohm Studio (Digital Audio Workstation), others to be announced.
History: Flip was initially designed by and for a small team working on a large project. We needed to create it because no existing solution was up to our need for creating the real time collaborative audio software of our dreams. After 7 refactoring and 7 years of R&D and dev we’ve been able to live test the technology on the field with 130k users on 350k projects. We realized it was a unique solution to a problem that many other developers were facing so we packaged the technology we had developed. Since then we’ve helped small and large teams to adopt it either for existing products and new, upcoming ones.
- Designed for tool apps and the kind of document editing that happens through a UI
- Can stand complex data model
- Undo, Redo, Versioning out of the box
- Easy to integrate even on apps with a lot of legacy code (even 20 year old)
- Fast & stable
- Not optimized for text editing (yet)
- Can support large (in MB) documents but not huge ones (in GB) (resources not included - e.g. a video montage is fine)
Analysis: we switched our activity from B2C software publishing to disseminating Flip because we think it’s a precious, unique solution with the potential to make real time collaboration a new standard in applications. It can cope with complex needs in a quick and reliable fashion. There’s a strong focus on fulfilling the typical needs and habits of applications developers, based on the classical MVC design.
Ultimately Flip is the sole 3rd party solution for UI intensive software editors in need of real time concurrent editing. It’s also attractive for simpler project because it simplifies the development - not only the strictly RTC related features but also Undo Redo. Indeed our partners team size vary from 3 to over 60.
If you feel this could be the solution for you or just want to know more the next step is to click that blue button: