Facebook Architecture

FacebookDesign PatternsArchitecture

Facebook Problem Overview


I have been scrounging for articles/info about the architecture at Facebook, the challenges & ways they tackle them. What they use & why they use. How do they scale & what are the design decisions for what they do etc. Main underpinning being to learn. Knowing about sites which handles such massive traffic gives lots of pointers for architects etc. to keep in mind certain stuff while designing new sites. I am sharing what I found.

  1. Facebook Science & Social Graph (Video)
  2. Scale at Facebook
  3. Facebook Chat Architecture
  4. Facebook Blog
  5. Facebook Cassandra Architecture and Design
  6. Facebook Engineering Notes
  7. Quora - Facebook Architecture
  8. Facebook for 600M users
  9. Hadoop & its usage at Facebook
  10. Erlang at Facebook: Chat Architecture
  11. Facebook Performance Caching Evolution
  12. Facebook Connect/Login Architecture

I have 2 more links but unable to post due to restrictions at this site. Also, please share if anyone has anything better (need not be related to Facebook only).

P.S. - I wasn't able to find good places to share this research, hence this initiative. Hope this helps someone.

Facebook Solutions


Solution 1 - Facebook

Well Facebook has undergone MANY many changes and it wasn't originally designed to be efficient. It was designed to do it's job. I have absolutely no idea what the code looks like and you probably won't find much info about it (for obvious security and copyright reasons), but just take a look at the API. Look at how often it changes and how much of it doesn't work properly, anymore, or at all.

I think the biggest ace up their sleeve is the Hiphop. http://developers.facebook.com/blog/post/358 You can use HipHop yourself: https://github.com/facebook/hiphop-php/wiki

But if you ask me it's a very ambitious and probably time wasting task. Hiphop only supports so much, it can't simply convert everything to C++. So what does this tell us? Well, it tells us that Facebook is NOT fully taking advantage of the PHP language. It's not using the latest 5.3 and I'm willing to bet there's still a lot that is PHP 4 compatible. Otherwise, they couldn't use HipHop. HipHop IS A GOOD IDEA and needs to grow and expand, but in it's current state it's not really useful for that many people who are building NEW PHP apps.

There's also PHP to JAVA via things like Resin/Quercus. Again, it doesn't support everything...

Another thing to note is that if you use any non-standard PHP module, you aren't going to be able to convert that code to C++ or Java either. However...Let's take a look at PHP modules. They are ARE compiled in C++. So if you can build PHP modules that do things (like parse XML, etc.) then you are basically (minus some interaction) working at the same speed. Of course you can't just make a PHP module for every possible need and your entire app because you would have to recompile and it would be much more difficult to code, etc.

However...There are some handy PHP modules that can help with speed concerns. Though at the end of the day, we have this awesome thing known as "the cloud" and with it, we can scale our applications (PHP included) so it doesn't matter as much anymore. Hardware is becoming cheaper and cheaper. Amazon just lowered it's prices (again) speaking of.

So as long as you code your PHP app around the idea that it will need to one day scale...Then I think you're fine and I'm not really sure I'd even look at Facebook and what they did because when they did it, it was a completely different world and now trying to hold up that infrastructure and maintain it...Well, you get things like HipHop.

Now how is HipHop going to help you? It won't. It can't. You're starting fresh, you can use PHP 5.3. I'd highly recommend looking into PHP 5.3 frameworks and all the new benefits that PHP 5.3 brings to the table along with the SPL libraries and also think about your database too. You're most likely serving up content from a database, so check out MongoDB and other types of databases that are schema-less and document-oriented. They are much much faster and better for the most "common" type of web site/app.

Look at NEW companies like Foursquare and Smugmug and some other companies that are utilizing NEW technology and HOW they are using it. For as successful as Facebook is, I honestly would not look at them for "how" to build an efficient web site/app. I'm not saying they don't have very (very) talented people that work there that are solving (their) problems creatively...I'm also not saying that Facebook isn't a great idea in general and that it's not successful and that you shouldn't get ideas from it....I'm just saying that if you could view their entire source code, you probably wouldn't benefit from it.

Solution 2 - Facebook

Facebook is using LAMP structure. Facebook’s back-end services are written in a variety of different programming languages including C++, Java, Python, and Erlang and they are used according to requirement. With LAMP Facebook uses some technologies ,to support large number of requests, like

  1. Memcache - It is a memory caching system that is used to speed up dynamic database-driven websites (like Facebook) by caching data and objects in RAM to reduce reading time. Memcache is Facebook’s primary form of caching and helps alleviate the database load. Having a caching system allows Facebook to be as fast as it is at recalling your data.

  2. Thrift (protocol) - It is a lightweight remote procedure call framework for scalable cross-language services development. Thrift supports C++, PHP, Python, Perl, Java, Ruby, Erlang, and others.

  3. Cassandra (database) - It is a database management system designed to handle large amounts of data spread out across many servers.

  4. HipHop for PHP - It is a source code transformer for PHP script code and was created to save server resources. HipHop transforms PHP source code into optimized C++. After doing this, it uses g++ to compile it to machine code.

If we go into more detail, then answer to this question go longer. We can understand more from following posts:

  1. How Does Facebook Work?
  2. Data Management, Facebook-style
  3. https://stackoverflow.com/questions/1009025/facebook-database-design
  4. https://stackoverflow.com/questions/1309231/facebook-walls-database-structure
  5. https://stackoverflow.com/questions/7487476/facebook-like-data-structure

Solution 3 - Facebook

> "Knowing about sites which handles > such massive traffic gives lots of > pointers for architects etc. to keep > in mind certain stuff while designing > new sites"

I think you can probably learn a lot from the design of Facebook, just as you can from the design of any successful large software system. However, it seems to me that you should not keep the current design of Facebook in mind when designing new systems.

Why do you want to be able to handle the traffic that Facebook has to handle? Odds are that you will never have to, no matter how talented a programmer you may be. Facebook itself was not designed from the start for such massive scalability, which is perhaps the most important lesson to learn from it.

If you want to learn about a non-trivial software system I can recommend the book "Dissecting a C# Application" about the development of the SharpDevelop IDE. It is out of print, but it is available for free online. The book gives you a glimpse into a real application and provides insights about IDEs which are useful for a programmer.

Attributions

All content for this solution is sourced from the original question on Stackoverflow.

The content on this page is licensed under the Attribution-ShareAlike 4.0 International (CC BY-SA 4.0) license.

Content TypeOriginal AuthorOriginal Content on Stackoverflow
QuestionSrikar AppalarajuView Question on Stackoverflow
Solution 1 - FacebookTomView Answer on Stackoverflow
Solution 2 - FacebookSomnath MulukView Answer on Stackoverflow
Solution 3 - FacebookJørgen FoghView Answer on Stackoverflow