Which database does facebook use




















The first part of Apollo is based on Raft. It is a quorum consensus protocol that is a direct derivation of Raft — the robust leader protocol from the house of Stanford. This is unique about the Apollo database system.

The second component in the Apollo database is the storage system inside the database. RocksDB inspires the storage system. Facebook can easily manipulate it to mimic other data structures including its old MySQL database structure. Notably, Apollo is not very amicable to customisations of storage.

As a result, the database management teams of Facebook are working on adding fragments of MySQL database support to their new Apollo database storage system. Any database API is a critical component. Users need to express their pre-conditions. Apollo returns the values reads or writes if the preconditions are correct. You must remember that all data inside the database is in fragments.

Therefore, any operation inside the Apollo database at the Shard level is atomic. You can always combine many conditions and reads together to create new operations pre-conditions. Up to date, users. For each uploaded photo, Facebook generates and stores four images of different sizes, which translates to a total of 60 billion images and 1. The current growth rate is million new photos per week, which translates to 25TB of additional storage consumed weekly.

Scribe is a flexible logging system that Facebook uses for a multitude of purposes. Facebook uses Varnish to serve photos and profile pictures,handling billions of requests every day.

HipHop was developed by Facebook and was released as open source in early It importantly serves as an affiliate between the database and the application programs that makes sure that data is spontaneously organized and is easily accessible.

A particular database can be of different types. It can in the form of a simple text file or it can also be as complex as the relational database management system.

There are quite a number of databases that Facebook uses in order to keep a record of the various information that is being shared by different people all over the world.

Among them, few of the databases are:. Facebook uses MYSQL as the primary database management system for all the structured data storage such as the different wall posts, information of the various users, their timeline and so on.

This particular data is being circulated between their different data centers. Since it is easy to manage the huge number of MYSQL servers, so providing good quality service becomes easy at the same time. It has flexible features of replication that widely includes the process of unsynchronized replication and other extraordinary features that protect the data and also helps in keeping the data intact.

In HBase, the data and the information are physically fragmented which are then termed as regions. A particular region server organizes each region, and a particular region server is responsible for more than one region.

As soon as data is being attached to the HBase list, it is primarily listed in the WAL or the write-ahead log popularly known as the HLog. Once copied down to the HLog, all the data gets stored in the in-memory MemStore. It is a method of storage system for the purpose of managing huge amounts of data structure that extends to different commodity servers.

RocksDB fits best when we need to store multiple terabytes of data in one single database. Spam detection where you require fast access to your dataset. A graph search query that needs to scan a data set in real-time. Memcache is being used at Facebook right from the start. Memcache helps reduce the request latency by a large extent. Eventually providing a smooth user experience. Memcache is a distributed memory caching system, used by big guns in the industry such as Google cloud.

Now after this for every request, the value is served from Memcache until it is modified. Now the concept of eventual consistency comes into effect. The instances of an app are geographically distributed. When one instance, the node of a distributed database is updated in say in Asia, it takes a while for the changes to cascade to all of the instances of the database running. To get a uniform consistent value across all the instances.

This is known as eventual consistency. Now right at the point in time when the value is updated in the Asia instance if a person requests the object from America, he will receive the old value from the cache. And they have an infrastructure in place to manage such an ocean of data.

Facebook has open-sourced the exact versions of Hadoop which they run in production. They have possibly the biggest implementation of the Hadoop cluster in the world. Processing approx. Facebook messages use a distributed database called Apache HBase to stream data to Hadoop clusters.

Another use case is collecting user activity logs in real-time in Hadoop clusters. It is written in Java. Now the messenger service uses RocksDB to store user messages. The migration of the messenger service database from HBase to RocksDB enabled Facebook to leverage flash memory to serve messages to its users as opposed to serving messages from the spinning hard disks.

Also, the replication topology of MySQL is more compatible with the way Facebook data centers operate in production. This enabled the service to be more available and have better disaster recovery capabilities.

Apache Cassandra is a distributed wide-column store built in house at Facebook for the Inbox search system. The project runs on top of an infrastructure of hundreds of nodes spread across many data centres. Cassandra is built to maintain a persistent state in case of node failures. Being distributed features like scalability, high performance, high availability are inherent. At Facebook, it is used to run data analytics on petabytes of data.



0コメント

  • 1000 / 1000