Pinned 1 month 3 weeks ago onto Google
Google's First Server (1998)
'The original hardware (circa 1998) that was used by Google when it was located at Stanford University included:
- Sun Microsystems Ultra II with dual 200 MHz processors, and 256 MB of RAM. This was the main machine for the original Backrub system.
- 2 × 300 MHz dual Pentium II servers donated by Intel, they included 512 MB of RAM and 10 × 9 GB hard drives between the two. It was on these that the main search ran.
- F50 IBM RS/6000 donated by IBM, included 4 processors, 512 MB of memory and 8 × 9 GB hard disk drives.
- Two additional boxes included 3 × 9 GB hard drives and 6 x 4 GB hard disk drives respectively (the original storage for Backrub). These were attached to the Sun Ultra II.
- SDD disk expansion box with another 8 × 9 GB hard disk drives donated by IBM.
- Homemade disk box which contained 10 × 9 GB SCSI hard disk drives.'
'Most of the software stack that Google uses on their servers was developed in-house. According to a well-known Google employee, C++, Java, Python and (more recently) Go are favored over other programming languages. For example, the back end of Gmail is written in Java and the back end of Google Search is written in C++. Google has acknowledged that Python has played an important role from the beginning, and that it continues to do so as the system grows and evolves.
The software that runs the Google infrastructure includes:
Google Web Server (GWS) – custom Linux-based Web server that Google uses for its online services.
- Google File System and its successor, Colossus
- BigTable – structured storage built upon GFS/Colossus
- Spanner – planet-scale structured storage system, next generation of BigTable stack
- Google F1 – a distributed, quasi-SQL DBMS based on Spanner, substituting a custom version of MySQL.
- Chubby lock service
- MapReduce and Sawzall programming language
- TeraGoogle – Google's large search index (launched in early 2006), designed by Anna Patterson of Cuil fame.
- Caffeine (Percolator) – continuous indexing system (launched in 2010).
- Hummingbird – major search index update, including complex search and voice search.
- Borg declarative process scheduling software
Google has developed several abstractions which it uses for storing most of its data:
- Protocol Buffers – "Google's lingua franca for data", a binary serialization format which is widely used within the company.
- SSTable (Sorted Strings Table) – a persistent, ordered, immutable map from keys to values, where both keys and values are arbitrary byte strings. It is also used as one of the building blocks of BigTable.
- RecordIO – a sequence of variable sized records.
Most operations are read-only. When an update is required, queries are redirected to other servers, so as to simplify consistency issues. Queries are divided into sub-queries, where those sub-queries may be sent to different ducts in parallel, thus reducing the latency time. To lessen the effects of unavoidable hardware failure, software is designed to be fault tolerant. Thus, when a system goes down, data is still available on other servers, which increases reliability.'