[csu540-f05-rpf] Some insight into google's innards

Robert Futrelle futrelle at ccs.neu.edu
Fri Nov 4 22:03:17 EST 2005


Excerpted from
http://news.com.com/Google+throws+bodies+at+OpenOffice/2100-7344_3-5920762.html

Google is notoriously reluctant to describe the particulars of
its search-computing data center, which served the demands of 380
million people in August. But DiBona did discuss some details.

The company uses the Linux operating system for its mainstay
search service, he said. Its Linux core begins not with software
from a company such as Red Hat, or Novell's Suse Linux, but
rather from the version that project leader Linus Torvalds posts
periodically to the kernel.org Web site.

Among the open-source technologies used by Google are the Python
programming language and the MySQL database, he said. In
addition, Google's Blogger site uses Apache Web server software
and the Tomcat package for running Java programs on the server.

The GCC compiler software, used to create nearly every
open-source program in existence, also is widely used at Google.

Sun's Java also figures prominently, even though it's not
open-source at its center. "We make great use of Java at the
company," DiBona said, including for Gmail. The company claims
the Web-based e-mail service has millions of subscribers.

Sun hasn't released the fundamental part of Java--the virtual
machine component--as open-source software. However, the Apache
Software Foundation is working on an open-source Java effort
called Project Harmony, an initiative that now has IBM developer
support.

"I think they'll succeed wildly," DiBona said of Harmony.
"They're so good at this. They say, 'We're going to write this
software,' and it gets done."

Despite Google's liking for open-source software, plenty of
programming at the company is proprietary.

"We're never going to open-source PageRank," DiBona said,
referring to the algorithm the company uses to choose which
search results to present. "It's the thing that makes Google
Google."

Open-source output

Google isn't only an open-source software consumer. It's an
open-source producer as well: For example, employees submit
software to the Apache Axis Web services project, DiBona said.

The Mountain View, Calif.-based company also employs some
open-source notables:

* Sean Egan, leader of the GAIM project for instant messaging
software;

* Alex Martelli, a leading Python developer;

* Greg Stein, the Apache Software Foundation chairman and a
manager of the Subversion source code management software.

* And Ben Goodger, the lead programmer of the Firefox Web browser
project, as well as a few other Firefox programmers.

Google also has published several open-source projects, including
tools for debugging software, improving its performance,
monitoring MySQL databases and using the AJAX software for richer
Web page interfaces.

But so far, there is a significant limit to the group-programming
facet of Google's projects: The company doesn't yet accept
outside contributions.

Some developers have offered the company contributions meant to
improve Google's open-source software--for example, to add 64-bit
support to 32-bit software. That cooperation is awkward right now
for reasons relating to intellectual-property control, DiBona
said.

"We've been slow in being able to accept outside patches," he
said. But the company is working on a contributor license that
lays out patent and copyright terms for outside contributors.
"It's something that pays to be very, very careful about."

The company has helped outside open-source projects, though.
Through a $2 million program called the Google Summer of Code,
the company sponsored 400 college-age students to work on
open-source projects last summer. Each got $4,500 if they met
their goals, which 84 percent did. Another $500 went to each of
the several open-source projects that helped organize the effort,
DiBona said.

Open-source software is good for young programmers, DiBona said,
noting that it gives them real-world problems to solve and
teaches them self-management skills.

"We think open-source is pretty important," DiBona said. "Without
it, the industry would not be as good as it is now to newcomers."



More information about the csu540-f05-rpf mailing list