How to prevent a leak like Tumblr’s
Tumblr took a bit of a tumble today.
http://news.ycombinator.com/item?id=2343330
One of Tumblr’s engineers (presumably) deployed a file to production which contained a critical flaw.
The first character of the file was replaced with an `i` instead of a `<`.
i?php
require_once('chorus/Utils.php');
require_once('chorus/Kestrel.php');
require_once('chorus/DataService.php');
require_once('chorus/Shard.php');
Database::set_defaults(array(
'user' => 'tumblr3',
'password' => 'm3MpH1C0Koh39AQD83TFhsBPlOM1Rx9eW55Z8YWStbgTmcgQWJvFt4',
'database' => 'tumblr3',
// 'write_lock_tables' => '*',
'extended_log' => (idate('G') == 17 && intval(idate('i')) == 56 && trim(`hostname`) == 'web10.tumblr.com')
));
if (__FILE__ == '/var/www/apps/tumblr/config/config.php' || __FILE__ == '/data/tumblr/config/config.php') {
define('ENVIRONMENT', 'production');
if (! defined('DEFAULT_DATABASE')) define('DEFAULT_DATABASE', 'primary');
define('S3_BUCKET', 'data.tumblr.com');
define('ENABLE_PANTHER', true);
define('ENABLE_MEDIA_CDN', true);
define('ASSETS_URL', (ENABLE_MEDIA_CDN && ! (isset($_SERVER['HTTPS']) && $_SERVER['HTTPS']) ? 'http://assets.tumblr.com' : ''));
define('MEMCACHE_HOST', '10.252.0.68');
define('MEMCACHE_VERSION_HOST', '10.252.0.67');
define('VALIDATION_FAILURE_LOG', BASE_PATH . '/validate.log');
# <snip>Yes, that is tumblr’s production database password.
Full source here
Once this error was introduced to production, people viewing any page would see a dump of the first 749 lines of the config file along with some PHP errors. THEN, GoogleBot came along and indexed the whole mess which is why you can still see it in Google’s search results
Learning from the mistakes of others
How can we keep this from happening to us?
First of all, go to any project on your local dev environment, replace the first `<` with `i` in any included file in your project (hint config.php) and see what happens. Chances are you'll see exactly the same thing that happened to Tumblr.
The only solution I see to this is pre-commit syntax checking for committed PHP files.
Here's a tutorial for php syntax checking in SVN and here’s a pre-commit hook script for php syntax checking in GIT.
Basically the way it works is that if you ever commit a PHP file which contains a syntax error, your commit will be blocked and you will have to amend it before you are allowed to commit it to the repository. If Tumblr had done this, they might never have leaked their config file.
[UDPATE] Apparently both PHP’s built-in Syntax Checking and the PEAR package PHP_CodeSniffer won’t pick up on errors such as this. Searching for a valid solution now…
[UDPATE] The best solution I’ve seen so far is to always return 500 responses to clients with a generic error message in production, thus preventing errors like this from bubbling up at the Apache level.
[UDPATE] Turns out PHP returns 200 when it encounters a fatal error. Inconceivable.
[UDPATE] Here’s how to force PHP to return a 500 when it encounters a fatal error. Add this to a prepend file or make it the first 5 lines of your bootstrap file.
function die_with_honor() { header("HTTP/1.1 500 Internal Server Error"); ob_clean(); } register_shutdown_function('die_with_honor');
Comments(10)
Alternatively, just do some basic testing. If this had been tested just once, the problem would have been obvious to even the most lacking of coders.
+1 to Anon
Yes, that’s certainly true. This error should have failed any number of functional tests on their staging box (assuming they have a staging box).
Or.. you could always a programming language under which stuff like this is far more unlikely.
And here’s a Ruby version for git:
https://github.com/cypher/git-ruby-syntax-check
That sucks, looks like it was a vi editor mistake too.
It does seem way too easy to make these kind of errors with php
You could always use the file command on the file.
For example in foo.php I started it with the i?php and then in bar.php I started it with <?php – at a previous employer my subversion pre commit scripts used to check this as well as things like code style with the PEAR Code Sniffer stuff to ensure that we all kept a common coding style.
$ file foo.php
foo.php: ASCII text
$ file bar.php
foo.php: PHP script text
That’s a pretty weird error, also I never hard code my actual db username and db passwords into a file that appears to be as high level as this one is.
I would have something a little lower level, sourced by everything, in a php include file, that is not part of the checkout.
the function’s name is cool imo
Seems pretty amateurish to let code like this ship without review. Then again, tumblr has always seemed pretty amateurish.