Adam Kennedy conjectures that perl is unparseable, and suggests how to prove it
How do you implement full-text search for that 10+ million row table, keep up with the load, and stay relevant? Sphinx is good at those kinds of riddles.
Books on Algorithms, Data Structures, Cryptography, Coding Theory, Combinatorics
These submissions from industry and academia are designed to help teach algorithms to students around the world.
Computer Science Unplugged is a series of learning activities that reveals a little-known secret: Computer Science isn't really about computers at all!
CS Unplugged provides an extensive collection of free resources that teach principles of Computer Science such as binary numbers, algorithms and data compression through engaging games and puzzles that use cards, string, crayons and lots of running around.
I spent the summer of 2006 building Code Search, which lets programmers search for source code using regular expressions. That is, it lets you grep through the world's public source code. We originally planned to use PCRE for the regular expression search, until we realized that it used a backtracking algorithm, meaning it is easy to make searches take exponential time or arbitrary stack depth. Since Code Search accepts regular expressions from anyone on the Internet, using PCRE would have left it open to easy denial of service attacks. As an alternative to PCRE, I wrote a new, carefully reviewed regular expression parser wrapped around Ken Thompson's open source grep implementation, which uses a fast DFA.
Over the next three years, I implemented a handful of new back ends that collectively replaced the grep code and expanded the functionality beyond what is needed for a POSIX grep. The result, RE2, provides most of the functionality of PCRE using a C++ interface very close to PCRE's, but it guarantees linear time execution and a fixed stack footprint. RE2 is now in widespread use at Google, both in Code Search and in internal systems like Sawzall and Bigtable.
As of March 2010, RE2 is an open source project, with all development conducted in public. This article is a tour through the RE2 source code, showing how the techniques in the first two articles apply in a production implementation.