If you type a few letters into the search field over at the Internet Movie Database, you might notice how fast it is. That’s because they’re not served dynamically from their primary servers. IMDB, instead, serves the JSON data for search suggestions from a CDN, resulting in a significant speed boost. They use pregenerated static files to make this possible.
For example, if you visit this URL, you’ll get a JSON file of results for Harry Potter films:
http://sg.media-imdb.com/suggests/h/harry.json
The “h” directory means the query starts with an “h,” as they group their result sets alphabetically, and the “harry” part is what was typed into the search box. So if you wanted results that would match Doctor Who, you could use /d/doct.json
. (Spaces are replaced with underscores.)
They only seem to have result sets for 4-5 character inputs, though. So you can query “ince” but not “inception.” The latter will result in an error. I guess most searches common enough to be matched in the suggestion box are covered within that limitation.
It’s a clever implementation, and it has to save a lot of computing power on a site that large, in addition to being fast.
(Note that this is not a public API, and IMDB/Amazon probably wouldn’t be happy about you scraping it or anything like that. But it’s a nice thing to learn from.)