cURL is often used on the server end to pull web pages and RSS feeds for parsing, or for interacting with APIs. It’s a nifty tool, and one that I use quite frequently.
cURL also can be used, in a command line environment, to do some useful things that come in handy when troubleshooting.
A few months ago a website I do some occasional work on had been infiltrated by a crafty hacker bent on spreading malware. If you visited the website directly, everything looked fine. But if you went to Google or Yahoo, searched for the site, and clicked through, you would be redirected to a page on a server that would attempt to install one of those fake antivirus malware applications. (The site ranked well too, and got at least half of it’s traffic from search engines.)
I ended up talking to the Google Security team about the problem, and they discovered that the attacker had put some code somewhere on the site that would check the referer [sic] header to see if a visitor was coming from Google or Yahoo, and redirect if they were. A trick like that would cause it to take a lot more time for a site’s operator to discover the problem. I later tracked down the little XSS attack, which was related to a forum vulnerability, and fixed the problem.
How did Google Security diagnose the site? They used cURL from a command line. They typed, at their Linux/UNIX/Mac system’s prompt:
curl -v --referer http://www.google.com/ www.example.com | more
Then they compared the output to that of
curl -v www.example.com | more
The first one returned the HTML source for a page with a JavaScript redirect to the malware site, while the second one did not. The --referer
flag allowed them to pretend that the request they were sending was a user clicking through from Google. (The “| more
” part is a *nix command to paginate the output so it doesn’t scroll by faster than you can read it.)
Neat trick, isn’t it?
While working on GoCodes recently, I was doing some work with PHP’s header()
function. The messages it sends to applications requesting the page aren’t usually visible to users, so how was I to tell if the headers I was sending were working properly? I turned to cURL. I would access a GoCodes URL from cURL like this:
curl -I http://www.webmaster-source.com/go/whatisrss/
The command would return the headers sent by the page. The date, the server software, the character set, and any other headers I was sending, such as the X-Robots-Tag
header.
cURL is a great diagnostics tool as well as a page-fetching application for scripts. It can do a lot, and has a lot of uses. Type curl --help
at the command line to see all of it’s capabilities.