Finding a Website’s Favicon with Ruby

Sep 25, 2013 by Matt | Posted in Coding, Featured 97 Comments

For a project I’ve been working on, I wanted to to have my Sidekiq worker (which is part of an RSS crawler) discover the favicon for a web site and cache it for later display. It was fun figuring out a way to do this, so I just had to share.

A Brief History of Favicons

Favicons, or “shortcut icons,” can be defined in multiple ways. Like all too many things in web design, browsers handle them in slightly different and mildly incompatible ways, meaning there’s plenty of redundancy. Favicons came to be when Microsoft added them to Internet Explorer 5 in 1999, implementing a feature where the browser would check the server for a file named favicon.ico and display it in certain parts of the UI. The following year, the W3C published a standard method for defining a favicon. Rather than simply having the browser look for a file in the root directory, an HTML document should specify a file in the header with a <link> tag, just like with stylesheets.

Fast forward to the present, and you have a bit of screwiness.

All major web browsers check for the link tag first, and fall back to favicon.ico if it’s not found.
You can define multiple icons in the HTML header. You can have ICO/PNG/GIF formats, as well as different sizes.
Some browsers support larger 32×32 favicons, while others will only use the 16×16 ones. Chrome for Mac prefers the 32×32 ones, and scales them down to 16×16 on Macs without Retina displays.
Big Bad Internet Explorer only supports ICO files for favicons, not PNGs.

The most compatible way to set up your favicon is to define both 32×32 and 16×16 icons in your header, using the PNG format, and make a 16×16 ICO formatted one to name “favicon.ico” and drop into your web root. Browsers that play nicely will use the PNG ones in whatever dimensions they prefer, and IE will fall back to the ICO file.

Writing the Class

Now that the history lesson is out of way, you can see why there’s a little bit of a challenge here. Depending on how badly you want to find and display that icon, you may have to write logic for the different methods. For this tutorial, I will focus on two. The simplest, which is looking to see if there’s a favicon.ico, and a basic implementation of checking for a link tag defining a shortcut icon.

Before we do anything else, we need to install a few dependencies. Either add them to your Gemfile and do a bundle install, or use the gem install command to install them manually.

HTTParty (gem install httparty)
Nokogiri (gem install nokogiri)

Now require the necessary libraries at the top of a new Ruby file and we can get going.

require "httparty"
require "nokogiri"
require "base64"

We can define a class to make a nice, clean interface for this to keep it modular and easier to reuse. As you can see below, I’ve made a Favicon class and added some accessors for instance variables, as well as an initialize method that assigns the parameter it receives to the @host instance variable before calling the method we will be defining next.

require "httparty"
require "nokogiri"
require "base64"


class Favicon


  attr_reader :host
  attr_reader :uri
  attr_reader :base64


  def initialize(host)
    @host = host
    check_for_ico_file
  end


end

We’ll be implementing the simplest part first. The check_for_ico_file method will send an HTTP GET request to /favicon.ico on the server specified in @host and check to see if a file exists. (The server will send a 200 OK response if it does, and a 404 Not Found error otherwise.) If it does, the URL will be saved to an instance variable and the icon file’s contents will be base64 encoded before being saved to an instance variable as well.

The HTTParty gem is great for this, since it drastically simplifies simple HTTP requests like this.

# Check /favicon.ico
def check_for_ico_file
  uri = URI::HTTP.build({:host => @host, :path => '/favicon.ico'}).to_s
  res = HTTParty.get(uri)
  if res.code == 200
    @base64 = Base64.encode64(res.body)
    @uri = uri
  end
end

If you want, you could go ahead and instantiate the class to try out what we have so far. If you pass it the domain name of a site that uses the /favicon.ico convention, the object should find it without issue.

favicon = Favicon.new("arstechnica.com")

puts favicon.uri
#Outputs http://arstechnica.com/favicon.ico

puts favicon.base64
#Outputs a bunch of base64-encoded gibberish. More on this later

puts puts favicon.host
#Outputs arstechnica.com

Now let’s handle link tags! The process for that is a little bit more in-depth. First we need to request a web page from the server, such as the index page, and parse it for tags that resemble <link rel="shortcut icon" href="..." />. Then we have to evaluate the contents of href to make sure it’s an absolute URL, and prepend the domain name if it is not. After that, we can finally make a request to get the icon itself and save it.

Still with me? Excellent, now here’s the code to do that. I’ll comment it a little more thoroughly, since it looks messier at a glance.

# Check "shortcut icon" tag
def check_for_html_tag

  # Load the index page with HTTParty and pass the contents to Nokogiri for parsing
  uri = URI::HTTP.build({:host => @host, :path => '/'}).to_s
  res = HTTParty.get(uri)
  doc = Nokogiri::HTML(res)

  # Use an xpath expression to tell Nokogiri what to look for.
  doc.xpath('//link[@rel="shortcut icon"]').each do |tag|

    # This is the contents of the "href" attribute, which we pass to Ruby's URI module for analysis
    taguri = URI(tag['href'])

    unless taguri.host.to_s.length < 1
      # There is a domain name in taguri, so we're good
      iconuri = taguri.to_s
    else
      # There is no domain name in taguri. It's a relative URI!
      # So we have to join it with the index URL we built at the beginning of the method
      iconuri = URI.join(uri, taguri).to_s
    end

    # Grab the icon and set the instance variables
    res = HTTParty.get(iconuri)
    if res.code == 200
      @base64 = Base64.encode64(res.body)
      @uri = iconuri
    end
    
  end

end

Now there’s one more thing to do before we’re done. The initialize method needs to be tweaked so it calls our newest method:

def initialize(host)
  @host = host
  check_for_ico_file
  check_for_html_tag
end

Now the class will check for the favicon.ico file first, then the HTML tag. If the HTML tag is present, it will take precedence.

Available as a Gist! For your convenience, the results of this tutorial are available as a GitHub Gist.

Using the Class

Now all you have to do is include the class with a require statement, and grab favicons.

require "favicon"

favicon = Favicon.new("arstechnica.com")

puts favicon.uri
#Outputs http://static.arstechnica.net/favicon.ico

puts favicon.base64
#Outputs a bunch of base64-encoded gibberish. More on this later

puts puts favicon.host
#Outputs arstechnica.com

Now…what of that “base64-encoded gibberish?” It’s the perfect format for a little trick called Data URIs, which you can read all about over at CSS-Tricks. If you cache that base64 string somewhere, probably in a database, you can output it like so:

<img width="16" height="16" alt="favicon" src="data:image/gif;base64,BLAHBBLAHGIBBERISHGOESHERE" />

It will display like any other image, but won’t use an additional HTTP request, because the image data is already embedded on the page. This makes it perfect for a list of web sites with icons beside them. Instead of kicking off several HTTP requests for individual tiny images, you just embed them right in the page.

If you’re unfortunate enough that you must support antique versions of Internet Explorer (version seven or prior) then you can’t use Data URIs, as they were not supported. However, all is not lost. You could conceivably adapt the class and have it write the image data to files on the server instead of base64-encoding them.

http://www.inspiredgiftgiving.com marquita herald

Great tutorial – there is one other way to install a favicon. If you have a self hosted WordPress site you can simply upload the favicon plugin, install your favicon and it shows up immediately. Easy.
Grabicon

Hi Matt – great article! If your readers want a shortcut way to get free favicons (also written in Ruby, by me) they can try grabicon.com. The benefit over the DIY approach is that instead of waiting 3-4 seconds to retrieve the icon, grabicon caches them, so they’re almost instant.

It also resizes icons to what you request, and generates unique default icons for sites that don’t have one. This allows web/mobile apps to have a uniform user experience because icons are all the same size, and none are missing. Here’s an example:

http://grabicon.com/icon?domain=wikipedia.org

The full docs are on the homepage. Thanks!
FredLuis

Well, resolving such issues is important because of many reasons tile installation
Brett M

Wow! This is really helpful information I’ve been looking for this since yesterday, glad to see this post. Thanks for sharing. Check here
Emmanuel Orta

Agreed thank you for sharing. So much value!
USA Directory
Luis M

Thank you this is helpful.
Trip Fall Accident Attorney
nicole patton

Great content. This is very helpful Thanks. http://www.sanantoniofoundationandleveling.com/
Josh Albright

You have an informative article. Thanks for sharing | Used Cars dealers
Robert

I think it depend on how badly you want to find and display that icon, you may have to write logic for the different methods. – http://www.kitchenremodelhawaii.com
Robert

We can finally make a request to get the icon itself and save it. Kitchen Remodels
Mary Solero

This is very helpful. Thank you. Hudson Valley Deck and Fence
Yvette Katerine

The following year, the W3C published a standard method for defining a favicon. – concrete contractors buffalo ny
Georgia Miller

Thanks for giving us a brief history.our vision
Angie Lyn

This makes it perfect for a list of web sites with icons beside them. |
Flooring Services near me
James Wood

Browsers that play nicely will use the PNG ones in whatever dimensions they prefer, and IE will fall back to the ICO file. |
crawlspace insulation
James Wood

Browsers that play nicely will use the PNG ones in whatever dimensions they prefer, and IE will fall back to the ICO file. crawlspace dehumidifier
Haleigh Jolla

After that, we can finally make a request to get the icon itself and save it. Murfreesboro Crawlspace
Rosa Mannelli

If it does, the URL will be saved to an instance variable and the icon file’s contents will be base64 encoded before being saved to an instance variable as well.
https://www.drywallphilly.com/
Patricia Miller

Thanks for sharing that great info. Keep on posting. our site
Patricia Miller

Such an informative site. Keep on posting. https://www.foamprosboston.com/
Rosa Mannelli

It was fun figuring out a way to do this, so I just had to share. online marketing fort worth, tx
Valarie Everett

If you want, you could go ahead and instantiate the class to try out what we have so far when doing kitchen renovation .
Kadan

This is a good one. Please keep on sharing your wisdom
Lawn Care
Vance Three

Excellent explanation, but there is another way to add a favicon. If you have a self-hosted WordPress site, you can simply install the favicon plugin and your favicon will appear immediately. Easy. | Delaware Drywallers
bellid

Great job on a very detailed explanation! Appreciate your work!
Appliance Repair Experts
JOANNE

Great post, very informative site indeed! Thank you for sharing!
Excavating Contractors
Kadan

Great job explaining favicons! Keep it up
Metal Fence
bellid

Fantastic job on explaining it on detail. I really learned favicons thru your article!
Regards,
Victoria Fabrication Company
Kadan

Great input. Please keep us updated. Great explanation on the technical stuff!
Red Deer Septic Company
Jack Briggs

This is really helpful to me! Wow. post office
James Geller

This presentation is easy to understand than my professor explaining it Springfield IL seamless gutters
https://OFallonRoofingPros.com Peter21

Very well explained. Thanks for the clarification
San Antonio Fence Pros.
Felicity Young

Thank you for this information about Favicons. contact us today
Jack Briggs

I really find Favicons interesting! haroclean.com
Karlitoo Bing

The favicons are being found by two ways. First, there is a ‘hardcoded’, traditional name . Concrete Contractors Burlington IA
Amber Brion

Favicons, also known as “shortcut icons,” are small icons associated with a website that are displayed in the browser’s address bar, bookmarks, and other UI elements. The history of favicons can be traced back to the late 1990s.

In 1999, Microsoft added support for favicons in Internet Explorer 5, with a feature where the browser would check the server for a file named “favicon.ico” and display it in certain parts of the UI. This was the first implementation of favicons in a web browser.

The following year, in 2000, the World Wide Web Consortium (W3C) published a standard method for defining a favicon. This standard specified that an HTML document should include a tag in the header that points to the favicon file, just like with stylesheets.

Since then, favicons have become a standard feature of web design and are widely used to help users identify and distinguish between different websites. However, due to differences in browser implementations, there are still some minor inconsistencies in how favicons are displayed across different browsers and platforms.http://www.bestcasepropertygroup.com/
Justin

Very great information provided I will def be reading more of your articles
Lawn Mowing Service San Antonio
Felicity Young

Favicon seems interesting! -Matt
Vance Three

It’s always interesting to see how developers find creative solutions to problems like this case, discovering a website’s favicon using Ruby. – https://www.mcallendrywall.com
ampva200

I might try this one after putting up wallpaper. Very interesting!
Adele Adkins

Wow! What an incredibly helpful article. call us
Jack Briggs

Glad that you did not keep it to yourself. You really share it to us and we’re grateful. contact us
Naoma Laopa

The contents of the icon file will be base64 encoded before being saved to an instance variable, and the URL will also be saved to an instance variable., contact us!
Louis Cottier

This seems like a pretty complex way to find a favicon… I own a tree service and we have a website so I get my web developer to deal with it but damn, didn’t know it was so complex.

– Tim Learn about my company
shapannsp@yahoo.com

Actually, it’s pretty good to see! Tiler Adelaide
shapannsp@yahoo.com

Thanks for sharing! Tiler Adelaide
shapannsp@yahoo.com

Thanks for letting us know! Tiler Wollongong
shapannsp@yahoo.com

Excellent post! Concreters in Wollongong
shapannsp@yahoo.com

Thanks for sharing this to public! Adelaide Landscaping
shapannsp@yahoo.com

I visited Your blog and got a massive number of informative articles. I read many articles carefully and got the information that I had been looking for for a long time. Hope you will write such a helpful article in future. Thanks for writing.Tilers in Hobart
shapannsp@yahoo.com

Very useful and informative post! Tiling Townsville
shapannsp@yahoo.com

Very informative post! tiler melbourne
shapannsp@yahoo.com

To be honest, I generally don’t read. But, this article caught my attention.digital marketing adelaide
shapannsp@yahoo.com

I am really impressed with your writing style. Keep it up! Landscapers Canberra
shapannsp@yahoo.com

Many thanks for sharing this! Adelaide Coolroom Hire
shapannsp@yahoo.com

Thanks for sharing! Sliding Doors Adelaide
shapannsp@yahoo.com

It’s so kind of you! Solar Panels Adelaide
shapannsp@yahoo.com

Many many thanks to you! Cleaning Services Adelaide
shapannsp@yahoo.com

You presented your ideas and thoughts really well on the paper. adelaide electrician
shapannsp@yahoo.com

Very informative content. Thanks. tow truck wollongong
shapannsp@yahoo.com

Thanks for letting us know. Tiler Adelaide
shapannsp@yahoo.com

I thik this is very helpfull post Canberra landscapers
shapannsp@yahoo.com

Great Post! I learned a lot from this, Thank you! Canberra landscapers
shapannsp@yahoo.com

Really nice article and helpful me Canberra landscapers
shapannsp@yahoo.com

Nice article, waiting for your another Canberra landscapers
shapannsp@yahoo.com

Such a great post! Glenelg South
shapannsp@yahoo.com

Thats what I was looking for! air conditioning repair adelaide
shapannsp@yahoo.com

Good to know about this! Tilers Wollongong Albion Park
shapannsp@yahoo.com

This is really very nice blog and so informative Bathroom Tilers Sydney
Lead Fox

It’s the little details like this that we think make a website look great. Our web designers in Swansea have a checklist they must go through to ensure all these little things are met and favicons are on there.
Anthony Tutino

this is great, thanks for sharing – Pittsburgh Cleaning Services Pittsburgh Wedding DJ Lawn Care Charleston SC Pool Cleaning Charleston SC Nanny Charleston SC Home Staging Charleston SC Dog Grooming Charleston SC
Patricia Miller

Thank you so much for sharing this informative blog.
https://applicationfiling.com/
Naoma Laopa

Setting up 32×32 and 16×16 PNG icons in your header is the most compatible way to set up your favicon. west auckland
Kelly

That sounds like a cool and challenging project! I’m curious to hear how you tackled the favicon discovery and caching within your Sidekiq worker. pinellas park metal roofing
morgan

Thanks for sharing this info!
tree services reading
fence companies scranton
drywall companies pittsburgh
flooring companies pittsburgh
hardscape contractors pittsburgh
Naoma Laopa

I’ve only opened one support ticket, and it was promptly resolved, thus far the service has been reliable. See: http://concretedrivewaysmiami.com
Naoma Laopa

Among the more intriguing improvements are the ability to work with static pages and a new method of rapidly editing posts by adding. See: http://roofrepairsauckland.co.nz
Nathalia Martinez

Learned a lot from this blog! haroclean.com
Anthony Tutino

this is great thanks Pool Companies Charleston SC
Nathalia Martinez

Thank you so much for sharing this history with us. https://www.poseidonfishingcharters.com/
Rye Dal

Interesting post! Glad to visit this site. epoxy shed floor
Olive

Thanks for keeping us here posted with new content. pool cage screening
Zabel Seo

I would love to see more articles like this in the future. Keep up the good work! Superior Fence Shreveport Shreveport CA
Shawn Smith

Thank you for the great information you shared on this site. concrete contractor
Lily

This seems like a practical solution to make sure you're getting the right icon every time. tree removals harrow
Olive

Awesome information you've shared here. Thanks for posting. https://wellingtonconcretelayers.co.nz/
Arlene Chaves

This post clarified so much for me. You’ve made a complex topic seem simple! https://www.grandeprairielandscaping.com/
John Walker

Thanks for posting! Very informative blog.
https://www.burlingtonconcretecontractors.com
https://www.google.com/maps?cid=14818458695734641126
Zombie Velvet

I love how detailed this is and how you broke down the history of favicons, it really helps to understand the challenges. The way you set up the Ruby class to check both the favicon.ico and the HTML tag is clever, and it looks like a super efficient solution. Plus, caching the base64 encoding is such a neat idea for optimizing requests.

If you're interested, swing by Chilliwack Car Detailing anytime!
John Walker

Visit us!
https://www.burlingtonrestorationsolutions.com
https://www.burlingtonrestorationsolutions.com
https://www.google.com/maps?cid=18139827990920054828
https://www.google.com/maps?cid=18139827990920054828
Uaena Lee

That was a really clear and interesting explanation of how favicons work across different browsers. The step-by-step breakdown made the Ruby implementation easy to understand, even with all the quirks involved. It is always cool to see how small details like this can make a difference in web design. If you want to, you are welcome to visit my website at Movers Kamloops .
John Walker

Visit us!
https://www.google.com/maps?cid=11865353194342688557
https://www.google.com/maps?cid=11865353194342688557
Robert Pernell

I love that this article takes into account multiple viewpoints. galaxy s25 ultra case with kickstand
- Zombie Velvet
  
  Definitely a neat solution for saving icons in a reusable way! http://kamloopsmechanic.com
Naoma Laopa

This can be made modular and more easily reusable by defining a class to create a beautiful, tidy interface. | leaky home class action
- Zombie Velvet
  
  This was a fun and clear walk through the world of favicons. It’s crazy how something so small can have so much behind it, from legacy IE rules to base64 tricks. Loved how practical the class is since you can tell this came from solving a real problem.
  
  If you're curious about what else I’ve been working on, feel free to check out my site at Commercial Painitng Kamloops .
Zachary

Love that this article takes multiple viewpoints into account, but there’s also value in adding real-world case studies to back it up. | tree services greenville nc

Finding a Website’s Favicon with Ruby

A Brief History of Favicons

Writing the Class

Using the Class

Related Posts

TweetRoll