"Amazon Album Covers in Ruby on Rails"
Jan 15, 2008 12:27pm |
0 comments
Lastfm is a free online service that keeps track of the music you listen to. When you play a song in iTunes or another media player, and you have a program that connects to Lastfm, the song information is transferred automatically to the Lastfm servers. This process of sending a song is called "scrobbling". Other than usage records and stats to track my listening habits, Lastfm has another great use: it gives an xml feed of your ten most-recently-listened-to tracks. I use this on my website to display those tracks for everybody to see. It is updated in near real-time and all automatic.
I used to have just the track title and artist listed, in boring text, but then decided I needed to get the album covers instead. For a little while, I used the picture url contained in the Lastfm feed, but soon realized that a spelling error, mislabled song, or just a rare song would not find a cover. Close to 30% of my songs would not have covers... Unacceptable! I decided to use Amazon's web services (AWS) to get the pictures using a smarter method of searching... Here's how.
To get started you need to install the hpricot gem
gem install hpricot |
1 2 3 4 5 6 7 8 |
cd /tmp wget http://www.caliban.org/files/ruby/ruby-amazon-0.9.2.tar.gz gzip -d ruby-amazon-0.9.2.tar.gz tar -xvf ruby-amazon-0.9.2.tar cd ruby-amazon-0.9.2 ruby setup.rb config ruby setup.rb setup ruby setup.rb install |
The Ruby/Amazon library is a well-documented way to easily access amazon services through ruby. After you do this, you want to sign up for your own Webservices account (free) over at Amazon. Now onto the good stuff.
First, put these requires in your environment.rb
1 2 3 4 |
require 'hpricot' require 'open-uri' require 'md5' require 'amazon/search' |
Next, we get the xml feed from Lastfm. You can stream my account if you would like, but I imagine you have your own lastfm account. To get your own feed, just replace 'coneybeare' below with your account name. Next I use the hpricot gem to easily extract data out of the xml feed. Then to get the album cover art and the url that points to the amazon page I call the album_cover_fetch. I use a cache system in mine to save both the url's and the photos, but that is for another post, and for simplicity, I will leave it out.
Put this code in application.rb so all views can use it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 |
helper_method :get_recent_covers def get_recent_covers() url = "http://ws.audioscrobbler.com/1.0/user/coneybeare/recenttracks.xml" recent_tracks = []; lastfm_doc = Hpricot.XML(url) (lastfm_doc/:track).each do |track| recent_track = {} recent_track[:artist] = (track/:artist).inner_html recent_track[:track] = (track/:name).inner_html recent_track[:album] = (track/:album).inner_html recent_track[:image], recent_track[:url] = album_cover_fetch(recent_track) recent_tracks << recent_track end rescue recent_tracks = {} end |
Next is the code for album_cover_fetch
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 |
include Amazon::Search ASSOCIATES_ID = "INSERT YOUR OWN ID HERE" # Your Amazon Affiliate ID DEV_TOKEN = "INSERT YOUR OWN TOKEN HERE" # Your Amazon Web Services Key def album_cover_fetch(song) artist, album, track = [song[:artist], song[:album], song[:track]] @request = Request.new(DEV_TOKEN, ASSOCIATES_ID) begin @response = @request.artist_search artist products = @response.products rescue # there was no exact match for artist products = [] end if products.empty? return [nil, "http://www.last.fm/music/" + CGI::escape(artist) + "/_/" + CGI::escape(track)] end products.each do |p| if !album.nil? && !album.blank? && matches?(album, p.product_name) unless p.tracks.nil? if p.tracks.include? track return [p.image_url_medium, p.url] else # song is not an exact match p.tracks.each do |t| if matches?(t, track) return [p.image_url_medium, p.url] end end end end # track matching end # album match end # No match yet, lets try to find the song on another album product_matches = [] products.each do |p| unless p.tracks.nil? p.tracks.each do |t| if matches?(t, track, 0) #found it somewhere else return [p.image_url_medium, p.url] elsif matches?(t, track, 2) product_matches << p end end end end product = product_matches.sort{ |a,b| a.sales_rank <=> b.sales_rank }.first unless product_matches.empty? if product.nil? return [nil, "http://www.last.fm/music/" + CGI::escape(artist) + "/_/" + CGI::escape(track)] else return [product.image_url_medium, product.url] end end |
- Lines 8-15: Here is the call to the Amazon service. You might want to cache this as Amazon requests are not supposed to be generated more than 1 per second. More than likely, the artist will not have new products within a month, so I just use a cache system that keeps the products returned for a month, however, the code is not in this example.
- Lines 17-22: If there are no products returned for that artist search (spelling mistake, no albums, etc...) then I return nil and a link to the Lastfm page. When nil is returned for a picture url, I use a generic blank cd image in place of an album cover.
- Lines 24-38: If the products are not nil, and if the album returned by Lastfm is not blank, I locate that album in the product list, then check its track listing for a matching track. If I have found the track, I return, else I check all the tracks and look for matches and similarity (see below)
- Lines 40-66: If Lastfm was unable to give me an album to look for, or if it did not match any album returned by Amazon, I then loop through all the products and all the tracks, looking for matching/similar tracks and adding them to the product_matches array. I then sort this array by sales rank and return the most popular.
Now lets add the helper method matches? and the string-closeness algorithm, levenshtein
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 |
def matches?(s1, s2, tolerance = 2) s1 == s2 || s1 =~ /#{Regexp.escape s2}/i || s2 =~ /#{Regexp.escape s1}/i || (levenshtein(s1, s2) <= tolerance) end def levenshtein(str1, str2) s, t = [str1.unpack('U*'), str2.unpack('U*')] n, m = [s.length, t.length] return m if (0 == n) return n if (0 == m) d = (0..m).to_a x = nil (0...n).each do |i| e = i+1 (0...m).each do |j| cost = (s[i] == t[j]) ? 0 : 1 x = [d[j+1] + 1, e + 1, d[j] + cost].min d[j] = e e = x end d[m] = x end return x end |
- Lines 1-6: The matches? method can return a positive match between two strings if any of 4 conditionals are met:
- 1) They are identical
- 2) String 1 is part of String 2
- 3) String 2 is part of String 1
- 4) They have a levenshtein edit-distance of less than TOLERANCE (i have set at 2)
- Lines 8-26: The Levenshtein Algorithm is one of the more famous string-closeness algorithms out there. It goes through both strings and calculates how many edits are needed to turn one into the other. For example, 'Apple' and 'apple' returns 1 meaning only 1 edit, as does 'Apple' and "Applw', as does 'Apple' and 'Aple'. This is a great way for getting correct results despite spelling errors.
Now that all the hard work is done, put this in your view:
1 2 3 |
<% for track in get_recent_covers(10) %> <%= album_cover_tag(track) %> <% end -%> |
... and this in your application_helper.rb:
1 2 3 4 5 6 7 |
def album_cover_tag(track) if track[:image] == nil link_to(image_tag("/PATH/TO/YOUR/BLANK/CD/IMAGE"), track[:url]) else link_to(image_tag(track[:image], track[:url]) end end |

Comments