<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Mark's Blog &#187; flickr scripting</title>
	<atom:link href="http://longair.net/blog/tag/flickr-scripting/feed/" rel="self" type="application/rss+xml" />
	<link>http://longair.net/blog</link>
	<description>(occasional miscellanea)</description>
	<lastBuildDate>Tue, 03 Aug 2010 09:59:58 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.0</generator>
		<item>
		<title>Hashing Flickr Photos</title>
		<link>http://longair.net/blog/2009/12/19/hashing-flickr-photos/</link>
		<comments>http://longair.net/blog/2009/12/19/hashing-flickr-photos/#comments</comments>
		<pubDate>Sat, 19 Dec 2009 22:30:27 +0000</pubDate>
		<dc:creator>mark</dc:creator>
				<category><![CDATA[Software]]></category>
		<category><![CDATA[flickr scripting]]></category>

		<guid isPermaLink="false">http://longair.net/blog/?p=405</guid>
		<description><![CDATA[<p>I used to host my photos with a simple set of CGI scripts that basically worked well enough for my simple requirements.  Such web applications are easy and fun to write, but in the end I decided that it wasn&#8217;t worth it because:</p> Hosting large amounts of data on a generic shell account is [...]]]></description>
			<content:encoded><![CDATA[<p><a href="http://www.flickr.com/photos/mhl20/4197661773/"><img class="alignright" src="http://farm3.static.flickr.com/2756/4197661773_d671c5feea_m.jpg" alt="" width="240" height="124" /></a>I used to host my photos with a simple set of CGI scripts that basically worked well enough for my simple requirements.  Such web applications are easy and fun to write, but in the end I decided that it wasn&#8217;t worth it because:</p>
<ul>
<li>Hosting large amounts of data on a generic shell account is typically quite expensive.  <a href="http://www.flickr.com">Flickr</a>&#8216;s <a href="http://www.flickr.com/upgrade/">&#8220;pro&#8221; account subscription</a> is a very good deal in comparison: as long as each photo is beneath 20 megabytes in size, you can upload as many as you like for $24.95 a year.</li>
<li>The community aspect of sites like Flickr is very encouraging &#8211; it&#8217;s lovely to have random people say nice things about your photographs, and occasionally have people use them in articles, etc.</li>
</ul>
<p>(Some people are put off from using Flickr by the appearance of the site, but its API means that there are plenty of alternative front-ends for viewing or presenting your photos, such as <a href="http://www.flickriver.com/photos/mhl20/">flickriver</a>.)</p>
<p>The slight problem with switching to hosting on Flickr was that previously I&#8217;d indexed all my photos by the MD5sum of the original image, so several of my pages had links or inline images that pointed to an MD5sum-based URL on the old site.  It occurred to me that it might be useful in general to have <a href="http://www.flickr.com/groups/api/discuss/72157594497877875/">&#8220;machine tags&#8221;</a> on each photo with a hash or checksum of the image, so that, for example:</p>
<ul>
<li>You can simply check which photos have already been uploaded.</li>
<li>You can find URLs for all the different image sizes, etc. based on the content of the file.</li>
</ul>
<p>Unfortunately, I hadn&#8217;t done this when uploading the files in the first place, so had to write a script (<a href="http://github.com/mhl/flickr-checksums/blob/master/flickr-checksum-tags.py">flickr-checksum-tags.py</a>) which takes the slightly extraordinary step of downloading the original version of every photo that doesn&#8217;t have the checksum tags to a temporary file, hashing each file, adding the tags and deleting the temporary file.  This add tags for the MD5sum and the SHA1sum, using a namespace and keys <a href="http://www.flickr.com/groups/api/discuss/72157594497877875/#comment72157594506503786">suggested in this discussion</a>, where someone suggests taking the same approach. These tags are of the form:</p>
<pre>
  checksum:md5=c629c63f8508cfd1a5e6ba6b4b3253a8
  checksum:sha1=df44fc771660fbe7a2d6b2e284ae61e9ed3e377c
</pre>
<p>The same script can return URLs for a given checksum:</p>
<pre>
  # ./flickr-checksum-tags.py -m c629c63f8508cfd1a5e6ba6b4b3253a8 --short
  > http://flic.kr/p/7oQxqK
  # ./flickr-checksum-tags.py -m c629c63f8508cfd1a5e6ba6b4b3253a8 -p
  > [... the Flickr photo page URL, which WordPress insists on turning into an image ...]
  # ./flickr-checksum-tags.py -m c629c63f8508cfd1a5e6ba6b4b3253a8 --size=b
  > http://farm3.static.flickr.com/2552/4196574615_491c6387f8_b.jpg
</pre>
<p>The <a href="http://github.com/mhl/flickr-checksums">repository</a> also has a <a href="http://github.com/mhl/flickr-checksums/blob/master/find-not-uploaded.py">script to pick out files that haven&#8217;t been uploaded</a>, and a <a href="http://github.com/mhl/flickr-checksums/blob/master/flickr-upload.py">simple uploader script</a> which will upload an image and add the checksum tags.  The scripts are based on the very useful Python <a href="http://stuvel.eu/projects/flickrapi">flickrapi module</a> and you&#8217;ll need to put your Flickr API key and secret in <tt>~/.flickr-api</tt></p>
<p>Anyway, these have been useful for me so maybe of some interest to someone out there&#8230;</p>
]]></content:encoded>
			<wfw:commentRss>http://longair.net/blog/2009/12/19/hashing-flickr-photos/feed/</wfw:commentRss>
		<slash:comments>1</slash:comments>
		</item>
	</channel>
</rss>
