Tech stuff, API and programming: remembering what i've uploaded

remembering what i've uploaded

	stephenh October 17, 2005, 03:11 AM
	So I've synced the first 21 files from Adobe Album up to 23, tags and all. Currently, I remember what photos I've uploaded by putting the string (for example) "album#3" in their caption. Oh. I'll need an end delimiter too. So album#3 and album#30 won't be returned by the search. Anyway, each time sync.rb runs, it goes through all the Album files, does a 23 search for that album id in album#id form, and then uploads the photo if it doesn't exist. I really like this because it keeps all of the sync state on the server. No configuration/history files to worry about on my side (earlier I tried adding a column to the Album Access file but it really screwed things up--I have no idea why). But now it looks ugly having this album#id in the caption. So the hack is starting to bother me. Any ideas on how else I can remember where a photo on 23 came from? Maybe something like a path field, e.g. where it originally was on my harddrive, then I could search on that. I dunno. I think the answer is that I should just do client-side sync state tracking. But I'd rather not. So I'm throwing this out there as a challenge--good ideas would be appreciated. Thanks.

	Steffen Fagerström Christensen Team 23 October 17, 2005, 06:52 AM
	Hm, sound somewhat extensive. One way of handling this would be to add some identifier to the exif data of every local photo. Then, you can check against that piece of info through the api, when you do your sync... /Stc

	martind October 17, 2005, 10:01 AM
	Erm... maybe I'm missing something, but why don't you simply use tags? That's what they are for, after all: a simple way to create metadata. The caption isn't really a useful place for that. E.g., use tags like "from:flickr", "from:adobe-album", "from:powershot-a70", etc.

	stephenh October 17, 2005, 02:55 PM
	Not a bad idea, except that I need to remember it on a photo-specific basis, e.g.: foo.jpg -> tagged with from:album:id=2 bar.jpg -> tagged with from:album:id=3 That way the next time my sync script runs, it will come across "foo.jpg" and say "are there any photos on 23 tagged with from:album:id=2?" (where 2 is the unique id from the Album Access db). If there is, it knows not to upload it again. If there is not, it knows its a new photo to POST up to 23. I suppose unique tags wouldn't necessarily be that bad--though 1 per photo might start to clutter the UI a little bit. Good idea though. Ironically, Album uses this very same technique to do some tracking internally--e.g. if I use its "share" feature to send out photos via email, it will give them an internal tag along the lines of "Shared - Recipient Name".

	stephenh October 17, 2005, 03:00 PM
	Hm...that sounds intriguing. I haven't played much with EXIF, but I like what you're saying. If I understand it: - Before uploading each file, make sure its EXIF has a field set like albumId=2 - Search 23 for any photo with albumId=2 in its EXIF - If it's there, don't upload - If it's not there, it's new, do an upload. Just one question: is there an easy way to search the EXIF data? E.g. according to the flickr docs, flickr.photos.search doesn't cover EXIF. Thanks.

	Steffen Fagerström Christensen Team 23 October 17, 2005, 04:23 PM
	Okay, the basics are: This will return a paginated list of all my public photos (since I'm user #456): http://www.23hq.com/services/rest/?method=photos.search&api_key=ignore&user_id=456 If you want to include private photos, include username and password as well: http://www.23hq.com/services/rest/?method=photos.search&api_key=ignore&user_id=456&username=steffen&password=(password) The list is paginated, so this will return page #4: http://www.23hq.com/services/rest/?method=photos.search&api_key=ignore&user_id=456&page=4 So that should get you a list of all photo_ids, then read the exif data: http://www.23hq.com/services/rest/?method=photos.getExif&api_key=ignore&photo_id=60786 This method doesn't deriviate much from tagging, but others won't be able to see the tags, and you'll have the tags hardcoded in the local versions of files (you'll probably need some toolkit to handle the exif tagging, though -- ImagaMagick may be a way to go). As always, I'll encourage you to share your solution when you're done... :-) /Stc

	Steffen Fagerström Christensen Team 23 October 17, 2005, 04:25 PM
	Oh yeah, forgot to show you how to get the photo id. This will do the trick: http://www.23hq.com/services/rest/?method=test.login&api_key=ignore&username=steffen&password=(password) /Stc

	stephenh October 27, 2005, 03:33 PM
	Just as an update, I'm still planning on sharing when I get done, but I had it working great, I was storing client-side sync state as a YAML-serialized map of { filePath: photoId }. Then I'd just loop over all the album images, find which ones didn't have a path/photoId match yet, and go upload it. However, after doing 300 of the 600 images I'm trying to upload, some crazy timeout/YAML error corrupted the sync.yaml file and I lost all 300 existing { filePath: photoId } mappings. With no backup. So, now my script wants to start at 0 and re-load 300 images. As you can see, its not really ready for public consumption yet. I'm mulling over the proposed EXIF tactic and will let you know when I get something more stable worked out.

	stephenh October 28, 2005, 07:04 PM
	I think I've got something that's working well. The EXIF recommendation led me down the right path. It made me realize each picture has a date taken that is semi-unique down to the second. That is good enough for me. So, I still have a local cache (backed by ActiveRecord and SQLite this time, so hopefully it will be a bit more robust, but then also a way to re-populate the cache by using photo.search(min_taken_date=d,max_taken_date=d) which should return just the one picture if it's already uploaded. So, I was able to re-populate the cache with the first 300 images I uploaded and not have to go through the pain of deleting/re-uploading them. And now the script is starting to upload the rest. We'll see how it goes. Thanks for the EXIF tip. Using the date taken is saving me a lot of time.

	Thomas Madsen-Mygdal Team 23 November 04, 2005, 07:33 PM
	How's it shaping out?

	stephenh November 05, 2005, 05:11 PM
	Very well. The SQLite cache is fast and robust. And re-building it based on max_taken_date/min_taken_date worked quite well too. A bit slow, but not nearly as slow as re-uploading 300 photos. So now I just need to put it up somewhere. I made some changes to the flickr.rb and sent a diff back to its author. But I haven't done much else--this weekend I'll try to throw it up somewhere on the web. Granted, its command line, no UI, no installer, requires Ruby, etc., so it may not be popular with non-techies. Also, thanks a lot for the plus account--so far I've been very pleased with 23. I like the UI better than flickr. I'm not much interested in hanging out in some sort of "cool photo community" that flickr supposedly is. I just want an easy place for friends and relatives to come see photos, and I think 23 fits this quite well.

	verseguru December 17, 2005, 08:17 AM
	On Linux and OS X 10.4 you can use extended attributes (xattr). I use these in PictureSync to keep track of uploaded files, recording their online IDs under an xattr named with the domain of the service provider (e.g. com.23hq=191506). It's faster than a dedicated DB and the data is preserved in the filesystem. I not familiar with what equivalents there might be on Windows, but I'd hope there's something similar…

	stephenh March 03, 2006, 06:09 AM
	Finally got around to packaging my Adobe Album -> 23 script up and uploading it. Binary/source/etc. is at: http://stephen.exigencecorp.com Comments/patches/etc. are appreciated. Thanks, Stephen

Please note: Photos posted in this photogroup are visible to all visitors − even if individual photos are marked as private.

Tech stuff, API and programming

remembering what i've uploaded

Discuss 23

Tech stuff, API and...

About 23

Popular photos right now