"Using Python to Grab Images From a Web Site" was posted by ThinkingSerious and received 5 users commented
btruelove said,         
May 2 2008

Overkill IMO. A less trivial example is going to get large and messy quickly. What about when the XML file in on an FTP, or requires authentication, needs some cookie, has to filter downloads by file size, wants to use a proxy, spawn multiple processes and so on? A more apt tool is curl or wget (I’m lazy so I used both). Also, when you keep it at the shell it’s more natural to pull in other shell commands when needed.

curl -s http://99designs.com/contests/6999/feed | grep -Po “src=\”.*(png|jpg)” | grep -o “http.*” | xargs wget -q

ThinkingSerious said,         
May 2 2008

Great example. Thanks for your contribution.

michaelangela said,         
October 15 2008

Great tip using grep. I was going to use python, too. :) My box doesn’t have the -P option :( but I used egrep to similar effect. Also, the items I needed were in tags so I had to remove those with a sed command. The images were also retrieved from a database and didn’t have an extension so I did a one liner loop to rename those.

curl -s http://domain.tld/feed | egrep -o “.*” | egrep -o “(http.*)” | sed -e ’s/]*>//g’
for f in *; do mv ./”$f” “${f}.jpg”; done

Ramblings said,         
October 15 2008

grep instead of python?…

michaelangela said,         
October 15 2008

Ah… forgot to add the

| xargs wget -q

at the end of the curl,egrep line to do the actual downloading.

Welcome, please leave your comments below

  Username (required)

  Email (required)

  Website

Leave your comments below

Subscribe to this topic via Comment Rss or TrackBack

about

This blog provides programming, design, business and productivity content for tech entrepreneurs living in a 2.0 world.

Find out more about the author here.

categories

members