Friday 18 May 2012

couchdb: delete document if it exists

I am uploading some files via a script to a couchdb database. In many cases the files are already there, but out of date. So I want to replace them. However, couch complains that there is an 'update conflict', even if I explicitly try to delete the resource without supplying its revision id. So I followed the instructions to just test that a document is present, extracted the rev-id and then composed a delete operation. In bash scripting language, and using curl as the http transmitter it looks like this:

erase_if_exists()
{
    RESULT=`curl -s --head $1`
    if [ ${RESULT:9:3} == "200" ]; then
      REVID=$(expr "$RESULT" : '.*\"\(.*\)\"')
      curl -X DELETE $1?rev=$REVID
    fi
}

Why HEAD? Because HEAD retrieves only the HTTP document headers but not the document, which could be large. Curl can send a HEAD request using curl -X HEAD but it hangs on reading the response, so the above formulation is an alternative that works. The -s tells it not to print out any download progress, and in this function $1 is the URL. The HTTP response code is extracted from the RESULT string using indexing. If it is 200, the document is present. Here's a sample response:

HTTP/1.1 200 OK
Server: CouchDB/1.1.1 (Erlang OTP/R15B)
Etag: "3-04bacb883737ffed82700eacdf4e74f2"
Date: Fri, 18 May 2012 21:17:04 GMT
Content-Type: text/plain;charset=utf-8
Content-Length: 8077
Cache-Control: must-revalidate

The REVID gets extracted from the Etag property via a regular expression, and in the example is set to 3-04bacb883737ffed82700eacdf4e74f2. This is passed in another call to curl as a URL parameter. You also have to embed the username and password into the URL for this to work, or you can use the -u option to specify a username.

No comments:

Post a Comment

Note: only a member of this blog may post a comment.