Friday, 30 March 2012

Bare bones multi-file upload

What I need for HritServer is an import facility that can handle multiple file uploads. I've found many fancy scripts for doing this that are way too complex and too specific. Sometimes all you need is a bare bones solution you can tailor to suit your needs rather than a fully fledged product you have to spend ages understanding and cutting down to size. Also I hate building in dependencies that only increase the tendency for code to break. So here's my simple contribution to the HTML multifile upload problem.

The basic idea is to have one <input type="file"> for every file you want to upload. But you hide all but the current empty one. So when the user selects the only visible input file element it sets itself to the chosen file, adds itself to an invisiible list of input file elements, and creates a fresh one for the next time. To make it easier to see what's already been selected I maintain a secondary list of paths in a table. Next to each entry is a remove button that let's you take out individual entries. That's it. Here is the php code to handle the upload on the server side. Replace this with something else if you like, such as Apache file upload in Java:

And here is the HTML, with self-contained javascript.

Call the first "upload.php" and the second "upload.html" then put them both in the root document directory of your web-server. If you have php installed navigate to /upload.html and take it from there. You can add styling and change the server script easily because it is simple.

Wednesday, 7 March 2012

Uploading a directory of images to couchdb

I wanted to have a general script to upload a set of images to couchdb. Couch can't store images as documents because it uses JSON for that. But you can still create a JSON document and attach a set of annotations to it in the form of images. But here's the catch: you have to specify the document's revision id, and that changes after every image you add.

But I had two problems. My directory of images contained sub-directories. I also wanted to access the images directly using a simple URL. So if I had a directory structure like:

    list
        file1.png
        one
            file2.png
            file3.png
        two
            file4.png

I would want the relative URLs to be: /list/file1.png and /list/one/file2.png etc. The trick in writing the script is to extract the revid from the server response. I used awk for that, then used the returned value to upload the next entry in that directory. The second trick is to use %2F not / as a directory separator when creating the docids. Couch doesn't allow nesting in the database structure but you can simulate it by creating documents called:

list
list%2Fone
line%2Ftwo

The first posting to each of those documents doesn't need a revid, but subsequent ones do. That's just a feature of couch. So here's the script. It's a bity sloppy because if couch responds with an error to any upload it will fall over. This bit of error-handling is currently left as an exercise to the reader. I'll put it in later and may update the post then. To use the script put it into a file called upload.sh and then invoke it thus: ./upload.sh images, where "images" is the master image directory.