Elucubrations: REST

Showing posts with label REST. Show all posts

Thursday, August 23, 2012

REST and Pagination

What is the RESTful way to perform pagination? To my mind, there is only one approach - URI parameters. For example, to return the first ten tunes from our repository:


    http://localhost:8080/musicrest/genre/irish/tune?page=1&size=10

A page is not itself a resource and so these parameters should not be part of the URI path. Nor should the HTTP Range header be hijacked for this purpose. Fortunately, Spray makes it ridiculously easy to fish out the parameters and provide defaults where necessary with its parameters combinator. For example, if we want to use a default page size of 10 we can use:


    pathPrefix("musicrest/genre") {
      path (PathElement / "tune" ) { genre => 
        parameters('page ? 1, 'size ? 10) { (page, size) =>
          get {  ctx => ctx.complete(TuneList(genre, page, size)) }
        }        
      }
    } ~
    ....

The next problem is how best to represent the list of tunes that is returned. As we're developing a transcoding service, it makes sense to support both JSON and XML representations and to return the paging information alongside the tunes. These representations can be returned by means of an appropriate marshaller as discussed earlier. A JSON representation for a couple of tunes might be:


 {
    "tune": [
        {
            "title": "A Fig For A Kiss",
            "rhythm": "slip jig",
            "_id": "a+fig+for+a+kiss-slip+jig"
        },
        {
            "title": "Baltimore Salute, The",
            "rhythm": "reel",
            "_id": "baltimore+salute%2C+the-reel"
        }
    ],
    "pagination": {
        "page": "1",
        "size": "2"
    }
}

and the equivalent XML representation would be:


  <tunes>
     <tune>
       <title>A Fig For A Kiss</title>
       <rhythm>slip jig</rhythm>
       <_id>a+fig+for+a+kiss-slip+jig</_id>
     </tune>
     <tune>
       <title>Baltimore Salute, The</title>
       <rhythm>reel</rhythm>
       <_id>baltimore+salute%2C+the-reel</_id>
     </tune>
     <pagination>
       <page>1</page>
       <size>2</size>
     </pagination>
  </tunes>

Before we look into how to implement this, I need to digress slightly and talk about the components I've chosen to integrate.

Components

It turns out that LilyPond was not really appropriate for transcoding the tunes. Although it produces scores of very high quality, it was slow, and the abc2ly utility would occasionally hiccup (for example, it would get confused by lead-in notes). Instead I now use abcMidi for midi production and abc2ps for postscript (which can then be transcoded to a variety of formats with the standard Linux convert tool). I had originally intended to use Blue Eyes MongoDB for Mongo integration but was put off by the large number of jars brought in through its dependency on Blue Eyes Core. Instead I now use Casbah which is very easy to work with. I'm still very pleased with Blue Eyes JSON, though. And finally, I have chosen Spray as the web service toolkit.

Implementing Paging

This is very straightforward with Mongo because the URI parameters very closely match Mongo's skip and limit functions. For example, the following Casbah query returns the data we need (here T represents the tune title and R its rhythm):


  def getTunes(genre: String, page: Int, size: Int): Iterator[scala.collection.Map[String, String]] = {
    val mongoCollection = mongoConnection(dbname)(genre)
    val q  = MongoDBObject.empty
    val skip = (page -1) * size
    val fields = MongoDBObject("T" -> 1, "R" -> 2)
    val s = MongoDBObject("T" -> 1)
    for {
      x <- mongoCollection.find(q, fields).sort(s).skip(skip).limit(size)
    } yield x.mapValues(v => v.asInstanceOf[String])    
  }

We can use the Iterator returned by this query to populate our TuneList class, and provide a toJSON method to produce the output we require:


class TuneList(i: Iterator[scala.collection.Map[String, String]], page: Int, size: Int) {
 
  def toJSON: String = {
    val quotedi = i.map( cols => cols.map( c => c match {
      case ("T", x) => "\"title\"" + ": " + "\"" + x + "\" "
      case ("R", x) => "\"rhythm\"" + ": " + "\"" + x + "\" "
      case (a, b)   => "\"" + a + "\"" + ": " + "\"" + b + "\" "
    }).mkString("{", ",", "}"))
    quotedi.mkString("{ \"tune\": [", ",", "], " + pageInfoJSON + "  }") 
  }
   
  private val pageInfoJSON:String = " \"pagination\" : { \"page\""  + ": \"" + page + "\" ," + "\"size\""  + ": \"" + size + "\" }"
  
}

object TuneList {
  def apply(genre: String, page: Int, size: Int):TuneList = new TuneList(TuneModel().getTunes(genre, page, size), page, size)
}

Finally, we can add a toXML method for the XML output, made very easy by the Blue Eyes JSON library:


  def toXML: String = {
      val jvalue = JsonParser.parse(toJSON)
      "<tunes>" + Xml.toXml(jvalue).toString + "</tunes>"
  }

I find it much simpler to generate XML from JSON rather than the other way round.

Friday, July 27, 2012

Content Negotiation with Spray

Please note: this article describes Spray 1.0-M2. The Spray API has altered considerably since this was written - see the more recent post on Spray Migration.

Spray is another attractive framework for building RESTful web services. Again it uses combinators for parsing requests in a similar manner to unfiltered but it offers considerably more options in composing them together, very nicely described here. Spray's approach to content negotiation differs from unfiltered. By and large, the set of supported MIME types, and the matching of appropriate responses to requests remains hidden. Having matched a request, you would typically return a response using the complete function:


    path("whatever") {
      get { requestContext => requestContext.complete("This is the response") }
    }

This has the ability to return a variety of MIME types. This is because each variant of the complete method on RequestContext takes Marshaller as an implicit parameter:


   def complete [A] (obj: A)(implicit arg0: Marshaller[A]): Unit

I like this because it is likely that you will want to model each RESTful resource with a dedicated type. Then, if you simply provide a marshaller for that type in implicit scope, you have a type-safe response mechanism. It is possible for a marshaller to take responsibility for more than one MIME type, and the Spray infrastructure will then silently handle the content negotiation for you, hooking up an appropriate response content type to that requested. For example, if we want to reproduce the music service so far implemented with unfiltered, we can do it like this:


import cc.spray._
import cc.spray.directives._

import org.bayswater.musicrest.typeconversion.MusicMarshallers._

trait MusicRestService extends Directives {
  
  val musicRestService = {
    path("musicrest") {
      get { _.complete("Music Rest main options") }
    } ~
    path("musicrest/genre") { 
      get { _.complete("List of genres") }
    } ~
    path("musicrest/genre" / PathElement ) { genre =>
      get { _.complete("Genre " + genre) }
    } ~
    pathPrefix("musicrest/genre") {
      path (PathElement / "tune" ) { genre => 
        get { _.complete("List of Tunes" ) }
      } ~
      path (PathElement / "tune" / PathElement ) { (genre, tune) =>         
        get { _.complete(Tune(genre, tune) ) }
      }
    } 
  }
  
}

Notice that when we want to match a URL that represents a tune, we use a Tune type with marshallers for the tune in scope. The marshaller (at the moment) simply encodes Strings, but it is all that is needed to generate the required Content-Type header (pdf, midi, json etc.):


import cc.spray.typeconversion._
import cc.spray.http._
import HttpCharsets._
import MediaTypes._
import java.nio.CharBuffer

trait MusicMarshallers {
  
  val `audio/midi` = MediaTypes.register(CustomMediaType("audio/midi", "midi"))
  
  case class Tune(genre: String, name: String)
  
  implicit lazy val TuneMarshaller = new SimpleMarshaller[Tune] {
    val canMarshalTo = ContentType(`text/plain`) ::
                       ContentType(`application/pdf`) :: 
                       ContentType(`audio/midi`) :: 
                       ContentType(`application/json`) :: 
                       Nil
                       
    def marshal(tune: Tune, contentType: ContentType) = {
      val content = "genre " + tune.genre + " name " + tune.name 
      val nioCharset = contentType.charset.getOrElse(`ISO-8859-1`).nioCharset
      val charBuffer = CharBuffer.wrap(content)
      val byteBuffer = nioCharset.encode(charBuffer)
      HttpContent(contentType, byteBuffer.array)
    }
  }
}

object MusicMarshallers extends MusicMarshallers

Spray provides a registry of the most common MIME types but this will not include our audio/midi type until the next release. For the time being, we must register the type ourselves.

Posted Messages

Suppose we wanted to offer a transcoding service, where users post tunes in ABC format and we return them transcoded to the requested type. We can manage this by defining an Abc type and adding an Unmarshaller to MusicMarshallers for that type which converts the POST body.


  case class Abc(notes: String)

  implicit lazy val AbcUnmarshaller = new SimpleUnmarshaller[Abc] {
    val canUnmarshalFrom = ContentTypeRange(`text/plain`)  :: Nil

    def unmarshal(content: HttpContent) = protect {
      val notes =  new String(content.buffer, content.contentType.charset.getOrElse(`ISO-8859-1`).nioCharset)
      Abc(notes)
    }
  }

Spray uses the same approach for ummarshalling that it uses for marshalling. If an implioit unmarshaller for the type is in scope, it will be used to construct the type when we use the content function. The type signatures are:


  def as [A] (implicit arg0: Unmarshaller[A]): Unmarshaller[A]

  def content [A] (unmarshaller: Unmarshaller[A]): SprayRoute1[A]

which we can then use like this:


    path("musicrest/transcode") { 
      post { 
        content(as[Abc]) { abc =>      
          val tune:Tune = parseAbc(abc)
          _.complete(tune) 
        }
      }
    }

and because we're completing with a Tune type, marshalling to the requested type again happens silently. We've had to write our own marshallers and unmarshallers because we're using relatively uncommon MIME types but Spray provides default marshallers/unmarshallers for the more common types. Incidentally, Spray seems to be refreshingly strict in its accordance to the HTTP spec. It appears that Chrome has a bug when submitting forms encoded as text/plain it mistakenly issues an empty boundary (as if it were a MultiPart type):


Content-Type:text/plain; boundary=

which is immediately rejected by Spray.

Testing

Spray supplies a SprayTest trait which allows you to test the routing logic directly (without having to fire up the container). For example:


 "The MusicRestService" should {
    "return resonable content for GET requests to the musicrest tune path" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(acceptHeader("text/plain")))) {
        musicRestService
      }.response.content.as[String] mustEqual Right("genre irish name odeas")
    }
  }

It is possible, too, to ensure that the correct Content-Type is produced. Firstly, make sure that the various HTTP headers, charsets and media types are in scope:


import cc.spray.http.HttpHeaders._ 
import cc.spray.http.HttpCharsets._ 
import cc.spray.http.MediaTypes._ 
import org.bayswater.musicrest.typeconversion.MusicMarshallers.`audio/midi`

At this stage in the proceedings, the Content-Type header has not yet been generated. Instead it is part of the content:


    "return PDF MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`application/pdf`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`application/pdf`, `ISO-8859-1`)) 
    }
    "return text/plain MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`text/plain`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`text/plain`, `ISO-8859-1`)) 
    }
    "return audio/midi MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`audio/midi`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`audio/midi`, `ISO-8859-1`)) 
    }

We can also, of course, use Dispatch for client-server testing like we did before, or we can use Spray Client which allows you to do the same sort of marshalling and unmarshalling at the client side.

Thursday, July 5, 2012

REST and Content Negotiation

Suppose you have a resource (in our case a piece of music) and you want to provide it in a variety of formats (plain text, pdf, midi etc.) then what is the RESTful best practice? Opinion seems divided - either you represent the resource just once and then provide a best-fit response by inspecting the Accept header or you represent each form of the resource with a separate URL. I tend to favour the former and want to investigate the support given by unfiltered and Blue Eyes.

Browser Tools

Before we start, it's helpful to do the initial testing from a browser. ModHeader is a simple Chrome plugin that allows you manipulate the request headers. Here we're restricting the effect just to localhost:8080 and we're setting the MIME type of the request to application/json:

You can then, of course, inspect the request and response headers using the standard Chrome development tools.

unfiltered

Unfiltered is an extremely lightweight toolkit that focuses on the problem at hand. It acts as a Servlet filter and so is dealing fundamentally with HttpServletRequest and HttpServletResponse objects. However, it delivers a very intuitive Scala API built round the type ResponseFunction = HttpServletRequest => HttpServletResponse. These functions can be composed together in a variety of ways allowing us to recognize the different combinations of request headers we need to identify or to build up appropriate combinations of response headers. Conventional scala pattern-matching is available to allow us to distinguish the various RESTful URLs we need. Finally, these mappings are gathered together in a Plan which is defined as PartialFunction[HttpServletRequest, ResponseFunction]. The portion of a Plan that handles our tunes might look like this:


  def intent = { 
    case req @ GET(Path(Seg("musicrest" :: genre :: "tune" :: tune :: Nil))) => req match {
        case MusicAccepts.Pdf(_) =>  PdfContent ~> ResponseString("Get PDF request genre: " + genre + " tune: " + tune)
        case MusicAccepts.Midi(_) => MidiContent ~> ResponseString("Get midi request genre: " + genre + " tune: " + tune)
        case Accepts.Json(_) => JsonContent ~> ResponseString("Get JSON request genre: "  + genre + " tune: " + tune)
        case _ =>    Ok ~> ResponseString("Get request for genre: " + genre + " tune: " + tune)
    }

    // other Plan cases omitted
  }

At the moment, this simply returns a String response that reflects the request, but the MIME type of the response accords with the type we eventually intend to return. Unfiltered natively handles common MIME types such as XML or JSON but when we stray from the straight and narrow with PDF or MIDI we need to provide a little more help. The DSL is easily extended like this:


import unfiltered.request.Accepts 
import unfiltered.response.ContentType

/** Accepts request header extractor for music extensions */
object MusicAccepts {

  object Midi extends Accepts.Accepting {
    val contentType = "audio/midi"
    val ext = "midi"
  }

  object Pdf extends Accepts.Accepting {
    val contentType = "application/pdf"
    val ext = "pdf"
  }
}

/** Response content type for music extensions */
object MidiContent extends ContentType("audio/midi")

Testing

Dispatch is a scala wrapper around Apache HttpClient. We can use this to run some integration tests. (I currently use Should Matchers for jUnit). Here is a test that confirms that the MIME types for PDF content agree:


  private val  localHost = :/("localhost", 8080)   
  private val genre = "irish" 
  private val tune = "odeas"

  def testMimePdf() {     
    val h = new Http
    val headers = Map("Accept" -> "application/pdf")
    val req = localHost / "musicrest" / genre / "tune" / tune  
    h(( req <: data-blogger-escaped-headers="">:> { 
      hs => {      
            val contentType: Option[String] = hs("Content-Type").headOption
            contentType.map{_ should be ("application/pdf")}.getOrElse(fail("No Content-type header"))
            }      
      })
  }

Dispatch uses a rich set of operators for its combinators and these need a bit of explaining. In this example, >:> extracts the response headers. These operators are nicely summarised here. If you need to inspect both the headers and the content of the response, you can use the >+ operator to fork a handler into two (returned as a tuple):


  def testMimeJson() {  
    val h = new Http
    val headers = Map("Accept" -> "application/json")
    val req = localHost / "musicrest" / genre / "tune" / tune  
    val ans = h((req <: data-blogger-escaped-headers="">+ { r => 
      (r >:> { 
         hs => {
                val contentType: Option[String] = hs("Content-Type").headOption
                contentType.map{_ should be ("application/json; charset=utf-8")}.getOrElse(fail("No Content-type header"))
                }
         },      
         r >- {
           content => content should be ("Get JSON request for genre: irish tune: odeas")
       }
       )
    }) 
  }

Blue Eyes

Blue Eyes is more ambitious - attempting to be a complete web framework for RESTful services, implemented from scratch using pure functional techniques. One departure from unfiltered is that it does not recognize the Accept header - instead it requires requests to use a Content-Type header. I think this is a legitimate approach - just not the one I was looking for. It uses a similar combinatorial design, and has very good support for common types (particularly XML and JSON). Here's a snippet showing how you can match tune requests and return appropriate responses in these formats:


       } ~
       describe ("get tune  within a genre and type") {
          path("/genre" / 'genreName / "tune" / 'tune) {
            jvalue {
              get { request: HttpRequest[Future[JValue]] =>
                val genre = request.parameters('genreName) 
                val tune  = request.parameters('tune)
                val jTune: Future[HttpResponse[JValue]] = TuneModel(musicRestConfig).jasonTune(genre, tune)
                jTune
              }
            } ~
            xml {
              get { request: HttpRequest[Future[NodeSeq]]  =>
                val genre = request.parameters('genreName)
                val tune  = request.parameters('tune)
                val xmlTune: Future[HttpResponse[NodeSeq]] = TuneModel(musicRestConfig).xmlTune(genre, tune)
                xmlTune
              }
            }
          }
        } ~

Here the tilde combinator allows you to join the partial functions that handle each URL - pattern a or else pattern b or else .... The jvalue and xml combinators handle the mapping between incoming and outgoing Content-Types. I did not find it easy to provide similar combinators for my music types. I think my difficulty was largely because these combinators are expressed in terms of Bijections which are mappings in both directions between a type A and a type B. It wasn't clear to me how this mapped on to the problem at hand. According to forum answers there is an outer scope whose currency is A and an inner scope whose currency is B. But I simply needed to negotiate the type and then map from the type I actually had to the type agreed. Anyway, forum answers were immensely helpful and there appears to be some possibility that the Bijection approach may be rethought. I did like the JSON and Mongo libraries that are supplied as separate jars. The JSON library is derived from Lift Json whereas the Mongo library is new and looks as if it might be a viable alternative to Casbah. But this will have to wait for another post.

The Way Forward

I think, as things stand, I will most likely use unfiltered for pattern matching and use the Blue Eyes Mongo and JSON libraries for the heavy lifting. But I am very interested in tracking Blue Eyes core to see how it turns out.

Wednesday, June 20, 2012

Delving Deeper into Scala

OK - here's the plan. Although I've been using Scala commercially for a couple of years, it's been in a somewhat niche area - XML messaging. I'd like to become a little more familiar with some of the newer frameworks and toolkits that are emerging.

One of my interests is playing traditional music. Trad musicians tend to exchanging tunes in abc notation - a very simple format developed by Chris Walshaw. Various sites provide tune repositories which also display the dots in a conventional music stave - the one I mostly use is The Session. For example, here's the abc for one of my favourite jigs - O'Dea's:

X: 1
T: O'Dea's
M: 6/8
L: 1/8
R: jig
K: Gmaj
|: G3 GBd|BGD E3|DGB d2 d|edB def|
g3 ged|ege edB|dee edB|gdB A2 B:|
|: c3 cBA|Bdd d2 e|dBG GBd|edB AFD|
GBd gag|ege edB|dee edB|gdB A3:|

The tune goes like this:

I intend to build a RESTful service for traditional tunes. Users would post tunes in abc notation within particular genres. They could then request a given tune in a variety of different formats (for example: plain text (abc), pdf, JSON, midi etc.). I would also probably have a simple one-off transcoding service that didn't save the tune but would allow users to experiment with the abc. All this has been done before of course in various places - tunedb has a huge collection for example. But my main motivation is to investigate the Scala landscape - my current plan is to investigate MongoDB, unfiltered, blueeys, configrity and scalaz7.

Transcoding

So the service is to be built round the ability to transcode abc into various other formats. To do this, I have chosen LilyPond which produces very high quality pdf images of scores and supports other formats too (such as midi), It also has an add-on that converts abc into its native .ly format.

One drawback to LilyPond is that it only offers a command-line interface, so I'll have to shell out using scala.sys.process. Here's a bash script (abc2pdf.sh) that invokes it:


!/bin/bash
#############################################
#
# transcode abc to pdf format
#
# usage: abc2pdf.sh srcdir destdir tunename
#
#############################################

EXPECTED_ARGS=3

if [ $# -ne $EXPECTED_ARGS ]
then
  echo "Usage: `basename $0` {srcdir destdir tunename}"
  exit $E_BADARGS
fi

# source
abcdir=$1
if [ ! -d $abcdir ]
then
  echo "$abcdir not a directory" >&2   # Error message to stderr.
  exit 1
fi  

# destination
pdfdir=$2
if [ ! -d $pdfdir ]
then
  echo "$pdfdir not a directory" >&2   
  exit 1
fi  

# temporary work directory (we'll reuse src for the time being)
workdir=$1

# source file
srcfile=${abcdir}/${3}.abc
if [ ! -f $srcfile ]
then
  echo "no such file $srcfile" >&2  
  exit 1
fi  

# transcode from .abc to .ly
abc2ly -o $workdir/$3.ly $abcdir/$3.abc
retcode=$?
echo "abc return: " $retcode

# transcode from .ly to .pdf
if [ $retcode -eq 0 ]; then
  echo "attempting to transcode ly to pdf"
  lilypond --pdf -o $pdfdir/$3 $workdir/$3.ly
  retcode=$?
fi

# remove the intermediate .ly file
rm -f $workdir/$3.ly

exit $retcode

Invoking from Scala

And here's some code that uses Process to call the script and return a scalaz Validation that contains either LilyPond's error messages or a file handle to the pdf:


import scala.sys.process.{Process, ProcessLogger}
import java.io.{InputStream, File}
import scalaz.Validation
import scalaz.Scalaz._
import org.streum.configrity.Configuration

trait Transcoder {
  
  def config: Configuration

  private def scriptHome = config[String]("transcode.scriptDir")
    
  // source
  private def abcHome = config[String]("transcode.abcDir")

  // destination
  private def pdfHome = config[String]("transcode.pdfDir")

  def transcode(abcName: String): Validation[String, File] = {
    import scala.collection.mutable.StringBuilder 

    val out = new StringBuilder
    val err = new StringBuilder

    val logger = ProcessLogger(
      (o: String) => out.append(o),
      (e: String) => err.append(e))

    val pb = Process(scriptHome + "abc2pdf.sh " + abcHome + " " + pdfHome + " " + abcName)
    val exitValue = pb.run(logger).exitValue

    exitValue match {
      case 0 => {val fileName = pdfHome  + abcName + ".pdf"
                 val file = new File(fileName)
                 file.success
                 }
      case _ => err.toString.fail
    }
  }  
}

config shows one way to use configrity. We could put our config values into a file (server.conf):


transcode {
   scriptDir = "/home/john/Development/Workspace/BlueEyes/tunedb/lilypond/"
   abcDir = "/var/data/music/abc/"
   pdfDir = "/var/data/music/pdf/"
}

we can them build a Transcoder in a test environment like this:


   class TestTranscoder extends Transcoder {
      override def toString = "test transcoder"
      override val config = {
        println("Loading configuration file")
        Configuration.load("/home/john/Development/Workspace/abcTranscode/conf/server.conf")
      }
    }

and we can test it like this:


    /** a test transcode with an abc file that exists */
    def testGoodABC() {
      val transcoder = new TestTranscoder()
      val validation = transcoder.transcode("Odeas")
      validation.fold( e => fail("file should have been transcoded"),
                       s => s.getName() should be ("Odeas.pdf"))
    }

    /** a test transcode with an abc file that does not exist */
    def testBadABC() {
      val transcoder = new TestTranscoder()
      val validation = transcoder.transcode("NotThere")
      validation.fold( e => (),
                       s => fail("PDF file should not be returned for invalid input"))
    }