Elucubrations: scala

Showing posts with label scala. Show all posts

Friday, August 29, 2014

Functional Programming in Scala

Richard Feynman said what I cannot create I do not understand, notoriously misquoted by Craig Venter as what I cannot build I do not understand when he inserted it as a watermark into the DNA of his bacterium with an artificial genome. But Venter has a software engineering approach to biology - the bacterium has a computer for a parent and he went through a series of debugging cycles before eventually the organism would 'boot-up'. His statement is fundamental to the proper understanding of all sorts of disciplines, but non more so than software development.

Manning have today published Functional Programming in Scala by Rúnar Bjarnason and Paul Chiusano. The book is constructed around a carefully graded sequence of exercises where you find yourself attempting to build things of increasing complexity. It's aim is to teach functional programming concepts through the medium of Scala and this means that you must construct your programs from pure functions, free of side effects. If, like me, you come from an OO background, this causes a massive rethink - there are whole swathes of the language that you just don't use. For example, you can't reassign a variable; your functions mustn't throw exceptions; you will not develop class hierarchies. You will use case classes but their use is restricted to the creation of algebraic data types. I found that I had to go through the exercises very slowly and deliberately. The introductory exercises are relatively straightforward and fun to attempt and your confidence grows, but I found that the material quite soon becomes very challenging indeed. In fact, whilst studying the various revisions of this book, I was reminded yet again of Feynman - his algorithm:

    Write down the problem.

    Think real hard.

    Write down the solution.

This, as opposed to an approach I've tended to used in OO:

   
    Write down the problem.

    do {
        Think a little bit.

        Type a little bit.

        See what happens

    } until problem solved.

    Write down the solution.

The problem when lesser mortals attempt Feynman's algorithm is that it tends not to terminate. When it works, however, you often tend to end up with an elegant, terse solution, often incomprehensible to the casual reader. But the main thing is that you gain enormous understanding - you will get no benefit unless you attempt the exercises.

The material is presented tersely, with just enough explanation of the concept and the scala syntax you need to express it. Some of the problems are very tricky and I would occasionally find myself pondering single questions for an hour or two, struggling to get the types to match. The satisfaction when you do tease out a solution is immense. The accompanying material on the web site includes pro-forma Scala exercise files for each chapter with placeholders for the answer, usually in the fom of stubs like this:

    def compose[A,B,C](f: B => C, g: A => B): A => C =
        ???

If you get stuck, hints and solutions for each question are provided, as is a fully fleshed-out answer file for each chapter. Some later chapters use material developed in earlier ones and so, when attempting a new chapter, you need either to have completed the previous ones or to compile the supplied answers before you can proceed.

There are four main sections. The first is an introduction to FP concepts and the authors have given a lot of thought to the progression of ideas. This allows you gradually to gain confidence in manipulating functions until finally you're able to develop a lazily-evaluated Stream data type and then a purely functional representation of State. If you just get this far, it's of tremendous benefit.

Then the style of the book changes tack to some extent in three chapters devoted to functional design. The idea is to illustrate some of the thought processes and alternative approaches that can be involved in groping your way towards the development of a combinator library that tackles a particular design space. Three completely different problems (parallelism, test frameworks and parsers) are used. This is illuminating but sometimes a little at odds with the progression of exercises because your approach and the authors' may legitimately diverge. This means you may occasionally have to look ahead to the answers to keep yourself on track. What is interesting here is that the three different areas evolve solutions with a deep structure in common and you develop something of an intuition of what it is that makes Monads inevitable.

The third section then discusses Monoids, Monads, Functors and Applicatives. This is familiar territory if you've read Learn You a Haskell but there is considerably more depth of coverage, and because you've spent so much time thinking about the diverse design problems in the previous section, you may well find that your understanding of these deep abstractions takes a more tangible form.

Section 4 deals with the question - if your program is completely pure, how do you deal with an effect such as IO? There are many aspects to this and the authors review the strengths and limitations of the IO Monad and discuss different alternatives to the problem of handling mutable state. The book ends with a fascinating chapter on Streaming IO. This was a brave thing to do - the discussion is exploratory in style and feels its way towards a thorough explanation of the design principles behind the scalaz-stream library. It seems as though the chapter might have been developed in parallel with the development of the library itself and you almost think that you are one of the team. Surprisingly, no mention is made of scalaz, however you will learn many of the design principles that underpin it. (This won't give you everything you need for the scalaz syntax, though. For this, I would recommend learning-scalaz.)

This is a book which has already taught me a great deal and given me a huge amount of pleasure. It is one that I will keep going back to until I finally feel that I am getting the hang of that mysterious activity which is called functional programming.

Friday, May 16, 2014

A SKI Calculator

I have recently been attempting the exercises in Raymond Smullyan's to mock a Mockingbird. This is a gentle and enjoyable introduction to combinatory logic. He chooses an applicative system of combinators where the objects are birds and so he names them combinator birds. They live in variety of different forests where the inhabitants of each forest have distinctive characteristics.

Combinator Birds

Here are some examples of Smullyan's birds:

  Bluebird: Bxyz = x(yz)
  Cardinal: Cxyz = xzy
  Warbler:  Wxy = xyy
  Identity: Ix = x

These are conventionally written in a form which minimises the bracketing because the terms are considered to be left-associative. They can be re-written in a normal form which makes the association explicit - for example:

  Cardinal: Cxyz = (xz)y
  Warbler:  Wxy = (xy)y

In the early chapters you learn that you can build all the other birds in a forest by applying birds to other birds. In fact, the birds shown above (BCWI) form the basis for a whole class of birds, and this class can also be generated from a different set (BTMI). More strangely, you can build all the birds you need from just two:

  Starling: Sxyz = xz(yz)
  Kestrel:  Kxy = x

It is very easy to generate the Identity bird from these two, and conventionally you use all three to generate the others - hence the name SKI Calculus. The real interest is that this is a Turing-complete language - a fact that has been used in an elegant piece of work to show that the Scala type system is Turing-complete. But this is not my object here - rather it is to take up Smullyan's challenge on page 178 to write a program that converts any combinator bird to a SKI bird. It also makes sense to write an interpreter for a SKI bird so that when it is applied to the appropriate variables (x,y,z etc.) then the original combinator bird is reconstituted. The puzzles in chapter 18 lead you to discover a deterministic algorithm for doing this which Smullyan sketches out in the answers (although be careful - unfortunately his explanation contains a couple of typos).

Parse Tree

A combinator bird is best represented by a binary tree which can distinguish between the bracketed forms in the following way:

This tree can be generated with the following very simple grammar which recognizes birds in normal form:

   expression ::= term term | term 
   term ::= terminal | bracket
   bracket ::= "(" expression ")" 
   terminal ::= "x"| "y" | "z" | "w" | "v"

A parse tree in Scala then is of course:

   trait Tree
   case class Leaf(value: String) extends Tree
   case class Node(left: Tree, right: Tree) extends Tree

It is then straightforward to build a parser using the parser combinator library - but be careful - this has been moved to its own jar in Scala 2.11 - scala-parser-combinators_2.11.

α-Elimination

The algorithm that converts a combinator bird to a SKI bird takes one variable at a time, starting with the outermost variable and removes it progressively from each node in the tree until it vanishes - being replaced by a tree of S,K and I. This process is called α-elimination and is an entirely mechanical process which involves invoking any or all of four Principles which are encoded as follows:

   // α-elimination of α alone is I (Iα=α)
   private def principle1 = I

   // α-elimination of X (α not in X) is KX (KXα=X)
   private def principle2(t: Tree) = Node(K,t)

   // α-elimination of Yα (α not in Y) is Y (Yα=α)
   private def principle3(t: Tree) = t

   // α-elimination of XY (X an Y both α-eliminations) is SXY
   private def principle4(l: Tree, r: Tree) =  Node(Node(S,l),r)

where we have:

   object SKINodes {
     val S = Leaf("S")
     val K = Leaf("K")
     val I = Leaf("I") 
   }

This is not a very efficient algorithm because you have to look down the tree to detect whether the α-variable you are trying to replace exists in each branch, and it may not find the optimal SKI representation, but it is entirely deterministic and once you have replaced each variable you obtain a tree where the leaf nodes contain only S, K or I leaves.

SKI Interpretation

To interpret a SKI tree, you attach it to the required variables at its apex and then walk the tree. I have chosen a left side tree walk where I continue translating each branch until (on looking down it) all SKI nodes have been replaced. At each re-write I effectively apply the left hand operator to the right-hand tree. Because I am dealing with a binary tree but the S and K operators require more than one parameter, I accumulate 'partial' operators flowing up the tree until they are fully satisfied. This is done by adding extra transient nodes to the tree:

  // transient nodes used in SKI interpretation - essentially representations
  // of partial application of S, K or I
  case object S0 extends Tree
  case object K0 extends Tree
  case object I0 extends Tree
  case class K1(child: Tree) extends Tree
  case class S1(child: Tree) extends Tree
  case class S2(left: Tree, right: Tree) extends Tree

Again, this algorithm is not extremely efficient because of the look-ahead, but appears to be deterministic (certainly for the class of birds in Smullyan's forest). It will run out of stack space for outrageously complex birds because I have not bothered to preserve stack space by means of tail recursion or trampolining. If you are interested, the code is here.

Wednesday, December 4, 2013

Lenses in Scala

I have just got back from Scala eXchange 2013 where the initial keynote session was from Simon Peyton Jones on Lenses: Compositional data access and manipulation. A presentation in pure Haskell was, in fact, quite a heavyweight start to a Scala conference. However, it turned out to be a lucid explanation of a beautiful abstraction, and one which, at first sight, would appear to be impossible to arrive at given the apparent inconsistency of the types. Have a look at Simon's presentation if you find time - it's well worth the effort.

A lens solves the problem of reading from, and updating, a deeply nested record structure in a purely functional way, but without getting mired in ugly, deeply nested and bracketed code. Anyway, the talk got me interested in finding out what was available with Scala. There seem to be two main contenders at the moment - from scalaz 7 and shapeless. I thought I'd compare the two, to see how they handle the simple name and address record structure used by Simon which is (in scala):

   case class Person (name: String, address: Address, salary: Int)
   case class Address (road: String, city: String, postcode: String)

   // sample Person
   val person = Person("Fred Bloggs", Address("Bayswater Road", "London", "W2 2UE"), 10000)

Lenses from Scalaz 7

A lens itself is a pretty simple affair, consisting only of a pair of functions, one for getting and another for setting the value that the lens describes, and doing this within the context of the container (i.e. the record). In scalaz, these are constructed with a good deal of boilerplate - there are two constructors available to you (lensg and lensu), one which curries the 'setting' function, and the other which does not:

    
   val addressLens = Lens.lensg[Person, Address] (
     a => value => a.copy(address = value),
     _.address
   )   
       
   // or
   
   val nameLens = Lens.lensu[Person, String] (
     (a, value) => a.copy(name = value),
      _.name
   )

Notice that each lens is defined in terms of two types - that of the value itself and that of its immediate container. Typically, you would then have to get hold of a lens for every atomic value you might wish to access:

   
  val salaryLens = Lens.lensu[Person, Int] (
     (a, value) => a.copy(salary = value),
     _.salary
   )

   val roadLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(road = value),
     _.road
   )

   val cityLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(city = value),
     _.city
   )

   val postcodeLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(postcode = value),
     _.postcode
   )

This soon becomes tedious. However, the true usefulness of lenses comes to light when you compose them:

   
  val personCityLens = addressLens andThen cityLens
  val personPostcodeLens = postcodeLens compose addressLens

i.e. you can stay at the top of your record structure and reach down to the very bowels just by using one of these composite lenses. Scalaz, of course, offers you a shorthand hieroglyphic for these composition functions. So you could write them:

   
  val personCityLens = addressLens >=> cityLens
  val personRoadLens = roadLens <=< addressLens

Now you have your lenses defined, you can use them like this:

  
  def get = salaryLens.get(person)
  // res0: Int = 10000

  def update = salaryLens.set(person, 20000)    
  // res1: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),20000)

  def transform = salaryLens.mod(_ + 500, person) 
  // res2: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),10500)

  def transitiveGet = personCityLens.get(person)
  //res3: String = London

  def transitiveSet:Person = {
    val person1 = personCityLens.set(person, "Wakefield") 
    val person2 = personRoadLens.set(person1, "Eastmoor Road") 
    personPostcodeLens.set(person2, "WF1 3ST")
  }
  // res4: Person = Person(Fred Bloggs,Address(Eastmoor Road,Wakefield,WF1 3ST),10000)

With scalaz, you also get good integration with the other type classes you might need to use. So, for example, lenses can be lifted to behave monadically because the %= method applies the update and returns a State Monad.:

  
  def updateViaMonadicState = {
        val s = for {c <- personRoadLens %= {"124, " + _}} yield c
        s(person)
      }
  //res13: (Person, String) = (Person(Fred Bloggs,Address(124, Bayswater Road,London,W2 2UE),10000),124, Bayswater Road)

Lenses from Shapeless

By contrast, Shapeless gets rid of all this boilerplate and allows for composition during the construction of the lenses themselves. It uses a positional notation to define the target field of each lens:

  
  val nameLens     = Lens[Person] >> 0
  val addressLens  = Lens[Person] >> 1
  val salaryLens   = Lens[Person] >> 2
  val roadLens     = Lens[Person] >> 1 >> 0
  val cityLens     = Lens[Person] >> 1 >> 1
  val postcodeLens = Lens[Person] >> 1 >> 2

These can then be used in pretty much the same way as Scalaz 7's lenses. The main difference is that mod becomes modify and function application is curried:

  
  def get = salaryLens.get(person)
  // res0: Int = 10000

  def update = salaryLens.set(person)(20000)    
  // res1: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),20000)

  def transform = salaryLens.modify(person)( _ + 500) 
  // res2: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),10500)

  def transitiveGet = cityLens.get(person)
  // res3: String = London

  def transitiveSet:Person = {
    val person1 = cityLens.set(person)("Wakefield") 
    val person2 = roadLens.set(person1)("Eastmoor Road") 
    postcodeLens.set(person2)("WF1 3ST")
  }
  // res4: Person = Person(Fred Bloggs,Address(Eastmoor Road,Wakefield,WF1 3ST),10000)

As far as I can make out, Shapeless doesn't offer any monadic behaviour.

Which One to Use?

It seems to me that if you're tempted to use lenses, removal of boilerplate is critical. As yet, this doesn't seem possible with Scalaz 7, but I'm sure it's only a matter of time. There are other initiatives as well, for example, macrocosm, an exploration of scala macros, has dynamic lens creation, as does Rillit. But for me, at the moment, Shapeless seems astonishingly elegant.

Sunday, September 29, 2013

CORS Support in Spray

My tradtunedb service is working nicely, but there is a problem with it. In order to hear a midi tune being played, it has to be rendered. At the moment, this is done by converting it to a wav file on the server - an expensive process which doesn't scale well. A better solution is to render the tune in javascript on the browser - but this presents a further problem. The architecture is such that the front- and back-end have different host names. And this means that the browser will prevent an XMLHttpRequest to the back-end because of potential security breaches.

Why CORS?

Browsers all maintain a same-origin policy. One reason this is necessary is that XMLHttpRequest passes the user's authentication tokens. Suppose he is logged in to theBank.com with basic auth and then visits untrusted.com. If there were no restrictions, untrusted.com could issue its own XMLHttpRequest to theBank.com with full authorisation and gain access to that user's private data.

But a blanket ban on cross-origin requests also prohibits legitimate use. For this reason, most modern browsers support Cross-origin resource sharing (CORS). . This is a protocol which allows the browser to negotiate with the back-end server to discover whether or not such requests from 'foreign' domains are allowed by the server, and to make the actual requests.

When browsers issue requests, they always include the Origin header. The essence of the protocol is that the server will respond with an Access-Control-Allow-Origin header if that Origin is acceptable to it. Browsers will then allow the access to go ahead. For example, here we have an XMLHttpRequest emanating from a script named midijs and hosted on a server running on localhost:9000:

Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Cache-Control:no-cache
Connection:keep-alive
Host:192.168.1.64:8080
Origin:http://localhost:9000
Pragma:no-cache
Referer:http://localhost:9000/midijs
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/28.0.1500.71 Chrome/28.0.1500.71 Safari/537.36

And here is a response from a back-end server that accepts the Origin:

Access-Control-Allow-Credentials:true
Access-Control-Allow-Origin:http://localhost:9000
Content-Length:2222
Content-Type:audio/midi; charset=UTF-8
Date:Sun, 29 Sep 2013 09:13:03 GMT
Server:spray-can/1.1-20130927

This is all that is necessary for GET requests. Security issues are greater for requests that are not used for pure retrieval and/or where user credentials are supplied. Here, CORS provides additional headers and a 'preflight request' mechanism which asks permission from the server before it makes the actual request. For more details see http://www.html5rocks.com/en/tutorials/cors/

CORS Support in Spray

The M8 release versions of Spray have no CORS support. Initial support for the CORS headers has now been integrated into the forthcoming release of Spray. These can be accessed if you develop your own CORS directive which makes use of respondWith directives in order to set the appropriate headers. For example, here's an approach based on that of Cristi Boarlu that works very well:

import spray.http._
import spray.routing._
import spray.http.HttpHeaders._

trait CORSDirectives  { this: HttpService =>
  private def respondWithCORSHeaders(origin: String) =
    respondWithHeaders(      
      HttpHeaders.`Access-Control-Allow-Origin`(SomeOrigins(List(origin))),
      HttpHeaders.`Access-Control-Allow-Credentials`(true)
    )
  private def respondWithCORSHeadersAllOrigins =
    respondWithHeaders(      
      HttpHeaders.`Access-Control-Allow-Origin`(AllOrigins),
      HttpHeaders.`Access-Control-Allow-Credentials`(true)
    )

  def corsFilter(origins: List[String])(route: Route) =
    if (origins.contains("*"))
      respondWithCORSHeadersAllOrigins(route)
    else
      optionalHeaderValueByName("Origin") {
        case None => 
          route        
        case Some(clientOrigin) => {
          if (origins.contains(clientOrigin))
            respondWithCORSHeaders(clientOrigin)(route)
          else {
            // Maybe, a Rejection will fit better
            complete(StatusCodes.Forbidden, "Invalid origin")
          }      
        }
      }
}

And this is how it can be used in my case to protect a route which generates a response for a path that requests a particular file type (e.g. midi):

path(Segment / "tune" / Segment / Segment ) { (genre, tuneEncoded, fileType) =>  
  get { 
    val contentTypeOpt = getContentTypeFromFileType(fileType:String)
    if (contentTypeOpt.isDefined) {
      respondWithMediaType(contentTypeOpt.get.mediaType) { 
        corsFilter(MusicRestSettings.corsOrigin) {
          val tune = java.net.URLDecoder.decode(tuneEncoded, "UTF-8") 
          val futureBin = Tune(genre, tune).asFutureBinary(contentTypeOpt.get)
          _.complete(futureBin)
        }
      }
    }
    else {
     failWith(new Exception("Unrecognized file extension " + fileType))
    }
  } 
} ~

There is also a discussion started by Tim Perrett which hopes to be able to wrap routes within a cors directive which will just 'do the right thing' with preflight requests and appropriate access control headers as unobtrusively as possible.

If you want to try this out yourself, you can try one of the release candidates that were published on October 23rd.

Embedded YouTube Videos and Chrome

I have just discovered a related problem to do with rendering embedded YouTube videos in Chrome. I want to extend my trad tunes web site with a comments page attached to each tune. The foot of the page will contain a form allowing the user to add a comment to the page, and this is particularly useful if she finds a video where the tune is being played. YouTube allows you to embed the video into the page which it manages by means of an iframe. Unfortunately, when she posts the tune, Chrome won't display it until the page is refreshed (although all other major browsers will). Instead it reports:

The XSS Auditor refused to execute a script in 'http://localhost:9000/genre/agenre/tune/atune/comments' because its source code was found within the request. The auditor was enabled as the server sent neither an 'X-XSS-Protection' nor 'Content-Security-Policy' header.

What's happening is that the browser must run javascript code to render the iframe, and it again thinks that it emanates from the server that provided the page which is thus making a cross-origin request. In this case, the solution is to add a request header instructing the browser to suspend the XSS Auditor check when it process the page. This is achieved with the X-XSS-Protection header. For example, the Play Framework Action that processes a valid comment might respond like this:

  
    val comment = commentForm.bindFromRequest.get
    commentsModel.add(comment)
    Ok(views.html.comments(commentsModel.getAll, commentForm, genre, tune)(userName)).withHeaders("X-XSS-Protection"->"0")

When Chrome sees this header, it immediately renders the video. Note, this technique doesn't work if you issue a redirect instead of responding directly.

Wednesday, July 10, 2013

Spray Migration

Firstly, apologies for having been silent for such a long time - I've been looking after my parents who were ill and are now, thankfully, recovered. I'd like to say a huge thank-you to Pinderfields Hospital in Wakefield, Yorkshire for saving my dad's life.

Since I've been away, the Scala ecosystem has been moving on unabated. We now have the final release of Scala 2.10.2 and I've been anxious to upgrade my RESTful music application accordingly. This has been enjoyable, but has taken longer than expected, largely because Spray underwent a huge refactoring at version 1.0-M3 which incorporated lots of breaking changes. As of this writing, the current release is labelled 1.2-M8 and is consistent with Akka 2.2.0-RC1. Other libraries that I use have changed too, but Spray has caused the most impact because the application concentrates on content negotiation and Spray has altered this considerably.

The New Spray

So much is new. The documentation has been completely overhauled and is now much more detailed. If you look at its navigation panel you can get an idea of the massive refactoring into a new module structure that has taken place. Both the github repository and the Spray repo have moved. Packages have been renamed from “cc.spray...” to simply “spray...” and the group id has changed from “cc.spray” to “io.spray”. There is a new DSL for testing routes. However, the changes with the biggest impact for me have been those to do with (un)marshalling (which has been moved to the new spray-httpx module) and those handling authentication.

Marshalling

You now use the Marshaller.of method to create a marshaller for a particular type (and which can be picked up implicitly in the routing). For example, if I want to create a marshaller for a handful of different binary types that I use represented by BinaryImage:

  import spray.httpx.marshalling._
  import MediaTypes._

  class BinaryImage(mediaType: MediaType, stream: BufferedInputStream)

  implicit val binaryImageMarshaller = 
     Marshaller.of[BinaryImage] (`application/pdf`,
                                 `application/postscript`,
                                 `image/png`,
                                 `audio/midi`) {
       (value, requestedContentType, ctx) ⇒ {
           val bytes:Array[Byte] = Util.readInput(value.stream)
           ctx.marshalTo(HttpEntity(requestedContentType, bytes))     
           }
     }

Alternatively, if you have a type A with no marshaller, but you do have a marshaller for a type B, you can provide a marshaller for A by delegating to B's marshaller if you can provide a function from A to B. The definition is:

  def delegate[A, B](marshalTo: ContentType*)
                    (f: A => B)
                    (implicit mb: Marshaller[B]): Marshaller[A]

In my case, nearly all the content is represented as a type Validation[String, A] where the String type represents an error and A represents the various types of source or intermediate content. You can provide a meta-marshaller for a Validation which either handles the error or invokes the marshaller for the underlying type:

implicit def validationMarshaller[B](implicit mb: Marshaller[B]) =
    Marshaller[Validation[String, B]] { (value, ctx) ⇒
      value fold (
        e => throw new Exception(e),
        s ⇒ mb(s, ctx)
      )
    }

It seems rather infra dig to have to throw an exception from a Validation. At the moment it is necessary to do so because a Marshaller has no access to the HTTP response headers and so it is not possible (say) to marshall an error to a string-like content type if you've already told the marshaller that it's only dealing in binaries. However, Mathias from Spray has proposed introducing alternative Marshallers, one of which will have access to the headers in an upcoming release so this small blemish can then be removed.

Unmarshalling

Unmarshalling goes in the opposite direction and can be used to construct a user-defined type from a form submission. The interface here has changed a little too. For example:

  implicit val AbcUnmarshaller =
    Unmarshaller[AbcPost](`application/x-www-form-urlencoded`) {
      case HttpBody(contentType, buffer) => {         
        val notes =  new String(buffer, contentType.charset.nioCharset)
        AbcPost(notes)
      }
    }

As you can see, there is now a nice symmetry between Marshallers and Unmarshallers. They can be used within routes like this:

  post {    
    entity(as[AbcPost]) { abcPost =>     
       // do something

And as with marshallers, there are a whole set of basic unmarshallers provided and you can delegate between unmarshallers if you wish.

Authentication

This is another area where the API has changed and here the documentation is lacking. In my case, I need to perform simple authentication of a user name and password by checking with a backend database. This is best achieved by providing a UserPassAuthenticator. You construct one of these by passing it a function that takes an Optional UserPass and returns a Future which contains an Optional result. The standard approach is to return an encapsulation of the user name in a BasicUserContext if the validation passes and return None otherwise:

  val UserAuthenticator = UserPassAuthenticator[BasicUserContext] { userPassOption ⇒ Future(userPassOption match {
        case Some(UserPass(user, pass)) => {
            try {
              println("Authenticating: " + user + ":" + pass)
              if (TuneModel().isValidUser(user, pass) ) Some(BasicUserContext(user)) else None
            } 
            catch {
              case _ : Throwable => None
            }
        }
        case _ => None
      })
    }

I also need to identify the administrator user who has special powers. I do this by first using the UserAuthenticator to establish the login credentials and then composing this future with a check to establish the administrator name:

   
    val AdminAuthenticator = UserPassAuthenticator[BasicUserContext] { userPassOption => {
      val userFuture = UserAuthenticator(userPassOption)
      userFuture.map(someUser => someUser.filter(bu => bu.username == "administrator"))  
      }
    }

Once you have an authenticator, you can use it in routes like this:

   get {     
     authenticate(BasicAuth(UserAuthenticator, "musicrest")) { user =>    
       _.complete("user is valid")    
     }
   }

Testing

As an example of the new testing DSL, here is how you can test the authentication behaviour:

package org.bayswater.musicrest

import org.specs2.mutable.Specification
import spray.testkit.Specs2RouteTest
import spray.http._
import spray.http.HttpHeaders._
import StatusCodes._
import spray.routing._
import org.bayswater.musicrest.model.{TuneModel,User}

class AuthenticationSpec extends RoutingSpec with MusicRestService {
  def actorRefFactory = system
  
  val before = insertUser

  "The MusicRestService" should {
    "request authentication parameters for posted tunes" in {  
       Post("/musicrest/genre/irish/tune") ~> musicRestRoute ~> check 
         { rejection === AuthenticationRequiredRejection("Basic", "musicrest", Map.empty) }
    }
    "reject authentication for unknown credentials" in {
       Post("/musicrest/genre/irish/tune") ~>  Authorization(BasicHttpCredentials("foo", "bar")) ~> musicRestRoute ~> check 
         { rejection === AuthenticationFailedRejection("musicrest") }
    }
    "allow authentication for known credentials (but reject the lack of a POST body as a bad request)" in {
       Post("/musicrest/genre/irish/tune") ~>  Authorization(BasicHttpCredentials("test user", "passw0rd1")) ~> musicRestRoute ~> check 
         { rejection === RequestEntityExpectedRejection }
    }
    "don't allow normal users to delete tunes" in {
       Delete("/musicrest/genre/irish/tune/sometune") ~>  Authorization(BasicHttpCredentials("test user", "passw0rd1")) ~> musicRestRoute ~> check 
         { rejection === AuthenticationFailedRejection("musicrest") }
    }
    "allow administrators to delete tunes" in {
       Delete("/musicrest/genre/irish/tune/sometune") ~>  Authorization(BasicHttpCredentials("administrator", "adm1n1str80r")) ~> musicRestRoute ~> check 
         { entityAs[String] must contain("Tune sometune removed from irish") }
    } 
  }
   
  def insertUser = {
   val dbName = "tunedbtest"
   val collection = "user"
   val tuneModel = TuneModel()
   tuneModel.deleteUser("test user")
   tuneModel.deleteUser("administrator")
   val user1 = User("test user", "test.user@gmail.com", "passw0rd1")
   tuneModel.insertPrevalidatedUser(user1)    
   val user2 = User("administrator", "john.watson@gmx.co.uk", "adm1n1str80r")
   tuneModel.insertPrevalidatedUser(user2)    
  }
   
}

Tuesday, September 25, 2012

Scala Play Framework and Form Validation

Now that the web service is just about complete, I need to provide a web frontend. For this, I have chosen the Scala Play Framework (v 2.03). The tutorial introduction for this is excellent and, on the whole, Play 'just works' as advertised. The one slight problem I had was in working out how to do cross-field validation on a form. Validating individual fields is very simple. For example, you might define a form for collecting new user details and provide a regex constraint that determines what a well constructed name should look like:


val registrationForm = Form(
    mapping (
      "name" -> (text verifying (pattern("""^[A-Za-z]([A-Za-z0-9_-]){5,24}$""".r, error="Names should start with a letter, be at least 5 characters, and may contain underscore or minus"))),
      "email" -> .....

Play associates both errors and help text with each input field in the form. When the form is rendered any error or help text is placed inside span elements that live alongside the field. So, all you need to do in the view is use the appropriate form helpers and these messages will display:


@helper.form(action = routes.Application.processNewUser, 'id -> "newuserform", 'enctype -> "application/x-www-form-urlencoded" ) {
          <fieldset>
            <legend class="new-user" >New User
             <ul>
             <li>
              @helper.inputText(registrationForm("name"), 'required -> "required",  '_class -> "newuser", '_help -> "" )
              </li>
              <li>
              @helper.inputText(registrationForm("email"), 'required -> "required", '_class -> "newuser")
              </li>
             ......
      }

However, if your validation constraint needs to consider multiple input fields, things get slightly more complicated.

Embedded Tuples

Suppose you have a registration form requiring your user to confirm his password. You need to provide both a main and confirmation input field on the form, but it is more efficient if you only have a single password field in the user registration model that underlies the form. So, your user may simply look like this:


case class User(name: String, email: String, password: String)

When you define the form, you can associate the two password fields together with an embedded tuple with a constraint that compares the two fields:


 val registrationForm = Form(
   mapping ( 
      .........
      "password" -> tuple(
        "main" -> text(minLength = 7),
        "confirm" -> text
       ).verifying (
        // Add an additional constraint: both passwords must match
        "Passwords don't match", passwords => passwords._1 == passwords._2
       ))   
      .......

Now you need to do a little more work in the view to persuade the errors to render. Here, we're defining the confirmation field. We have to define a couple of pseudo attributes (starting with an underscore) that explicitly set values that previously with simple forms had been implicit. The '_label attribute sets the label text and the '_error attribute associates the field with the constraint for the overall password:


    <li>
    @helper.inputPassword(registrationForm("password.confirm"), '_label -> "Confirm password", 'required -> "required", '_class -> "newuser",   '_error -> registrationForm.error("password") )
    </li>

This technique is used in the SignUp form in Play's samples.

Form-Level Validation

Alternatively you can provide constraints at the level of the overall form. These differ from constraints seen so far because they are ad-hoc - in other words, they are not named and thus unattached from any particular input field. Suppose you have successfully registered your user but now want a login form that checks that the user has previously registered:


  val loginForm = Form(
    mapping (
      "name" -> text,
      "password" -> text
    ) (Login.apply)(Login.unapply)
    verifying ("user not registered", f => checkUserCredentials(f.name, f.password))
  )

In this case, you can render any error messages by invoking the globalError method and perhaps place it in a span element at the foot of the form:


   @loginForm.globalError.map { error =>
     @error.message

Thursday, August 23, 2012

REST and Pagination

What is the RESTful way to perform pagination? To my mind, there is only one approach - URI parameters. For example, to return the first ten tunes from our repository:


    http://localhost:8080/musicrest/genre/irish/tune?page=1&size=10

A page is not itself a resource and so these parameters should not be part of the URI path. Nor should the HTTP Range header be hijacked for this purpose. Fortunately, Spray makes it ridiculously easy to fish out the parameters and provide defaults where necessary with its parameters combinator. For example, if we want to use a default page size of 10 we can use:


    pathPrefix("musicrest/genre") {
      path (PathElement / "tune" ) { genre => 
        parameters('page ? 1, 'size ? 10) { (page, size) =>
          get {  ctx => ctx.complete(TuneList(genre, page, size)) }
        }        
      }
    } ~
    ....

The next problem is how best to represent the list of tunes that is returned. As we're developing a transcoding service, it makes sense to support both JSON and XML representations and to return the paging information alongside the tunes. These representations can be returned by means of an appropriate marshaller as discussed earlier. A JSON representation for a couple of tunes might be:


 {
    "tune": [
        {
            "title": "A Fig For A Kiss",
            "rhythm": "slip jig",
            "_id": "a+fig+for+a+kiss-slip+jig"
        },
        {
            "title": "Baltimore Salute, The",
            "rhythm": "reel",
            "_id": "baltimore+salute%2C+the-reel"
        }
    ],
    "pagination": {
        "page": "1",
        "size": "2"
    }
}

and the equivalent XML representation would be:


  <tunes>
     <tune>
       <title>A Fig For A Kiss</title>
       <rhythm>slip jig</rhythm>
       <_id>a+fig+for+a+kiss-slip+jig</_id>
     </tune>
     <tune>
       <title>Baltimore Salute, The</title>
       <rhythm>reel</rhythm>
       <_id>baltimore+salute%2C+the-reel</_id>
     </tune>
     <pagination>
       <page>1</page>
       <size>2</size>
     </pagination>
  </tunes>

Before we look into how to implement this, I need to digress slightly and talk about the components I've chosen to integrate.

Components

It turns out that LilyPond was not really appropriate for transcoding the tunes. Although it produces scores of very high quality, it was slow, and the abc2ly utility would occasionally hiccup (for example, it would get confused by lead-in notes). Instead I now use abcMidi for midi production and abc2ps for postscript (which can then be transcoded to a variety of formats with the standard Linux convert tool). I had originally intended to use Blue Eyes MongoDB for Mongo integration but was put off by the large number of jars brought in through its dependency on Blue Eyes Core. Instead I now use Casbah which is very easy to work with. I'm still very pleased with Blue Eyes JSON, though. And finally, I have chosen Spray as the web service toolkit.

Implementing Paging

This is very straightforward with Mongo because the URI parameters very closely match Mongo's skip and limit functions. For example, the following Casbah query returns the data we need (here T represents the tune title and R its rhythm):


  def getTunes(genre: String, page: Int, size: Int): Iterator[scala.collection.Map[String, String]] = {
    val mongoCollection = mongoConnection(dbname)(genre)
    val q  = MongoDBObject.empty
    val skip = (page -1) * size
    val fields = MongoDBObject("T" -> 1, "R" -> 2)
    val s = MongoDBObject("T" -> 1)
    for {
      x <- mongoCollection.find(q, fields).sort(s).skip(skip).limit(size)
    } yield x.mapValues(v => v.asInstanceOf[String])    
  }

We can use the Iterator returned by this query to populate our TuneList class, and provide a toJSON method to produce the output we require:


class TuneList(i: Iterator[scala.collection.Map[String, String]], page: Int, size: Int) {
 
  def toJSON: String = {
    val quotedi = i.map( cols => cols.map( c => c match {
      case ("T", x) => "\"title\"" + ": " + "\"" + x + "\" "
      case ("R", x) => "\"rhythm\"" + ": " + "\"" + x + "\" "
      case (a, b)   => "\"" + a + "\"" + ": " + "\"" + b + "\" "
    }).mkString("{", ",", "}"))
    quotedi.mkString("{ \"tune\": [", ",", "], " + pageInfoJSON + "  }") 
  }
   
  private val pageInfoJSON:String = " \"pagination\" : { \"page\""  + ": \"" + page + "\" ," + "\"size\""  + ": \"" + size + "\" }"
  
}

object TuneList {
  def apply(genre: String, page: Int, size: Int):TuneList = new TuneList(TuneModel().getTunes(genre, page, size), page, size)
}

Finally, we can add a toXML method for the XML output, made very easy by the Blue Eyes JSON library:


  def toXML: String = {
      val jvalue = JsonParser.parse(toJSON)
      "<tunes>" + Xml.toXml(jvalue).toString + "</tunes>"
  }

I find it much simpler to generate XML from JSON rather than the other way round.

Friday, July 27, 2012

Content Negotiation with Spray

Please note: this article describes Spray 1.0-M2. The Spray API has altered considerably since this was written - see the more recent post on Spray Migration.

Spray is another attractive framework for building RESTful web services. Again it uses combinators for parsing requests in a similar manner to unfiltered but it offers considerably more options in composing them together, very nicely described here. Spray's approach to content negotiation differs from unfiltered. By and large, the set of supported MIME types, and the matching of appropriate responses to requests remains hidden. Having matched a request, you would typically return a response using the complete function:


    path("whatever") {
      get { requestContext => requestContext.complete("This is the response") }
    }

This has the ability to return a variety of MIME types. This is because each variant of the complete method on RequestContext takes Marshaller as an implicit parameter:


   def complete [A] (obj: A)(implicit arg0: Marshaller[A]): Unit

I like this because it is likely that you will want to model each RESTful resource with a dedicated type. Then, if you simply provide a marshaller for that type in implicit scope, you have a type-safe response mechanism. It is possible for a marshaller to take responsibility for more than one MIME type, and the Spray infrastructure will then silently handle the content negotiation for you, hooking up an appropriate response content type to that requested. For example, if we want to reproduce the music service so far implemented with unfiltered, we can do it like this:


import cc.spray._
import cc.spray.directives._

import org.bayswater.musicrest.typeconversion.MusicMarshallers._

trait MusicRestService extends Directives {
  
  val musicRestService = {
    path("musicrest") {
      get { _.complete("Music Rest main options") }
    } ~
    path("musicrest/genre") { 
      get { _.complete("List of genres") }
    } ~
    path("musicrest/genre" / PathElement ) { genre =>
      get { _.complete("Genre " + genre) }
    } ~
    pathPrefix("musicrest/genre") {
      path (PathElement / "tune" ) { genre => 
        get { _.complete("List of Tunes" ) }
      } ~
      path (PathElement / "tune" / PathElement ) { (genre, tune) =>         
        get { _.complete(Tune(genre, tune) ) }
      }
    } 
  }
  
}

Notice that when we want to match a URL that represents a tune, we use a Tune type with marshallers for the tune in scope. The marshaller (at the moment) simply encodes Strings, but it is all that is needed to generate the required Content-Type header (pdf, midi, json etc.):


import cc.spray.typeconversion._
import cc.spray.http._
import HttpCharsets._
import MediaTypes._
import java.nio.CharBuffer

trait MusicMarshallers {
  
  val `audio/midi` = MediaTypes.register(CustomMediaType("audio/midi", "midi"))
  
  case class Tune(genre: String, name: String)
  
  implicit lazy val TuneMarshaller = new SimpleMarshaller[Tune] {
    val canMarshalTo = ContentType(`text/plain`) ::
                       ContentType(`application/pdf`) :: 
                       ContentType(`audio/midi`) :: 
                       ContentType(`application/json`) :: 
                       Nil
                       
    def marshal(tune: Tune, contentType: ContentType) = {
      val content = "genre " + tune.genre + " name " + tune.name 
      val nioCharset = contentType.charset.getOrElse(`ISO-8859-1`).nioCharset
      val charBuffer = CharBuffer.wrap(content)
      val byteBuffer = nioCharset.encode(charBuffer)
      HttpContent(contentType, byteBuffer.array)
    }
  }
}

object MusicMarshallers extends MusicMarshallers

Spray provides a registry of the most common MIME types but this will not include our audio/midi type until the next release. For the time being, we must register the type ourselves.

Posted Messages

Suppose we wanted to offer a transcoding service, where users post tunes in ABC format and we return them transcoded to the requested type. We can manage this by defining an Abc type and adding an Unmarshaller to MusicMarshallers for that type which converts the POST body.


  case class Abc(notes: String)

  implicit lazy val AbcUnmarshaller = new SimpleUnmarshaller[Abc] {
    val canUnmarshalFrom = ContentTypeRange(`text/plain`)  :: Nil

    def unmarshal(content: HttpContent) = protect {
      val notes =  new String(content.buffer, content.contentType.charset.getOrElse(`ISO-8859-1`).nioCharset)
      Abc(notes)
    }
  }

Spray uses the same approach for ummarshalling that it uses for marshalling. If an implioit unmarshaller for the type is in scope, it will be used to construct the type when we use the content function. The type signatures are:


  def as [A] (implicit arg0: Unmarshaller[A]): Unmarshaller[A]

  def content [A] (unmarshaller: Unmarshaller[A]): SprayRoute1[A]

which we can then use like this:


    path("musicrest/transcode") { 
      post { 
        content(as[Abc]) { abc =>      
          val tune:Tune = parseAbc(abc)
          _.complete(tune) 
        }
      }
    }

and because we're completing with a Tune type, marshalling to the requested type again happens silently. We've had to write our own marshallers and unmarshallers because we're using relatively uncommon MIME types but Spray provides default marshallers/unmarshallers for the more common types. Incidentally, Spray seems to be refreshingly strict in its accordance to the HTTP spec. It appears that Chrome has a bug when submitting forms encoded as text/plain it mistakenly issues an empty boundary (as if it were a MultiPart type):


Content-Type:text/plain; boundary=

which is immediately rejected by Spray.

Testing

Spray supplies a SprayTest trait which allows you to test the routing logic directly (without having to fire up the container). For example:


 "The MusicRestService" should {
    "return resonable content for GET requests to the musicrest tune path" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(acceptHeader("text/plain")))) {
        musicRestService
      }.response.content.as[String] mustEqual Right("genre irish name odeas")
    }
  }

It is possible, too, to ensure that the correct Content-Type is produced. Firstly, make sure that the various HTTP headers, charsets and media types are in scope:


import cc.spray.http.HttpHeaders._ 
import cc.spray.http.HttpCharsets._ 
import cc.spray.http.MediaTypes._ 
import org.bayswater.musicrest.typeconversion.MusicMarshallers.`audio/midi`

At this stage in the proceedings, the Content-Type header has not yet been generated. Instead it is part of the content:


    "return PDF MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`application/pdf`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`application/pdf`, `ISO-8859-1`)) 
    }
    "return text/plain MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`text/plain`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`text/plain`, `ISO-8859-1`)) 
    }
    "return audio/midi MIME type for GET tune requests for that type" in {
      testService(HttpRequest(GET, "/musicrest/genre/irish/tune/odeas", List(Accept(`audio/midi`)))) {
        musicRestService
      }.response.content.map(_.contentType) === Some(ContentType(`audio/midi`, `ISO-8859-1`)) 
    }

We can also, of course, use Dispatch for client-server testing like we did before, or we can use Spray Client which allows you to do the same sort of marshalling and unmarshalling at the client side.

Thursday, July 5, 2012

REST and Content Negotiation

Suppose you have a resource (in our case a piece of music) and you want to provide it in a variety of formats (plain text, pdf, midi etc.) then what is the RESTful best practice? Opinion seems divided - either you represent the resource just once and then provide a best-fit response by inspecting the Accept header or you represent each form of the resource with a separate URL. I tend to favour the former and want to investigate the support given by unfiltered and Blue Eyes.

Browser Tools

Before we start, it's helpful to do the initial testing from a browser. ModHeader is a simple Chrome plugin that allows you manipulate the request headers. Here we're restricting the effect just to localhost:8080 and we're setting the MIME type of the request to application/json:

You can then, of course, inspect the request and response headers using the standard Chrome development tools.

unfiltered

Unfiltered is an extremely lightweight toolkit that focuses on the problem at hand. It acts as a Servlet filter and so is dealing fundamentally with HttpServletRequest and HttpServletResponse objects. However, it delivers a very intuitive Scala API built round the type ResponseFunction = HttpServletRequest => HttpServletResponse. These functions can be composed together in a variety of ways allowing us to recognize the different combinations of request headers we need to identify or to build up appropriate combinations of response headers. Conventional scala pattern-matching is available to allow us to distinguish the various RESTful URLs we need. Finally, these mappings are gathered together in a Plan which is defined as PartialFunction[HttpServletRequest, ResponseFunction]. The portion of a Plan that handles our tunes might look like this:


  def intent = { 
    case req @ GET(Path(Seg("musicrest" :: genre :: "tune" :: tune :: Nil))) => req match {
        case MusicAccepts.Pdf(_) =>  PdfContent ~> ResponseString("Get PDF request genre: " + genre + " tune: " + tune)
        case MusicAccepts.Midi(_) => MidiContent ~> ResponseString("Get midi request genre: " + genre + " tune: " + tune)
        case Accepts.Json(_) => JsonContent ~> ResponseString("Get JSON request genre: "  + genre + " tune: " + tune)
        case _ =>    Ok ~> ResponseString("Get request for genre: " + genre + " tune: " + tune)
    }

    // other Plan cases omitted
  }

At the moment, this simply returns a String response that reflects the request, but the MIME type of the response accords with the type we eventually intend to return. Unfiltered natively handles common MIME types such as XML or JSON but when we stray from the straight and narrow with PDF or MIDI we need to provide a little more help. The DSL is easily extended like this:


import unfiltered.request.Accepts 
import unfiltered.response.ContentType

/** Accepts request header extractor for music extensions */
object MusicAccepts {

  object Midi extends Accepts.Accepting {
    val contentType = "audio/midi"
    val ext = "midi"
  }

  object Pdf extends Accepts.Accepting {
    val contentType = "application/pdf"
    val ext = "pdf"
  }
}

/** Response content type for music extensions */
object MidiContent extends ContentType("audio/midi")

Testing

Dispatch is a scala wrapper around Apache HttpClient. We can use this to run some integration tests. (I currently use Should Matchers for jUnit). Here is a test that confirms that the MIME types for PDF content agree:


  private val  localHost = :/("localhost", 8080)   
  private val genre = "irish" 
  private val tune = "odeas"

  def testMimePdf() {     
    val h = new Http
    val headers = Map("Accept" -> "application/pdf")
    val req = localHost / "musicrest" / genre / "tune" / tune  
    h(( req <: data-blogger-escaped-headers="">:> { 
      hs => {      
            val contentType: Option[String] = hs("Content-Type").headOption
            contentType.map{_ should be ("application/pdf")}.getOrElse(fail("No Content-type header"))
            }      
      })
  }

Dispatch uses a rich set of operators for its combinators and these need a bit of explaining. In this example, >:> extracts the response headers. These operators are nicely summarised here. If you need to inspect both the headers and the content of the response, you can use the >+ operator to fork a handler into two (returned as a tuple):


  def testMimeJson() {  
    val h = new Http
    val headers = Map("Accept" -> "application/json")
    val req = localHost / "musicrest" / genre / "tune" / tune  
    val ans = h((req <: data-blogger-escaped-headers="">+ { r => 
      (r >:> { 
         hs => {
                val contentType: Option[String] = hs("Content-Type").headOption
                contentType.map{_ should be ("application/json; charset=utf-8")}.getOrElse(fail("No Content-type header"))
                }
         },      
         r >- {
           content => content should be ("Get JSON request for genre: irish tune: odeas")
       }
       )
    }) 
  }

Blue Eyes

Blue Eyes is more ambitious - attempting to be a complete web framework for RESTful services, implemented from scratch using pure functional techniques. One departure from unfiltered is that it does not recognize the Accept header - instead it requires requests to use a Content-Type header. I think this is a legitimate approach - just not the one I was looking for. It uses a similar combinatorial design, and has very good support for common types (particularly XML and JSON). Here's a snippet showing how you can match tune requests and return appropriate responses in these formats:


       } ~
       describe ("get tune  within a genre and type") {
          path("/genre" / 'genreName / "tune" / 'tune) {
            jvalue {
              get { request: HttpRequest[Future[JValue]] =>
                val genre = request.parameters('genreName) 
                val tune  = request.parameters('tune)
                val jTune: Future[HttpResponse[JValue]] = TuneModel(musicRestConfig).jasonTune(genre, tune)
                jTune
              }
            } ~
            xml {
              get { request: HttpRequest[Future[NodeSeq]]  =>
                val genre = request.parameters('genreName)
                val tune  = request.parameters('tune)
                val xmlTune: Future[HttpResponse[NodeSeq]] = TuneModel(musicRestConfig).xmlTune(genre, tune)
                xmlTune
              }
            }
          }
        } ~

Here the tilde combinator allows you to join the partial functions that handle each URL - pattern a or else pattern b or else .... The jvalue and xml combinators handle the mapping between incoming and outgoing Content-Types. I did not find it easy to provide similar combinators for my music types. I think my difficulty was largely because these combinators are expressed in terms of Bijections which are mappings in both directions between a type A and a type B. It wasn't clear to me how this mapped on to the problem at hand. According to forum answers there is an outer scope whose currency is A and an inner scope whose currency is B. But I simply needed to negotiate the type and then map from the type I actually had to the type agreed. Anyway, forum answers were immensely helpful and there appears to be some possibility that the Bijection approach may be rethought. I did like the JSON and Mongo libraries that are supplied as separate jars. The JSON library is derived from Lift Json whereas the Mongo library is new and looks as if it might be a viable alternative to Casbah. But this will have to wait for another post.

The Way Forward

I think, as things stand, I will most likely use unfiltered for pattern matching and use the Blue Eyes Mongo and JSON libraries for the heavy lifting. But I am very interested in tracking Blue Eyes core to see how it turns out.

Wednesday, June 20, 2012

Delving Deeper into Scala

OK - here's the plan. Although I've been using Scala commercially for a couple of years, it's been in a somewhat niche area - XML messaging. I'd like to become a little more familiar with some of the newer frameworks and toolkits that are emerging.

One of my interests is playing traditional music. Trad musicians tend to exchanging tunes in abc notation - a very simple format developed by Chris Walshaw. Various sites provide tune repositories which also display the dots in a conventional music stave - the one I mostly use is The Session. For example, here's the abc for one of my favourite jigs - O'Dea's:

X: 1
T: O'Dea's
M: 6/8
L: 1/8
R: jig
K: Gmaj
|: G3 GBd|BGD E3|DGB d2 d|edB def|
g3 ged|ege edB|dee edB|gdB A2 B:|
|: c3 cBA|Bdd d2 e|dBG GBd|edB AFD|
GBd gag|ege edB|dee edB|gdB A3:|

The tune goes like this:

I intend to build a RESTful service for traditional tunes. Users would post tunes in abc notation within particular genres. They could then request a given tune in a variety of different formats (for example: plain text (abc), pdf, JSON, midi etc.). I would also probably have a simple one-off transcoding service that didn't save the tune but would allow users to experiment with the abc. All this has been done before of course in various places - tunedb has a huge collection for example. But my main motivation is to investigate the Scala landscape - my current plan is to investigate MongoDB, unfiltered, blueeys, configrity and scalaz7.

Transcoding

So the service is to be built round the ability to transcode abc into various other formats. To do this, I have chosen LilyPond which produces very high quality pdf images of scores and supports other formats too (such as midi), It also has an add-on that converts abc into its native .ly format.

One drawback to LilyPond is that it only offers a command-line interface, so I'll have to shell out using scala.sys.process. Here's a bash script (abc2pdf.sh) that invokes it:


!/bin/bash
#############################################
#
# transcode abc to pdf format
#
# usage: abc2pdf.sh srcdir destdir tunename
#
#############################################

EXPECTED_ARGS=3

if [ $# -ne $EXPECTED_ARGS ]
then
  echo "Usage: `basename $0` {srcdir destdir tunename}"
  exit $E_BADARGS
fi

# source
abcdir=$1
if [ ! -d $abcdir ]
then
  echo "$abcdir not a directory" >&2   # Error message to stderr.
  exit 1
fi  

# destination
pdfdir=$2
if [ ! -d $pdfdir ]
then
  echo "$pdfdir not a directory" >&2   
  exit 1
fi  

# temporary work directory (we'll reuse src for the time being)
workdir=$1

# source file
srcfile=${abcdir}/${3}.abc
if [ ! -f $srcfile ]
then
  echo "no such file $srcfile" >&2  
  exit 1
fi  

# transcode from .abc to .ly
abc2ly -o $workdir/$3.ly $abcdir/$3.abc
retcode=$?
echo "abc return: " $retcode

# transcode from .ly to .pdf
if [ $retcode -eq 0 ]; then
  echo "attempting to transcode ly to pdf"
  lilypond --pdf -o $pdfdir/$3 $workdir/$3.ly
  retcode=$?
fi

# remove the intermediate .ly file
rm -f $workdir/$3.ly

exit $retcode

Invoking from Scala

And here's some code that uses Process to call the script and return a scalaz Validation that contains either LilyPond's error messages or a file handle to the pdf:


import scala.sys.process.{Process, ProcessLogger}
import java.io.{InputStream, File}
import scalaz.Validation
import scalaz.Scalaz._
import org.streum.configrity.Configuration

trait Transcoder {
  
  def config: Configuration

  private def scriptHome = config[String]("transcode.scriptDir")
    
  // source
  private def abcHome = config[String]("transcode.abcDir")

  // destination
  private def pdfHome = config[String]("transcode.pdfDir")

  def transcode(abcName: String): Validation[String, File] = {
    import scala.collection.mutable.StringBuilder 

    val out = new StringBuilder
    val err = new StringBuilder

    val logger = ProcessLogger(
      (o: String) => out.append(o),
      (e: String) => err.append(e))

    val pb = Process(scriptHome + "abc2pdf.sh " + abcHome + " " + pdfHome + " " + abcName)
    val exitValue = pb.run(logger).exitValue

    exitValue match {
      case 0 => {val fileName = pdfHome  + abcName + ".pdf"
                 val file = new File(fileName)
                 file.success
                 }
      case _ => err.toString.fail
    }
  }  
}

config shows one way to use configrity. We could put our config values into a file (server.conf):


transcode {
   scriptDir = "/home/john/Development/Workspace/BlueEyes/tunedb/lilypond/"
   abcDir = "/var/data/music/abc/"
   pdfDir = "/var/data/music/pdf/"
}

we can them build a Transcoder in a test environment like this:


   class TestTranscoder extends Transcoder {
      override def toString = "test transcoder"
      override val config = {
        println("Loading configuration file")
        Configuration.load("/home/john/Development/Workspace/abcTranscode/conf/server.conf")
      }
    }

and we can test it like this:


    /** a test transcode with an abc file that exists */
    def testGoodABC() {
      val transcoder = new TestTranscoder()
      val validation = transcoder.transcode("Odeas")
      validation.fold( e => fail("file should have been transcoded"),
                       s => s.getName() should be ("Odeas.pdf"))
    }

    /** a test transcode with an abc file that does not exist */
    def testBadABC() {
      val transcoder = new TestTranscoder()
      val validation = transcoder.transcode("NotThere")
      validation.fold( e => (),
                       s => fail("PDF file should not be returned for invalid input"))
    }

Monday, June 11, 2012

Scala in Depth

Josh Suereth's long-awaited book was published a couple of weeks ago. It is by no means an introductory text - it seems to be aimed predominantly at java developers who already have a good grasp of scala syntax.

I feel that one of the main threats to the scala community is the possibility of it polarising into OO and pure functional factions. I found Josh's book tremendously helpful in attempting to establish a truly idiomatic scala style and this is not an easy goal to achieve. Throughout the book, he develops rules which summarise best practice. As expected, these encourage a more functional style, but he is not afraid to say where their over-use becomes awkward, or to point out scala constructs that are best avoided. At all times, he has in the back of his mind the professional programmer, working in a team, and he wants to make the environment as productive as possible, with no nasty surprises.

For me, the most important chapters were those on implicits and the type system which were treated very thoroughly and provide a very understandable introduction to the coverage of type classes in the final chapter. As such, this book may also act as a wonderful introduction to Functional Programming in Scala when it arrives.