Elucubrations: 2013

Wednesday, December 4, 2013

Lenses in Scala

I have just got back from Scala eXchange 2013 where the initial keynote session was from Simon Peyton Jones on Lenses: Compositional data access and manipulation. A presentation in pure Haskell was, in fact, quite a heavyweight start to a Scala conference. However, it turned out to be a lucid explanation of a beautiful abstraction, and one which, at first sight, would appear to be impossible to arrive at given the apparent inconsistency of the types. Have a look at Simon's presentation if you find time - it's well worth the effort.

A lens solves the problem of reading from, and updating, a deeply nested record structure in a purely functional way, but without getting mired in ugly, deeply nested and bracketed code. Anyway, the talk got me interested in finding out what was available with Scala. There seem to be two main contenders at the moment - from scalaz 7 and shapeless. I thought I'd compare the two, to see how they handle the simple name and address record structure used by Simon which is (in scala):

   case class Person (name: String, address: Address, salary: Int)
   case class Address (road: String, city: String, postcode: String)

   // sample Person
   val person = Person("Fred Bloggs", Address("Bayswater Road", "London", "W2 2UE"), 10000)

Lenses from Scalaz 7

A lens itself is a pretty simple affair, consisting only of a pair of functions, one for getting and another for setting the value that the lens describes, and doing this within the context of the container (i.e. the record). In scalaz, these are constructed with a good deal of boilerplate - there are two constructors available to you (lensg and lensu), one which curries the 'setting' function, and the other which does not:

    
   val addressLens = Lens.lensg[Person, Address] (
     a => value => a.copy(address = value),
     _.address
   )   
       
   // or
   
   val nameLens = Lens.lensu[Person, String] (
     (a, value) => a.copy(name = value),
      _.name
   )

Notice that each lens is defined in terms of two types - that of the value itself and that of its immediate container. Typically, you would then have to get hold of a lens for every atomic value you might wish to access:

   
  val salaryLens = Lens.lensu[Person, Int] (
     (a, value) => a.copy(salary = value),
     _.salary
   )

   val roadLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(road = value),
     _.road
   )

   val cityLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(city = value),
     _.city
   )

   val postcodeLens = Lens.lensu[Address, String] (
     (a, value) => a.copy(postcode = value),
     _.postcode
   )

This soon becomes tedious. However, the true usefulness of lenses comes to light when you compose them:

   
  val personCityLens = addressLens andThen cityLens
  val personPostcodeLens = postcodeLens compose addressLens

i.e. you can stay at the top of your record structure and reach down to the very bowels just by using one of these composite lenses. Scalaz, of course, offers you a shorthand hieroglyphic for these composition functions. So you could write them:

   
  val personCityLens = addressLens >=> cityLens
  val personRoadLens = roadLens <=< addressLens

Now you have your lenses defined, you can use them like this:

  
  def get = salaryLens.get(person)
  // res0: Int = 10000

  def update = salaryLens.set(person, 20000)    
  // res1: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),20000)

  def transform = salaryLens.mod(_ + 500, person) 
  // res2: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),10500)

  def transitiveGet = personCityLens.get(person)
  //res3: String = London

  def transitiveSet:Person = {
    val person1 = personCityLens.set(person, "Wakefield") 
    val person2 = personRoadLens.set(person1, "Eastmoor Road") 
    personPostcodeLens.set(person2, "WF1 3ST")
  }
  // res4: Person = Person(Fred Bloggs,Address(Eastmoor Road,Wakefield,WF1 3ST),10000)

With scalaz, you also get good integration with the other type classes you might need to use. So, for example, lenses can be lifted to behave monadically because the %= method applies the update and returns a State Monad.:

  
  def updateViaMonadicState = {
        val s = for {c <- personRoadLens %= {"124, " + _}} yield c
        s(person)
      }
  //res13: (Person, String) = (Person(Fred Bloggs,Address(124, Bayswater Road,London,W2 2UE),10000),124, Bayswater Road)

Lenses from Shapeless

By contrast, Shapeless gets rid of all this boilerplate and allows for composition during the construction of the lenses themselves. It uses a positional notation to define the target field of each lens:

  
  val nameLens     = Lens[Person] >> 0
  val addressLens  = Lens[Person] >> 1
  val salaryLens   = Lens[Person] >> 2
  val roadLens     = Lens[Person] >> 1 >> 0
  val cityLens     = Lens[Person] >> 1 >> 1
  val postcodeLens = Lens[Person] >> 1 >> 2

These can then be used in pretty much the same way as Scalaz 7's lenses. The main difference is that mod becomes modify and function application is curried:

  
  def get = salaryLens.get(person)
  // res0: Int = 10000

  def update = salaryLens.set(person)(20000)    
  // res1: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),20000)

  def transform = salaryLens.modify(person)( _ + 500) 
  // res2: Person = Person(Fred Bloggs,Address(Bayswater Road,London,W2 2UE),10500)

  def transitiveGet = cityLens.get(person)
  // res3: String = London

  def transitiveSet:Person = {
    val person1 = cityLens.set(person)("Wakefield") 
    val person2 = roadLens.set(person1)("Eastmoor Road") 
    postcodeLens.set(person2)("WF1 3ST")
  }
  // res4: Person = Person(Fred Bloggs,Address(Eastmoor Road,Wakefield,WF1 3ST),10000)

As far as I can make out, Shapeless doesn't offer any monadic behaviour.

Which One to Use?

It seems to me that if you're tempted to use lenses, removal of boilerplate is critical. As yet, this doesn't seem possible with Scalaz 7, but I'm sure it's only a matter of time. There are other initiatives as well, for example, macrocosm, an exploration of scala macros, has dynamic lens creation, as does Rillit. But for me, at the moment, Shapeless seems astonishingly elegant.

Sunday, September 29, 2013

CORS Support in Spray

My tradtunedb service is working nicely, but there is a problem with it. In order to hear a midi tune being played, it has to be rendered. At the moment, this is done by converting it to a wav file on the server - an expensive process which doesn't scale well. A better solution is to render the tune in javascript on the browser - but this presents a further problem. The architecture is such that the front- and back-end have different host names. And this means that the browser will prevent an XMLHttpRequest to the back-end because of potential security breaches.

Why CORS?

Browsers all maintain a same-origin policy. One reason this is necessary is that XMLHttpRequest passes the user's authentication tokens. Suppose he is logged in to theBank.com with basic auth and then visits untrusted.com. If there were no restrictions, untrusted.com could issue its own XMLHttpRequest to theBank.com with full authorisation and gain access to that user's private data.

But a blanket ban on cross-origin requests also prohibits legitimate use. For this reason, most modern browsers support Cross-origin resource sharing (CORS). . This is a protocol which allows the browser to negotiate with the back-end server to discover whether or not such requests from 'foreign' domains are allowed by the server, and to make the actual requests.

When browsers issue requests, they always include the Origin header. The essence of the protocol is that the server will respond with an Access-Control-Allow-Origin header if that Origin is acceptable to it. Browsers will then allow the access to go ahead. For example, here we have an XMLHttpRequest emanating from a script named midijs and hosted on a server running on localhost:9000:

Accept:*/*
Accept-Encoding:gzip,deflate,sdch
Accept-Language:en-GB,en-US;q=0.8,en;q=0.6
Cache-Control:no-cache
Connection:keep-alive
Host:192.168.1.64:8080
Origin:http://localhost:9000
Pragma:no-cache
Referer:http://localhost:9000/midijs
User-Agent:Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Ubuntu Chromium/28.0.1500.71 Chrome/28.0.1500.71 Safari/537.36

And here is a response from a back-end server that accepts the Origin:

Access-Control-Allow-Credentials:true
Access-Control-Allow-Origin:http://localhost:9000
Content-Length:2222
Content-Type:audio/midi; charset=UTF-8
Date:Sun, 29 Sep 2013 09:13:03 GMT
Server:spray-can/1.1-20130927

This is all that is necessary for GET requests. Security issues are greater for requests that are not used for pure retrieval and/or where user credentials are supplied. Here, CORS provides additional headers and a 'preflight request' mechanism which asks permission from the server before it makes the actual request. For more details see http://www.html5rocks.com/en/tutorials/cors/

CORS Support in Spray

The M8 release versions of Spray have no CORS support. Initial support for the CORS headers has now been integrated into the forthcoming release of Spray. These can be accessed if you develop your own CORS directive which makes use of respondWith directives in order to set the appropriate headers. For example, here's an approach based on that of Cristi Boarlu that works very well:

import spray.http._
import spray.routing._
import spray.http.HttpHeaders._

trait CORSDirectives  { this: HttpService =>
  private def respondWithCORSHeaders(origin: String) =
    respondWithHeaders(      
      HttpHeaders.`Access-Control-Allow-Origin`(SomeOrigins(List(origin))),
      HttpHeaders.`Access-Control-Allow-Credentials`(true)
    )
  private def respondWithCORSHeadersAllOrigins =
    respondWithHeaders(      
      HttpHeaders.`Access-Control-Allow-Origin`(AllOrigins),
      HttpHeaders.`Access-Control-Allow-Credentials`(true)
    )

  def corsFilter(origins: List[String])(route: Route) =
    if (origins.contains("*"))
      respondWithCORSHeadersAllOrigins(route)
    else
      optionalHeaderValueByName("Origin") {
        case None => 
          route        
        case Some(clientOrigin) => {
          if (origins.contains(clientOrigin))
            respondWithCORSHeaders(clientOrigin)(route)
          else {
            // Maybe, a Rejection will fit better
            complete(StatusCodes.Forbidden, "Invalid origin")
          }      
        }
      }
}

And this is how it can be used in my case to protect a route which generates a response for a path that requests a particular file type (e.g. midi):

path(Segment / "tune" / Segment / Segment ) { (genre, tuneEncoded, fileType) =>  
  get { 
    val contentTypeOpt = getContentTypeFromFileType(fileType:String)
    if (contentTypeOpt.isDefined) {
      respondWithMediaType(contentTypeOpt.get.mediaType) { 
        corsFilter(MusicRestSettings.corsOrigin) {
          val tune = java.net.URLDecoder.decode(tuneEncoded, "UTF-8") 
          val futureBin = Tune(genre, tune).asFutureBinary(contentTypeOpt.get)
          _.complete(futureBin)
        }
      }
    }
    else {
     failWith(new Exception("Unrecognized file extension " + fileType))
    }
  } 
} ~

There is also a discussion started by Tim Perrett which hopes to be able to wrap routes within a cors directive which will just 'do the right thing' with preflight requests and appropriate access control headers as unobtrusively as possible.

If you want to try this out yourself, you can try one of the release candidates that were published on October 23rd.

Embedded YouTube Videos and Chrome

I have just discovered a related problem to do with rendering embedded YouTube videos in Chrome. I want to extend my trad tunes web site with a comments page attached to each tune. The foot of the page will contain a form allowing the user to add a comment to the page, and this is particularly useful if she finds a video where the tune is being played. YouTube allows you to embed the video into the page which it manages by means of an iframe. Unfortunately, when she posts the tune, Chrome won't display it until the page is refreshed (although all other major browsers will). Instead it reports:

The XSS Auditor refused to execute a script in 'http://localhost:9000/genre/agenre/tune/atune/comments' because its source code was found within the request. The auditor was enabled as the server sent neither an 'X-XSS-Protection' nor 'Content-Security-Policy' header.

What's happening is that the browser must run javascript code to render the iframe, and it again thinks that it emanates from the server that provided the page which is thus making a cross-origin request. In this case, the solution is to add a request header instructing the browser to suspend the XSS Auditor check when it process the page. This is achieved with the X-XSS-Protection header. For example, the Play Framework Action that processes a valid comment might respond like this:

  
    val comment = commentForm.bindFromRequest.get
    commentsModel.add(comment)
    Ok(views.html.comments(commentsModel.getAll, commentForm, genre, tune)(userName)).withHeaders("X-XSS-Protection"->"0")

When Chrome sees this header, it immediately renders the video. Note, this technique doesn't work if you issue a redirect instead of responding directly.

Wednesday, July 31, 2013

Running a Play 2 Application as a Debian Service

Debian Services

If you have developed your Play app, you then need a straightforward way to deploy and administer it in your production Linux environment. The conventional way to do this is to install it as a service, which means you can ensure it starts automatically when the server starts and your system administrators will know how to stop or restart it or check its status. This is an area where the Play documentation seems to be lacking. Here's how I approach things.

Tanukisoft Wrapper

The Tanukisoft Wrapper takes much of the pain away from the work of converting a JVM application into a well behaved service. You need to downloaded a version appropriate to your target machine, then pretty much all you have to do is name your service and edit a single configuration file - wrapper.conf. This lets you set the classpath, the application main class, the JVM parameters, the logging levels and so on. There are then various mechanisms for converting your application into a service but by far the simplest is to to use WrapperSimpleApp as your main class in wrapper.conf and let this hand off to Play's main class by setting it as the first application parameter. You also have a skeletal startup shell script which you rename to the name of the service you wish to use. So for example you can then invoke your application from the bin directory using

./myservice stop

./myservice start

./myservice status

and so on.

Integrating with Play

The play dist command builds a zip file containing all the dependencies which it used to place in the dist directory but since 2.2.0 puts in target/universal. With my tradtunestore application, it builds:

As you can see, all the dependent jars are placed in tradtunestore-1.1-SNAPSHOT/lib and there is also a start script which shows that the main class is named play.core.server.NettyServer. We are going to ignore this script because we are replacing it with Tanuki. All you have to do now is extract the contents of the zip file and place it into your Tanuki lib directory. Now you configure wrapper.conf to pick up all these jars in the classpath and invoke the NettyServer as the sole app wrapper parameter. Here's the relevant section that describes the Java application:

#********************************************************************
# Java Application
#  Locate the java binary on the system PATH:
wrapper.java.command=java
#  Specify a specific java binary:
set.JAVA_HOME=/usr/lib/jvm/jdk1.7.0
wrapper.java.command=%JAVA_HOME%/bin/java

# Tell the Wrapper to log the full generated Java command line.
wrapper.java.command.loglevel=INFO

# Java Main class.  This class must implement the WrapperListener interface
#  or guarantee that the WrapperManager class is initialized.  Helper
#  classes are provided to do this for you.  See the Integration section
#  of the documentation for details.
wrapper.java.mainclass=org.tanukisoftware.wrapper.WrapperSimpleApp

# Java Classpath (include wrapper.jar)  Add class path elements as
#  needed starting from 1
wrapper.java.classpath.1=../lib/*.jar
wrapper.java.classpath.2=../lib/tradtunestore-1.1-SNAPSHOT/lib/*.jar

# Java Library Path (location of Wrapper.DLL or libwrapper.so)
wrapper.java.library.path.1=../lib

# Java Bits.  On applicable platforms, tells the JVM to run in 32 or 64-bit mode.
wrapper.java.additional.auto_bits=TRUE

# Java Additional Parameters
wrapper.java.additional.1=-Dconfig.file=../conf/application.conf

# Initial Java Heap Size (in MB)
#wrapper.java.initmemory=3

# Maximum Java Heap Size (in MB)
#wrapper.java.maxmemory=64

# Application parameters.  Add parameters as needed starting from 1
wrapper.app.parameter.1=play.core.server.NettyServer

init.d

The final step is to install the service into init.d. With Tanuki, this is simple - from your Tanuki bin directory, enter:

sudo ./myservice install

You can then control the service using:

sudo service myservice stop

sudo service myservice start

sudo service myservice status

and so on from any directory.

Wednesday, July 10, 2013

Spray Migration

Firstly, apologies for having been silent for such a long time - I've been looking after my parents who were ill and are now, thankfully, recovered. I'd like to say a huge thank-you to Pinderfields Hospital in Wakefield, Yorkshire for saving my dad's life.

Since I've been away, the Scala ecosystem has been moving on unabated. We now have the final release of Scala 2.10.2 and I've been anxious to upgrade my RESTful music application accordingly. This has been enjoyable, but has taken longer than expected, largely because Spray underwent a huge refactoring at version 1.0-M3 which incorporated lots of breaking changes. As of this writing, the current release is labelled 1.2-M8 and is consistent with Akka 2.2.0-RC1. Other libraries that I use have changed too, but Spray has caused the most impact because the application concentrates on content negotiation and Spray has altered this considerably.

The New Spray

So much is new. The documentation has been completely overhauled and is now much more detailed. If you look at its navigation panel you can get an idea of the massive refactoring into a new module structure that has taken place. Both the github repository and the Spray repo have moved. Packages have been renamed from “cc.spray...” to simply “spray...” and the group id has changed from “cc.spray” to “io.spray”. There is a new DSL for testing routes. However, the changes with the biggest impact for me have been those to do with (un)marshalling (which has been moved to the new spray-httpx module) and those handling authentication.

Marshalling

You now use the Marshaller.of method to create a marshaller for a particular type (and which can be picked up implicitly in the routing). For example, if I want to create a marshaller for a handful of different binary types that I use represented by BinaryImage:

  import spray.httpx.marshalling._
  import MediaTypes._

  class BinaryImage(mediaType: MediaType, stream: BufferedInputStream)

  implicit val binaryImageMarshaller = 
     Marshaller.of[BinaryImage] (`application/pdf`,
                                 `application/postscript`,
                                 `image/png`,
                                 `audio/midi`) {
       (value, requestedContentType, ctx) ⇒ {
           val bytes:Array[Byte] = Util.readInput(value.stream)
           ctx.marshalTo(HttpEntity(requestedContentType, bytes))     
           }
     }

Alternatively, if you have a type A with no marshaller, but you do have a marshaller for a type B, you can provide a marshaller for A by delegating to B's marshaller if you can provide a function from A to B. The definition is:

  def delegate[A, B](marshalTo: ContentType*)
                    (f: A => B)
                    (implicit mb: Marshaller[B]): Marshaller[A]

In my case, nearly all the content is represented as a type Validation[String, A] where the String type represents an error and A represents the various types of source or intermediate content. You can provide a meta-marshaller for a Validation which either handles the error or invokes the marshaller for the underlying type:

implicit def validationMarshaller[B](implicit mb: Marshaller[B]) =
    Marshaller[Validation[String, B]] { (value, ctx) ⇒
      value fold (
        e => throw new Exception(e),
        s ⇒ mb(s, ctx)
      )
    }

It seems rather infra dig to have to throw an exception from a Validation. At the moment it is necessary to do so because a Marshaller has no access to the HTTP response headers and so it is not possible (say) to marshall an error to a string-like content type if you've already told the marshaller that it's only dealing in binaries. However, Mathias from Spray has proposed introducing alternative Marshallers, one of which will have access to the headers in an upcoming release so this small blemish can then be removed.

Unmarshalling

Unmarshalling goes in the opposite direction and can be used to construct a user-defined type from a form submission. The interface here has changed a little too. For example:

  implicit val AbcUnmarshaller =
    Unmarshaller[AbcPost](`application/x-www-form-urlencoded`) {
      case HttpBody(contentType, buffer) => {         
        val notes =  new String(buffer, contentType.charset.nioCharset)
        AbcPost(notes)
      }
    }

As you can see, there is now a nice symmetry between Marshallers and Unmarshallers. They can be used within routes like this:

  post {    
    entity(as[AbcPost]) { abcPost =>     
       // do something

And as with marshallers, there are a whole set of basic unmarshallers provided and you can delegate between unmarshallers if you wish.

Authentication

This is another area where the API has changed and here the documentation is lacking. In my case, I need to perform simple authentication of a user name and password by checking with a backend database. This is best achieved by providing a UserPassAuthenticator. You construct one of these by passing it a function that takes an Optional UserPass and returns a Future which contains an Optional result. The standard approach is to return an encapsulation of the user name in a BasicUserContext if the validation passes and return None otherwise:

  val UserAuthenticator = UserPassAuthenticator[BasicUserContext] { userPassOption ⇒ Future(userPassOption match {
        case Some(UserPass(user, pass)) => {
            try {
              println("Authenticating: " + user + ":" + pass)
              if (TuneModel().isValidUser(user, pass) ) Some(BasicUserContext(user)) else None
            } 
            catch {
              case _ : Throwable => None
            }
        }
        case _ => None
      })
    }

I also need to identify the administrator user who has special powers. I do this by first using the UserAuthenticator to establish the login credentials and then composing this future with a check to establish the administrator name:

   
    val AdminAuthenticator = UserPassAuthenticator[BasicUserContext] { userPassOption => {
      val userFuture = UserAuthenticator(userPassOption)
      userFuture.map(someUser => someUser.filter(bu => bu.username == "administrator"))  
      }
    }

Once you have an authenticator, you can use it in routes like this:

   get {     
     authenticate(BasicAuth(UserAuthenticator, "musicrest")) { user =>    
       _.complete("user is valid")    
     }
   }

Testing

As an example of the new testing DSL, here is how you can test the authentication behaviour:

package org.bayswater.musicrest

import org.specs2.mutable.Specification
import spray.testkit.Specs2RouteTest
import spray.http._
import spray.http.HttpHeaders._
import StatusCodes._
import spray.routing._
import org.bayswater.musicrest.model.{TuneModel,User}

class AuthenticationSpec extends RoutingSpec with MusicRestService {
  def actorRefFactory = system
  
  val before = insertUser

  "The MusicRestService" should {
    "request authentication parameters for posted tunes" in {  
       Post("/musicrest/genre/irish/tune") ~> musicRestRoute ~> check 
         { rejection === AuthenticationRequiredRejection("Basic", "musicrest", Map.empty) }
    }
    "reject authentication for unknown credentials" in {
       Post("/musicrest/genre/irish/tune") ~>  Authorization(BasicHttpCredentials("foo", "bar")) ~> musicRestRoute ~> check 
         { rejection === AuthenticationFailedRejection("musicrest") }
    }
    "allow authentication for known credentials (but reject the lack of a POST body as a bad request)" in {
       Post("/musicrest/genre/irish/tune") ~>  Authorization(BasicHttpCredentials("test user", "passw0rd1")) ~> musicRestRoute ~> check 
         { rejection === RequestEntityExpectedRejection }
    }
    "don't allow normal users to delete tunes" in {
       Delete("/musicrest/genre/irish/tune/sometune") ~>  Authorization(BasicHttpCredentials("test user", "passw0rd1")) ~> musicRestRoute ~> check 
         { rejection === AuthenticationFailedRejection("musicrest") }
    }
    "allow administrators to delete tunes" in {
       Delete("/musicrest/genre/irish/tune/sometune") ~>  Authorization(BasicHttpCredentials("administrator", "adm1n1str80r")) ~> musicRestRoute ~> check 
         { entityAs[String] must contain("Tune sometune removed from irish") }
    } 
  }
   
  def insertUser = {
   val dbName = "tunedbtest"
   val collection = "user"
   val tuneModel = TuneModel()
   tuneModel.deleteUser("test user")
   tuneModel.deleteUser("administrator")
   val user1 = User("test user", "test.user@gmail.com", "passw0rd1")
   tuneModel.insertPrevalidatedUser(user1)    
   val user2 = User("administrator", "john.watson@gmx.co.uk", "adm1n1str80r")
   tuneModel.insertPrevalidatedUser(user2)    
  }
   
}