I must admit that the problem of automatically generating a useful numerical score based on a user’s drawn response has exercised me considerably. I think I am now finally in a position to implement something that might work appropriately, but if there are any mathematical geniuses out there who can come up with a more elegant solution then please do enlighten me.
There are two sets of points described by X and Y coordinates. One set is the answer defined by the activity author. The other set is the drawn response created by the activity participant. To score the activity we need to define an algorithm to calculate a measure based on the closeness of fit between these two sets points and the lines/areas that they describe.
This algorithm will ideally be the same for activities involving both drawing lines and drawing areas.
Scoring based on area defined by the two sets of points – my original thought was to calculate the area between the two sets of points, based on the idea that as this area gets smaller the line/area defined by the sets are more similar and so a higher score should be given. An issue with this solution is that this area isn’t always going to be a good determinant of fit (e.g. the scenario shown below where an inaccurate response would be scored highly by this method)
Scoring based on defined tolerance area – another idea was for the activity author to define an area within which they expect the points making up a drawn response to fall. Scores could then be allocated according to whether each drawn point falls within this area, or outside of this area. Again there are scenarios where this method would score inaccurate responses inappropriately (see example below where the drawn response points fall within the tolerance area but are describing an inaccurate response).
Scoring based on distances from the points to the line defined by the other set of points – the final method I have considered is based on calculating the minimum distance from each point of one set to the line defined by the other set of points (see an example of calculating this minimum distance illustrated below).
These distances can be calculated for each drawn point and a mean distance per point calculated (average) which would then be used as the basis of the scoring. When considering this approach it was apparent there are still problematic scenarios where these distances could be small while the line shape is inaccurate (as with the tolerance area issue illustrated above). To counter this I also considered calculating the difference in the lengths of the two defined lines and the distance between the starting and ending points of the two sets. These additional measures would need to be added to the mean minimum point distance and perhaps multiplied by weighting factors to come up with an overall measure and finally a score. This approach seemed to have promise but it seemed likely that the appropriate factors for each of these measures would need to be defined by trial and error by authors on an activity by activity basis. This would make authoring scoring activities particularly time consuming!
Then a couple of days ago I woke up with a bit of a revelation. Instead of just calculating this mean, minimum distance for the drawn response points, why not also calculate it for the answer points with respect to the drawn response line. Then by combining these two measures calculated from each perspective, scenarios such as that illustrated above could be scored more appropriately.
There are still scenarios where even this method would not be perfect, but this is the best solution that I can identify for now. Next job is to implement it!