There is a now a SCORM 1.2 publishing option available on all Drawtivity activities which means that when delivered within a compatible Learning Management System (e.g. Moodle, Blackboard etc) the activities will record a percentage score (if this activity setting is selected).
After the revelation described in the previous blog post I have been working to implement and test this scoring mechanism.
As described in the previous post this scoring mechanism involves measuring and summing the minimum distances between each drawn point and the answer area or line. Then in the same way measuring and summing the minimum distances between each point of the answer area or line and the drawn area or line.
These two totals are added together and divided by the total number of drawn and answer points to calculate a single average distance. This average distance is a single numeric measure of the accuracy of the drawn area or line compared against the answer. The lower this measure the more accurate the drawn response of the user is. Inversely the higher this measure, the greater the average distance between the points and so the lower the accuracy of the drawn response.
To turn this numeric measure into a percentage score the activity is using upper and lower limits of this measure to define the values of this measure that relate to scores of 0% and 100%. So for example the upper limit may be set as 30 which is the value of this average distance measure which relates to a score of 0%. Values of this measure that exceed this limit will also score 0%.
An example lower limit may be set as 5 which is the value of this average distance measure that relates to a score of 100%. Values of this measure lower than this lower limit will also score 100%.
Values of this average distance measure which fall between these two limits are assigned a percentage score based on a straight line interpolation between these two limits and the respective percentage values (see the graph below).
These limits are set in the activity settings by the activity author. Increasing the upper limit means more inaccurate responses will receive a score. Increasing the lower limit increases the allowable inaccuracy for achieving high scores.
These are three screenshots illustrating attempts of varying accuracy and their related scores:
Very inaccurate – 0%
Medium accuracy – 38%
Very accurate – 100%
To demonstrate this scoring mechanism in action below is a link to a scoring version of the biceps example activity. Have a go at this activity and make multiple attempts to draw the correct area and vary your accuracy to see how your attempts are scored. This activity has the upper and lower limits set at 30 and 5 and so you should see it is possible to get 100% for a very accurate drawing of the area and if you draw an area any distance away from the correct area you will get 0%. Attempts that are close to correct are scored based on their accuracy and in general it seems to give appropriate scores (in my testing anyway!).
I must admit that the problem of automatically generating a useful numerical score based on a user’s drawn response has exercised me considerably. I think I am now finally in a position to implement something that might work appropriately, but if there are any mathematical geniuses out there who can come up with a more elegant solution then please do enlighten me.
There are two sets of points described by X and Y coordinates. One set is the answer defined by the activity author. The other set is the drawn response created by the activity participant. To score the activity we need to define an algorithm to calculate a measure based on the closeness of fit between these two sets points and the lines/areas that they describe.
This algorithm will ideally be the same for activities involving both drawing lines and drawing areas.
Scoring based on area defined by the two sets of points – my original thought was to calculate the area between the two sets of points, based on the idea that as this area gets smaller the line/area defined by the sets are more similar and so a higher score should be given. An issue with this solution is that this area isn’t always going to be a good determinant of fit (e.g. the scenario shown below where an inaccurate response would be scored highly by this method)
Scoring based on defined tolerance area – another idea was for the activity author to define an area within which they expect the points making up a drawn response to fall. Scores could then be allocated according to whether each drawn point falls within this area, or outside of this area. Again there are scenarios where this method would score inaccurate responses inappropriately (see example below where the drawn response points fall within the tolerance area but are describing an inaccurate response).
Scoring based on distances from the points to the line defined by the other set of points – the final method I have considered is based on calculating the minimum distance from each point of one set to the line defined by the other set of points (see an example of calculating this minimum distance illustrated below).
These distances can be calculated for each drawn point and a mean distance per point calculated (average) which would then be used as the basis of the scoring. When considering this approach it was apparent there are still problematic scenarios where these distances could be small while the line shape is inaccurate (as with the tolerance area issue illustrated above). To counter this I also considered calculating the difference in the lengths of the two defined lines and the distance between the starting and ending points of the two sets. These additional measures would need to be added to the mean minimum point distance and perhaps multiplied by weighting factors to come up with an overall measure and finally a score. This approach seemed to have promise but it seemed likely that the appropriate factors for each of these measures would need to be defined by trial and error by authors on an activity by activity basis. This would make authoring scoring activities particularly time consuming!
Then a couple of days ago I woke up with a bit of a revelation. Instead of just calculating this mean, minimum distance for the drawn response points, why not also calculate it for the answer points with respect to the drawn response line. Then by combining these two measures calculated from each perspective, scenarios such as that illustrated above could be scored more appropriately.
There are still scenarios where even this method would not be perfect, but this is the best solution that I can identify for now. Next job is to implement it!