Finish first R judge draft
This commit is contained in:
parent
4c1a8a94ff
commit
bf2c188791
1 changed files with 27 additions and 1 deletions
28
book.org
28
book.org
|
@ -762,7 +762,7 @@ Dodona and its related software comprises a lot of code.
|
|||
This chapter discusses the technical background of Dodona itself\nbsp{}[cite:@vanpetegemDodonaLearnCode2023] and a stand-alone online code editor, Papyros (\url{https://papyros.dodona.be}), that was integrated into Dodona\nbsp{}[cite:@deridderPapyrosSchrijvenUitvoeren2022].
|
||||
I will also discuss two judges that I was involved with the development of.
|
||||
The R judge was written entirely by myself\nbsp{}[cite:@nustRockerversePackagesApplications2020].
|
||||
The TESTed judge came forth out of a prototype I built in my master's thesis\nbsp{}[cite:@vanpetegemComputationeleBenaderingenVoor2018] and was further developed in a master's thesis I supervised\nbsp{}[cite:@strijbolTESTedOneJudge2020].
|
||||
The TESTed judge came forth out of a prototype I built in my master's thesis\nbsp{}[cite:@vanpetegemComputationeleBenaderingenVoor2018] and was further developed in two master's thesises I supervised\nbsp{}[cite:@selsTESTedProgrammeertaalonafhankelijkTesten2021; @strijbolTESTedOneJudge2020].
|
||||
|
||||
** Dodona
|
||||
:PROPERTIES:
|
||||
|
@ -1077,13 +1077,26 @@ They also gave some interesting ideas about future additions to Papyros such as
|
|||
|
||||
Because Dodona had proven itself as a useful tool for teaching Python and Java to students, colleagues teaching statistics started asking if we could build R support into Dodona.
|
||||
Since the judge system of Dodona makes this fairly easy, I started working on an R judge soon after.
|
||||
By now, more than 1\thinsp{}250 R exercises have been added, and almost 1 million submissions have been made to an R exercise.
|
||||
|
||||
Because R is mostly used for statistics, there are a few extra features that come to mind that are not typically handled by judges, such as handling of data frames and outputting visual graphs (or even evaluating that a graph was built correctly).
|
||||
Another feature that teachers wanted that we had not built into a judge previously was support for inspecting the student's source code, e.g. for making sure that certain functions were or were not used.
|
||||
|
||||
The API for the R judge was designed to follow the visual structure of the feedback table as closely as possible, as can be seen in the sample evaluation code in Listing\nbsp{}[[lst:technicalrsample]].
|
||||
Tabs are represented by different evaluation files.
|
||||
In addition to the =testEqual= function demonstrated in Listing\nbsp{}[[lst:technicalrsample]] there are some other functions to specifically support the requested functionality.
|
||||
=testImage= will set up some the R environment so that generated plots (or other images) are sent to the feedback table (in a base 64 encoded string) instead of the filesystem.
|
||||
It will also make the test fail if no image was generated (but does not do any verification of the image contents).
|
||||
=testDF= has some extra functionality for testing the equality of data frames, where it is possible to ignore row and column order.
|
||||
The generated feedback is also limited to 5 lines of output, to avoid overwhelming students (and their browsers) with the entire table.
|
||||
=testGGPlot= can be used to introspect plots generated with GGPlot\nbsp{}[cite:@wickhamGgplot2CreateElegant2023].
|
||||
To test whether students use certain functions, =testFunctionUsed= and =testFunctionUsedInVar= can be used.
|
||||
The latter tests whether the specific function is used when initializing a specific variable.
|
||||
|
||||
#+CAPTION: Sample evaluation code for a simple R exercise.
|
||||
#+CAPTION: The feedback table will contain one context with two testcases in it.
|
||||
#+CAPTION: The first testcase checks whether some t-test was performed correctly, and does this by performing two equality checks.
|
||||
#+CAPTION: The second testcase checks that the $p$ value calculated by the t-test is correct.
|
||||
#+NAME: lst:technicalrsample
|
||||
#+ATTR_LATEX: :float t
|
||||
#+BEGIN_SRC r
|
||||
|
@ -1112,6 +1125,19 @@ context({
|
|||
})
|
||||
#+END_SRC
|
||||
|
||||
Other than the API for teachers creating exercises, encapsulation of student code is also an important part of a judge.
|
||||
Students should not be able to access functions defined by the judge, or be able to find the correct solution or the evaluating code.
|
||||
The R judge makes sure of this by making extensive use of environments.
|
||||
This is also reflected in the teacher API: they can access variables or execute functions in the student environment, but this environment has to be explicitely passed to the function generating the student result.
|
||||
In R, all environments except the root environment have a parent, essentialy creating a tree structure of environments.
|
||||
In most cases, this tree will actually be a path, but in the R judge, the student environment is explicitely attached to the base environment.
|
||||
This even makes sure that libraries loaded by the judge are not initially available to the student code (thus allowing teachers to test that students can correctly load libraries).
|
||||
The judge itself runs in an anonymous environment, so that even students with intimate knowledge of the inner workings of R and the judge itself would not be able to find this environment.
|
||||
|
||||
The judge is also programmed very defensively.
|
||||
Every time execution is handed off to student code (or even teacher code), appropriate error handlers and output redirections are installed.
|
||||
This prevents the student and teacher code from e.g. writing to standard output (and thus messing up the JSON expected by Dodona).
|
||||
|
||||
** TESTed
|
||||
:PROPERTIES:
|
||||
:CREATED: [2023-10-23 Mon 08:49]
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue