Purdue CS240 Autograder

2013-12-17 project xbu

For historical curiosity, this page records the work of Autograder, an automated grading platform used in Purdue CS240 “Programming in C” course. Students submit code for grading and later receive grades. This platform encouraged students to try achieving best score and also greatly reduced TAs’ workload. In Fall 2014 a new project gitlab-ag, based on GitLab, replaced Autograder.

Autograder has been deprecated. The purpose of this page is to summarize the knowledge and experience gained from this project so that we make better things in the future.

2014 Fall

Autograder was rewritten bottom-up. We replaced a UML-based sandbox with Docker because the former occasionally fails on Linux x86-64 kernel. The UI is reconstructed with Twitter Bootstrap. The event queue is rewritten to take advantage of multi-core hardware.

GitHub Repository: https://github.com/xybu/autograder

We stopped using Autograder because the course instructor is so busy that he did not manage to prepare lab assignments in advance enough for preparing test cases and grading scripts.

2014 Spring

Server / Web Interface:

UI improvements.
Improved self-maintenance capability.
Added daily submission quota functionality (can be turned off). E.g., if the quota is 15, then students are allowed to submit at most 15 times per day. This can be used to discourage late start and reduce the burden of the grading system.

From instructors’ perspective:

Admin control panel for instructors. Most common administrative operations can be done through web interface rather than direct file operations, thus reducing the permission level granted to every TA. This should enhance the system stability especially when there are a number of TAs modifying the file system.
Improved debug mode. Instead of printing everything (Fall 2013), only the information about where the student program goes wrong will be printed. So TAs should feel more comfortable with it.
Removed the historical rule that “the total point for an assignment must be 100”. Now the instructors can set their own total grade like (“60 / 60” for an assignment).

From students’ perspective:

Added a general guidance page to introduce the system. No need to teach students how to use it in the first lab session.
Student will see more info about his/her status for each assignment.

TODO list:

OS update
move dump to tmpfs
“Regrade my last submission”
How would a self-hosted Git server + hooked grading system work?

2013 Fall

Changes in terms of system stability:

Isolation of TA platform and student platform for performance, security, and flexibility.
Modularized and standardized test scripts (e.g., file ops, segfault detection functions, etc.) that makes TAs write stable Autograder code more easily.

Changes for instructors side:

Added debug mode: enabling debug mode gives TAs a lot more information about what happened in one grading session. TAs can diagnose student code and give incisive feedback. By default debug mode is off for students but on for TAs.
Added support for manual offset: If one assignment needs manually grading (e.g., coding style grade), instructors write the grade for each student through an interface, and Autograder will collect those grades and add them to the gradebook. In Fall 2013 semester the grades of most assignments consisted of 80% of Autograder score and 20% of hand-graded score for coding standards.
Added submission history: Autograder now records the score of all submissions from each student. This makes it a lot easier to respond to grade/source code retrieval requests from students.

Util scripts:

Turn-In ID generator: generates massive turnin IDs and match them with the students in the roster
Slacker finder: generates a list of student ids who do not submit a specified assignment
Moss runner: generates and runs the command of Moss for you so you don’t need to come up with and issue long commands every time.

Changes for students side:

UI improvements to provide more information in more beautiful ways.

What to do in the future:

Students want Autograder to provide more detailed feedback, which reflects that they want it to become a tool for study, not just for grading. Should think about how to fulfill this demand in the future.
Make a hierarchy of test cases. If test case A fails and test case B depends on A, then skip B.
User role system so that different users (TAs, students) get different permissions of doing something. But no strong demand for this so far.
Better workload management mechanism that encourages students to start early and reduces the workload on due date (Done in Spring 2014).

purdue autograding