9 Mar
2007
9 Mar
'07
3:47 a.m.
A few other notes to consider: Turnitin does not store the actual paper. They store a hash of the paper, weakening the argument that IP is being violated.
[A *hash*? Really, come on. A whole-document hash, certainly not, given their output and use. A hash of paragraphs or sentences? Maybe - but that gets us closer to being able to reconstruct the actual text, or at least assess similarity.] If I were building an online plagiarism detection service, using very well understood information retrieval methods - term-document matrices, document vectors, and the like - I would find it fairly difficult NOT to store the student's work in a re-constitutable form. --elijah