The primary goal of compare_responses function is to compare two students responses and find out if they are equal, so the penalty would'nt applied twice.
However, in shortanswer question type compare_responses internally calls compare_string_with_wildcard, using old state as a pattern, so any unescaped * in previous response is treated as wildcard! You can easily see this in preview mode of shortanswer question: type wrong response with * somewhere, send it, then type another wrong response replacing * with any characters and send it one more time - you'll see first response (with *), not last now, and they are treated as equal.
Also numerical question type, inherited from shortanswer, doesn't overload compare_responses fuction, so it'll call compare_string_with_wildcard too.
I think that compare_responses should not use compare_string_with_wildcar, it should use plain string compare functions (maybe with respect to the question case-sensitivity). Also, test_responses calls compare_responses now (which is strange, and costs two clones of state), it should not do this anymore as compare as compare_responses behavour will be changed - it can use test_responses or directly compare_string_with_wildcard instead.
Tim, if you are busy, I can fix it by myself and send a patch. This is necessary cleanup before developing this question type (and a bug fix too). The relations between test_response, check_response and compare_responses are quite strange now.