I would expect all values to be zero, but I assume this is just how the algorithm works.
Is there some non-code documenation re how the algorithm works?

The output of the model is the probability distribution over the Answers. This is the reason why you get [0.3, 0.3, 0.3] for the last example, this means that the model is equally unsure about all three labels. You can decide about the correct answer by defining a threshold on the maximal probability score.

You can use just one of the available three options [confident_threshold, max_proba, top_n] in the according priority. When you set confident_threshold=0.5 you filter out all the candidates with the probability less or equal 0.5, which is in your case all the candidates.