Every stop-motion lipsync I've done has been on 1's at 24-30 frames per second. It gives me greater control over the x-sheet and actually syncing the sounds to the action. However, most of the animation appears to have more movement than there should be.
It's said that you should minimize the amount of movements the mouth makes in a lipsync (since most people run their words together), but minimizing the framerate/animating on 2's or 3's runs the risk of missing sharp specifics in audio cues. Also, stop-motion itself tends to resist lowered framerates and look jerky.
You've summed things up pretty well. Generally, less is more with lip synch but shooting on 1's gives you better control of the timing. That's not a contradiction but a guide to use a lot of short holds on your mouth poses. Richard Williams talks about hitting the mouth pose without inbetweens and then softening it until the next sharp specific. If you have fine control of your puppet, that's ideal. Less exaggeration looks more natural. More exaggeration looks cartoony.
Personally I prefer to shoot on ones for lipsynch. Especially for exaggerated, cartoony mouth shapes, because the difference between one shape and the next is so great, even on ones it can pop. You can't add an extra in-betweeen frame to ease it, because you have to go with the timing of your voice track. If it goes from one extreme shape to another in two frames (which it can), you really benefit from having that one frame in between with a halfway position.
My puppets have mouths with wire in them for limited movement, rather than replacement mouths or clay with extreme Ooh and Aah shapes, so they don't pop as much. But even there I feel like I want all the in-betweens I can get. I normally shoot on ones at 24 (film) or 25 (PAL video) fps.
Actually, to be honest, I don't care much for shooting on twos, with or without lipsynch, at 24 or 25 fps. I can see the steppiness at 12 or 12.5 fps. But 15 fps seems to be the magic number for me, so shooting on twos for ntsc 30 fps comes up ok.
You can shoot slower dialog on two's and also drag the audio track forward by a couple of frames to ensure that the lips do not fall behind the timing of the sounds they make.