Self-Consistency Improves Chain of Thought Reasoning in Language Models

Self-consistency also based on CoT, which aims to replace the “naive” greedy decoding result used in CoT.

The intuition is really simple: a model can generate several plausible responses to a math question that all arive at the same correct answer; it can also produce an incorrect reasoning path but those solutions are less likely to arrive at the same answer.

Specific instructions:

Prompted with a set of manually written chain-of-thought examples
Sample a set of candidate outputs from the LLM’s decoder
Aggregate the answers and choose the answer that is the most consistent

FF's Roam Notes

Explorer

Self-Consistency Improves Chain of Thought Reasoning in Language Models

Graph View