10 questions · need 7/10 to pass.
Q1.For "Your first local inference run", which detail or constraint from the module is accurate?
single
Q2.Which of these correctly identifies the role of "What happens when you call an LLM" in the broader system?
single
Q3.Which definition of "Prefill vs. decode: two very different phases" matches what the module established?
single
Q4.Which statement about how "What happens when you call an LLM" actually works is correct?
single
Q5.Which fact about "Tokenization at serve time" matches the mechanism the module covered?
single
Q6.When applying "From logits to tokens: sampling" in practice, which of these holds?
single
Q7.Which statement about how "Where the FLOPs and the time actually go" actually works is correct?
single
Q8.When applying "Logprobs at serve time" in practice, which of these holds?
single
Q9.For "Prefill vs. decode: two very different phases", which detail or constraint from the module is accurate?
single
Q10."Stop conditions and max tokens" — which of these claims is supported by the module?
single