Expected Values
Interview prep for expected values, linearity of expectation, conditional expectations, and variance
Fundamental Concepts
Definition of Expected Value
Discrete case: For a discrete random variable X with PMF :
Continuous case: For a continuous random variable X with PDF :
Intuition: The expected value is the "center of mass" or weighted average of all possible values, weighted by their probabilities.
Expected Value of a Function
For a function :
Discrete case:
Continuous case:
Key insight: You don't need to find the distribution of first—just apply to the values and weight by the original probabilities.
Properties of Expectation
-
Constant: for any constant
-
Scaling: for any constant
-
Addition: (always true, even if X and Y are dependent)
-
Non-negativity: If , then
-
Monotonicity: If , then
Essential Identities and Formulas
Linearity of Expectation
Key Formula:
General form: For any constants and random variables :
Critical insight: This holds regardless of whether the random variables are independent or dependent. This is one of the most powerful tools in probability.
Law of the Unconscious Statistician (LOTUS)
For a function , you can compute directly from the distribution of X without finding the distribution of :
Discrete:
Continuous:
Why it's useful: Saves the effort of deriving the distribution of transformed variables.
Expected Value for Independent Variables
If X and Y are independent:
Warning: This does NOT hold if X and Y are dependent. Independence is required.
General form: For independent :
Law of Iterated Expectations (Tower Property)
Intuition: The expectation of X equals the expectation of the conditional expectation of X given Y.
Discrete form:
Continuous form:
When to use: When it's easier to compute conditional expectations than the overall expectation.
Wald's Equation
If are i.i.d. with , and N is a non-negative integer random variable independent of the 's:
Intuition: Expected sum over a random number of terms equals the expected number of terms times the expected value per term.
Critical requirement: N must be independent of the 's (or at least, the stopping rule must not "peek ahead").
Variance Identities
Definition:
Properties:
- (constants shift but scaling squares the variance)
- For independent X and Y:
- General:
Law of Total Variance:
Advanced Expectation Tricks
Trick 1: Indicator Functions
Express a random variable as a sum of indicators:
Then by linearity:
Example: Expected number of fixed points in a random permutation =
Why powerful: Avoids computing the full distribution of X.
Trick 2: Symmetry
If random variables have identical distributions (by symmetry), they have the same expectation.
Example: In a random arrangement, each position has the same expected value for any specific item.
Use: Simplify calculations by identifying symmetric components.
Trick 3: Tail Sum Formula
For a non-negative discrete random variable X taking values in :
Continuous version: For :
When to use: When computing tail probabilities is easier than computing the full PMF/PDF.
Trick 4: Conditioning on First Step
For sequential/recursive processes:
Example: Random walks, gambler's ruin, expected time to absorption.
Key: Set up a recursive equation and solve.
Trick 5: Memoryless Property
For exponential random variables (continuous) or geometric random variables (discrete):
Intuition: "Forgetting" the past—the remaining time has the same distribution as the original.
Use: Simplifies problems involving waiting times.
Trick 6: Renewal-Reward Theorem
For a renewal process with rewards:
When to use: Long-run average problems, queuing theory, inspection paradox.
Trick 7: Indicator Product for Joint Events
For covariance:
Trick 8: Optimal Stopping
For sums of i.i.d. uniform(0,1) random variables, the expected number of terms needed until the sum exceeds 1 is:
where and
General principle: Many optimal stopping problems have elegant closed-form solutions.
Trick 9: Geometric Series in Expectations
For geometric random variables or processes involving probabilities:
Use: Computing expectations of geometric-type distributions.
Trick 10: Exchange of Sum and Expectation
When appropriate (Fubini's theorem):
Condition: Usually requires or all .
Common Distribution Expectations
Discrete Distributions
| Distribution | Notation | ||
|---|---|---|---|
| Bernoulli | |||
| Binomial | |||
| Geometric (trials) | |||
| Negative Binomial | |||
| Poisson |
Continuous Distributions
| Distribution | Notation | ||
|---|---|---|---|
| Uniform | |||
| Exponential | |||
| Normal | |||
| Gamma | |||
| Beta |
Computing Expectations: Strategies
Strategy 1: Direct Computation
Use the definition directly:
Discrete: Sum over all possible values weighted by probabilities
Continuous: Integrate over the support
When to use: Simple distributions, finite support, tractable integrals.
Strategy 2: Use Linearity
Break the random variable into simpler components:
Then:
When to use: Sums, counting problems, indicator functions.
Strategy 3: Conditioning (Law of Total Expectation)
Break into cases:
or
When to use: Natural partition exists, conditional expectations are simpler.
Strategy 4: Symmetry Arguments
If are identically distributed (symmetric):
When to use: Random permutations, random selections, exchangeable variables.
Strategy 5: Tail Sum Formula
For non-negative random variables:
When to use: Survival/tail probabilities easier to compute than full distribution.
Strategy 6: Recursive Equations
Set up a recursive formula for and solve:
When to use: Markov chains, random walks, first-passage times.
Interview Problem Types
Type 1: Indicator Function Problems
| Given | Find | Approach |
|---|---|---|
| Counting problem (e.g., fixed points, matches) | Expected count | Express as sum of indicators, use linearity: |
Type 2: Linearity of Expectation Applications
| Given | Find | Approach |
|---|---|---|
| Sum of random variables (possibly dependent) | Expected value of sum | Use regardless of dependence |
Type 3: Conditional Expectation Problems
| Given | Find | Approach |
|---|---|---|
| Problem with natural stages or conditions | Expected value | Use or condition on first step |
Type 4: Random Sums (Wald's Equation)
| Given | Find | Approach |
|---|---|---|
| Sum of random number of i.i.d. terms | Expected value | Use if N independent of |
Type 5: Geometric Distribution Problems
| Given | Find | Approach |
|---|---|---|
| "First success" or waiting time problems | Expected waiting time | Use memoryless property, geometric distribution: |
Type 6: Continuous Integration Problems
| Given | Find | Approach |
|---|---|---|
| Continuous PDF, possibly on restricted region | or | Integrate or , use LOTUS if needed |
Type 7: Tail Sum Applications
| Given | Find | Approach |
|---|---|---|
| Non-negative random variable with known tail probabilities | Expected value | Use or |
Type 8: Recursive Expected Value Problems
| Given | Find | Approach |
|---|---|---|
| Process with recursive/Markov structure | Expected time, cost, or value | Set up equation , condition on first step, solve |
Type 9: Symmetry-Based Problems
| Given | Find | Approach |
|---|---|---|
| Random arrangements, selections with symmetry | Expected value involving positions/selections | Use symmetry to argue , simplify calculation |
Type 10: Moment Generating Functions
| Given | Find | Approach |
|---|---|---|
| MGF | Moments | Use (nth derivative at 0) |
Common Pitfalls
Pitfall 1: Confusing with
Wrong: Assuming when X and Y are dependent
Check: This only holds when X and Y are independent
Pitfall 2: Applying Linearity to Products
Wrong: without independence
Correct: Linearity applies to sums: always
Pitfall 3: Forgetting Absolute Convergence
Wrong: Exchanging infinite sum and expectation without justification
Check: Ensure or all
Pitfall 4: Misapplying Wald's Equation
Wrong: Using when N depends on the 's
Check: N must be independent of the 's (or use optional stopping theorem)
Pitfall 5: Incorrect Domain of Integration
Wrong: Integrating over wrong limits for restricted support
Check: What values can X actually take? Ensure integration limits match support
Pitfall 6: Forgetting to Condition Properly
Wrong: Computing directly when conditioning would simplify
Check: Is there a natural partition or first step to condition on?
Additional Helpful Tricks
The Complement for Indicators
For an indicator :
Example: Expected number of non-matches =
First-Step Analysis
For Markov processes, condition on the first transition:
where is the expected time starting from state i.
Splitting Expectations
For non-negative X:
Use: When X behaves differently on different regions.
Coupling and Comparison
If you can show for all outcomes:
Use: Bounds on expectations, proof by comparison.
Convexity and Jensen's Inequality
If is convex:
If is concave:
Examples:
- (since is convex)
- (since is concave)
Common Mistakes to Avoid
The "Average of Averages" Fallacy
Wrong: in general... wait, this is actually true!
Correct: Law of iterated expectations says always holds
Assuming E[1/X] = 1/E[X]
Wrong:
Correct: By Jensen's inequality (1/x is convex for x > 0):
Infinite Expectation Traps
Wrong: Assuming all random variables have finite expectation
Example: Cauchy distribution, St. Petersburg paradox
Check: Verify before using expectation formulas
Quick Reference: Key Formulas
Linearity:
Independence:
Law of Total Expectation:
Wald's Equation:
Variance Formula:
Tail Sum:
LOTUS:
Practice Problem Categories
- Coupon collector
- Birthday problem
- Random walks
- Gambler's ruin
- Geometric waiting times
- Poisson processes
- Order statistics
- Secretary problem
- Optimal stopping
- Matching/derangement problems
- Inspection paradox
- Two-envelope problem
- St. Petersburg paradox
- Renewal processes