229. Majority Element II
On LeetCode ->Reformulated question¶
Find all values that appear more than \(\lfloor n/3 \rfloor\) times in an array.
- There can be at most 2 such values.
- Aim for \(O(n)\) time and \(O(1)\) extra space.
Example:
Key trick¶
Use the Boyer-Moore voting idea with 2 candidates.
- More than \(n/3\) means at most 2 winners can exist.
- First pass:
- keep 2 candidates and 2 counters
- cancel triples of distinct values
- Second pass:
- verify the candidates really occur more than \(\lfloor n/3 \rfloor\)
Trap¶
- Forgetting the verification pass.
- The first pass gives only possible candidates.
- Using
elifin the wrong order and breaking candidate replacement logic. - Forgetting that answers can have size 0, 1, or 2.
- Returning duplicates when both candidate slots end up equal.
Why is this question interesting?¶
It looks like counting, but the follow-up forces a stronger invariant.
- It tests whether you can generalize majority voting from \(n/2\) to \(n/3\).
- It is a good example of "few possible winners, so track only a few candidates".
Solve the problem with idiomatic python¶
class Solution:
def majorityElement(self, nums: list[int]) -> list[int]:
# At most two values can appear more than n//3 times.
cand1 = cand2 = None
count1 = count2 = 0
# First pass: find possible candidates.
for x in nums:
if x == cand1:
count1 += 1
elif x == cand2:
count2 += 1
elif count1 == 0:
cand1, count1 = x, 1
elif count2 == 0:
cand2, count2 = x, 1
else:
count1 -= 1
count2 -= 1
# Second pass: verify.
count1 = count2 = 0
for x in nums:
if x == cand1:
count1 += 1
elif x == cand2:
count2 += 1
ans = []
limit = len(nums) // 3
if count1 > limit:
ans.append(cand1)
if count2 > limit:
ans.append(cand2)
return ans
Pytest test¶
import pytest
@pytest.mark.parametrize(
("nums", "expected"),
[
([3, 2, 3], [3]),
([1], [1]),
([1, 2], [1, 2]),
([1, 1, 1, 3, 3, 2, 2, 2], [1, 2]),
([2, 2], [2]),
([1, 2, 3, 4], []),
([0, 0, 0], [0]),
([4, 4, 4, 4, 2, 2, 2, 3, 3], [4]),
([-1, -1, -1, 0, 1, 2, 3], [-1]),
],
)
def test_majority_element(nums, expected):
assert sorted(Solution().majorityElement(nums)) == sorted(expected)
Comment my solution¶
Your solution is correct for many cases, but it does not meet the follow-up space bound.
- It uses a hash map, so space is \(O(m)\) where \(m\) is the number of distinct values.
- This problem’s trick is that you never need to track all counts, only 2 candidates.
- Your early append logic is fine for the map approach, but the interview-optimal answer is Boyer-Moore plus verification.
# pass test but it's not O(1) in space
class Solution:
def majorityElement(self, nums: list[int]) -> list[int]:
k = len(nums) // 3
counts = {}
ans = []
for num in nums:
if num in counts:
if counts[num] <= k:
counts[num] += 1
if counts[num] > k:
ans.append(num)
else:
counts[num] = 1
if k == 0:
ans.append(num)
return ans
Extra¶
In my mind I was confusing time linearity and 1 pass. I also didn't understand why the first loop picks the right candidates. So I asked AI for helped.
One passe and time linearity¶
Yes - 2 passes is still linear time.
- If one pass costs about
n, and another pass costs aboutn, total is:
- In Big-O, constant factors are ignored, so:
- More generally:
- 3 full passes over the array is still
O(n) 100nis stillO(n)n + 5is stillO(n)
- 3 full passes over the array is still
It stops being linear when the work grows faster than proportional to n.
- Examples:
- one loop over array:
O(n) - two separate loops over array:
O(n) - nested loop over array inside array:
O(n^2) - divide array in half recursively with total work proportional to all elements once per level: maybe
O(n log n)
- one loop over array:
For interviews, "linear time" usually means:
- the total number of operations is bounded by
c * n + bfor constantsc, b - so a fixed number of passes is completely fine
Why the first pass works¶
The key fact is:
- If an element appears more than
n/3times, there can be at most 2 such elements.
Because:
- 3 different values each appearing more than
n/3times would total more thann, impossible.
So it is enough to track 2 candidates.
Intuition of the cancellation¶
When you see a new number that matches neither candidate, and both counters are positive, you do:
You can think of that as removing:
- one occurrence of candidate 1
- one occurrence of candidate 2
- the current new value
So you cancel a group of 3 distinct values.
That is the whole trick.
Why this does not lose a real answer¶
Suppose some value x appears more than n/3 times.
- Every cancellation removes at most one
x. - But each cancellation also removes 2 non-
xelements. - So
xcannot be completely wiped out by these triple cancellations, because there are not enough non-xelements to pair against all of its occurrences.
A cleaner way to say it:
- Repeatedly delete triples of distinct values.
- Deleting such triples does not change which values occur more than one third of the time.
- After all possible deletions, at most 2 values can remain.
- Those remaining possible winners are exactly what the first pass tracks.
Why the candidates from pass 1 are only "possible"¶
The first pass guarantees:
- every true answer must be among
cand1andcand2
It does not guarantee:
- both candidates are valid answers
Example:
You may end with some candidates, but no number appears more than 4//3 = 1 times.
So the second pass is required to count and verify.
Conditions that make this correct¶
These are the important ones:
- Threshold is
> n/3 - Therefore there are at most 2 valid answers
- The algorithm maintains at most 2 candidate slots
- When a third different value appears and both slots are occupied, cancel all three
- A true majority-over-
n/3cannot be fully eliminated by this process - Therefore every true answer survives as one of the final candidates
- Final verification confirms which candidates really pass the threshold
Short mental model¶
Use this mental picture:
- for
> n/2, track 1 candidate - for
> n/3, track 2 candidates - for
> n/k, trackk-1candidates
Because there can be at most k-1 such frequent elements.
Tiny example¶
Cancellation view:
- remove
(1,2,3)once - left with
[1,1,2,2]
Now the only possible winners are clearly 1 and 2.
That is what the first pass is simulating, without actually deleting elements.