230. Kth Smallest Element in a BST
On LeetCode ->Reformulated question¶
Given a BST root and an integer k, return the kth smallest value in the tree, where k is 1-indexed.
- Example:
root=[5,3,6,2,4,null,null,1], k=3 -> 3 - Why: inorder traversal of a BST visits values in sorted order:
[1,2,3,4,5,6]
Key trick¶
Use inorder traversal.
- In a BST,
left -> node -> rightyields values in increasing order. - So the answer is simply the
kth visited node. - You can stop early instead of traversing the whole tree.
Trap¶
Common mistakes:
- Forgetting
kis 1-indexed. - Doing a full traversal into a list when early stop is enough.
- Using generic tree logic and missing the BST property.
- Recursive inorder may be fine here, but iterative avoids recursion depth issues.
Why this question is interesting¶
It tests whether you connect a BST property to traversal order.
- Easy brute force: collect and sort.
- Better answer: inorder with early stop in \(O(h + k)\) time and \(O(h)\) space, where \(h\) is tree height.
- Follow-up naturally leads to augmented BSTs with subtree sizes.
Solution with idiomatic Python¶
from typing import Optional
class Solution:
def kthSmallest(self, root: Optional[TreeNode], k: int) -> int:
# Iterative inorder traversal:
# in a BST, inorder visits nodes in sorted order.
stack = []
node = root
while stack or node:
# Go as left as possible first.
while node:
stack.append(node)
node = node.left
node = stack.pop()
k -= 1
# The kth visited node is the kth smallest.
if k == 0:
return node.val
# Then explore the right subtree.
node = node.right
Pytest test¶
import pytest
from utils import tree_build
from solution import Solution
@pytest.mark.parametrize(
("values", "k", "expected"),
[
([3, 1, 4, None, 2], 1, 1),
([3, 1, 4, None, 2], 2, 2),
([3, 1, 4, None, 2], 3, 3),
([5, 3, 6, 2, 4, None, None, 1], 3, 3),
([2, 1, 3], 1, 1),
([2, 1, 3], 2, 2),
([2, 1, 3], 3, 3),
([1], 1, 1),
([5, 3, 7, 2, 4, 6, 8], 7, 8),
],
)
def test_kthSmallest(values, k, expected):
root = tree_build(values)
assert Solution().kthSmallest(root, k) == expected
Comment on my solution¶
Your solution is correct and already good.
-
Strong point:
- It does iterative inorder and stops early.
-
What can be simplified:
- Use the standard
while stack or nodepattern. - Decrement
kdirectly instead of keeping a separatecount. - No need to special-case the initial left descent and right child handling.
- Use the standard
-
Your version rewritten in the more common interview style is shorter and easier to verify.
## Solution
class Solution:
def kthSmallest(self, root: Optional[TreeNode], k: int) -> int:
# inorder walk of tree. The kth in that order is the kth smallest.
node = root
stack = []
while node:
stack.append(node)
node = node.left
count = 0
while stack:
node = stack.pop()
val = node.val
count +=1
if count == k:
return val
if node.right:
node = node.right
while node:
stack.append(node)
node = node.left
Extra¶
Order-statistics tree (BST augmented with subtree sizes)¶
An augmented BST is a normal BST where each node stores extra metadata.
For this follow-up, the useful metadata is:
size- The number of nodes in the subtree rooted at that node.
- Including the node itself.
Example:
5(size=8)
/ \
3(size=4) 6(size=3)
/ \ \
2(size=2) 4(1) 7(size=2)
/ \
1(1) 8(1)
At node `3`:
- left subtree has size `2`
- so:
- the 1st and 2nd smallest are in the left subtree
- the 3rd smallest is `3` itself
- anything larger is in the right subtree
#### Why use subtree sizes in general
They let you answer order-statistic queries efficiently:
- find the `k`th smallest
- find the rank of a value
- "how many values are `< x`?"
- predecessor / successor with rank logic
Without subtree sizes:
- `kthSmallest` is usually done by inorder traversal
- that costs $O(h + k)$, worst case $O(n)$ each query
With subtree sizes:
- `kthSmallest` becomes $O(h)$
- because at each node you know exactly which side to go
That matters when:
- the tree changes often
- and you ask `kthSmallest` often
#### Why the follow-up naturally leads to it
The original solution is:
- do inorder
- stop at the `k`th node
Good for one query.
Bad if you repeatedly do:
- insert
- delete
- `kthSmallest`
- `kthSmallest`
- `kthSmallest`
Each query may walk many nodes again.
So the natural idea is:
- keep extra info updated during insert/delete
- then queries become faster
The exact extra info needed for "kth smallest" is subtree size.
This structure is commonly called:
- an order-statistics tree
- or a BST augmented with subtree sizes
#### How `kthSmallest` works with sizes
At a node:
- let `left_size = size(node.left)`
Then:
- if `k <= left_size`
- answer is in the left subtree
- if `k == left_size + 1`
- current node is the answer
- otherwise
- answer is in the right subtree
- new `k` becomes `k - left_size - 1`
So you move down one path only.
#### Small example
Suppose:
```text
5(size=6)
/ \
3(size=3) 7(size=2)
/ \ /
2(1) 4(1) 6(1)
Find k=4:
- at
5,left_size=3 k == left_size + 1- answer is
5
Find k=5:
- at
5,left_size=3 - go right with
k = 5 - 3 - 1 = 1 - at
7,left_size=1 k <= left_size, go left- at
6,left_size=0,k=1 - answer is
6
Important caveat¶
A plain BST can become skewed.
So complexities are:
- \(O(h)\), not always \(O(\log n)\)
If you want guaranteed \(O(\log n)\):
- use a balanced BST
- like AVL / Red-Black Tree / Treap
- and maintain subtree sizes too
For interview follow-up, saying this is enough.
Implementation for the follow-up¶
Below is a concise implementation of a BST augmented with subtree sizes.
It supports:
- insert
- delete
kth_smallest
Average-case complexity on a regular BST:
- insert: \(O(h)\)
- delete: \(O(h)\)
- kth smallest: \(O(h)\)
from dataclasses import dataclass
from typing import Optional
@dataclass
class Node:
val: int
left: Optional["Node"] = None
right: Optional["Node"] = None
size: int = 1
def _size(node: Optional[Node]) -> int:
return node.size if node else 0
def _recalc(node: Optional[Node]) -> None:
if node:
node.size = 1 + _size(node.left) + _size(node.right)
class OrderStatisticBST:
def __init__(self):
self.root: Optional[Node] = None
def insert(self, val: int) -> None:
self.root = self._insert(self.root, val)
def _insert(self, node: Optional[Node], val: int) -> Node:
if node is None:
return Node(val)
if val < node.val:
node.left = self._insert(node.left, val)
else:
node.right = self._insert(node.right, val)
_recalc(node)
return node
def delete(self, val: int) -> None:
self.root = self._delete(self.root, val)
def _delete(self, node: Optional[Node], val: int) -> Optional[Node]:
if node is None:
return None
if val < node.val:
node.left = self._delete(node.left, val)
elif val > node.val:
node.right = self._delete(node.right, val)
else:
if node.left is None:
return node.right
if node.right is None:
return node.left
# Replace with inorder successor.
succ = self._min_node(node.right)
node.val = succ.val
node.right = self._delete(node.right, succ.val)
_recalc(node)
return node
def _min_node(self, node: Node) -> Node:
while node.left:
node = node.left
return node
def kth_smallest(self, k: int) -> int:
if k < 1 or k > _size(self.root):
raise IndexError("k out of range")
node = self.root
while node:
left_size = _size(node.left)
if k <= left_size:
node = node.left
elif k == left_size + 1:
return node.val
else:
k -= left_size + 1
node = node.right
raise RuntimeError("unreachable")
Example use¶
bst = OrderStatisticBST()
for x in [5, 3, 6, 2, 4, 1]:
bst.insert(x)
print(bst.kth_smallest(1)) # 1
print(bst.kth_smallest(3)) # 3
print(bst.kth_smallest(6)) # 6
bst.delete(3)
print(bst.kth_smallest(3)) # 4
If you want the LeetCode follow-up answer in interview style¶
A concise good answer is:
- If the BST changes often and we query
kthSmallestfrequently, augment each node with the size of its subtree. - Update that size after insert/delete.
- Then
kthSmallestis no longer an inorder scan; at each node, comparekwith the left subtree size and move left or right. - Query becomes \(O(h)\) instead of re-traversing many nodes each time.
- If we also keep the tree balanced, it becomes \(O(\log n)\) per update and query.
One subtle thing you may be missing¶
This is not "store the sorted list somewhere".
It is:
- keep local counts on each node
- so global rank queries can be answered by walking one root-to-leaf path
That is the key idea of augmentation in data structures in general:
- store a little extra information
- maintain it on updates
- use it to answer expensive queries much faster
Follow-up on OrderStatisticBST above implementation¶
It seems your implementation of OrderStatisticBST allows different
nodes with the same value. Am I right? Is it OK? Why did you make
that choice? Isn't more common to allow only different values in BST?
What are the use cases for both? Be concise.
Yes.
- This implementation allows duplicates.
- Because insertion uses:
- So equal values go to the right subtree.
Is it OK?¶
Yes, as long as the BST rule is defined consistently.
Two common variants are:
- strict BST
- left
< node < right - no duplicates
- left
- duplicate-friendly BST
- left
< node, right>= node - or left
<= node, right> node
- left
Both are valid if all operations follow the same rule.
Why I made that choice¶
Mostly for simplicity.
- It makes
inserteasy. - It keeps the structure general.
- It matches many practical "multiset" style uses.
For the follow-up idea, duplicates do not change the core point:
- maintain subtree sizes
- use them to answer rank / kth queries
Isn't it more common to allow only distinct values?¶
For interview BST problems:
- often yes
- because examples usually assume unique keys unless duplicates are mentioned
For real data structures:
- both are common
- it depends on whether the structure is a
setor amultiset
Use cases for distinct-only BST¶
You want set/map semantics.
- membership
- insert if absent
- delete exact key
- ordered dictionary
- unique IDs, usernames, primary keys
Examples:
setdict-like ordered map keyed by unique key
In that case, on insert:
- if value already exists, do nothing
- or update payload instead of creating a new node
Use cases for duplicate-friendly BST¶
You want multiset semantics.
- repeated values are meaningful
- frequency matters
- rank queries over repeated values matter
Examples:
- scores with ties
- event timestamps
- inventory counts by repeated key
- streaming median with repeated numbers
A more common way to support duplicates¶
Instead of creating many equal-valued nodes, store a count per node.
Each node keeps:
valcount- number of times this value appears
size- total number of elements in subtree, including duplicates
That is usually cleaner because:
- tree shape is better than chaining equal keys to one side
- delete is simpler for duplicates
- rank/kth logic stays natural
Example¶
Instead of:
use:
What I would do in an interview¶
If duplicates are unspecified:
- I would ask whether values are unique.
If I must choose silently:
- for a LeetCode BST problem, I would usually assume unique values
- for a reusable order-statistics tree, I would mention both options
- and prefer
countper node if duplicates matter
Distinct-only change¶
If you want unique keys only, change insert to:
Bottom line¶
- Yes, my version allows duplicates.
- That is valid, not a bug.
- For production-like design, duplicates are often better handled with a per-node count.
- For interview BST questions, unique keys are often the default unless stated otherwise.