49. Group Anagrams
On LeetCode ->Reformulated question¶
Group strings that are anagrams of each other.
- Input:
list[str] - Output:
list[list[str]] - Rule: words belong to the same group if their letters and counts match; output order does not matter.
Example:
Key trick¶
Use a canonical key for each word so all anagrams map to the same bucket.
- Simple key:
- sort the letters
- Example:
"eat" -> "aet""tea" -> "aet"
Then group with a dictionary from key to list of words.
Trap¶
- Comparing output directly without normalizing order.
- Forgetting that:
- group order does not matter
- word order inside each group does not matter
- Using sorting per word is fine here, but not noticing there is also a linear-per-word frequency-count key.
Why this question is interesting¶
It tests a very common interview pattern:
- turn each item into a canonical representation
- use hashing to group efficiently
It also checks whether you separate the problem's real requirement from irrelevant output ordering.
Solution with idiomatic Python¶
class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
# Map canonical form -> all words with that form.
groups: dict[str, list[str]] = {}
for s in strs:
# All anagrams have the same sorted-letter representation.
key = "".join(sorted(s))
if key not in groups:
groups[key] = []
groups[key].append(s)
return list(groups.values())
Alternative with defaultdict:
from collections import defaultdict
class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
groups = defaultdict(list)
for s in strs:
key = "".join(sorted(s))
groups[key].append(s)
return list(groups.values())
Pytest test¶
import pytest
def normalize(groups: list[list[str]]) -> list[list[str]]:
return sorted([sorted(group) for group in groups])
@pytest.mark.parametrize(
("strs", "expected"),
[
(
["eat", "tea", "tan", "ate", "nat", "bat"],
[["eat", "tea", "ate"], ["tan", "nat"], ["bat"]],
),
(
[""],
[[""]],
),
(
["a"],
[["a"]],
),
(
["", ""],
[["", ""]],
),
(
["ab", "ba", "abc", "cab", "bac", "foo"],
[["ab", "ba"], ["abc", "cab", "bac"], ["foo"]],
),
(
["zzz", "zz", "z", "zz"],
[["zzz"], ["zz", "zz"], ["z"]],
),
],
)
def test_groupAnagrams(strs, expected):
assert normalize(Solution().groupAnagrams(strs)) == normalize(expected)
Comment on your solution¶
Your solution is good and interview-valid.
- What is good:
- correct idea
- clear helper for canonical key
- good use of a dictionary
- Small simplifications:
- inline
anagram_idif used once return list(groups.values())is enough
- inline
- Complexity:
- \(O(n \cdot k \log k)\) time, where \(k\) is average word length
- \(O(n \cdot k)\) extra space
## Solution
class Solution:
def groupAnagrams(self, strs: list[str]) -> list[list[str]]:
def anagram_id(s):
return "".join(sorted(s))
groups = {}
for s in strs:
ana_id = anagram_id(s)
if ana_id in groups:
groups[ana_id].append(s)
else:
groups[ana_id] = [s]
return [list(anagrams) for anagrams in groups.values()]
Solution().groupAnagrams(["eat","tea","tan","ate","nat","bat"]) # [['eat', 'tea', 'ate'], ['tan', 'nat'], ['bat']]
Solution().groupAnagrams(["", ""]) # [['', '']]
Slightly cleaned-up version of your approach: