Top K Frequent Elements

Problem Statement

Given an integer array nums and an integer k, return the k most frequent elements within the array.

The test cases are generated such that the answer is unique.

You may return the output in any order.

Example 1:

Input: nums = [1,2,2,3,3,3], k = 2
Output: [2,3]

Example 2:

Input: nums = [7,7], k = 1
Output: [7]

Constraints:

  • 1 <= nums.length <= 10^4
  • -1000 <= nums[i] <= 1000
  • 1 <= k <= number of distinct elements in nums

You should aim for a solution with O(n) time and O(n) space, where n is the size of the input array.

You should aim for a solution with O(n) time and O(n) space, where n is the size of the input array.

Hint 1

A naive solution would be to count the frequency of each number and then sort the array based on each element’s frequency. After that, we would select the top k frequent elements. This would be an O(n log n) solution. Though this solution is acceptable, can you think of a better way?

Hint 2

Can you think of an algorithm which involves grouping numbers based on their frequency?

Hint 3

Use the bucket sort algorithm to create n buckets, grouping numbers based on their frequencies from 1 to n. Then, pick the top k numbers from the buckets, starting from n down to 1.

Solution

Sorting

def top_k_frequent(nums: List[int], k: int) -> List[int]:
count = {}
for num in nums:
count[num] = 1 + count.get(num, 0)
arr = []
for num, cnt in count.items():
arr.append([cnt, num])
arr.sort()
res = []
while len(res) < k:
res.append(arr.pop()[1])
return res

Time complexity: O(n log n) Space complexity: O(n)

Heap

import heapq
def top_k_frequent(nums: List[int], k: int) -> List[int]:
count = {}
for num in nums:
count[num] = 1 + count.get(num, 0)
heap = []
for num in count.keys():
heapq.heappush(heap, (count[num], num))
if len(heap) > k:
heapq.heappop(heap)
res = []
for i in range(k):
res.append(heapq.heappop(heap)[1])
return res

Time complexity: O(n log k) Space complexity: O(n + k)

Bucket Sort

def top_k_frequent(nums: List[int], k: int) -> List[int]:
count = {}
freq = [[] for i in range(len(nums) + 1)]
for num in nums:
count[num] = 1 + count.get(num, 0)
for num, cnt in count.items():
freq[cnt].append(num)
res = []
for i in range(len(freq) - 1, 0, -1):
for num in freq[i]:
res.append(num)
if len(res) == k:
return res

Time complexity: O(n) Space complexity: O(n)