H-index

Introduction

Given an array of citations (each citation is a non-negative integer) of a researcher, write a function to compute the researcher’s h-index.

According to the definition of h-index on Wikipedia: “A scientist has index h if h of his/her N papers have at least h citations each, and the other N − h papers have no more than h citations each.”

For example, given citations = [3, 0, 6, 1, 5], which means the researcher has 5 papers in total and each of them had received 3, 0, 6, 1, 5 citations respectively. Since the researcher has 3 papers with at least 3 citations each and the remaining two with no more than 3 citations each, his h-index is 3.

Follow up question: what if the citation is already sorted in ascending order.

Solution

Create an array bucket[] to store the number of papers with same citations in index equal to the number of citation, if the citation is larger than the length of citation list, then increment bucket[n] at the end. bucket[i] + ... + bucket[n] is the number of papers that have at least i citations, then start from the end and move backward until the number of total papers is more than or equal to current index.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
public int hIndex(int[] citations) {
int n = citations.length;
int[] bucket = new int[n+1];
for(int num : citations) {
if(num > n) bucket[n]++;
else bucket[num]++;
}
int count = 0;
for(int i=n; i>=0; i--) {
count += bucket[i];
if(count >= i) return i;
}
return 0;
}

Solution for follow-up question

As the citation list is already sorted in ascending order, we can use binary search to find solution in O(log(N)) time by comparing citations[mid] with len - mid, which is the number of papers that have at least citations[mid] citations. If they are equal, then we find the answer, otherwise update left or right pointer to mid accordingly. It is possible that they are not equal after the whole loop, then return len - right -1 at the end because left and right pointers reach the same position, so we need to select the first available index i that has more than citation[i] citations.

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
public int hIndex(int[] citations) {
int len = citations.length, left = 0, right = len - 1;
if(len == 0) return 0;
while(left <= right) {
int mid = (left + right) / 2;
if(citations[mid] == len - mid) {
return citations[mid];
}
else if(citations[mid] > len - mid) {
right = mid - 1;
}
else {
left = mid + 1;
}
}
return len - right - 1;
}