<abbr id="mysos"></abbr>

<del id="mysos"></del>

[LeetCode/LintCode] Top K Frequent Words

0x584a 發(fā)布于2019-08-15 13:32 / 3163人閱讀

LeetCode version Problem

Given a non-empty list of words, return the k most frequent elements.

Your answer should be sorted by frequency from highest to lowest. If two words have the same frequency, then the word with the lower alphabetical order comes first.

Example 1:
Input: ["i", "love", "leetcode", "i", "love", "coding"], k = 2
Output: ["i", "love"]
Explanation: "i" and "love" are the two most frequent words.

Note that "i" comes before "love" due to a lower alphabetical order.

Example 2:
Input: ["the", "day", "is", "sunny", "the", "the", "the", "sunny", "is", "is"], k = 4
Output: ["the", "is", "sunny", "day"]
Explanation: "the", "is", "sunny" and "day" are the four most frequent words,

with the number of occurrence being 4, 3, 2 and 1 respectively.

Note:
You may assume k is always valid, 1 ≤ k ≤ number of unique elements.
Input words contain only lowercase letters.
Follow up:
Try to solve it in O(n log k) time and O(n) extra space.

Solution

class Solution {
    public List topKFrequent(String[] words, int k) {
        List res = new ArrayList<>();
        if (words.length < k) return res;
        Map map = new HashMap<>();
        for (String word: words) {
            if (!map.containsKey(word)) map.put(word, 1);
            else map.put(word, map.get(word)+1);
        }
        PriorityQueue> queue = new PriorityQueue<>(
            (a, b) -> a.getValue() == b.getValue() ? b.getKey().compareTo(a.getKey()) : a.getValue() - b.getValue()
        );
        for (Map.Entry entry: map.entrySet()) {
            queue.offer(entry);
            if (queue.size() > k) queue.poll();
        }
        while (!queue.isEmpty()) {
            res.add(0, queue.poll().getKey());
        }
        return res;
    }
}

LintCode version Problem

Find top k frequent words with map reduce framework.

The mapper"s key is the document id, value is the content of the document, words in a document are split by spaces.

For reducer, the output should be at most k key-value pairs, which are the top k words and their frequencies in this reducer. The judge will take care about how to merge different reducers" results to get the global top k frequent words, so you don"t need to care about that part.

The k is given in the constructor of TopK class.

Notice

For the words with same frequency, rank them with alphabet.

/**
 * Definition of OutputCollector:
 * class OutputCollector {
 *     public void collect(K key, V value);
 *         // Adds a key/value pair to the output buffer
 * }
 * Definition of Document:
 * class Document {
 *     public int id;
 *     public String content;
 * }
 */

Example

Given document A =

lintcode is the best online judge
I love lintcode
and document B =

lintcode is an online judge for coding interview
you can test your code online at lintcode
The top 2 words and their frequencies should be

lintcode, 4
online, 3

Tags

Map Reduce

Solution

// Use Pair to store k-v pair
class Pair {
    String key;
    int value;

    Pair(String k, int v) {
        this.key = k;
        this.value = v;
    }
}

public class TopKFrequentWords {

    public static class Map {
        public void map(String _, Document value,
                        OutputCollector output) {
            // Output the results into output buffer.
            // Ps. output.collect(String key, int value);
            
            String content = value.content;
            String[] words = content.split(" ");
            for (String word : words) {
                if (word.length() > 0) {
                    output.collect(word, 1);
                }
            }
        }
    }

    public static class Reduce {
        private PriorityQueue Q = null;
        private int k;

        private Comparator pairComparator = new Comparator() {
            public int compare(Pair o1, Pair o2) {
                if (o1.value != o2.value) {
                    return o1.value - o2.value;
                }
                //if the values are equal, compare keys
                return o2.key.compareTo(o1.key);
            }
        };

        public void setup(int k) {
            // initialize your data structure here
            this.k = k;
            Q = new PriorityQueue(k, pairComparator);
        }

        public void reduce(String key, Iterator values) {
            int sum = 0;
            while (values.hasNext()) {
                    sum += values.next();
            }

            Pair pair = new Pair(key, sum);
            if (Q.size() < k) {
                Q.add(pair);
            } else {
                Pair peak = Q.peek();
                if (pairComparator.compare(pair, peak) > 0) {
                    Q.poll();
                    Q.add(pair);
                }
            }
        }

        public void cleanup(OutputCollector output) {
            // Output the top k pairs  into output buffer.
            // Ps. output.collect(String key, Integer value);
            List pairs = new ArrayList();
            while (!Q.isEmpty()) {
                pairs.add(Q.poll());
            }

            // reverse result
            int n = pairs.size();
            for (int i = n - 1; i >= 0; --i) {
                Pair pair = pairs.get(i);
                output.collect(pair.key, pair.value);
            }
            
            // while (!Q.isEmpty()) {
            //     Pair pair = Q.poll();
            //     output.collect(pair.key, pair.value);
            // }
        }
    }
}

GPU云服務(wù)器云服務(wù)器 Frequent Words Aspose.Words Top

文章版權(quán)歸作者所有，未經(jīng)允許請(qǐng)勿轉(zhuǎn)載,若此文章存在違規(guī)行為，您可以聯(lián)系管理員刪除。

轉(zhuǎn)載請(qǐng)注明本文地址：http://systransis.cn/yun/68159.html

發(fā)表評(píng)論

登陸后可評(píng)論

0條評(píng)論

0x584a

男|高級(jí)講師

我要關(guān)注我要私信

TA的文章

數(shù)百萬行自研代碼都捐了，華為將歐拉捐贈(zèng)給開放原子開源基金會(huì)

閱讀 1259·2021-11-11 16:54
C語言字符操作函數(shù)和字符串操作函數(shù)

閱讀 1780·2021-10-13 09:40
【C語言/入門游戲】掃雷完整版（包含標(biāo)記，安全保護(hù)及展開）

閱讀 976·2021-10-08 10:05
怎么選虛擬主機(jī)-如何選擇虛擬主機(jī)？

閱讀 3536·2021-09-22 15:50
如何設(shè)置遠(yuǎn)程虛擬主機(jī)-如何在路由器內(nèi)建立虛擬主機(jī)以實(shí)現(xiàn)遠(yuǎn)程訪問？

閱讀 3741·2021-09-22 15:41
主機(jī)名指的是什么-手機(jī)正常的主機(jī)名是什么？

閱讀 1892·2021-09-22 15:08
MacZip – 免費(fèi)macOS蘋果系統(tǒng)壓縮軟件支持20+壓縮格式

閱讀 2376·2021-09-07 10:24
淺析RWD

閱讀 3603·2019-08-30 12:52

成人国产在线小视频_日韩寡妇人妻调教在线播放_色成人www永久在线观看_2018国产精品久久_亚洲欧美高清在线30p_亚洲少妇综合一区_黄色在线播放国产_亚洲另类技巧小说校园_国产主播xx日韩_a级毛片在线免费

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來選購(gòu)！

[LeetCode/LintCode] Top K Frequent Words

相關(guān)文章

**[LeetCode/LintCode] Sentence Similarity**

**[LeetCode/LintCode] Word Ladder**

LeetCode 347. Top K Frequent Elements

**[LeetCode] Top K Frequent Elements**

leetcode347. Top K Frequent Elements

發(fā)表評(píng)論

0條評(píng)論

0x584a

男|高級(jí)講師

TA的文章

數(shù)百萬行自研代碼都捐了，華為將歐拉捐贈(zèng)給開放原子開源基金會(huì)

C語言字符操作函數(shù)和字符串操作函數(shù)

【C語言/入門游戲】掃雷完整版（包含標(biāo)記，安全保護(hù)及展開）

怎么選虛擬主機(jī)-如何選擇虛擬主機(jī)？

如何設(shè)置遠(yuǎn)程虛擬主機(jī)-如何在路由器內(nèi)建立虛擬主機(jī)以實(shí)現(xiàn)遠(yuǎn)程訪問？

主機(jī)名指的是什么-手機(jī)正常的主機(jī)名是什么？

MacZip – 免費(fèi)macOS蘋果系統(tǒng)壓縮軟件支持20+壓縮格式

淺析RWD

最新活動(dòng)

資訊專欄INFORMATION COLUMN

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來選購(gòu)！

[LeetCode/LintCode] Top K Frequent Words

相關(guān)文章

發(fā)表評(píng)論

0條評(píng)論

男|高級(jí)講師

TA的文章

最新活動(dòng)

上云采購(gòu)季！| 2核2G4M爆款云服務(wù)器低至59元/年，更有多臺(tái)、長(zhǎng)期優(yōu)惠，快來選購(gòu)！