Huffman Coding in Swift

up vote
1
down vote

favorite

I’m learning Swift and decided the Huffman Coding algorithm would be a good exercise while learning this new language. Below is a working version that encodes a string and decodes a string.

Here’s an example of how the API is used:

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

let decode = huffEncoded.decode()

I’d love to learn of anything I’m not doing that would be more swift-like.

I’m wondering if my choice of using a class with a lot of static functions is a good approach or if a struct would be better. They seem like similar solutions but I’m not 100% sure the real difference past structs are value types and classes are reference types.

I'm also not confident in how I used GCD, but I think that using a concurrent queue will be helpful if encoding or decoding a large string since this isn't updating a view.

I’m happy to hear any suggestions on my approach to the algorithm itself

My next step is to make this into a framework so it can be used by other code. Does my approach scale for this?

Thanks for any help!

class Huffman {

    private var key = [String: String]()

    private(set) var code = [String]()

    lazy private var queue: DispatchQueue = {

        return DispatchQueue(label: "huffman-queue", qos: .userInitiated, attributes: .concurrent)

    }()



    init(_ input: String) {

        self.key = Huffman.getKey(for: input)

        self.code = Huffman.encode(for: input, with: self.key)

    }



    func decode() -> String {

        var word = ""

        queue.sync { [unowned self] in

            var reverseKey = [String:String]()

            for (k, v) in self.key {

                reverseKey[v] = k

            }



            for prefix in self.code {

                if let letter = reverseKey[prefix] {

                   word += letter

                }

            }

        }

        return word

    }



    static private func getKey(for input: String) -> [String: String] {

        // sort letter frequency by decreasing count

        let sortedFrequency = Array(input)

            .reduce(into: [String: Int](), { freq, char in

                let letter = String(char)

                return freq[letter] = freq[letter, default: 0] + 1

            })

            .sorted(by: {$0.1 > $1.1})

        // create queue of initial Nodes

        let queue = sortedFrequency.map{ Node(name: $0.key, value: $0.value)}

        // generate key by traversing tree

        return Huffman.generateKey(for: Huffman.createTree(with: queue), prefix: "")

    }



    static private func encode(for input: String, with key: [String: String]) -> [String] {

        var code = [String]()

        let queue = DispatchQueue(label: "huffman-encode-queue", qos: .userInitiated, attributes: .concurrent)

        queue.sync {

            for char in input {

                if let prefix = key[String(char)] {

                    code.append(prefix)

                }

            }

        }

        return code

    }



    static private func generateKey(for node: Node, prefix: String) -> [String: String] {

        var key = [String: String]()

        if let left = node.left, let right = node.right {

            key.merge(generateKey(for: left, prefix: prefix + "0"), uniquingKeysWith: {current,_ in current})

            key.merge(generateKey(for: right, prefix: prefix + "1"), uniquingKeysWith: {current,_ in current})

        }else {

            key[node.name] = prefix

        }

        return key

    }



    static private func createTree(with queue: [Node]) -> Node {

        var queue = queue

        // until we have 1 root node, get subtree of least frequency

        while queue.count > 1 {

            let last = queue.count - 1

            let node1 = queue[last]

            let node2 = queue[last - 1]



            // if we have a third then compare frequency to second

            if let node3 = queue[safe: last - 2], node3.value + node2.value < node2.value + node1.value {

                queue.remove(at: last - 1)

                queue.remove(at: last - 2)

                queue.append(Huffman.createRoot(with: node2, and: node3))

            } else {

                queue.removeLast()

                queue.removeLast()

                queue.append(Huffman.createRoot(with: node1, and: node2))

            }

        }

        return queue[0]

    }



    static private func createRoot(with first: Node, and second: Node) -> Node {

        return Node(name: "(first.name)(second.name)", value: first.value + second.value, left: first, right: second)

    }



}



class Node {

    var name: String

    var value: Int

    var left: Node?

    var right: Node?



    init(name: String, value: Int, left: Node? = nil, right: Node? = nil) {

        self.name = name

        self.value = value

        self.left = left

        self.right = right

    }

}



extension Collection {

    subscript (safe index: Index) -> Element? {

        return indices.contains(index) ? self[index] : nil

    }

}

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

1

Here is an alternative implementation, you might glean some ideas from it.
– Carpsen90
Dec 4 at 22:30

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Sᴀᴍ Onᴇᴌᴀ
Dec 5 at 21:07

add a comment |

up vote
1
down vote

favorite

I’m learning Swift and decided the Huffman Coding algorithm would be a good exercise while learning this new language. Below is a working version that encodes a string and decodes a string.

Here’s an example of how the API is used:

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

let decode = huffEncoded.decode()

I’d love to learn of anything I’m not doing that would be more swift-like.

I’m wondering if my choice of using a class with a lot of static functions is a good approach or if a struct would be better. They seem like similar solutions but I’m not 100% sure the real difference past structs are value types and classes are reference types.

I'm also not confident in how I used GCD, but I think that using a concurrent queue will be helpful if encoding or decoding a large string since this isn't updating a view.

I’m happy to hear any suggestions on my approach to the algorithm itself

My next step is to make this into a framework so it can be used by other code. Does my approach scale for this?

Thanks for any help!

class Huffman {

    private var key = [String: String]()

    private(set) var code = [String]()

    lazy private var queue: DispatchQueue = {

        return DispatchQueue(label: "huffman-queue", qos: .userInitiated, attributes: .concurrent)

    }()



    init(_ input: String) {

        self.key = Huffman.getKey(for: input)

        self.code = Huffman.encode(for: input, with: self.key)

    }



    func decode() -> String {

        var word = ""

        queue.sync { [unowned self] in

            var reverseKey = [String:String]()

            for (k, v) in self.key {

                reverseKey[v] = k

            }



            for prefix in self.code {

                if let letter = reverseKey[prefix] {

                   word += letter

                }

            }

        }

        return word

    }



    static private func getKey(for input: String) -> [String: String] {

        // sort letter frequency by decreasing count

        let sortedFrequency = Array(input)

            .reduce(into: [String: Int](), { freq, char in

                let letter = String(char)

                return freq[letter] = freq[letter, default: 0] + 1

            })

            .sorted(by: {$0.1 > $1.1})

        // create queue of initial Nodes

        let queue = sortedFrequency.map{ Node(name: $0.key, value: $0.value)}

        // generate key by traversing tree

        return Huffman.generateKey(for: Huffman.createTree(with: queue), prefix: "")

    }



    static private func encode(for input: String, with key: [String: String]) -> [String] {

        var code = [String]()

        let queue = DispatchQueue(label: "huffman-encode-queue", qos: .userInitiated, attributes: .concurrent)

        queue.sync {

            for char in input {

                if let prefix = key[String(char)] {

                    code.append(prefix)

                }

            }

        }

        return code

    }



    static private func generateKey(for node: Node, prefix: String) -> [String: String] {

        var key = [String: String]()

        if let left = node.left, let right = node.right {

            key.merge(generateKey(for: left, prefix: prefix + "0"), uniquingKeysWith: {current,_ in current})

            key.merge(generateKey(for: right, prefix: prefix + "1"), uniquingKeysWith: {current,_ in current})

        }else {

            key[node.name] = prefix

        }

        return key

    }



    static private func createTree(with queue: [Node]) -> Node {

        var queue = queue

        // until we have 1 root node, get subtree of least frequency

        while queue.count > 1 {

            let last = queue.count - 1

            let node1 = queue[last]

            let node2 = queue[last - 1]



            // if we have a third then compare frequency to second

            if let node3 = queue[safe: last - 2], node3.value + node2.value < node2.value + node1.value {

                queue.remove(at: last - 1)

                queue.remove(at: last - 2)

                queue.append(Huffman.createRoot(with: node2, and: node3))

            } else {

                queue.removeLast()

                queue.removeLast()

                queue.append(Huffman.createRoot(with: node1, and: node2))

            }

        }

        return queue[0]

    }



    static private func createRoot(with first: Node, and second: Node) -> Node {

        return Node(name: "(first.name)(second.name)", value: first.value + second.value, left: first, right: second)

    }



}



class Node {

    var name: String

    var value: Int

    var left: Node?

    var right: Node?



    init(name: String, value: Int, left: Node? = nil, right: Node? = nil) {

        self.name = name

        self.value = value

        self.left = left

        self.right = right

    }

}



extension Collection {

    subscript (safe index: Index) -> Element? {

        return indices.contains(index) ? self[index] : nil

    }

}

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

1

Here is an alternative implementation, you might glean some ideas from it.
– Carpsen90
Dec 4 at 22:30

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Sᴀᴍ Onᴇᴌᴀ
Dec 5 at 21:07

add a comment |

up vote
1
down vote

favorite

I’m learning Swift and decided the Huffman Coding algorithm would be a good exercise while learning this new language. Below is a working version that encodes a string and decodes a string.

Here’s an example of how the API is used:

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

let decode = huffEncoded.decode()

I’d love to learn of anything I’m not doing that would be more swift-like.

I’m wondering if my choice of using a class with a lot of static functions is a good approach or if a struct would be better. They seem like similar solutions but I’m not 100% sure the real difference past structs are value types and classes are reference types.

I'm also not confident in how I used GCD, but I think that using a concurrent queue will be helpful if encoding or decoding a large string since this isn't updating a view.

I’m happy to hear any suggestions on my approach to the algorithm itself

My next step is to make this into a framework so it can be used by other code. Does my approach scale for this?

Thanks for any help!

class Huffman {

    private var key = [String: String]()

    private(set) var code = [String]()

    lazy private var queue: DispatchQueue = {

        return DispatchQueue(label: "huffman-queue", qos: .userInitiated, attributes: .concurrent)

    }()



    init(_ input: String) {

        self.key = Huffman.getKey(for: input)

        self.code = Huffman.encode(for: input, with: self.key)

    }



    func decode() -> String {

        var word = ""

        queue.sync { [unowned self] in

            var reverseKey = [String:String]()

            for (k, v) in self.key {

                reverseKey[v] = k

            }



            for prefix in self.code {

                if let letter = reverseKey[prefix] {

                   word += letter

                }

            }

        }

        return word

    }



    static private func getKey(for input: String) -> [String: String] {

        // sort letter frequency by decreasing count

        let sortedFrequency = Array(input)

            .reduce(into: [String: Int](), { freq, char in

                let letter = String(char)

                return freq[letter] = freq[letter, default: 0] + 1

            })

            .sorted(by: {$0.1 > $1.1})

        // create queue of initial Nodes

        let queue = sortedFrequency.map{ Node(name: $0.key, value: $0.value)}

        // generate key by traversing tree

        return Huffman.generateKey(for: Huffman.createTree(with: queue), prefix: "")

    }



    static private func encode(for input: String, with key: [String: String]) -> [String] {

        var code = [String]()

        let queue = DispatchQueue(label: "huffman-encode-queue", qos: .userInitiated, attributes: .concurrent)

        queue.sync {

            for char in input {

                if let prefix = key[String(char)] {

                    code.append(prefix)

                }

            }

        }

        return code

    }



    static private func generateKey(for node: Node, prefix: String) -> [String: String] {

        var key = [String: String]()

        if let left = node.left, let right = node.right {

            key.merge(generateKey(for: left, prefix: prefix + "0"), uniquingKeysWith: {current,_ in current})

            key.merge(generateKey(for: right, prefix: prefix + "1"), uniquingKeysWith: {current,_ in current})

        }else {

            key[node.name] = prefix

        }

        return key

    }



    static private func createTree(with queue: [Node]) -> Node {

        var queue = queue

        // until we have 1 root node, get subtree of least frequency

        while queue.count > 1 {

            let last = queue.count - 1

            let node1 = queue[last]

            let node2 = queue[last - 1]



            // if we have a third then compare frequency to second

            if let node3 = queue[safe: last - 2], node3.value + node2.value < node2.value + node1.value {

                queue.remove(at: last - 1)

                queue.remove(at: last - 2)

                queue.append(Huffman.createRoot(with: node2, and: node3))

            } else {

                queue.removeLast()

                queue.removeLast()

                queue.append(Huffman.createRoot(with: node1, and: node2))

            }

        }

        return queue[0]

    }



    static private func createRoot(with first: Node, and second: Node) -> Node {

        return Node(name: "(first.name)(second.name)", value: first.value + second.value, left: first, right: second)

    }



}



class Node {

    var name: String

    var value: Int

    var left: Node?

    var right: Node?



    init(name: String, value: Int, left: Node? = nil, right: Node? = nil) {

        self.name = name

        self.value = value

        self.left = left

        self.right = right

    }

}



extension Collection {

    subscript (safe index: Index) -> Element? {

        return indices.contains(index) ? self[index] : nil

    }

}

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

I’m learning Swift and decided the Huffman Coding algorithm would be a good exercise while learning this new language. Below is a working version that encodes a string and decodes a string.

Here’s an example of how the API is used:

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

let decode = huffEncoded.decode()

I’d love to learn of anything I’m not doing that would be more swift-like.

I’m wondering if my choice of using a class with a lot of static functions is a good approach or if a struct would be better. They seem like similar solutions but I’m not 100% sure the real difference past structs are value types and classes are reference types.

I'm also not confident in how I used GCD, but I think that using a concurrent queue will be helpful if encoding or decoding a large string since this isn't updating a view.

I’m happy to hear any suggestions on my approach to the algorithm itself

My next step is to make this into a framework so it can be used by other code. Does my approach scale for this?

Thanks for any help!

class Huffman {

    private var key = [String: String]()

    private(set) var code = [String]()

    lazy private var queue: DispatchQueue = {

        return DispatchQueue(label: "huffman-queue", qos: .userInitiated, attributes: .concurrent)

    }()



    init(_ input: String) {

        self.key = Huffman.getKey(for: input)

        self.code = Huffman.encode(for: input, with: self.key)

    }



    func decode() -> String {

        var word = ""

        queue.sync { [unowned self] in

            var reverseKey = [String:String]()

            for (k, v) in self.key {

                reverseKey[v] = k

            }



            for prefix in self.code {

                if let letter = reverseKey[prefix] {

                   word += letter

                }

            }

        }

        return word

    }



    static private func getKey(for input: String) -> [String: String] {

        // sort letter frequency by decreasing count

        let sortedFrequency = Array(input)

            .reduce(into: [String: Int](), { freq, char in

                let letter = String(char)

                return freq[letter] = freq[letter, default: 0] + 1

            })

            .sorted(by: {$0.1 > $1.1})

        // create queue of initial Nodes

        let queue = sortedFrequency.map{ Node(name: $0.key, value: $0.value)}

        // generate key by traversing tree

        return Huffman.generateKey(for: Huffman.createTree(with: queue), prefix: "")

    }



    static private func encode(for input: String, with key: [String: String]) -> [String] {

        var code = [String]()

        let queue = DispatchQueue(label: "huffman-encode-queue", qos: .userInitiated, attributes: .concurrent)

        queue.sync {

            for char in input {

                if let prefix = key[String(char)] {

                    code.append(prefix)

                }

            }

        }

        return code

    }



    static private func generateKey(for node: Node, prefix: String) -> [String: String] {

        var key = [String: String]()

        if let left = node.left, let right = node.right {

            key.merge(generateKey(for: left, prefix: prefix + "0"), uniquingKeysWith: {current,_ in current})

            key.merge(generateKey(for: right, prefix: prefix + "1"), uniquingKeysWith: {current,_ in current})

        }else {

            key[node.name] = prefix

        }

        return key

    }



    static private func createTree(with queue: [Node]) -> Node {

        var queue = queue

        // until we have 1 root node, get subtree of least frequency

        while queue.count > 1 {

            let last = queue.count - 1

            let node1 = queue[last]

            let node2 = queue[last - 1]



            // if we have a third then compare frequency to second

            if let node3 = queue[safe: last - 2], node3.value + node2.value < node2.value + node1.value {

                queue.remove(at: last - 1)

                queue.remove(at: last - 2)

                queue.append(Huffman.createRoot(with: node2, and: node3))

            } else {

                queue.removeLast()

                queue.removeLast()

                queue.append(Huffman.createRoot(with: node1, and: node2))

            }

        }

        return queue[0]

    }



    static private func createRoot(with first: Node, and second: Node) -> Node {

        return Node(name: "(first.name)(second.name)", value: first.value + second.value, left: first, right: second)

    }



}



class Node {

    var name: String

    var value: Int

    var left: Node?

    var right: Node?



    init(name: String, value: Int, left: Node? = nil, right: Node? = nil) {

        self.name = name

        self.value = value

        self.left = left

        self.right = right

    }

}



extension Collection {

    subscript (safe index: Index) -> Element? {

        return indices.contains(index) ? self[index] : nil

    }

}

algorithm swift compression

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

edited Dec 5 at 21:07

Sᴀᴍ Onᴇᴌᴀ

8,11461751

asked Dec 4 at 19:10

Turnipdabeets

1265

asked Dec 4 at 19:10

Turnipdabeets

1265

asked Dec 4 at 19:10

Turnipdabeets

1265

1

Here is an alternative implementation, you might glean some ideas from it.
– Carpsen90
Dec 4 at 22:30

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Sᴀᴍ Onᴇᴌᴀ
Dec 5 at 21:07

add a comment |

1

Here is an alternative implementation, you might glean some ideas from it.
– Carpsen90
Dec 4 at 22:30

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Sᴀᴍ Onᴇᴌᴀ
Dec 5 at 21:07

Here is an alternative implementation, you might glean some ideas from it.
– Carpsen90
Dec 4 at 22:30

Please do not update the code in your question to incorporate feedback from answers, doing so goes against the Question + Answer style of Code Review. This is not a forum where you should keep the most updated version in your question. Please see what you may and may not do after receiving answers.
– Sᴀᴍ Onᴇᴌᴀ
Dec 5 at 21:07

add a comment |

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Concurrency

In the encode and decode method you use queue.sync() to dispatch code to a concurrent queue – which means that the current queue is blocked until the computation is finished. If that is the main queue in an iOS or macOS application, UI updates and event handling are blocked for that time.

Therefore using a dispatch queue does not really help here. Also the Huffman encoder cannot know in which context it is executed (such as “user initiated”).

Dispatching the code with GCD – if necessary – should be done by the caller of these methods, not in the Huffman class itself.

The API

In its current form, the class seems of limited use to me.

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

builds the Huffman tree and encodes the given string and stores the results in private member variables. The only thing that I can do with the result is to decode it again. What I am missing are methods to

retrieve the generated Huffman tree so that it can be stored (perhaps in some serialized form) for later decompression,

retrieve the compressed string as a sequence of zeros and ones,

an initializer which takes a (previously created) Huffman tree,

a decode methods which takes a previously compressed string and decodes it, using the given Huffman tree.

Simplifications and other remarks

All properties in class Node are never mutated, and can be declared as constants (with let).

In func decode() the reverse dictionary mapping can be created with

let reverseKey = Dictionary(uniqueKeysWithValues: zip(key.values, key.keys))

instead of a loop. The following loop can be more simply done with compactMap():

let word = code.compactMap({ reverseKey[$0] }).joined()

In func getKey() it is not necessary to create an array from the given string. The return in the closure of the reduce() call is misleading: An assignment does not return anything. += can be used for the increment. Using $n.value instead of $n.1$ in the sort predicate emphasizes that the dictionary is sorted according to its values:

let sortedFrequency = input.reduce(into: [String: Int](), { freq, char in

        freq[String(char), default: 0] += 1

    })

    .sorted(by: {$0.value > $1.value})

In func encode(), compactMap() can be used again instead of a for loop:

let code = input.compactMap( { key[String($0)] } )

In func createTree() a “safe subscript” method (defined as an extension
of the Collection protocol) is used to access the third last queue element if it exists. Testing if queue.count >= 3 would seem more clear to me.

The Huffman tree algorithm

Your algorithm does not generate the optimal code. The reason is that at each step it only considers the last two or three nodes, not the two nodes with the minimal total weight. Here is an example: For the string
"ABCDEFGH" (i.e. 8 distinct characters with equal freqency) your
program generates the codes

"D": 01

"G": 11

"F": 001

"E": 101

"A": 0001

"C": 0000

"H": 1000

"B": 1001

with an average code length of 26/8 = 3.25. (Your output can be different because hash values are randomized since Swift 4.2.) The optimal tree in this case would be a balanced binary tree, where every code has length 3.

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

add a comment |

Your Answer

StackExchange.ifUsing("editor", function () {
return StackExchange.using("mathjaxEditing", function () {
StackExchange.MarkdownEditor.creationCallbacks.add(function (editor, postfix) {
StackExchange.mathjaxEditing.prepareWmdForMathJax(editor, postfix, [["\$", "\$"]]);
});
});
}, "mathjax-editing");

StackExchange.ifUsing("editor", function () {
StackExchange.using("externalEditor", function () {
StackExchange.using("snippets", function () {
StackExchange.snippets.init();
});
});
}, "code-snippets");

StackExchange.ready(function() {
var channelOptions = {
tags: "".split(" "),
id: "196"
};
initTagRenderer("".split(" "), "".split(" "), channelOptions);

StackExchange.using("externalEditor", function() {
// Have to fire editor after snippets, if snippets enabled
if (StackExchange.settings.snippets.snippetsEnabled) {
StackExchange.using("snippets", function() {
createEditor();
});
}
else {
createEditor();
}
});

function createEditor() {
StackExchange.prepareEditor({
heartbeatType: 'answer',
convertImagesToLinks: false,
noModals: true,
showLowRepImageUploadWarning: true,
reputationToPostImages: null,
bindNavPrevention: true,
postfix: "",
imageUploader: {
brandingHtml: "Powered by u003ca class="icon-imgur-white" href="https://imgur.com/"u003eu003c/au003e",
contentPolicyHtml: "User contributions licensed under u003ca href="https://creativecommons.org/licenses/by-sa/3.0/"u003ecc by-sa 3.0 with attribution requiredu003c/au003e u003ca href="https://stackoverflow.com/legal/content-policy"u003e(content policy)u003c/au003e",
allowUrls: true
},
onDemand: true,
discardSelector: ".discard-answer"
,immediatelyShowMarkdownHelp:true
});

}
});

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

StackExchange.ready(
function () {
StackExchange.openid.initPostLogin('.new-post-login', 'https%3a%2f%2fcodereview.stackexchange.com%2fquestions%2f209018%2fhuffman-coding-in-swift%23new-answer', 'question_page');
}
);

Post as a guest

Name

Required, but never shown

1 Answer
1

active

oldest

votes

1 Answer
1

active

oldest

votes

up vote
1
down vote

accepted

Concurrency

Therefore using a dispatch queue does not really help here. Also the Huffman encoder cannot know in which context it is executed (such as “user initiated”).

Dispatching the code with GCD – if necessary – should be done by the caller of these methods, not in the Huffman class itself.

The API

In its current form, the class seems of limited use to me.

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

retrieve the generated Huffman tree so that it can be stored (perhaps in some serialized form) for later decompression,

retrieve the compressed string as a sequence of zeros and ones,

an initializer which takes a (previously created) Huffman tree,

a decode methods which takes a previously compressed string and decodes it, using the given Huffman tree.

Simplifications and other remarks

All properties in class Node are never mutated, and can be declared as constants (with let).

In func decode() the reverse dictionary mapping can be created with

let reverseKey = Dictionary(uniqueKeysWithValues: zip(key.values, key.keys))

instead of a loop. The following loop can be more simply done with compactMap():

let word = code.compactMap({ reverseKey[$0] }).joined()

let sortedFrequency = input.reduce(into: [String: Int](), { freq, char in

        freq[String(char), default: 0] += 1

    })

    .sorted(by: {$0.value > $1.value})

In func encode(), compactMap() can be used again instead of a for loop:

let code = input.compactMap( { key[String($0)] } )

The Huffman tree algorithm

"D": 01

"G": 11

"F": 001

"E": 101

"A": 0001

"C": 0000

"H": 1000

"B": 1001

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

add a comment |

up vote
1
down vote

accepted

Concurrency

Therefore using a dispatch queue does not really help here. Also the Huffman encoder cannot know in which context it is executed (such as “user initiated”).

Dispatching the code with GCD – if necessary – should be done by the caller of these methods, not in the Huffman class itself.

The API

In its current form, the class seems of limited use to me.

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

retrieve the generated Huffman tree so that it can be stored (perhaps in some serialized form) for later decompression,

retrieve the compressed string as a sequence of zeros and ones,

an initializer which takes a (previously created) Huffman tree,

a decode methods which takes a previously compressed string and decodes it, using the given Huffman tree.

Simplifications and other remarks

All properties in class Node are never mutated, and can be declared as constants (with let).

In func decode() the reverse dictionary mapping can be created with

let reverseKey = Dictionary(uniqueKeysWithValues: zip(key.values, key.keys))

instead of a loop. The following loop can be more simply done with compactMap():

let word = code.compactMap({ reverseKey[$0] }).joined()

let sortedFrequency = input.reduce(into: [String: Int](), { freq, char in

        freq[String(char), default: 0] += 1

    })

    .sorted(by: {$0.value > $1.value})

In func encode(), compactMap() can be used again instead of a for loop:

let code = input.compactMap( { key[String($0)] } )

The Huffman tree algorithm

"D": 01

"G": 11

"F": 001

"E": 101

"A": 0001

"C": 0000

"H": 1000

"B": 1001

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

add a comment |

up vote
1
down vote

accepted

Concurrency

Therefore using a dispatch queue does not really help here. Also the Huffman encoder cannot know in which context it is executed (such as “user initiated”).

Dispatching the code with GCD – if necessary – should be done by the caller of these methods, not in the Huffman class itself.

The API

In its current form, the class seems of limited use to me.

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

retrieve the generated Huffman tree so that it can be stored (perhaps in some serialized form) for later decompression,

retrieve the compressed string as a sequence of zeros and ones,

an initializer which takes a (previously created) Huffman tree,

a decode methods which takes a previously compressed string and decodes it, using the given Huffman tree.

Simplifications and other remarks

All properties in class Node are never mutated, and can be declared as constants (with let).

In func decode() the reverse dictionary mapping can be created with

let reverseKey = Dictionary(uniqueKeysWithValues: zip(key.values, key.keys))

instead of a loop. The following loop can be more simply done with compactMap():

let word = code.compactMap({ reverseKey[$0] }).joined()

let sortedFrequency = input.reduce(into: [String: Int](), { freq, char in

        freq[String(char), default: 0] += 1

    })

    .sorted(by: {$0.value > $1.value})

In func encode(), compactMap() can be used again instead of a for loop:

let code = input.compactMap( { key[String($0)] } )

The Huffman tree algorithm

"D": 01

"G": 11

"F": 001

"E": 101

"A": 0001

"C": 0000

"H": 1000

"B": 1001

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

Concurrency

Therefore using a dispatch queue does not really help here. Also the Huffman encoder cannot know in which context it is executed (such as “user initiated”).

Dispatching the code with GCD – if necessary – should be done by the caller of these methods, not in the Huffman class itself.

The API

In its current form, the class seems of limited use to me.

let huffEncoded = Huffman("MISSISSIPPI_RIVER!")

retrieve the generated Huffman tree so that it can be stored (perhaps in some serialized form) for later decompression,

retrieve the compressed string as a sequence of zeros and ones,

an initializer which takes a (previously created) Huffman tree,

a decode methods which takes a previously compressed string and decodes it, using the given Huffman tree.

Simplifications and other remarks

All properties in class Node are never mutated, and can be declared as constants (with let).

In func decode() the reverse dictionary mapping can be created with

let reverseKey = Dictionary(uniqueKeysWithValues: zip(key.values, key.keys))

instead of a loop. The following loop can be more simply done with compactMap():

let word = code.compactMap({ reverseKey[$0] }).joined()

let sortedFrequency = input.reduce(into: [String: Int](), { freq, char in

        freq[String(char), default: 0] += 1

    })

    .sorted(by: {$0.value > $1.value})

In func encode(), compactMap() can be used again instead of a for loop:

let code = input.compactMap( { key[String($0)] } )

The Huffman tree algorithm

"D": 01

"G": 11

"F": 001

"E": 101

"A": 0001

"C": 0000

"H": 1000

"B": 1001

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

edited Dec 4 at 22:53

answered Dec 4 at 21:37

Martin R

15.4k12264

answered Dec 4 at 21:37

Martin R

15.4k12264

answered Dec 4 at 21:37

Martin R

15.4k12264

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

add a comment |

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

Thanks! I need to spend some time thinking through all your suggestions. Wouldn't the last two or three nodes always be of minimal total weight if I've sorted in that order?
– Turnipdabeets
Dec 4 at 21:47

@Turnipdabeets: Here is an example: You start with weights 1, 1, 1, 1, 1, 1. Then the last two are joined: 1, 1, 1, 1, 2. Then 1, 1, 2, 2. Now the first two should be joined, but your algorithm looks only at the last three. – Also for nodes with equal weight the one with lower height should be chosen.
– Martin R
Dec 4 at 22:22

Thanks for that example. I’ll rework on the weights. Can you elaborate on the API improvements? An initializer that takes a previously created Huffman tree is a great idea. But is it better to retrieve and initialize with the tree or the key that I generate from the tree? My thought was that the key is O(n) look up so better than the tree.
– Turnipdabeets
Dec 5 at 15:50

I was also thinking maybe the encrypted code should contain the tree/key rather than separate them… maybe as the first element in the array for code or maybe return a struct with these two values. What are your thoughts about keeping the values together like that? The instance var code is readable but its an array of zeros and ones. Is there a benefit of having a string of zeros and ones vs an array? I’ll have to google around for compressing the string or array of strings.
– Turnipdabeets
Dec 5 at 15:51

add a comment |

draft saved

draft discarded

Thanks for contributing an answer to Code Review Stack Exchange!

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

Use MathJax to format equations. MathJax reference.

To learn more, see our tips on writing great answers.

Some of your past answers have not been well-received, and you're in danger of being blocked from answering.

Please pay close attention to the following guidance:

Please be sure to answer the question. Provide details and share your research!

But avoid …

Asking for help, clarification, or responding to other answers.

Making statements based on opinion; back them up with references or personal experience.

To learn more, see our tips on writing great answers.

draft saved

draft discarded

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Sign up or log in

StackExchange.ready(function () {
StackExchange.helpers.onClickDraftSave('#login-link');
});

Post as a guest

Name

Required, but never shown

Name

Required, but never shown

Name

Required, but never shown

This page is only for reference, If you need detailed information, please check here

搜尋此網誌

Gfrktyl