Note: this is also called the "Moore’s Voting Algorithm".The algorithm simply looks at each element of the sequence:
Keep a candidate element and a counter (initially unknown and zero, respectively).
As we move across the sequence, look at each element:
As we move across the sequence, look at each element:
- If the counter is 0: the element is the candidate and the counter is 1.
- If the counter is not 0: increment if the element is the candidate, decrement if not.
Using this specification, we can implement the algorithm:
A few simple tests show that this is working:: majority ( seq -- elt/f ) [ f 0 ] dip [ over zero? [ 2nip 1 ] [ pick = [ 1 + ] [ 1 - ] if ] if ] each zero? [ drop f ] when ;
This is perhaps not quite idiomatic Factor, can you improve it?[ f ] [ { } majority ] unit-test [ f ] [ { 1 2 } majority ] unit-test [ 1 ] [ { 1 1 2 } majority ] unit-test [ f ] [ { 1 1 2 2 } majority ] unit-test [ 2 ] [ { 1 1 2 2 2 } majority ] unit-test
Here's my solution, but not so sure it's idiomatic:
ReplyDelete:: majority2 ( seq -- f/elt )
0 :> count! f :> candidate!
seq [ count 0 =
[ candidate! 1 count! ]
[ candidate =
[ count 1 + count! ] [ count 1 - count! ] if
] if ] each
count 0 = [ f ] [ candidate ] if ;
What's more disturbing to me is that the input { 1 2 3 1 } for both of our implementations of this algorithm output f, instead of 1.
I don't know why.
I looked at the page that you linked to, and the description matches your own description of the algorithm above too. However, I feel that there's something else missing.
ReplyDeleteI found that the input { 1 1 1 2 2 3 3 } also gives a wonky result. Is this algorithm not always 100% (similar perhaps to a bloom filter which has a % change to give a false positives)?
http://stackoverflow.com/questions/780937/linear-time-voting-algorithm-i-dont-get-it
ReplyDeletemrjbq7, I didn't read the assumptions carefully enough! There needs to be a MAJORITY (duh!) element in the entire sequence of at greater than N/2.
I mistakenly thought that majority meant "highest frequency group", which is a different problem altogether.
Sorry for any confusion.
@otoburb: I think I just realized why. One of the requirements that I didn't notice is that the sequence has an element that exists more than N/2 times.
ReplyDeleteSo, its not just the "most common element" (which I had originally thought), but actually the "majority element", meaning more than half the sequence consists of the element.
Just for fun using tuple and reduce.
ReplyDeleteShorter
[ [ 1 + ] ] [ [ 1 - ] ] if change-n
fails with Cannot apply “call” to a run-time computed value, sadly.
Here's my crack at it:
ReplyDelete:: (majority) ( candidate count elt -- candidate count )
{
{ [ count 0 = ] [ elt 1 ] }
{ [ candidate elt = ] [ candidate count 1 + ] }
[ candidate count 1 - ]
} cond ;
: majority ( seq -- elt ) [ f 0 ] dip [ (majority) ] each drop ;