Polyvoice 3 is a Max 7 object for managing polyphony. While Max has its own polyphonic management objects, there's a number of reasons you might wish to manage polyphony yourself. The first and most obvious is that you can then create a display which shows the actual playing notes. The demonstration patch shows, for example, if you click and drag across the keyboard, the older notes turn themselves off automatically. Also, you can change the maximum number of playing notes dynamically, which is even useful as a performance tool.
1. The Methodology
Polyvoice 3 is a third-generation Max design. The first generation used coll objects to keep lists of playing notes in an external object. The second generation instead used zlist, not requiring an external file and providing vector performance. Those designs, including those in Yofiel's Godel 2, used a programmatic approach for managing polyphony, with standard round-robin assignment and iterative list reorganization to reduce clipped release phases. That is voice age was indexed by voice numb er, just as for pitch and gate.
The new design still maintains pitch and gate lists organized by voice number, but for age, it instead uses a queue-based model with a 'reverse-indexed' list. This means that list indices indicate the note's age, and the contents of each index slot are the voice number, whereas before the indices were the voice and the slot contents the age. By this 'queue'method, the contents of the age list at index 1 is the voice number of the first played note; at index 2, the voice number of the second played note; and so on. Thus if, for example, three notes are playing, and the second is released, then the voice number in slot 2 is moved to the last index permitted by the maximum number of notes, and all the others values after slot 2 are shifted one to the left. By this method, the most recently released voices are re-used last, permitting the longest possible release phase for each note; and moreover, the voice numbers for the playing notes are at the beginning of the list, in the order they were played.
There's two main benefits:
- The first benefit of the queue model, compared to others, is that it is easy to change the maximum number of notes without disrupting the existing queue. When the maximum permitted notes is increased, the queue length is simply increased, and new notes can make use of the new voices without further ado. When the the maximum number of notes is decreased, notes below the new threshold can continue to play without any reset, and voices above the new threshold can be identified simply by slicing the list. Hence Polyvoice 3 detects when the number of voices are less than before and sends note-off events to the extra notes. As a result, changing the number of playing voices could actually become a performance tool to the musician, as simply reducing the polyphony thins the sound to the earlier played notes, while increasing the polyphony can deepen the chordal texture.
- Second, a queue model simplifies age-based note processing, because the list of notes in order played is directly generated during play. In the demo patch, an on-screen pitch-bar display exemplifies this benefit, with the oldest pitches on the left and the newest on the right. If a note is turned off in the middle of the sequence, the patch also removes that note from the middle of the bar display, maintaining display of remaining active notes in order played.
The age queue is useful for real-time sequencing and arpeggiation. Sequences of short rapid notes can be very taxing on the CPU, so with this method, one can build the list of ordered notes once, at the time it is played...rather than looking up the age index, finding the voice number, then finding the pitch for that voice. That is, one stage of index lookup is removed, because the queue of notes in order played is made only once. For arpeggiation one simply cycles through the numbers in the pre-built ordered lists from the object, without looking up the voice for each age index every time that note is reached in the sequence.
1.1. ExamplesThe Godel 3.1 design (currently still in alpha) contains two versions of Polyvoice. The first, in the lower left of the following schematic, stores up to 32 notes with a patch preset, and provides the notes in order played to the arpeggiator. The Polyvoice aqe queue sorts the notes for the arpeggiator, as described above.
The second Polyvoice version, in the lower half of the following schematic, receives notes from the arpeggiator, which may overlap of different durations; combines them with direct input from the on-screen keyboard or MIDI; adds chord notes, and allocates the audio voices. The chord generator also creates additional notes of different durations during arpeggiation, so a second allocator is really necessary. This Polyvoice version contains two enhancements: first, when chords are played from the keyboard or MIDI, it turns the chord off when a note-off message is received for the root note. Second, it manages overlapping notes from the arpeggiator, turning them off automatically when the end of their duration is reached.
With 32 voices playing at rapid speed, and in conjunction with other design optimizations in this third-generation design, the Polyvoice design reduced typical load by 5-10% on a 3GHz i7, and also reduced peak CPU load at voice change events, in the alpha version of Godel 3.1, which actually contains two versions of Polyvoice 3. The first version maintains a note store and generates the pitch/gate list for Godel's real-time arpeggiator:
The Polyvoice version in this download is simpler than the above implementations, shown and described below.
2. Design Implementation
A large number of variables are needed in every phase of polyphony management, and attempts to divide the design into functional subcomponents result in very large arrays of wiring loops. The patch is therefore in one design sheet. This also reduces the number of queue messages, which was found to be the most significant part of the CPU load in rapid note sequences. For this reason also, the note-on and note-off processing entirely uses LISP-style vector operations instead of iterators.
As it's in one sheet, color coding helps follow the datapath:
- Pink: Initialization.
- Magenta: Max voices.
- Red: Pitch.
- Blue: Velocity.
- Yellow: Age.
- Brown: Pitch+gate I/O.
- Green: Voice Index.
2.1. Kslider ObjectIf a user is pressing the mouse on the kslider object when it receives a 'set' message, kslider resends the current pitch and velocity values. Of course, this would cause the allocator to create additional notes after it tries to turn off a note on the kslider. In prior designs, I placed a transparent 'note input' kslider object over the top of a 'note display' kslider object. However, then some notes remain turned on in the top kslider when they are actually turned off, requiring two clicks to start a new note instead of one. So new in this version is a simple solution: a little datapath in the top right of the subpatch which filters out identical messages from the kslider object. Now there is no more need for double clicks.
There is still one remaining minor problem: when clicking and dragging more than the maxvoice count, the first clicked note remains highlighted until the mouse is released. This appears to be because the first note is the 'active' note to the kslider object, but it changes to inactive color anyway once the mouse slides to a different note, and there is no option to modify this behavior. So it would be possible to use a transparent kslider overlay to remove that highlight, and to use the new design's output to both layers, to avoid second-click requirements on the top layer, and also hide the active highlight; but really the highlight is a display artefact that does not affect play, so for ease of demonstration, this design just has one kslider object.
In footnote, the kslider object also appears to be a class derivative of the pict object, so pitch and gate values ascend from the top left corner. Thus one has to subtract the velocity output from 128 to get something approaching a normal velocity value. The Godel design, which hopefully will reach final completion in the next few months, includes velocity scaling panels to properly adjust the amplitude range.
2.2. List Processing
The most difficult part of the implementation was slicing lists where one slice could be empty, depending on the slice point. This is because there is no output from 'zl slice' if that slice segment is empty, and if that is the left operand, then the trigger for the next object has to be created manually. The join object would be useful for combining more than two lists, but it cannot join lists if one of the lists is empty, hence one must combine segments with a string of 'zl join' tuple catenations. For note-off events this was particularly problematic. The current solution moves turned-off notes to the beginning of the queue and then rotates the complete queue to the left. As a result, there's always at least one value in the left operand, but at least one 'zlclear' message is still necessary before starting the list alteration, because when the right operand is empty, remnants from prior operations are not replaced and are re-inserted into new strings, causing data corruption. It may be that some better method will emerge, but this method appears to result in the least number of messages. In Godel 2, which used iterators, 32 voices could create several hundred messages for each note-off event, simply to update the age queue. The current implementation requires less than a dozen for the same task.
The second difficulty is that the only object designed to insert values into a list is the 'zl nth' object, which requires a tuple in the right inlet defining the index and value, followed by the list for the data insertion in the left inlet. For storage there is therefore a 'zl reg' object above a 'zl nth' object, with the output of 'zl nth' feeding into the 'zl reg' passive input. Therefore, to load a complete new list, one has to set one of the list values in the 'zl nth' right outlet, then put the entire list into its left inlet, in order to replace the 'zl reg' contents. This single requirement is the main reason of the massive amount of connections.
2.3. Voice-Count AccumulatorThe Max accumulator object requires a bang message to add or subtract a value and generate output. Thus to count voices, there is an add object which sends its result back into its passive input, which provides the same functionality as an accumulator, while simplifying trigger sequences, because only one message is required to add or subtract a value. The reset button sends the add object a '0 0' message to reset its output.
2.4. Note-On EventsOn note-on events, the object first checks to see if there are any voices available. If so, it simply sets the output to use the next voice in sequence, then increments the voice-count accumulator, and makes no change to the aqe queue. If there are no voices available, it first sends a note-off message for the oldest voice, then rotates the age queue to the right, so the oldest voice is in the first slot, and replaces the pitch and gate values with the new note data in the first slot. This is just the standard round-robin approach to ensure that the oldest voice is taken by new notes when the allocator is full.
2.5. Note-Off EventsThe object first searches for any active notes with the same pitch (voice overflow could have caused the voice to be turned off). If it's found, then the pitch and gate values are set to zero for that voice, and the active voice count decremented. Then the object looks up the index of that voice in the age queue, moves the voice number at that index to the end of the age queue, and shifts all the other voice numbers after its age index left.
2.6. Pitch, Gate, and Age ListsOn initialization, the pitch and gate stores are filled with zeroes, and the age store is filled with a number sequence '1 2 3...'. On reset the pitch and gate stores are set to all zeroes, but the age queue is not changed. When the pitch and gate stores change, the array is truncated to the maximum number of voices and output from the object as pitch and gate lists. The age queue is slightly different. It only contains values up to the number of available voices.
2.7. Max VoicesWhen this is changed to a lower value, the object slices the 32-element age list at the new value, creates a list of indexes for voices above the new value, and checks to see which ones of those have pitches above zero. If they do, indicating that the note was on, then note-offs are issued for those pitches, the voice-count accumulator decremented, and the pitch and gate stores cleared for that voice. When max-voice count increases, existing pitch and gate values are untouched, but the object adds new values to the age queue and resets them to the original sequence. Note that this is not ideal, and an incremental design improvement could preserve the existing aque queue via buffering for enhanced real-time performance control of playing voices.
A demonstration patch is available in the Synthcore2 bundle.