Published on December 23, 2024 · Reading time: 4 minutes
So, you have installed CircuitPython onto the development board you own, and you
want to use
framebuf
library to interact with the display buffer directly. Maybe you are kinda forced
to do so, because displayio
is not available for your device. Either way, you
try some demo code, and… it takes a whole second to simply fill a little OLED
screen with text.
Pretty bad, huh? Luckily, you can fix this by adding a few lines of code to your program, which will significantly improve the rendering time.
I have confirmed this code runs well on the Raspberry Pi Pico with a 128x64 SSD1306 OLED display installed. This will probably NOT work with color displays and pixel matrices.
Why is font rendering slow?
The framebuf
library is compatible with multiple types of displays, so it is
expected that its code will have a unified API, and as a side effect there will
be some performance loss. Every character you are going to render needs to be
read from the font file (there is no cache), and then it is written to the
buffer one pixel at a time.
It doesn’t sound bad on its own, but here’s the catch - instead of drawing a single pixel, a single 1×1 px rectangle is drawn instead. Every rectangle needs to be rotated, checked if it fits on the screen, and then the buffer data is actually updated. On top of that, this is a pure Python implementation, so no wonder it is so slow.
# Go through each row in the column byte.
for char_y in range(self.font_height):
# Draw a pixel for each bit that's flipped on.
if (line >> char_y) & 0x1:
framebuffer.fill_rect(
x + char_x * size, y + char_y * size, size, size, color
)
If you do not care about screen rotation, and you are sure you won’t go out of bounds, you can try to use a simpler implementation.
How this can be optimized?
It just so happens that the display buffer and font data are arranged in the same way. Monochrome displays with SSD1306, SH1106 or ST7565 driver have pixels arranged in pages, each one 8 pixels tall. Inside a font file, each column of a glyph is represented by a single byte that can be simply copied to the buffer.
The code responsible for drawing text is decoupled from the remaining
framebuffer implementation. We can easily override the draw_char()
method of
BitmapFont
class:
import struct
from adafruit_framebuf import BitmapFont
class FastBitmapFont(BitmapFont):
def draw_char(self, char, x, y, framebuffer, color, size=1):
if y % 8 != 0:
# Not aligned to the page, going back to the default (slower) implementation.
return super().draw_char(char, x, y, framebuffer, color, size)
# Go through each column of the character.
for char_x in range(self.font_width):
# Grab the byte for the current column of font data.
self._font.seek(2 + (ord(char) * self.font_width) + char_x)
try:
line = struct.unpack("B", self._font.read(1))[0]
except RuntimeError:
continue # maybe character isnt there? go to next
# THIS SINGLE LINE REPLACES THE framebuffer.fill_rect() CALL
framebuffer.buf[framebuffer.width * (y >> 3) + x + char_x] |= line
Add this to your display initialization code:
display = SSD1306_I2C(128, 64, I2C(board.GP21, board.GP20))
display._font = FastBitmapFont()
Now the same program needs only about 200 ms to complete.
But wait, there’s more! Have you noticed that in each loop iteration, we read
only one byte of glyph data? What if we read all bytes at once and skip the
struct
library entirely?
from adafruit_framebuf import BitmapFont
class FastBitmapFont(BitmapFont):
def draw_char(self, char, x, y, framebuffer, color, size=1):
if y % 8 != 0:
# Not aligned to the page, going back to the default (slower) implementation.
return super().draw_char(char, x, y, framebuffer, color, size)
# Grab bytes for the current glyph from font data.
self._font.seek(2 + (ord(char) * self.font_width))
data = self._font.read(self.font_width)
# Go through each column of the character.
for char_x in range(self.font_width):
framebuffer.buf[framebuffer.width * (y >> 3) + x + char_x] |= data[char_x]
With this custom BitmapFont
implementation, we are down to less than 100 ms.
At this point, the only way to improve the performance is to either cache glyphs
(which does not seem to help), try to increase I2C or SPI frequency, come up
with another solution that does not use framebuf
internals, or use another
programming language.
Check out other blog posts:
-
Tracking libadwaita adoption in Fedora (updated)
2024-10-29
The complete list of software preinstalled in Fedora, including apps using the libadwaita library.
-
Reinstalling Debian, fast
2024-10-12
Installing Debian with core GNOME, fixing UI inconsistencies, restoring software needed on a home PC.
-
Creating ST7565 driver for CircuitPython
2024-05-30
Analyzing original framebuf driver and implementing modern version for displayio compositor.