{ Mario Zechner }

developer • coach • speaker

Boxie - an always offline audio player for my 3 year old

2025-04-26

Channeling the spirit of the Gameboy

Table of Contents

At the end of July 2024, I embarked on a journy to learn electronics so I could build little gadgets for my son. In my introductory post I described a few electronic toys he uses a lot, including the Tonie Box, a device that plays audiobooks and has a huge library of content. The Tonie Box is great, until it isn't. As outlined in my original post, it has a bunch of deficencies. Back then, I did not think I would ever be in a position to build my own replacement.

Fast forward a few months, and I've learned enough electronics to actually build my son's own audio player. Here's the premilinary result:

It's been in "prod" since January 2025. It's his daily driver, be it at the breakfast table or when going for stroll in his buggy.

What I learned these past months

Turns out it's easier than I thought. I had to pick up a few skills:

I also had to buy myself some tools. Here's my battle station.

Battle Station

The list of all my tools (including links to sites where you may or may not be able to purchase them):

Now, the past few entries on this blog may have given you the idea, that I'll teach you everything you need to get the a stage, where you can use all the above tools. I'm afraid to inform you that this is not the case. It would take considerable resources on my end, which I'd rather spend on my family and tinkering with my own projects.

However, the above list of tools and things to learn is a good starting point, once you're passed the "made an LED blink with an Arduino" stage. (SMD) soldering, PCB design, 3D modelling for 3D printing and so on are all things many gifted people have described in depth, either in book or video form, much better than I ever could. See my previous posts for recommendations.

I also found that studying existing designs is a fantastic way to learn, same as with software. For example, all [Adafruit products] are open source, meaning you can study their schematics and PCB designs for all those little helpful boards they offer. Want to design a power management circuit for your own board? Have a look at their the PowerBoost 1000, scroll down to the "Technical Details" section and enjoy the schematics and PCB layout files. Similarly, as I created my own ESP32-based boards, I took inspiration from the designs by Unexpected Maker and Waveshare. Just pick one of their boards, e.g. the excellent Waveshare ESP32-S3 Mini and find the schematics on their Wiki.

My final recommendation is to get away from Arduino as soon as you possibly can. It's a great way to learn, but eventually, you'll need to go one level deeper. That opens up a whole new world of possibilities. Like porting DOOM to the ESP32-S3 and have it run at 44FPS.

Eventually, you'll write your own Arduino like framework to handle all the low level stuff, like communicating with SPI displays, driving audio or neopixels, etc. My entry into this is mcugdx, a simple C-API on top of ESP-IDF that can already do a lot (on ESP32-S3 at least). You can also find the above DOOM port in the mcugdx examples. I don't currently give support or documentation. If you know your way around ESP-IDF, you should be able to figure it out, given the examples and the sdkconfig files found in the repository.

In the remainder of this post, I'll go through the design process of Boxie, walk you through the schematics and PCB layout, show you how I came up with the 3D printed enclosure design, and give an overview of the software I wrote to make it all work.

Like a Gameboy, but for audiobooks

I started by thinking about how I want the boy to interact with the device. I'm in love with the Gameboy form factor, so I wanted a device that is similar in size and shape. It should be portable, so it needed to be battery powered.

Instead of the Tonie Box's ears that control the volume, I wanted to have a knob he can turn to change the volume. To go to the previous or next chapter or song on the Tonie Box, you have to smack it hard. That usually does not work, and if it does, the Tonie figure usually falls off the device. I wanted simple buttons via which he can navigate through chapters and songs.

As for the actual audio content, the Tonie Box basically stores all audio on an SD card. The Tonie figures are just RFID tags that trigger playback of a specific audio file on the SD card. This whole setup requires the Tonie Box to be connected to the internet to download the audio files. It also (potentially) gives the company a lot of data on what the user is listening to.

I hate that. So I decided that the device will be always offline. The content must be brought in "physically", reminiscent of the cassette tapes of my youth. And what better form factor than that of a Gameboy cartridge? With a slight twist. When you insert a cartridge into a Gameboy, the label on the cartridge disappears partially into the slot. Makes sense, you then focus on the display in front to play the game.

While driving around in the car with my boy, we usually listen to his audiobooks via Android Auto and Spotify. That displays the audiobook's cover on the screen on my dashboard. While listening to the audiobook, he'd focus on that cover intensely. The covers usually depict the characters in the story, and are distinct for each story. I thus decided that the cartridge's label should always be fully visible to him, while he listens to the audiobook. This means the cartridge sticks out at the top of the device. Yes, it is a potential breaking point. Just means I need the design to be sturdy.

Finally, I want the device to be fail-safe as much as possible. No need for a power button. Inserting the cartridge will turn it on. Removing a cartridge will turn it off. Ideally, none of the components inside the device will be damaged or explode when the boy inevitably drops it. This includes the battery, which means LiPos are out, NiMH are in.

In summary, the device needs to be:

The cartridge

Since I use an ESP32-S3 as the brain, I opted for micro SD cards as the medium to store audio files. It's super easy to access a FAT filesystem on an SD card via ESP-IDF. Just wire up the SD card's pins to the ESP32-S3's GPIOs and use the SD card API.

To elicit that Gameboy cartridge feel and to make inserting and removing the cartridge easy and sturdy, I designed a custom PCB that holds and exposes the micro SD card's pins, and a 3D printed cover, onto which I could stick the label. You can view the end result in this instructional video for my SO, so she can assemble cartridges as well:

The PCB is a trivial 2 layer board. A micro SD card socket holds the card. Each of the pins is exposed to its corresponding big pad at the bottom of the PCB. Four M2 screw holes let me screw on the 3D printed cover.

To decide on the size, I drew a bunch of rectangles onto milimeter grid paper and picked the one that looked right. Grid paper is your best friend when getting a feel for sizes!

I designed the cover in Fusion 360. I exported the PCB design from EasyEDA as a STEP file and imported it. I created a sketch at the top of the PCB, projected the PCB outline onto it, and extruded it to a depth that looked good. Finally, I added a cutout for the SD card socket, screw holes, and an indented area for the label. That took 5 minutes.

I send off the PCB design to JLCPCB and ordered a batch of 100 PCBs, along with 100 SD card sockets from LCSC. I also printed 30 covers on my 3D printer.

I then proceeded to hand solder 100 SD card sockets onto 100 PCBs, cursing myself that I didn't order a stencil and didn't use JLCPCB's assembly service. It took 2 nights. I used a toothpick to apply the solder paste to the pads.

Yeah, I'm dumb. You might have noticed the 2 capacitor pads on the PCB layout above. I ended up not populating those, which saved a bit of time.

The cartridge slot connector

How do I connect the cartridge pads to the ESP32-S3? The Gameboy uses a traditional edge connector, with gold plated contacts on the cartridge and a corresponding connector in the cartridge slot.

Since I didn't know how to do "chamfered gold fingers" in EasyEDA, nor how to get JLCPCB to manufacture such PCBs, I opted for a different approach. But not without failing first.

The cartridge design was inspired by one of Abe's projects. In his design, he uses pogo pins on the cartridge reader to make contact with the pads on the cartridge. So I tried the same:

Which did not work at all. It was extremely hard to get the vertical spacing correct. Any even with a somewhat OK spacing, actually inserting a cartridge was tough. Too tough for a 3 year old. It was also very easy to bend the soldered pogo pins.

After some thinking, and given the requirement of "a 3 year old should be able to do it", I came up with this:

I use battery springs, which are super sturdy and survive contact with a 3 year old's enthusiasm. I don't event solder anything to them, but instead use ring terminals I can easily crimp onto the wires and screw onto the battery springs into the 3D enclosure.

The end result is a mechanism that is easy to assemble, easy to repair, and (so far) indestructible. Here's the first time I inserted a cartridge into the mechanism:

The device's internals were still living on a breadboard at this point, as I hadn't figured out how to actually put them into the enclosure yet.

Selecting a DAC, amp and speaker

The device should playback music read from an SD card on a mono speaker. For that, I need a digital to analog converter (DAC) that takes digital audio data from the ESP32-S3 and transforms and amplifies it to drive the speaker.

There are many integrated circuits available for that task. The MAX98357A is a popular and cheap choice. Adafruit sells an easy to use breakout board for it and you can find dozens of clones on AliExpress and Amazon.

The MAX98357A is a mono amplifier that can output up to 3W into a 4 Ohm speaker at 5V. That's more than enough for my use case. It also speaks the I2S protocol, which is a standard protocol for digital audio. My little mcugdx framework was already able to handle audio output through I2S, so the MAX98357A was a good fit.

Given the amp specs, I went on a hunt for a good speaker. I ordered a few spec compliant speakers from Amazon and discovered the art of speaker enclosure design.

I eventually ended up with a Visaton FR 7/4, which has pretty good bandwidth and sounds good to my ears. Better than the speaker on the Tonie Box.

ESP32-S3 board with power management and battery charging

Before starting this project, I actually dabbled in designing my own ESP32-S3 board. There are many great boards on the market, with those from Waveshare and Unexpected Maker being my favorite. But I wanted to learn how to design such a board, should I ever need something that's not available on the market.

Turns out, I needed exactly that. None of the existing boards support charging NiMH batteries. They often don't have undervoltage protection built-in, but instead rely on your measuring the voltage via GPIO and then switching the ESP32-S3 into deep sleep. I wanted to avoid that. Some also don't like having both USB-C and a battery connected at the same time. A requirement for my device, as I don't want to unplug the batteries, just so I can connect a USB-C cable to flash new software or do some debugging.

I went through a few iterations and eventually ended up with this:

The board is meant as a jumping off point for all my project designs. As such, its features are not specific to this project, but rather things I've found useful in all my projects:

Here's the full schematic:

Let me walk you through the schematic real quick.

USB-C

Connecting a USB-C cable powers the battery charger and the device. The 5.1k Ohm resistors on the CC1 and CC2 pins allow the device to draw 0.5A if connected to a USB 2 source, or 0.9A if connected to a USB 3 source. The USB-C connector provides 5V to the system if connected to a USB source like my laptop.

For debugging and flashing, the DN and DP pins are connected to the ESP32-S3 GPIOs 19 and 20. The ESP32-S3 has built-in USB capabilities that handle all the communication. When routing these traces on the PCB, it's important to maintain equal length for both signals (known as a differential pair) to ensure proper USB signal integrity.

ESP32-S3

There are various variants of the ESP32-S3. I picked one with a built-in antenna, 8MB of flash, and 8MB of PSRAM. That's been sufficient for all my projects so far. And I don't need to wire up those components myself. Here's how it is wired up.

This mostly follows the recommendation in the ESP32-S3 hardware design guidelines. I didn't include a crystal needed for accurate deep sleep timing, as I never need that feature.

C1 and C3 are decoupling capacitors. R3 is a pull-up on the enable pin, that gets "overwritten" if the reset button is pressed, which grounds the pin and resets the device. C4 is a debounce capacitor for the reset button. A similar button setup is found for the boot button on the left side. Pretty simple!

The GPIOs are exposed at the edge of the PCB.

Battery charging via BQ25171-Q1

I spent a long time trying to find an IC that can charge both NiMH and LiPo batteries. I eventually settled for the BQ25171-Q1, by Texas Instruments.

Here's the part of the schematic that deals with battery charging:

CN2 on the right side is a 2 pin 2mm PH connector, to which the battery is connected. VBAT is the battery voltage. On the left side of this block you find 5V_USB, which is the 5V from the USB-C connector. If USB-C is not connected, then VBAT powers the system, while the charger IC does basically nothing. If USB-C is connected, then 5V_USB powers the charger IC, which charges the battery. The battery is cut off from the system in that case, which we'll see in the next section.

The remainder of the schematic is made up of components to configure the charger IC.

R7 connected to ISET sets the charge current to approximately 0.5A. R8 pulls the TS pin down to ground, disabling battery temperature monitoring. The PCB layout has provisions to wire up an NTC thermistor instead of soldering the R8 pull-down resistor.

The CHM_TMR pin is connected to a dip switch, which selects one of two resistors. R16 signals to the IC that we want to charge a single-cell LiPo battery at maximally 4.2V, with a 5 hour safety timer. The IC will actually terminate charging earlier and follow the usual LiPo charging curve. R15 signals to the IC that we want to charge 3 NiMH batteries in series, with a max charging voltage of 4.2V and a 4 hour safety timer. This configuration will not disable charging when the voltage reaches 4.2V, but instead keep charging until the safety timer is hit (or the optional thermistor signals that the battery is too hot).

The VSET pin is connected to a 18k Ohm resistor, which sets the maximal charging voltage to 4.2V for both battery chemistries.

Finally, STAT1 and STAT2 are connected to LEDs, which are connected to VBAT via a 620 Ohm resistor each. STAT1 and STAT2 are open drain ouputs; the IC will pull them low if the LEDs should light up.

Power management

The system can either be powered by USB-C or by the battery. The USB-C voltage is 5V, while the battery voltage can range anywhere from 3-4.2V. The ESP32-S3 requires 3.3V, so we need to regulate the power source voltage down. We also want to prevent the battery from discharging below 3.08V, so we need a voltage monitor that can power down the ESP32-S3. Here's how that's done:

On the left side of the schematic, you see 5V_USB and VBAT. VBAT goes through a P-channel transistor. If USB-C is connected, the gate is pulled high, which will prevent power from the battery from reaching the system. If USB-C is not connected, the gate is pulled low, meaning that power from VBAT can reach the remainder of the system. This is a little trick I saw in the Unexpected Marker's ESP32-S3 Feather board. It's really simple power path management!

For extra safety, a schottky diode is placed after 5V_USB, just in case power from VBAT or some capacitor manages to sneak through for some reason.

C9 is a decoupling capacitor suggested by the TPS3839 datasheet. That is an ultra low power supply voltage monitor. If the voltage drops below 3.08V, it pulls its RESET pin low. That reset pin is connected to the EN pin of the low drop out voltage regulator to its right. If it is low, the voltage regulator is disabled, cutting off power to the system, thereby not draining the battery further. This is likely not strictly necessary, as the ESP32-S3 and other connected ICs will brown out way before the battery is drained below 3.08V. But it's a good safety net.

The low drop out voltage regulator is a LD56100. It can provide 1A of current with a dropout voltage of only 120mV. The audio player draws no more than 60mA even with the speaker at max volume. That results in an even lower dropout voltage, which means we can make good use of most of the battery capacity.

The final output of this block is a clean 3.3V voltage, which powers the ESP32-S3 and the audio amp and speaker.

PCB Layout

Once I've dug through countless datasheets and wired everything up in the schematic, I was finally able to start laying out the PCB. It's like advanced Tetris. I really enjoy it (even though I'm not good at it). Here's how I translated the schematic into a printable layout:

It's a 4 layer board, with two ground planes in the middle. The top hosts the circuits described above, while the bottom exposes most of the GPIOs. The board is breadboard friendly and also has screw holes for mounting in an enclosure. The power traces are thick enough to actually handle up to 1A of current, should the need arise. Though that kind of power could not be delivered via USB-C.

If you squint, you can see the differential pair for the USB signals. There are probably a gazillion things that could be done better, but it works and doesn't blow up, which is good enough for me. I'm pretty sure this wouldn't pass any kind of certification necessary to sell it as part of a commercial electronic device.

Soldering the board

I ordered a stencil along with the PCBs, which makes applying the solder paste a lot easier, compared to the tooth pick method I used for the cartridge PCBs.

Once the paste is applied, I start by dropping capacitors, resistors, tactile switches and other components with big pads onto the board. I then turn on my heating plate and let it melt the solder paste.

With the "easy" components out of the way, I then inspect the solder on the pads for trickier components, like the ESP32-S3, the 1.2mm by 1.6mm LDO voltage regulator, or the USB-C connector. Chances are that the solder paste has melted into a blob or formed bridges between the pads. That tends to happen when too much paste was applied, especially on extremely tiny sub millimeter pads. I then use solder wick or my trusty tooth pick to fix up any bridges.

Only when I deem the solder on the pads to be good, I then drop the "tricky" components onto the board and hope for the best.

Once everything is soldered, I flash a simple LED blinking sketch to make sure the board is somewhat working. In case it does not, I use my multi-meter to pain stakingly check for expected voltages across the board. I also use my thermal camera to check for overheating components.

Buttons, knobs and enclosure

I designed the enclosure pretty early on in the process, based on the speaker and cartridge dimensions, and the cartridge slot mechanism. The enclosure consists of a top and bottom half, which are held together by M3 screws.

The top has cutouts for the buttons, volume knob, speaker. It also has a mounting bracket for the battery springs which serves as connectors to the cartridge pads. A separate mesh cover is mounted on top the speaker hole to protect the speaker from my boy's destructive tendencies.

The buttons and knob are simple shapes, with their bottoms extruded, so they don't fall out of the enclosure through the cut outs. In case of the buttons, they sit atop simple tactile switches. The knob has a center cutout so it can be slid onto the potentiometer beneath it.

Turning the top upside down reveals two mounting structures to which a PCB can be screwed beneath the button cutouts. More on that PCB in the next section. You can also see spacers and screw holes for the battery springs, which make up the cartridge slot mechanism. There's also a cutout on the side for the USB-C connector for the ESP32-S3 board.

The bottom is pretty unremarkable. It features a bay for the 3 AAA NiMH batteries, a few spacers so positioning the top on the bottom is easy, and a weird little contraption consisting of what looks like a rod and two screw holes. Here's what that's for.

When a cartridge is inserted, the rod is pushed down, which pushes a switch that connects the battery to the ESP32-S3 board. It's a silly looking mechanism, but it has a few benefits. In the first iteration, I positioned the switch next to the cartridge slot. However, getting the position right was tricky. Also, routing the power wire around the magnet of the speaker led to interesting issues I was unable to debug or fix.

The rod solution just works.

The mother board

As you can see in the last video, the innards of the device were still living on a breadboard at this point. The ESP32-S3 board, the amp, and the buttons and knob needed to be mounted inside the enclosure somehow. I already designed mounting brackets on the top half of the enclosure, taking into account the size of the tactile swithces, the potentiometer, and their 3D printed covers. That allowed me to measure the left over space in the enclosure inside Fusion 360, based on which I sized a carrier PCB or motherboard. Here's the schematic:

To the left is the MAX98357A audio amplifier break out board. The ESP32-S3 board is at the center. At the bottom are the tactile switches and the potentiometer. And to the right are the connections to the battery springs of the cartridge slot. Everything is wired to appropriate pins on the ESP32-S3 board. The SD card traces also feature pull-ups as necessary.

It's a 2 layer board. The top layer houses the tactile switches and the potentiometer. The bottom layer has 2.54mm headers into which I can plug the MAX98357A breakout board and the ESP32-S3 board. In the top right there are pads for the SD card connections which are wired to the battery springs of the cartridge slot.

Here's the motherboard with tactile switches, potentiometer, and headers soldered to it:

And here it is with the MAX98357A breakout board and the ESP32-S3 board plugged in:

The motherboard is screwed to the top half of the enclosure and wired up with the cartridge slot mechanism.

Then I plug in the ESP32-S3 board and the MAX98357A breakout board and wire up the speaker to the amp, and the battery to the ESP32-S3 board.

(Note that the ESP32-S3 board is an earlier iteration but the principle is the same.)

The trick to getting this all lined up is to actually import 3D models of all components into Fusion 360.

EasyEDA allows exporting a 3D model of the PCB, including the components (if the component has a 3D model). This allows me to expirment with component placement and orientation, try out different PCB layouts, etc. That saves me enormous amounts of time and frustration.

Software

The software is stupid simple and based on my mcugdx framework. There entire source code is a single 390 LOC file that invokes the "easy" mcugdx API to:

You can dig into the mcugdx sources to how to implement all this under the hood on top of ESP-IDF.

Bonuns content: USB cartridge reader

For shits and giggle, I also designed a USB cartridge reader. Instead of having to unscrew the cover from a cartridge, extract the SD card, and then put it back together, the reader lets me modify the files on a cartridge by just plugging it into my laptop via USB, where it will show up as a mass storage device.

Here's the schematic:

It's really just a fancy wrapper around a Genesys Logic GL823K IC, a USB 2.0 SD/MSPRO Card reader controller.

I've never really used it. When I create a new cartridge, I just stick the SD card into my laptop before assembling the cartridge. But it was a fun side project none the less.

This page respects your privacy by not using cookies or similar technologies and by not collecting any personally identifiable information.