Radxa Dragon Q6A Qualcomm Dragonwing SBC: Breathing Life into AI Edge Computing

Radxa Dragon Q6A

The single board computer (SBC) market is constantly evolving, driven by demand for compact, low-power devices capable of handling increasingly complex workloads. The Radxa Dragon Q6A represents a significant step forward, particularly in the realm of edge AI. Powered by Qualcomm’s QCS6490 processor (based on the Dragonwing architecture), this SBC promises to deliver impressive performance at a competitive price point. This blog post will provide an in-depth look at the Radxa Dragon Q6A, focusing on its key features, AI capabilities, the value proposition, and practical considerations for deployment. We’ll also address the downsides outlined by early adopters, offering a balanced perspective for those considering this platform.

Qualcomm Dragonwing Architecture & Hardware Overview

At the heart of the Dragon Q6A sits Qualcomm’s QCS6490 system‑on‑chip, branded as a “Dragonwing” processor. The chip integrates:

  • 1 × Kryo Prime core @ 2.7 GHz (high performance)
  • 3 × Kryo Gold cores @ 2.4 GHz (balanced workloads)
  • 4 × Kryo Silver cores @ 1.9 GHz (efficiency)

This heterogeneous configuration gives you a total of eight CPU cores, allowing the board to handle mixed‑type workloads – from heavy inference tasks on the NPU to background Linux services.

GPU and Video Processing

The integrated Adreno 643 GPU supports OpenGL ES 3.2/2.0/1.1, Vulkan 1.1–1.3, OpenCL 2.2 and DirectX Feature Level 12. For video‑centric AI pipelines (e.g., object detection on live streams) the Adreno Video Processing Unit 633 can decode up to 4K 60 fps H.264/H.265/VP9 and encode up to 4K 30 fps, making it suitable for surveillance or multimedia edge devices.

Memory and Storage

  • LPDDR5 RAM options: 4 GB, 6 GB, 8 GB, 12 GB, 16 GB (5500 MT/s)
  • eMMC/UFS storage: up to 512 GB UFS module or 64 GB eMMC

Connectivity and I/O

InterfaceDetails
Wi‑Fi / BluetoothIEEE 802.11a/b/g/n/ac/ax (Wi‑Fi 6) + BT 5.4, two external antenna connectors (note: driver support currently missing in the Windows preview)
Ethernet1 × Gigabit RJ45 with optional PoE (requires separate PoE HAT)
USB1 × USB 3.1 OTG Type‑A, 3 × USB 2.0 Host Type‑A
HDMIHDMI 2.0 Type‑A, up to 3840 × 2160 (4K 30 fps)
M.2Key‑M slot supporting PCIe Gen3 x2 for 2230 NVMe SSDs
Camera1 × four‑lane CSI + 2 × two‑lane CSI, plus a four‑lane MIPI DSI display connector
GPIO40‑pin header with UART, I²C, SPI, PWM, 5 V and 3.3 V power rails

AI Capabilities: A Surprisingly Strong Contender

The most compelling aspect of the Radxa Dragon Q6A is its potential for edge AI applications. Qualcomm’s software ecosystem – QAIRT SDK, QAI-APP-BUILDER and QAI-HUB model library – provides a robust foundation for developing and deploying AI models. Out of the box, these tools support major CV (Computer Vision), LVM (Language & Voice Models) and VLM (Vision Language Models).

Qualcomm’s AI acceleration is split between the Hexagon Vector Extensions (HVX) DSP and a dedicated Tensor Accelerator. The DSP handles low‑precision operations efficiently, while the Tensor Accelerator provides high throughput for matrix multiplication in modern LLMs and vision transformers. Together they form the backbone of the board’s AI performance.

Early testing indicates impressive performance even on the modest 4/6/8GB version. Reports show ~100 tokens/second in prompt processing and over 10 tokens/second in generation with a 4096 context length using Llama3.2-1b. These figures that are highly competitive for an SBC in this price range. This suggests the Dragon Q6A can handle real-time AI inference tasks, opening up possibilities for applications like:

  • Computer Vision: Object detection, image classification, facial recognition
  • Natural Language Processing: tooling, text summarization, sentiment analysis
  • Edge Analytics: Real-time data processing and anomaly detection
  • Robotics: Autonomous navigation, object manipulation
  • Smart Home Applications: Voice control, personalized automation

The integrated Hexagon Tensor Accelerator is key to this performance. It’s designed specifically for accelerating machine learning workloads, enabling efficient execution of complex models without relying heavily on the CPU or GPU. This translates to lower power consumption and improved responsiveness – critical factors for edge deployments.

Software Support & Development Ecosystem

Radxa supports a variety of operating systems including RadxaOS, Ubuntu Linux, Deepin Linux, Armbian, ArchLinux, Qualcomm Linux (Yocto-based), and Windows on Arm. The availability of hardware access libraries for both Linux and Android platforms simplifies development and integration. However, it’s important to note that the software is still under active development and hasn’t reached a stable release state yet. This means users may encounter bugs or require recompiling the kernel and working with packages – potentially challenging for those unfamiliar with Linux subsystems.

Downsides & Practical Considerations

Despite its impressive capabilities, the Radxa Dragon Q6A isn’t without its drawbacks:

  • Limited Availability: Currently, shipments are primarily out of China, which can lead to difficulties and additional expenses for North American customers due to current trade conditions.
  • Thermal Management: The SBC runs hot when executing models, requiring a cooling solution. Radxa doesn’t offer official passive or active cooling systems, necessitating modification of existing solutions designed for other boards. This adds complexity and cost.
  • Software Maturity: As mentioned earlier, the software ecosystem is still evolving. Users should be prepared to debug issues, potentially recompile kernels, and work with Linux packages.

Comparison to Competing Edge AI (8GB) SBCs

DeviceNPU / AcceleratorApprox. Price (USD)Token Generation Speed*Prompt Processing Speed
Radxa Dragon Q6AQualcomm Hexagon (QCS6490) + Tensor Accelerator $85-1009.7 tokens/s110.3 tokens/s
Orange Pi 5Rockchip 3588 NPU (Mali‑G610)$150‑$1805.8 tokens/s 14.8 tokens/s
Nvidia Jetson Orin NanoCUDA GPU Core based on NVIDIA Ampere architecture$24938.6 tokens/s 8.8 tokens/s
Raspberry Pi 5 CPU + GPUBroadcom BCM2712 + VideoCore VII$85-1006.5 tokens/s4.3 tokens/.s

*Measured under similar quantization settings and batch size of 1. context of 4096, and using Llama 3.2-1B.

The Dragon Q6A’s advantage lies in its dedicated Tensor Accelerator that can sustain higher throughput for larger context windows, making it a compelling choice for on‑device LLM inference or multimodal tasks where latency matters.

Inference Pipeline LLM Prompt Execution Flow

The following Mermaid diagram visualizes the data flow from user input to NPU inference and back to the application:

The diagram highlights that the CPU handles tokenization and detokenization while the heavy matrix operations run on the NPU, keeping latency low and freeing CPU cycles for other tasks such as network handling or monitoring.

Conclusion

The Radxa Dragon Q6A represents an exciting development in the SBC landscape, offering a compelling combination of performance, AI capabilities, and affordability. Its Qualcomm Dragonwing processor and dedicated Hexagon Tensor Accelerator make it well-suited for edge AI applications. However, potential buyers should be aware of the downsides – limited availability, thermal management challenges, and software maturity issues. By carefully addressing these considerations, developers can unlock the full potential of this powerful SBC.

Next I look forward to putting the Dragonwing (including the Airbox Q900 once they continue shipping to the US) line to the test in a Hiwonder robotics application, where I believe it will outshine a traditional Raspberry Pi; at the same price point.

From Spooky Ambitions to Practical Lessons: Overwhelming Animatronics Powered by Local VLM

Animatronics Powered by Local VLM

The dream was simple enough: an AI-powered Halloween skeleton, affectionately dubbed “Skelly,” greeting trick-or-treaters with personalized welcomes based on their costumes. The reality, as often happens in the world of rapid prototyping and ambitious side projects, proved… more complicated. This post details the lessons learned from a somewhat chaotic Halloween night deployment, focusing on the implications inherent in edge AI systems like Skelly, and outlining strategies for a more controlled – and successful – iteration next year. We’ll dive into the design choices, the unexpected challenges, and how leveraging local Vision Language Models (VLMs) can be a powerful tool for privacy-focused applications.

The Initial Vision: A Local AI Halloween Greeter

The core concept revolved around using a Radxa Zero 3W, a connected USB webcam, built-in speaker controlled by a MAX98357A mono amplifier, and the animatronics of a pre-built Halloween skeleton. The plan was to capture images, feed them into an offline VLM like those available through LM Studio (powered by AMD Strix Halo platform), analyze the costumes (with Google gemma-3-27B), and generate a custom greeting delivered via text-to-speech (TTS) using PiperTTS. The original inspiration came from Alex Volkov’s work on Weights & Biases, utilizing a similar setup with Google AI Studio, ElevenLabs, Cartesia, and ChatGPT.

I opted for a fully offline approach to prioritize privacy. Capturing images that include children requires careful consideration, and sending that data to external APIs introduces significant risks. Local processing eliminates those concerns, albeit at the cost of increased complexity in model management and resource requirements.

The Halloween Night Reality: Overwhelmed by the Que

The biggest issue wasn’t technical – it was human. We anticipated a trickle of small groups, perhaps one to three treaters approaching Skelly at a time, uttering a polite “trick or treat.” Instead, we were met with waves of ten-plus children lining up like attendees at a concert. The system simply couldn’t handle the rapid influx.

The manual trigger approach – snapping pictures on demand – quickly became unsustainable. We struggled to process images fast enough before the next wave arrived. Privacy concerns also escalated as we attempted manual intervention, leading us to abandon the effort and join our kids in traditional trick-or-treating. The lack of good reproducible artifacts was a direct consequence of these issues; we were too busy firefighting to collect meaningful data.

Security Considerations: A Deep Dive into Edge AI Risks

This experience highlighted several critical risk considerations for edge AI deployments, particularly those involving physical interaction and potentially sensitive data like images of children:

  • Data Capture & Storage: Even with offline processing, the captured images represent a potential privacy breach if compromised. Secure storage is paramount – encryption at rest and in transit (even locally) is essential. Consider minimizing image retention time or implementing automated deletion policies.
  • Model Integrity: The VLM itself could be targeted. A malicious actor gaining access to the system could potentially replace the model with one that generates inappropriate responses or exfiltrates data. Model signing and verification are crucial.
  • GPIO Control & Physical Access: The Radxa Zero 3W’s GPIO pins, controlling the animatronics, represent a physical attack vector. Unrestricted access to these pins or the network could allow an attacker to manipulate Skelly in unintended ways,
  • Network Exposure (Even Offline): While we aimed for complete offline operation, the system still had network connectivity for initial model downloads and updates. This creates a potential entry point for attackers.

Reimagining Skelly: Controlling the Chaos

Next year’s iteration will focus on mitigating these risks through a combination of controlled interactions, robust security measures, and optimized processing. Here’s the plan:

1. Photo Booth Mode: Abandoning the “ambush” approach in favor of a dedicated photo booth setup. A backdrop and clear visual cues will encourage people to interact with Skelly in a more predictable manner.

2. Motion-Triggered Capture: Replacing voice activation with a motion sensor. This provides a consistent trigger mechanism, allowing us to time image capture and processing effectively.

3. Timing & Rate Limiting: Implementing strict timing controls to prevent overwhelming the system. A delay between captures will allow sufficient time for processing and response generation.

4. Visual Indicators & Auditory Cues: Providing clear feedback to users – a flashing light indicating image capture, a cheerful phrase confirming costume recognition, and a countdown timer before the greeting is delivered. This enhances user experience and encourages cooperation.

5. Enhanced GPIO Controls: Restricting access to the GPIO pins using Linux capabilities or mount namespaces. As well as limiting physical access to Skelly is key to reduce tampering.

Leveraging Local VLMs: A Python Example

The power of local VLMs lies in their ability to understand images without relying on external APIs. Here’s a simplified example demonstrating how to capture an image from a USB webcam and prompt Ollama with a costume greeting request using Python:

import cv2
import requests
import json

# Configuration
OLLAMA_API_URL = "http://localhost:11434/api/generate" # Adjust if necessary
MODEL = "gemma-3-27B"  # Or your preferred VLM model
PROMPT_TEMPLATE = "You are an AI assistant controlling a Halloween animatronic. The following is a base64‑encoded JPEG image of a person(s) in a costume.
Identify the costume in one short phrase and then respond with a friendly greeting that references the costume. Use a cheerful tone."

def capture_image(camera_index=0):
    """Captures an image from the specified webcam."""
    cap = cv2.VideoCapture(camera_index)
    if not cap.isOpened():
        raise IOError("Cannot open webcam")
    ret, frame = cap.read()
    if not ret:
        raise IOError("Failed to capture image")
    _, img_encoded = cv2.imencode('.jpg', frame)
    cap.release()
    return img_encoded.tobytes()

def prompt_ollama(image_data):
    """Prompts Ollama with the image data and returns the response."""
    headers = {
        "Content-Type": "application/json"
    }
    payload = {
        "model": MODEL,
        "prompt": PROMPT_TEMPLATE,
        "stream": False # Set to True for streaming responses
    }

    # Encode the image as base64 (Ollama requires this)
    import base64
    image_base64 = base64.b64encode(image_data).decode('utf-8')
    payload["prompt"] += f"\n[Image: {image_base64}]"

    response = requests.post(OLLAMA_API_URL, headers=headers, data=json.dumps(payload))
    response.raise_for_status()  # Raise an exception for bad status codes
    return response.json()['response']


if __name__ == "__main__":
    try:
        image_data = capture_image()
        greeting = prompt_ollama(image_data)
        print("Generated Greeting:", greeting)

    except Exception as e:
        print("Error:", e)

Important Notes:

  • This is a simplified example and requires the cv2 (OpenCV) and requests libraries. Install them using pip install opencv-python requests.
  • Ensure Ollama is running and the specified model (gemma-3-27B) is downloaded.
  • The image data is encoded as base64 for compatibility with Ollama’s API. Adjust this if your VLM requires a different format.
  • Error handling is minimal; implement more robust error checking in a production environment.

System Flow Diagram: Whisper to Piper via Ollama

Here’s a flow diagram illustrating the complete system architecture:

This diagram highlights the key components and data flow: a motion sensor triggers image capture, which is then processed by Ollama to generate a costume description and greeting. Piper TTS converts the text into audio, delivered through Skelly’s speaker. Whisper processing detects the “trick or treat” wake word, initiating the process.

Conclusion: Building Secure & Engaging Edge AI Experiences

The Halloween night debacle served as a valuable learning experience. While the initial vision was ambitious, it lacked the necessary controls and security measures for a real-world deployment. By focusing on controlled interaction, robust security practices, and leveraging the power of local VLMs like those available through Ollama or LM Studio, we can create engaging and privacy-focused edge AI experiences that are both fun and secure. The key is to anticipate potential challenges, prioritize user safety, and build a system that’s resilient against both accidental mishaps and malicious attacks. The future of animatronics powered by local VLM is bright – let’s make sure it’s also safe!

Bringing Skelly to Life: A Radxa Zero 3W Powered AI Halloween Skeleton

Halloween is my favorite time of year, and I love creating interactive experiences for trick-or-treaters. Last year, Alex Volkov’s project on Weights & Biases – building an AI-powered skeleton using a Raspberry Pi – sparked an idea. This year, I wanted to take that concept further, aiming for a smaller footprint and increased processing power by leveraging the Radxa Zero 3W single board computer. This blog post details my journey of transforming a standard Home Depot 3ft Halloween Classics Animated LED Dancing Skeleton into a locally AI-driven greeter. We’ll cover everything from dismantling the original animatronic, wiring up the Radxa Zero 3W, and setting the stage for integrating local vision models to recognize costumes.

The Inspiration & Goals

Volkov’s project was brilliant: using online AI services like Google AI Studio, ElevenLabs (for voice), Cartesia, and ChatGPT to create a responsive skeleton that could greet trick-or-treaters. However, relying on cloud services introduces latency, requires a stable internet connection, and could raise privacy concerns – not ideal for the often chaotic Halloween night. My goal was to replicate the interactive experience but move all processing local, using a more compact and powerful board than the Raspberry Pi 4. The Radxa Zero 3W seemed like the perfect fit. It packs significant punch in a tiny form factor, offering Wi-Fi connectivity, Bluetooth, and ample GPIO pins for controlling the animatronic components.

Disassembly & Component Identification: Getting to Know Skelly

The first step was understanding how the original skeleton worked. This involved carefully dismantling the 3ft dancing skeleton. Start by removing the back of the skull and chest plate; this provides access to the control board, battery pack, motors, and speaker. Be gentle – these animatronics aren’t built for extensive tinkering!

Inside the skull, you’ll find a DC motor controlling the mouth movement (yellow positive, white ground wires) and LEDs illuminating the eyes (red positive, black ground wires).

Photo of the skeleton's internal components after removing the skull backplate. Highlight the mouth motor and eye LEDs

Under the chest plate you will see a hardware speaker with two blue wires and another DC motor powering the body/arm movements (positive red, ground black). All these wires converge on a small control board.

Photo of the skeleton's internal components after removing the chest backplate. Highlight the control board, battery pack,  body motor, and speaker

It’s crucial to document everything as you go. I took numerous photos and created a wiring diagram to ensure I could reassemble everything correctly (or at least understand where things went if something went wrong!).

Close-up documenting the Home Depot 3ft Halloween Classics Animated LED Dancing Skeleton control board wiring

The original manufacturer uses transistors and capacitors to compensate for the fluctuating battery voltage – typically between 1.4V and 1.66V with three AA batteries in parallel, reaching around 5V. This is a good reminder that relying solely on the battery pack’s power output isn’t ideal; we’ll address this later.

Wiring Up the Radxa Zero 3W: The Heart of the Operation

The plan was to intercept the signals going to each component – mouth motor, eye LEDs, body motor, and speaker – and control them via the Radxa Zero 3W’s GPIO pins. This required carefully unsoldering these wires from the original control board.

Once unsolder, I connected each wire to a set of jumper pins, allowing me to easily breadboard and test connections before committing to permanent soldering. This also provides flexibility for future modifications.

Note: I included a 220 Ohm inline to help prevent the eye LEDs from burning out. Its not required, but its recommended to avoid burning out the LEDs during tinkering.

Close-up photo showing the wires from the skeleton's components connected to jumper cables.

Here’s the GPIO pinout I utilized on the Radxa Zero 3W (gpiochip3):

  • PIN_7 (gpiochip3 20) – Mouth motor open/close
  • PIN_11 (gpiochip3 1) – Eye LEDs illumination
  • PIN_15 (gpiochip3 8) – Body motor for dancing movement

The Radxa Zero 3W’s official documentation outlines the 40-pin GPIO interface. Since we’re using pins 12, 40 and 35 for our mono amp (more on that later), this leaves a good selection of readily available pins on gpiochip3 to control relays.

  • Pin 7: GPIO3_C4 (also PWM14_M0)
  • Pin 11: GPIO3_A1
  • Pin 15: GPIO3_B0
  • Pin 16: GPIO3_B1 (also PWM8_M0)

Note: PWM (Pulse-Width Modulation) can be used on Pin 7 and Pin 16 by enabling the correct device tree overlay using rsetup and/or u-boot. This allows for finer control over LEDs (dimming) and DC motor speeds, but wasn’t necessary for this initial implementation.

Powering Skelly: The Radxa Zero 3W to the Rescue

Initially, I considered powering the components directly from the battery pack. However, as mentioned earlier, the inconsistent voltage proved problematic. The solution? Leverage the Radxa Zero 3W’s 5V GPIO power rails! This provides a stable and reliable power source for all components.

To manage the current requirements of the motors and LEDs, I incorporated relays inline with each component’s wiring. Relays act as electrically controlled switches, allowing the Radxa Zero 3W to control the flow of power from its 5V output to the skeleton’s components.

showing how the Radxa Zero 3W controls the skeleton's components via relays

Relay Implementation: The Switching Mechanism

Each relay requires a control pin on the GPIO, which when activated, allows power to flow through it. The wiring is as follows:

  1. Connect the Radxa Zero 3W’s 5V output to both sides of each relay.
  2. Ground each relay and component to the Radxa Zero 3W’s ground pins.
  3. Connect the control pin on the GPIO to the relay’s control input.
  4. The output power of each relay connects to the corresponding component’s jumper (mouth motor, body motor, eye LEDs).

When the GPIO pin is set HIGH (active state), the relay closes, allowing power to flow from the Radxa Zero 3W to the component. When the pin is LOW, the relay opens, cutting off the power supply. This effectively gives us programmatic control over each animatronic function.

Breadboarding & Testing: Bringing it All Together

I started by breadboarding everything – connecting the Radxa Zero 3W, relays, jumpers, and a temporary power source to verify functionality. This is where patience is key! Double-check all connections before applying power. A multimeter is your best friend during this phase. Once I confirmed that each component responded correctly to the GPIO signals, I removed the breadboard and connected everything directly to the jumper wires for a more permanent connection.

Mounting & Final Assembly: Skelly Gets an Upgrade

With the wiring complete, it was time to mount the Radxa Zero 3W inside the skeleton’s chest cavity. I repurposed the original control board’s mounting point and used some 2.5mm standoffs to secure the Radxa Zero 3W in place. This ensured a snug fit without interfering with any existing components.

 Photo showing the Radxa Zero 3W and relays mounted inside the skeleton’s chest cavity

I then stuffed all the “guts” back into the body, wrote a simple web control page, using Flask, to test the functionality remotely, and ran through final testing. Success! Skelly was now responding to commands from my computer, ready for the next phase: AI integration.

The Next Phase: Local Vision Models & Costume Recognition

With Skelly reasonably buttoned up and his basic movements working, I’m moving on to the most exciting part of the project – leveraging a connected camera and locally hosted/trained vision models to determine trick-or-treaters’ costumes. This will involve using services like LMStuido on an AMD AI workstation to leverage models like Gemma 3 for vision to text. I plan to document this process in a future blog post, so stay tuned! But up next a deep dive on how to leverage device tree overlays and I2S via GPIO, to power an existing hardware speaker with a MAX98357A mono amp.

Resources & Further Exploration

This project has been a fantastic learning experience, combining hardware tinkering with software development and AI integration. The Radxa Zero 3W proved to be an excellent platform for this application, offering the power and flexibility needed to bring Skelly to life. I hope this blog post inspires you to create your own interactive Halloween experiences!