AMB82-Mini Embedded AI Vision - Capture Image & Analyze

Overview

In this article, we will build an Embedded AI Vision Camera using the Realtek AMB82-Mini Board. The AMB82-Mini is powered by the Realtek RTL8735BDM SoC, making it a strong, low-power alternative to the ESP32-CAM. It offers built-in IoT security and runs AI camera apps out of the box. You can program it easily in the Arduino IDE.

Using the AMB82-Mini Board we will build a Real-Time Generative AI Vision Camera. The board streams live video to an ILI9341 TFT screen. A button press freezes the current image. Then the board sends that image and a prompt to an AI service (OpenAI, Gemini, or Llama) for analysis.

When the AI returns its response, the board displays the text on the screen. Pressing the button again resumes the live stream. This simple demo covers image capture, AI inference, and on-device display of results. You may go through the Getting Started Guide before you go through the following article.

Bill of Materials

We need following components for this project. All the components can be purchased through the following link.

S.N.	Components Name	Quantity	Purchase Link
1	Realtek AMB82-Mini Board	1	Amazon \| AliExpress
2	ILI9341 TFT LCD Screen	1	Amazon \| AliExpress
3	Push Button Switch	1	Amazon \| AliExpress
4	Resistor 1K	1	Amazon \| AliExpress
5	3.7V Lithium-ion Battery	1	Amazon \| AliExpress
6	Boost Converter Module	1	Amazon \| AliExpress
7	Slide Switch	1	Amazon \| AliExpress
8	Battery Pin JST Connector	1	Amazon \| AliExpress
9	Vero Board/Zero PCB	1	Amazon \| AliExpress

Realtek AMB82-Mini IoT AI Camera Board

The Realtek AMB82-Mini IoT AI Camera Board is a development tool designed to streamline the creation of AI network camera applications.

Equipped with the highly integrated Realtek RTL8735BDM SoC, this board facilitates low-power 802.11 a/b/g/n WLAN and BLE solutions. Comprising of an Arm® v8M MCU, DualBand Wi-Fi, Bluetooth BLE5, an audio codec, an ISP, H264/H265 encoder, DDR2 128MB memory, and a neural network intelligent engine, the RTL8735BDM SoC ensures efficient amalgamation of diverse applications and controls.

The AMB82-Mini is optimized for battery-powered appliances, demonstrating swift boot-up time in milliseconds and consuming ultra-low power in mA/uA depending upon the applications. Its embedded security architecture with TrustZone/security mechanism and dual-band Wi-Fi ensures secure, high-quality H264/H265 video streaming with minimal power consumption, making it ideal for IoT applications.

In addition, the AMB82-Mini is compatible with multiple programming platforms such as RTOS, IAR, GCC, and Arduino IDE. It is not just limited to wireless network camera designs; the board’s internal NN engine can support edge AI devices, enabling the development of intelligent equipment and a plethora of AI models, including object detection, audio recognition, and facial recognition. With the AMB82-Mini, developers can explore the limitless potential of IoT and AI technology in their future products.

Key Features & Specifications of Realtek AMB82-Mini

MCU: 32-bit Arm v8M, up to 500MHz
NPU: Intelligent Engine @ 0.4 TOPS
Memory: 768KB ROM, 512KB RAM, 16MB Flash, Supports MCM embedded DDR2/DDR3L memory up to 128MB
Wi-Fi: 802.11 a/b/g/n, Dualband 2.4GHz/5GHz Wi-Fi & Wi-Fi simple config
Bluetooth: Bluetooth Low Energy (BLE) 5.1
Security: Hardware cryptographic engine, Secure boot, Trust-Zone, Wi-Fi WEP, WPA, WPA2, WPA3, WPS
Audio Codec: ADC/DAC/I2S
ISP/Video: HDR / 3DNR / WDR ; H264/H265/JPEG video encoder 1080p@30fps +720p@30fps
Camera module: JXF37 1920×1080 full HD CMOS image sensor with wide view angel FOV 130°optical lens
Interface: 1 Microphone on Dev Board, 2 Micro USB_B, 1 MicroSD card slot, 2 tact switch button, 3 UART, 2 SPI, 1 I2C, 8 PWM, 2 GDMA, Max. 23 GPIOs

Important Documents Links

Getting Started Guide and Projects Links

AMB82-Mini Embedded AI Vision – Capture Images, Send Prompts, Show Results

Now let’s move to the project part. We’ll build the AMB82-Mini AI Vision Camera that streams live video to an ILI9341 TFT LCD. It uses a push-button to freeze frames, sends images with prompts over Wi-Fi to OpenAI/Gemini/Llama for analysis. Finally it displays the AI-generated results on-screen.

Block Diagram and Hardware Design

This diagram shows how the AMB82-Mini AI vision camera is powered and controlled:

Power Chain: A 3.7 V LiPo battery feeds a slide switch, then a boost converter steps up to 5 V to run the AMB82-Mini (and its TFT LCD).
Video & UI: The AMB82-Mini drives an ILI9341 TFT LCD to stream live video. A push-button input lets you freeze the current frame and trigger analysis.
Cloud AI: On button press, the AMB82-Mini sends the frozen image over Wi-Fi to public generative AI services (OpenAI, Google Gemini or GroqCloud Llama). The returned result is then rendered back on the TFT display.

Circuit Diagram and Connections

Here is the circuit diagram for the AMB82-Mini AI Vision Camera project.

A 3.7 V LiPo battery feeds a slide switch (the system’s master on/off), whose output goes into a boost converter that raises the voltage to a stable 5 V. That 5V rail then powers both the AMB82-Mini (via its VUSB pin) and the ILI9341 TFT LCD (via its VCC pin).

Here is a connection between the AMB82-Mini and ILI9341 TFT LCD Screen.

TFT LCD Pinout	Signal	AMB82-Mini Pin Pinout
VCC	+5 V supply	5V (V_USB)
GND	Ground	GND
MOSI	SPI MOSI	13
MISO	SPI MISO	14
SCLK	SPI Clock	15
CS	SPI Chip Select	12
D/C	Data/Command	3
RESET	LCD Reset	4
LED	Backlight	3.3V (VDD33)

The push-button connects between GPIO 7 (configured with INPUT_PULLUP) and GND, with a brief software debounce to detect clean HIGH→LOW→release transitions.

Hardware Assembly

You can use a Zero-PCB or a Vero Board to assemble the circuit.

In case you just want to test the circuit and the project, you may use the breadboard for assembly.

That’s all from the hardware section. Now we can start programming the device.

Source Code/Program

After hardware assembly is done, let’s move to the programming part.

First, install the GenAI, VideoStream, AmebaILI9341 and JPEGDEC_Libraries into your Arduino IDE. Next, copy the following provided sketch and paste it in your Arduino IDE.

Update the wifi_ssid/wifi_pass with your network credentials and paste in your OpenAI, Gemini or Llama API key(s).

In the ANALYSIS section, uncomment exactly one of the three llm.openaivision(), llm.geminivision() or llm.llamavision() calls to select your cloud provider.

100

101

102

103

104

105

106

107

108

109

110

111

112

113

114

115

116

117

118

119

120

121

122

123

124

125

126

127

128

129

130

131

132

133

134

135

136

137

138

139

140

141

142

143

144

145

146

147

148

149

150

151

152

153

154

155

156

157

158

159

160

161

162

163

164

165

166

167

168

169

170

171

172

173

174

175

176

177

178

179

180

181

182

183

184

185

186

187

188

189

190

191

192

193

194

195

196

197

198

199

#include <SPI.h>

#include <WiFi.h>

#include "GenAI.h"

#include "VideoStream.h"

#include "AmebaILI9341.h"

#include <JPEGDEC_Libraries/JPEGDEC.h>

//—— Wi-Fi & API keys —————————————————————————

char wifi_ssid[] = "**************"; // your SSID

char wifi_pass[] = "**************"; // your password

String openAI_key = ""; // OpenAI API key

String Gemini_key = ""; // Gemini API key (comment OpenAI if used)

String Llama_key = ""; // Llama API key (comment OpenAI if used)

//—— Camera configuration ——————————————————————

#define CHANNEL 0

VideoSetting config(VIDEO_VGA, CAM_FPS, VIDEO_JPEG, 1);

//—— TFT display pins & settings —————————————

#define TFT_CS SPI_SS // SPI chip select

#define TFT_DC 3 // Data/Command control pin

#define TFT_RESET 4 // Display reset pin

#define ILI9341_SPI_FREQUENCY 80000000UL // 80 MHz SPI clock

AmebaILI9341 tft(TFT_CS, TFT_DC, TFT_RESET);

//—— JPEG decoder callback ————————————————

JPEGDEC jpeg;

int JPEGDraw(JPEGDRAW *p) {

// Render decoded JPEG block to TFT

tft.drawBitmap(p->x, p->y, p->iWidth, p->iHeight, p->pPixels);

return 1; // continue decoding

}

//—— GenAI client instance —————————————————————

WiFiSSLClient client;

GenAI llm;

//—— Button & font settings ——————————————————

#define BUTTON_PIN 7 // GPIO7 wired to GND on press

const uint8_t FONT_SIZE = 2; // unified font size

//—— Application states —————————————————————

enum State { STREAM, FROZEN, ANALYZED };

State state = STREAM; // start in streaming mode

//—— Image buffers & prompt ————————————————————

uint32_t img_addr = 0, img_len = 0;

String prompt_msg =

"Please describe the image, and if there is text, please summarize it";

//—— Helper: display a centered status message —————

void displayStatus(const String &msg) {

tft.fillScreen(ILI9341_BLACK);

tft.setFontSize(FONT_SIZE);

tft.setForeground(ILI9341_WHITE);

int16_t x = (tft.getWidth() - msg.length() * (6 * FONT_SIZE)) / 2;

int16_t y = (tft.getHeight() / 2) - (8 * FONT_SIZE) / 2;

tft.setCursor(max(0, x), max(0, y));

tft.println(msg);

}

//—— Helper: convert IP to string —————————————

String ipToString(const IPAddress &ip) {

return String(ip[0]) + "." + String(ip[1]) + "." +

String(ip[2]) + "." + String(ip[3]);

}

//—— Initialize Wi-Fi & show IP on screen ————————

void initWiFi() {

displayStatus("Connecting Wi-Fi...");

WiFi.begin(wifi_ssid, wifi_pass);

uint32_t start = millis();

while (WiFi.status() != WL_CONNECTED && millis() - start < 10000) {

delay(500);

}

if (WiFi.status() == WL_CONNECTED) {

displayStatus("Wi-Fi Connected!");

delay(500);

displayStatus("IP: " + ipToString(WiFi.localIP()));

} else {

displayStatus("Wi-Fi Failed");

}

delay(3000); // show for 3 seconds

tft.fillScreen(ILI9341_BLACK); // clear before streaming

}

//—— Helper: word-wrap & show AI response ————————

void displayText(const String &msg) {

tft.fillScreen(ILI9341_BLACK);

tft.setFontSize(FONT_SIZE);

tft.setForeground(ILI9341_WHITE);

tft.setCursor(0, 0);

int maxC = tft.getWidth() / (6 * FONT_SIZE);

for (int i = 0; i < msg.length(); i += maxC) {

tft.println(msg.substring(i, min(i + maxC, msg.length())));

}

//—— Detect a clean button press (HIGH→LOW→release) —

bool waitForButtonPress() {

static bool wasIdle = true;

bool pressed = (digitalRead(BUTTON_PIN) == LOW);

if (pressed && wasIdle) {

delay(50); // debounce

while (digitalRead(BUTTON_PIN) == LOW) delay(5);

wasIdle = false;

return true;

}

if (!pressed) wasIdle = true;

return false;

}

void setup() {

Serial.begin(115200);

pinMode(BUTTON_PIN, INPUT_PULLUP);

// TFT initialization

SPI.setDefaultFrequency(ILI9341_SPI_FREQUENCY);

tft.begin();

tft.setRotation(3); // 270°

tft.fillScreen(ILI9341_BLACK);

// Boot animation

for (int i = 0; i < 3; i++) {

displayStatus("Booting...");

delay(500);

tft.fillScreen(ILI9341_BLACK);

delay(500);

}

// Wi-Fi & camera

initWiFi();

Camera.configVideoChannel(CHANNEL, config);

Camera.videoInit();

Camera.channelBegin(CHANNEL);

// Warm up camera with a few frames

delay(500);

for (int i = 0; i < 3; i++) {

Camera.getImage(CHANNEL, &img_addr, &img_len);

jpeg.openFLASH((uint8_t*)img_addr, img_len, JPEGDraw);

jpeg.decode(0, 0, JPEG_SCALE_HALF);

jpeg.close();

delay(100);

}

tft.fillScreen(ILI9341_BLACK);

state = STREAM;

}

void loop() {

// Button-driven state machine

if (waitForButtonPress()) {

if (state == STREAM) {

// -> FROZEN: capture & display one frame

Camera.getImage(CHANNEL, &img_addr, &img_len);

jpeg.openFLASH((uint8_t*)img_addr, img_len, JPEGDraw);

jpeg.decode(0, 0, JPEG_SCALE_HALF);

jpeg.close();

state = FROZEN;

}

else if (state == FROZEN) {

// -> ANALYZED: run AI on frozen frame

displayStatus("Analyzing...");

Camera.getImage(CHANNEL, &img_addr, &img_len);

// 1) OpenAI Vision

String aiReply = llm.openaivision(

openAI_key, "gpt-4o-mini",

prompt_msg, img_addr, img_len, client

);

// 2) Google Gemini Vision

// String aiReply = llm.geminivision(

// Gemini_key, "gemini-2.0-flash",

// prompt_msg, img_addr, img_len, client

// );

// 3) GroqCloud Llama Vision

// String aiReply = llm.llamavision(

// Llama_key, "llama-3.2-90b-vision-preview",

// prompt_msg, img_addr, img_len, client

// );

Serial.println("-- AI Response --");

Serial.println(aiReply);

displayText(aiReply);

state = ANALYZED;

}

else { // ANALYZED -> STREAM

state = STREAM;

tft.fillScreen(ILI9341_BLACK);

}

// Continuous live streaming in STREAM state

if (state == STREAM) {

Camera.getImage(CHANNEL, &img_addr, &img_len);

jpeg.openFLASH((uint8_t*)img_addr, img_len, JPEGDraw);

jpeg.decode(0, 0, JPEG_SCALE_HALF);

jpeg.close();

}

Setting Up API Key for OpenAI, Gemini and Llama

You can follow the following steps to get the API Key for OpenAI, Gemini or Llama. You just need to use one of the API Key for this project. In my case, I choosed OpenAI for image analysis.

OpenAI API key

Go to https://platform.openai.com/ and sign in (or create an account).

In the left sidebar, click API Keys → Create new secret key.

Copy the generated key (it will start with sk-…).

Gemini (Google) API key

Go to the Google Cloud Console: https://console.cloud.google.com/ and sign in.
Create or select a project.
In the left menu, go to APIs & Services → Library, search for “Gemini API” (or “Generative AI” → “Gemini”) and Enable it.
Then go to APIs & Services → Credentials → Create credentials → API key.
Copy your new API key.

GroqCloud Llama API key

Sign in at https://console.groq.com/ (you may need to sign up).
In the dashboard, navigate to API Tokens (or Access → Tokens).
Click Create new token, give it a name, and copy the token.

Working of AMB82-Mini Embedded AI Vision Camera Project

Let us see how the working of the AMB82-Mini Embedded AI Vision Camera project works briefly.

Fig: Flowchart for Working of AMB82-Mini AI Vision Camera Project

On power-up, the AMB82-Mini runs a brief “Booting…” animation three times.

Then connects to your Wi-Fi network.

Once connected, it then displays its IP address for three seconds.

It then initializes the camera and ILI9341 TFT and immediately begins live video streaming, continuously fetching and decoding JPEG frames to the display.

When you press the push-button, the stream freezes on the current frame.

Pressing it a second time sends that frozen image—with your preconfigured prompt—to the selected AI service (OpenAI, Gemini, or Llama) over Wi-Fi.

Once the AI returns its analysis, the device switches into “AI Analysis” mode and renders the text result on the screen.

A third press clears the result and returns the system to live streaming mode.

Some More AI Image Analysis

Image Analysis 1 for Tide Detergent

AMB82-Mini Embedded AI Vision - Capture Image & Analyze — Fig: Image Analysis 1 for Tide Detergent

The image shows a bottle of Tide laundry detergent, featuring a bright yellow container with a handle. The label includes a fresh green design and may point to information regarding the product’s features or usage. The bottle is positioned on a flat surface with part of a keyboard or another object partially visible in the background.

Image Analysis 2 for Person (Man)

The image shows a person sitting against a plain background. He is wearing a light-colored sweater with a diamond pattern and a collared shirt underneath. His expression appears calm and neutral. The surrounding environment includes some furniture, but details are minimal. There is no text visible in the image.

Image Analysis 3 for Mixed Pickle Bottle

The image features a cylindrical container of “Mixed Pickle.” The container is predominantly yellow with a red lid. It likely contains mixed pickled vegetables. The background is a dark surface that enhances the visibility of the container. There is no additional text visible in the image.