DEV Community

Cover image for Image Generation using Amazon Nova Canvas

Image Generation using Amazon Nova Canvas

🧠 Overview

The Image Generator is a web-based application that allows users to generate custom images using text or voice prompts. It utilizes AWS services such as Amazon Bedrock's Nova Canvas model for image generation and AWS Transcribe for converting voice input into text. The application has a React + HTML/CSS frontend, a FastAPI backend hosted on an EC2 instance, and uses Amazon S3 for image storage.

πŸŽ₯ Image Showcase

Check out these cool images the web-based application generated using Amazon Nova Canvas:

Image 1

Image 2

πŸŽ₯ Demo Video

Want to see it in action?

πŸ‘‰ Watch the full working demo of the Image Generator


✨ Amazon Nova Canvas Model - Why and How I Used It

The Amazon Nova Canvas model was chosen for its:

  • ✨ Photorealistic image generation capabilities.
  • ⏱ Low-latency inference through Amazon Bedrock.
  • 🧠 Structured prompt compatibility, allowing controlled and formatted input.
  • πŸ”Œ Seamless API integration with Bedrock, reducing the need for infrastructure management.

How I Used It (Technical Execution)

I have used the model via the Bedrock API by sending a structured message payload. Here’s how:

  • I have structured the input like a conversation, as expected by Nova Canvas.
  • The following is the core FastAPI code (main.py) that handles prompt input, connects to Amazon Bedrock, and returns the generated image:
from fastapi import FastAPI, Form, UploadFile, File
from fastapi.responses import StreamingResponse, JSONResponse
from fastapi.middleware.cors import CORSMiddleware
from io import BytesIO
import boto3
import base64
import json
import os
import time

app = FastAPI()

app.add_middleware(
    CORSMiddleware,
    allow_origins=["*"],
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Clients
s3_client = boto3.client("s3")
transcribe_client = boto3.client("transcribe")
bedrock_client = boto3.client("bedrock-runtime", region_name="us-east-1")

# S3 bucket
BUCKET_NAME = "bedrock-video-generation-us-east-1-6jywv6"

@app.post("/generate")
async def generate_wallpaper(prompt: str = Form(...)):
    body = {
        "messages": [
            {
                "role": "user",
                "content": [{"text": prompt}]
            }
        ]
    }

    try:
        response = bedrock_client.invoke_model(
            modelId="amazon.nova-canvas-v1:0",
            contentType="application/json",
            accept="application/json",
            body=json.dumps(body)
        )

        response_body = json.loads(response["body"].read())
        base64_image = response_body["output"]["message"]["content"][0]["image"]["source"]["bytes"]

        # Save and return image
        output_path = "output_wallpaper.png"
        with open(output_path, "wb") as f:
            f.write(base64.b64decode(base64_image))

        return StreamingResponse(open(output_path, "rb"), media_type="image/png")

    except Exception as e:
        return JSONResponse(status_code=500, content={"error": str(e)})


@app.post("/transcribe")
async def transcribe_audio(file: UploadFile = File(...)):
    # Step 1: Upload to S3
    job_name = f"transcription-job-{int(time.time())}"
    s3_key = f"uploads/{job_name}.wav"

    s3_client.upload_fileobj(file.file, BUCKET_NAME, s3_key)

    s3_uri = f"s3://{BUCKET_NAME}/{s3_key}"

    # Step 2: Start transcription job
    transcribe_client.start_transcription_job(
        TranscriptionJobName=job_name,
        Media={"MediaFileUri": s3_uri},
        MediaFormat="wav",
        LanguageCode="en-US",
        OutputBucketName=BUCKET_NAME
    )

    # Step 3: Poll until the job finishes
    while True:
        status = transcribe_client.get_transcription_job(TranscriptionJobName=job_name)
        job_status = status["TranscriptionJob"]["TranscriptionJobStatus"]

        if job_status in ["COMPLETED", "FAILED"]:
            break
        time.sleep(2)

    if job_status == "FAILED":
        return JSONResponse(status_code=500, content={"error": "Transcription failed"})

    # Step 4: Get transcript from S3
    transcript_uri = status["TranscriptionJob"]["Transcript"]["TranscriptFileUri"]
    transcript_json = boto3.client("s3").get_object(
        Bucket=BUCKET_NAME,
        Key=f"{job_name}.json"
    )
    transcript_data = json.loads(transcript_json["Body"].read())
    transcript_text = transcript_data["results"]["transcripts"][0]["transcript"]

    return {"prompt": transcript_text}
Enter fullscreen mode Exit fullscreen mode
  • This payload is sent via the FastAPI backend to the Bedrock API using the model ID amazon.nova-canvas-v1:0.
  • The response contains an image blob which is returned to the frontend and rendered for the user.

This allowed users to see their ideas transformed into visuals almost instantly.

  • On the frontend, I have built a React application where users can type or speak their prompts. Here's App.js file that shows how we send the prompt to the backend and render the generated image:
import React, { useState } from 'react';
import axios from 'axios';
import { ReactMediaRecorder } from "react-media-recorder";

function App() {
  const [prompt, setPrompt] = useState("");
  const [image, setImage] = useState(null);
  const [loading, setLoading] = useState(false);

  // Function to handle wallpaper generation
  const handleGenerate = async (textPrompt) => {
    if (!textPrompt) {
      alert("Please provide a prompt.");
      return;
    }

    setLoading(true);
    const formData = new FormData();
    formData.append("prompt", textPrompt);

    try {
      const response = await axios.post("http://98.81.151.118:8000/generate", formData, {
        responseType: 'blob'
      });

      const imageUrl = URL.createObjectURL(response.data);
      console.log("βœ… Image generated:", imageUrl);
      setImage(imageUrl);
    } catch (error) {
      console.error("❌ Error generating wallpaper:", error);
      alert("Error generating image.");
    } finally {
      setLoading(false);
    }
  };

  // Handle audio file upload and transcription

const handleGenerate = async (textPrompt) => {
  setLoading(true);
  const formData = new FormData();
  formData.append("prompt", textPrompt);  // Append the prompt as form data

  try {
    // Send form data instead of JSON
    const response = await axios.post("http://98.81.151.118:8000/generate", formData, {
      responseType: 'blob',  // Ensure the response is a blob (image)
    });

    // Create an object URL from the blob response and set it to the state
    setImage(URL.createObjectURL(response.data));
  } catch (error) {
    console.error("❌ Error generating wallpaper:", error);
    alert("Error generating image");
  } finally {
    setLoading(false);
  }
};


  return (
    <div style={{ padding: "2rem", textAlign: "center" }}>
      <h1>🎨 AI Wallpaper Generator</h1>

      {/* Text input for manual prompt */}
      <input
        type="text"
        value={prompt}
        onChange={(e) => setPrompt(e.target.value)}
        placeholder="Describe your wallpaper..."
        style={{ width: "300px", marginRight: "1rem" }}
      />
      <button onClick={() => handleGenerate(prompt)}>Generate</button>

      <br /><br />

      {/* Audio recorder button */}
      <ReactMediaRecorder
        audio
        render={({ startRecording, stopRecording }) => (
          <button
            onMouseDown={startRecording}
            onMouseUp={stopRecording}
          >
            πŸŽ™οΈ Hold to Speak
          </button>
        )}
        onStop={handleAudioUpload}
      />

      {loading && <p>✨ Generating wallpaper...</p>}

      {/* Image display */}
      {image && (
        <img
          src={image}
          alt="Generated Wallpaper"
          style={{ marginTop: "2rem", maxWidth: "90%", borderRadius: "12px" }}
        />
      )}
    </div>
  );
}

export default App;
Enter fullscreen mode Exit fullscreen mode

In addition to the React frontend, I have also built a lightweight index.html version using plain HTML, JavaScript, and the MediaRecorder API. This ensures the app remains functional in environments where React isn’t available or during testing:

<!DOCTYPE html>
<html lang="en">
<head>
  <meta charset="UTF-8" />
  <title>✨Image Generation using Amazon Nova Canvas</title>
  <style>
    body {
      margin: 0;
      padding: 0;
      font-family: 'Segoe UI', sans-serif;
      background: linear-gradient(to bottom right, #d9a7c7, #fffcdc);
      display: flex;
      flex-direction: column;
      align-items: center;
      padding-top: 40px;
      transition: background-color 0.3s, color 0.3s;
    }

    .container {
      background: rgba(255, 255, 255, 0.8);
      backdrop-filter: blur(10px);
      border-radius: 20px;
      padding: 30px;
      max-width: 600px;
      width: 90%;
      box-shadow: 0 8px 20px rgba(0, 0, 0, 0.15);
      text-align: center;
    }

    h1 {
      margin-bottom: 20px;
      font-size: 28px;
    }

    input[type="text"] {
      padding: 12px 16px;
      font-size: 16px;
      width: 100%;
      border: none;
      border-radius: 10px;
      margin-bottom: 15px;
      box-shadow: inset 0 2px 6px rgba(0,0,0,0.1);
    }

    button {
      padding: 12px 20px;
      font-size: 16px;
      border: none;
      border-radius: 30px;
      margin: 10px 5px;
      cursor: pointer;
      transition: background 0.3s, transform 0.2s;
      box-shadow: 0 4px 8px rgba(0,0,0,0.2);
    }

    button:hover:not(:disabled) {
      transform: translateY(-2px);
    }

    button:disabled {
      background-color: #ccc;
      cursor: not-allowed;
    }

    #generateBtn {
      background: linear-gradient(to right, #4facfe, #00f2fe);
      color: white;
    }

    #recordBtn {
      background: linear-gradient(to right, #ff9966, #ff5e62);
      color: white;
    }

    #stopBtn {
      background: linear-gradient(to right, #f2709c, #ff9472);
      color: white;
      display: none;
    }

    #downloadLink button {
      background: linear-gradient(to right, #11998e, #38ef7d);
      color: white;
    }

    .loading-spinner {
      border: 4px solid #f3f3f3;
      border-top: 4px solid #4fa3f7;
      border-radius: 50%;
      width: 30px;
      height: 30px;
      animation: spin 2s linear infinite;
      margin: 10px auto;
    }

    @keyframes spin {
      0% { transform: rotate(0deg); }
      100% { transform: rotate(360deg); }
    }

    .mic-animation {
      font-size: 40px;
      animation: bounce 1s infinite;
    }

    @keyframes bounce {
      0%, 100% { transform: translateY(0); }
      50% { transform: translateY(-10px); }
    }

    #audioStatus {
      font-size: 16px;
      color: #2e7d32;
      margin-top: 10px;
      display: none;
    }

    img {
      max-width: 100%;
      border-radius: 12px;
      box-shadow: 0 6px 12px rgba(0,0,0,0.2);
    }

    #output {
      display: none;
    }

    .gallery {
      margin-top: 40px;
      display: flex;
      flex-wrap: wrap;
      gap: 16px;
      justify-content: center;
    }

    .gallery-item {
      width: 180px;
      background: white;
      border-radius: 12px;
      overflow: hidden;
      box-shadow: 0 4px 8px rgba(0,0,0,0.15);
      cursor: pointer;
      text-align: center;
      position: relative;
    }

    .gallery-item img {
      width: 100%;
      height: auto;
    }

    .gallery-item p {
      margin: 10px;
      font-size: 14px;
      padding: 0 8px;
    }

    .delete-btn {
      position: absolute;
      top: 10px;
      right: 10px;
      background-color: red;
      color: white;
      border: none;
      border-radius: 50%;
      padding: 5px;
      cursor: pointer;
      font-size: 12px;
    }

    .gallery-item:hover .delete-btn {
      display: block;
    }

    #gallerySearch {
      width: 90%;
      padding: 12px;
      font-size: 16px;
      border-radius: 8px;
      margin-bottom: 20px;
    }

    body.dark-mode {
      background-color: #333;
      color: #fff;
    }

    body.dark-mode .container {
      background: rgba(40, 40, 40, 0.8);
    }

    body.dark-mode .gallery-item {
      background: #444;
    }

    body.dark-mode .gallery-item p {
      color: #eee;
    }
  </style>
</head>
<body>
  <div class="container">
    <h1>✨Image Generation using Amazon Nova Canvas</h1>

    <input type="text" id="prompt" placeholder="Describe your image idea..." />
    <button id="generateBtn" onclick="generate()">Generate Image</button>

    <div id="loading" class="loading-spinner" style="display:none;"></div>

    <div id="output">
      <h3>Here’s your Amazon-Canva-generated Image:</h3>
      <img id="wallpaper" src="" alt="Generated wallpaper" />
      <br>
      <a id="downloadLink" download="image.png">
        <button>Download Image</button>
      </a>
    </div>

    <br />
    <button id="recordBtn" onclick="startRecording()">πŸŽ™οΈ Record Prompt</button>
    <button id="stopBtn" onclick="stopRecording()" disabled>Stop</button>
    <p id="audioStatus"><span class="mic-animation">🎀</span> Recording...</p>

    <br />
    <button onclick="toggleDarkMode()">πŸŒ™ / 🌞 Toggle Dark Mode</button>
  </div>

  <div class="container">
    <h2>πŸ–ΌοΈ Gallery</h2>
    <input type="text" id="gallerySearch" placeholder="Search in gallery..." oninput="searchGallery()" />
    <div class="gallery" id="gallery"></div>
    <button onclick="exportImages()">Export All Images</button>
  </div>

  <!-- Modal for Full Image View -->
  <div id="imageModal" style="display:none;">
    <div style="position:fixed; top:0; left:0; width:100%; height:100%; background: rgba(0, 0, 0, 0.7);">
      <span style="color:white; font-size: 40px; position: absolute; top: 10px; right: 10px; cursor: pointer;" onclick="closeModal()">βœ–</span>
      <img id="modalImage" style="width:100%; max-height: 90%; object-fit: contain; margin-top: 50px;" />
    </div>
  </div>

  <script src="https://cdnjs.cloudflare.com/ajax/libs/jszip/3.10.1/jszip.min.js"></script>
  <script>
    let mediaRecorder, audioChunks = [], history = [];

    async function startRecording() {
      try {
        const stream = await navigator.mediaDevices.getUserMedia({ audio: true });
        document.getElementById("recordBtn").disabled = true;
        document.getElementById("recordBtn").style.display = "none";
        document.getElementById("stopBtn").disabled = false;
        document.getElementById("stopBtn").style.display = "inline-block";
        document.getElementById("audioStatus").style.display = "block";

        mediaRecorder = new MediaRecorder(stream);
        mediaRecorder.ondataavailable = e => audioChunks.push(e.data);
        mediaRecorder.onstop = async () => {
          const audioBlob = new Blob(audioChunks, { type: 'audio/wav' });
          audioChunks = [];
          document.getElementById("audioStatus").style.display = "none";
          await transcribeAudio(audioBlob);
        };
        mediaRecorder.start();
      } catch (err) {
        console.error("Mic error:", err);
        alert("Microphone access denied.");
      }
    }

    function stopRecording() {
      if (mediaRecorder && mediaRecorder.state === "recording") {
        mediaRecorder.stop();
        document.getElementById("recordBtn").disabled = false;
        document.getElementById("recordBtn").style.display = "inline-block";
        document.getElementById("stopBtn").disabled = true;
        document.getElementById("stopBtn").style.display = "none";
      }
    }

    async function transcribeAudio(audioBlob) {
      const formData = new FormData();
      formData.append("file", audioBlob, "audio.wav");
      try {
        const res = await fetch("http://98.81.151.118:8000/transcribe", {
          method: "POST",
          body: formData
        });
        const data = await res.json();
        document.getElementById("prompt").value = data.prompt;
        await generate();
      } catch (err) {
        alert("Transcription failed.");
      }
    }

    async function generate() {
      const prompt = document.getElementById("prompt").value.trim();
      if (!prompt) return alert("Enter a prompt first!");

      document.getElementById("loading").style.display = "block";
      document.getElementById("generateBtn").disabled = true;

      try {
        const res = await fetch("http://98.81.151.118:8000/generate", {
          method: "POST",
          headers: { "Content-Type": "application/x-www-form-urlencoded" },
          body: new URLSearchParams({ prompt })
        });

        if (!res.ok) throw new Error("Failed to generate image");

        const blob = await res.blob();
        const imageUrl = URL.createObjectURL(blob);

        document.getElementById("wallpaper").src = imageUrl;
        document.getElementById("downloadLink").href = imageUrl;
        document.getElementById("output").style.display = "block";

        addToHistory(prompt, imageUrl);
      } catch (err) {
        alert("Error: " + err.message);
      }

      document.getElementById("loading").style.display = "none";
      document.getElementById("generateBtn").disabled = false;
    }

    function addToHistory(prompt, imageUrl) {
      history.unshift({ prompt, imageUrl });
      localStorage.setItem("wallpaperHistory", JSON.stringify(history));
      renderGallery();
    }

    function renderGallery() {
      const gallery = document.getElementById("gallery");
      gallery.innerHTML = "";

      history.forEach((item, index) => {
        const div = document.createElement("div");
        div.className = "gallery-item";
        div.innerHTML = `
          <img src="${item.imageUrl}" onclick="viewImage('${item.imageUrl}')" />
          <p><strong>Image ${index + 1}</strong></p>
          <p>${item.prompt}</p>
          <button class="delete-btn" onclick="deleteFromHistory(${index})">❌</button>
        `;
        gallery.appendChild(div);
      });
    }

    function viewImage(url) {
      document.getElementById("modalImage").src = url;
      document.getElementById("imageModal").style.display = "block";
    }

    function closeModal() {
      document.getElementById("imageModal").style.display = "none";
    }

    function deleteFromHistory(index) {
      history.splice(index, 1);
      localStorage.setItem("wallpaperHistory", JSON.stringify(history));
      renderGallery();
    }

    function searchGallery() {
      const query = document.getElementById("gallerySearch").value.toLowerCase();
      const filteredHistory = history.filter(item => item.prompt.toLowerCase().includes(query));
      renderFilteredGallery(filteredHistory);
    }

    function renderFilteredGallery(filteredHistory) {
      const gallery = document.getElementById("gallery");
      gallery.innerHTML = "";
      filteredHistory.forEach((item, index) => {
        const div = document.createElement("div");
        div.className = "gallery-item";
        div.innerHTML = `
          <img src="${item.imageUrl}" onclick="viewImage('${item.imageUrl}')" />
          <p><strong>Image ${index + 1}</strong></p>
          <p>${item.prompt}</p>
          <button class="delete-btn" onclick="deleteFromHistory(${index})">❌</button>
        `;
        gallery.appendChild(div);
      });
    }

    function exportImages() {
      const zip = new JSZip();
      history.forEach((item, index) => {
        zip.file(`image${index + 1}.png`, fetch(item.imageUrl).then(res => res.blob()));
      });

      zip.generateAsync({ type: "blob" }).then(content => {
        const link = document.createElement("a");
        link.href = URL.createObjectURL(content);
        link.download = "images.zip";
        link.click();
      });
    }

    function toggleDarkMode() {
      document.body.classList.toggle("dark-mode");
    }

    window.onload = () => {
      const saved = localStorage.getItem("wallpaperHistory");
      if (saved) {
        history = JSON.parse(saved);
        renderGallery();
      }
    };
  </script>
</body>
</html>
Enter fullscreen mode Exit fullscreen mode


πŸ—οΈ Project Architecture

User β†’ React Frontend
     β†’ FastAPI Backend
         β†’ /transcribe β†’ AWS Transcribe (voice to text)
         β†’ /generate  β†’ Amazon Nova Canvas via Bedrock (text to image)
         β†’ (Gallery*) β†’ Session-based temporary history
Enter fullscreen mode Exit fullscreen mode

Flow

πŸ› οΈ Technologies Used

Frontend:

  • ReactJS
  • MediaRecorder API
  • Axios
  • Vanilla JS fallback (index.html)

Backend:

  • FastAPI (Python)
  • Hosted on Amazon EC2 (Amazon Linux 2)

AWS Services:

  • Amazon Bedrock (Nova Canvas) – AI-generated images
  • AWS Transcribe – Converts voice to text
  • Amazon S3 – Optional audio storage

πŸ” Application Flow

Text Prompt:

  1. User types a description (e.g., "black car with green lights")
  2. Sent to /generate
  3. Image returned and displayed for download

Voice Prompt:

  1. User records voice with MediaRecorder
  2. Audio sent to /transcribe
  3. Transcribed to text via AWS Transcribe
  4. Used as input for /generate
  5. Image generated and displayed

πŸ“‘ Backend Endpoints

POST /transcribe

  • Input: Audio file
  • Process:
    • Save audio blob
    • Send to AWS Transcribe
    • Return transcribed text as { "prompt": "..." }

POST /generate

  • Input: Prompt string (form-urlencoded)
  • Process:
    • Create structured message
    • Invoke Nova Canvas model
    • Return image blob

πŸ–₯️ Frontend Modes

  • Text Input: User can type any creative idea and press Generate
  • Voice Input: Uses MediaRecorder to capture audio
  • Result: Image is displayed and can be downloaded

Fallback HTML version (index.html) includes:

  • Input field and generate button
  • Record button (asks for microphone permission)
  • Real-time rendering of generated image

πŸ—£οΈ Voice-to-Image Flow (Deep Dive)

  1. Recording via MediaRecorder
  2. Upload to /transcribe
  3. Transcription via AWS Transcribe
  4. Automatic generation via /generate with transcribed prompt
  5. Rendering on the frontend

Key Design Considerations:

  • Use of temporary file handling for audio blob
  • Parsing transcription response and routing to generation
  • Handling microphone permissions and browser compatibility

βœ… Be sure microphone access is allowed in the browser.


🎯 Conclusion

This project demonstrates how voice and text can serve as natural inputs for generating rich visuals using generative AI. By combining Amazon Bedrock, Nova Canvas, and AWS Transcribe, I have built a smooth, scalable app that lets users bring their imagination to life.


Top comments (0)