AI Officer Institute
AI Buddy
🔥 7
1,240 xp
DH
← Back to Mission 4
Generative AI for Business · Mission 04

Prompting Perfect Visuals

Words got the strategy to leadership. Now visuals take it to market. Build a repeatable visual system — not one image at a time, but a framework your team can run without you.

Brief 1

The 5-Element Framework

Most people open an AI image tool, type "professional product photo of an energy drink," and get something that looks like a stock photo from 2019. It is technically an image. It is not a visual system.

The gap between a generic AI image and a professional visual is the same gap you closed in Mission 3 between a bare prompt and a structured deliverable. Same principle. Different medium. Structure wins.

The 5-Element Framework

Five elements. Define all five before you generate anything.

1. Subject - What is in the image? Be specific. Not "energy drink" but "a single can of BOLT energy drink, matte black with electric blue accent stripe, condensation on the surface."

2. Style - What visual style? Photorealistic, editorial, flat illustration, 3D render, cinematic. If you do not know your options, ask AI.

3. Environment - Where is this? Studio with white backdrop, outdoor gym setting, minimalist desk, neon-lit bar counter. The environment sets the context.

4. Lighting and Mood - How does it feel? Bright and clean for product shots. Moody and dramatic for lifestyle. Warm and inviting for social. Lighting is the single biggest lever for changing the feel of an image.

5. Details and Enhancements - What finishing touches? Depth of field, lens flare, grain, specific camera angle, text overlay space, negative space for copy placement.

Worked Example: Bare Prompt vs. 5-Element

Bare prompt: "Create an image of an energy drink." Result: Generic can, generic background, generic lighting. Could be any product from any company. You would never put this on a landing page.

5-Element prompt: Subject = single BOLT can with matte black finish and electric blue accent, condensation visible. Style = photorealistic product photography. Environment = dark studio with single spotlight from above. Lighting and Mood = dramatic, high contrast, premium feel. Details = shallow depth of field, slight reflection on surface below, negative space on right for headline text.

Result: Looks like it came from a professional photo shoot. On-brand. Ready for the press release.

Same model. Same tool. The framework is the difference.

Want to go deeper? Ask your AI Buddy:

"I need to create a professional product image for BOLT energy drink. Before I write any prompt, walk me through the 5-Element Framework. For each element, give me three options that would work for a premium energy drink targeting health-conscious millennials. Then help me choose and write the .json prompt."

Key Insight

AI does not know what good looks like unless you tell it. The 5-Element Framework is how you tell it. Five decisions, defined before you generate, produce results that look professional instead of generic. This is the same principle from every mission: your input drives the output. In Mission 3 it was RACE and CRA. Here it is Subject, Style, Environment, Lighting and Mood, and Details. Different framework, same discipline.

Brief 2

VISION: Six Elements for Video

Video starts with a controlled image. You do not describe a video from scratch. You take an image you already generated using the 5-Element Framework and extend it into motion with purpose.

VISION: Six Elements for Video

V - Visual Anchor. Your starting image. This is what the viewer sees in frame one. It must be a controlled, on-brand image you already generated.

I - Intention. What emotion or message must land? Is this a product reveal? A lifestyle moment? A call to action? The intention drives every other decision.

S - Sequence. The movement or action, step by step. Camera pulls back to reveal the environment. Hand reaches for the can. Condensation drips. Define the sequence before you generate.

I - Immersion. The sensory layer. Lighting shifts, sound design cues, texture details. What makes this feel real? What makes the viewer stop scrolling?

O - Orbit. Camera motion and perspective. Static, slow zoom, pan left, orbit around subject, first-person POV. The camera movement carries the story.

N - Narrative Arc. How does the story evolve from start to finish? Even in a 6-second clip, there is a beginning, a shift, and an end.

Worked Example: From Image to Video

You have your BOLT product shot from Brief 1. Now you extend it.

Visual Anchor: The dramatic product shot - BOLT can, dark studio, spotlight from above. Intention: Premium product reveal. The viewer should feel "I want that." Sequence: Camera starts tight on condensation dripping down the can. Slowly pulls back to reveal the full can in the spotlight. Final frame: BOLT brand in the lower third. Immersion: Sound of condensation drip. Low ambient hum. Spotlight intensifies as the camera pulls back. Orbit: Slow dolly back, slight upward tilt to make the can feel heroic. Narrative Arc: Mystery (what am I looking at?) to reveal (BOLT) to desire (I want that).

Six decisions. Made before you generate. The result is a 6-second clip that feels intentional, not random.

Want to go deeper? Ask your AI Buddy:

"I have a product image for BOLT energy drink that I want to turn into a short video clip. Walk me through the VISION Framework. For each of the six elements, give me options that would work for a premium product reveal on social media. Then help me choose and write the .json video prompt for Google Flow."

Key Insight

Video without structure is just moving noise. VISION gives you six decisions to make before you generate, so the output has purpose. Do not write the prompt from scratch. Define the elements, let AI write the structured prompt, review it, run it. The AI Officer builds repeatable visual systems, not one-off clips.

Brief 3

Three Types of Visual AI Tools

You need to know what is under the hood. Not to become a designer. To make the right tool decision for your program.

Three Types of Visual AI Tools

LLMs with image generation. ChatGPT (DALL-E), Claude, Gemini (Imagen, Nano Banana). You are already using these. They generate images from text prompts and they are getting better fast. Good for most business visuals.

Embedded AI. Canva, Figma, Adobe - AI built into design tools you already use. Good when you need AI inside an existing workflow. But you are often paying for the wrapper, not just the model.

Purpose-built tools. Midjourney for artistic images. Runway for video. Google Flow for video generation. Flux and Stable Diffusion for specialized control. Use these when the LLMs cannot give you what you need.

When to Use What

For most business visuals - product shots, social graphics, presentation images - the LLMs are enough. Start there. For video, Google Flow or Runway. For brand-level artistic work, Midjourney.

The AI Officer's question is always the same: what does the program need? Match the tool to the outcome, not the hype.

Key Insight

99% of visual AI apps are just wrappers around the same models. Before your team signs up for another subscription, ask what is under the hood. If it is DALL-E with a pretty interface, you can do the same thing for free in ChatGPT. The AI Officer knows the difference.

Brief 4

What Is Style DNA?

One good image is easy. Eleven images and two videos that all look like they came from the same brand? That is a system.

What Is Style DNA?

Style DNA is a document that locks your brand's visual identity. It defines what stays fixed across every visual (colors, typography style, overall mood) and what varies by audience and platform (composition, environment, energy level).

Think of it like brand guidelines, but for AI-generated visuals. Anyone on the team picks up the Style DNA document, plugs it into a prompt, and gets on-brand output without asking you a single question.

Fixed vs. Variable

Fixed elements stay the same across every image and video: color palette, typography style, logo treatment, overall mood, quality level.

Variable elements shift by audience and platform: composition, environment, energy level, camera angle, aspect ratio, details.

The LinkedIn image feels different from the Instagram image. The training document header feels different from the TikTok thumbnail. But all of them are unmistakably BOLT. That is what Style DNA does.

Worked Example: Building BOLT's Style DNA

Fixed: Matte black primary, electric blue accent, clean sans-serif typography, premium and aspirational mood, photorealistic style.

Variable by platform: - LinkedIn = professional, clean, studio setting - Instagram = lifestyle, vibrant, environmental - TikTok = dynamic, close-up, high energy - Internal documents = minimal, structured, informational

With this document, anyone on Dana's team generates visuals that feel like BOLT. Without it, every image looks like it came from a different company.

Want to go deeper? Ask your AI Buddy:

"I need to build a Style DNA document for BOLT energy drink. Ask me about the brand's personality, target audience, visual preferences, and the platforms we will be publishing on. Then create a Style DNA template that defines what stays fixed and what varies by audience and platform."

Key Insight

The leadership test for Mission 4 is not "can you make a good image?" It is "could Dana hand this entire visual system to a new marketing associate on day one and get consistent, on-brand visuals without a single meeting?" If the answer is yes, you built a system. If not, you made some pictures. The AI Officer builds systems.

Brief 5

Practice Challenges

Everything from Missions 1 through 3 becomes visual. Dana has the strategy, the data, the go-to-market brief, and the content pack. Now she needs the visuals to take BOLT to market.

Practice Challenges

Challenge 1: Bare Prompt vs. 5-Element (10 min) Generate the same image twice - once with a bare prompt, once with the 5-Element Framework. See the difference structure makes.

Challenge 2: Structured Image Build (15 min) Use the full image workflow to create a professional BOLT product shot. .json prompt required.

Challenge 3: VISION Video Build (15 min) Take your Challenge 2 image and extend it into a 6-second video using the VISION Framework. .json prompt required.

Challenge 4: Build BOLT's Style DNA (15 min) Create the Style DNA document. Define fixed and variable elements. Map variables to personas and platforms.

Start the Practice Challenges: https://lab.ai-officer.com/program/785403/mission/2267521

Downloads: - Download: Mission 4 Challenge Guide - Download: Mission 4 Words to Know - Download: Mission 4 Prompt Library

REQUIRED FOR CERTIFICATION

Final Project: The BOLT Visual Campaign (60-90 min)

This is the work that earns your Mission 4 certification and completes your Generative AI Specialist series. This is the capstone. Everything from four missions comes together here.

Produce the complete visual package for BOLT mapped to the content pack from Mission 3.

Office of Revenue (6 images): Press release hero, blog post graphic, LinkedIn image, Instagram image, Facebook image, TikTok/X image.

Office of Talent (2 images): Internal training document header, team memo header.

Office of Innovation (2 images): Product storyboard scene, user story concept visual.

Office of Operations (1 image): Customer complaint response catalog header.

Videos (2 clips): Blog post video clip and internal training video clip. Both using VISION Framework and Google Flow.

For each image submit: The image, the .json prompt, your 5-Element decisions and why, how fixed elements held while variables shifted.

For each video submit: The video, the starting image, the .json VISION prompt, your six VISION element decisions and why.

Before You Submit: Three questions. If the answer to all three is yes, you built a system. 1. Does every visual stay within BOLT's Style DNA? 2. Does every visual feel right for its specific audience? 3. Could Dana hand this entire package to a new associate on day one and get on-brand visuals without a single meeting?

Launch Final Project: https://lab.ai-officer.com/program/785403/mission/2267521

SECTION 3: WRAP-UP

Key Takeaways

AI does not know what good looks like unless you tell it. The 5-Element Framework for images and the VISION Framework for video are how you tell it. Five decisions for stills. Six decisions for motion. Defined before you generate.

Your input drives the output. Same principle, every mission. In Mission 1 it was mindset. In Mission 2 it was data. In Mission 3 it was logic frameworks. Here it is visual frameworks. The discipline is always the same: structure before generation.

One good image is easy. A system is leadership. Style DNA is the document that makes it possible. When anyone on the team can generate on-brand visuals without asking you, that is a system. That is what you built.

The AI Officer builds systems, not images. The test is never "can I make something that looks good?" The test is "can someone else follow my system and get the same quality without me in the room?"

You have completed the Generative AI Essentials series. Four missions. Mindset. Data. Logic. Visuals. You have built the personal foundation for leading AI. Next: the Agentic AI series, where you take everything you have learned and scale it to your team.

Your Commitment

Before you close out: name one habit you will start, stop, or continue this week. Not a tool. A behavior. Something your manager would notice.

Checkpoint

Question 1: You need to create a professional product image for a client pitch. What do you do first?

A) Open an AI image tool and describe what you want B) Look for a similar stock photo to use as a reference C) Define all five elements of the 5-Element Framework before touching the tool D) Ask a designer to give you a brief

Correct answer: C. Structure before generation. AI does not know what good looks like unless you define it first. Five elements, all defined, produce professional results.

Question 2: What is a Visual Anchor in the VISION Framework?

A) The text overlay or logo that anchors the brand in the video B) The controlled starting image that video is built from C) The central message or emotion the video must deliver D) The aspect ratio and format specification for the platform

Correct answer: B. Video does not start from a text description - it starts from a still frame you have already generated and approved. The Visual Anchor ensures the video begins on-brand and visually controlled.

Question 3: What is Style DNA and why does it matter?

A) A document that shows which AI tools your brand is approved to use B) A template for writing 5-Element Framework prompts faster C) A document that defines fixed and variable elements of your brand's visual identity so anyone can generate on-brand visuals without a briefing D) A copyright registration for AI-generated brand assets

Correct answer: C. Style DNA is the system document that makes visual consistency possible at scale. Without it, every image looks like it came from a different company. With it, a new associate can generate on-brand visuals on day one.

Question 4: When should you use a purpose-built visual AI tool like Midjourney or Google Flow instead of the LLM you already use?

A) Always - purpose-built tools produce better results B) Never - stick with one tool to build mastery C) When the program requires artistic or video output that the LLMs cannot deliver at the required quality level D) When you want to avoid paying for a premium subscription

Correct answer: C. For most business visuals, the LLMs are enough. Use purpose-built tools when the program needs something the LLMs cannot reliably produce - artistic editorial images, controlled video generation, or specialized output.

Question 5: You have generated eleven images for the BOLT campaign. How do you know if you built a system?

A) All eleven images look identical B) The images all used the same .json prompt C) A new team member could use your Style DNA document and .json templates to generate on-brand visuals without asking you a single question D) You ran VRC on every image before submitting

Correct answer: C. That is the leadership test. Not "did I make good images?" but "could someone else follow my system and get the same quality without me in the room?" That is the AI Officer standard.

Certificate of Completion

AI Essentials Program Prompting Perfect Visuals - Mission 4 Complete

Generative AI Specialist Certification Earned

Outstanding work, Cadet. You have completed all four missions of Generative AI Essentials and earned your AI Specialist Certification.

Mission 1: You learned the other 50% - what it means to lead AI, not just use it. Mission 2: You fixed the fuel - clean data before anything else. Mission 3: You built the logic - frameworks that produce professional output your leadership trusts. Mission 4: You built the visual system - repeatable, on-brand, scalable.

You now have the personal foundation for leading AI programs. The mindset, the data discipline, the logic frameworks, and the visual systems. You have proven you can produce professional-grade output across text, analysis, and visuals.

Next: the Agentic AI series. Where you stop leading yourself and start leading your team.

Progress: Mission 1 (done) | Mission 2 (done) | Mission 3 (done) | Mission 4 (done)

Your Next Step: Agentic AI Essentials

Issued by AI Officer Institute Instructors: Dave Hajdu and David Nilssen dave@ai-officer.com | ai-officer.com

Course Experience Survey

[Survey placeholder - link to be added by Kate]

Words to Know

For definitions of all key terms from this mission, see Mission 4 Words to Know or visit: https://aiofficer.sg.larksuite.com/sync/[Mission4_Words_Link]

You can always ask your AI Buddy to explain any of these concepts in more detail. That is what he is there for.

Prompt Library

For copy-paste prompts from this mission, see Mission 4 Prompt Library or visit: https://aiofficer.sg.larksuite.com/sync/[Mission4_Prompts_Link]

Don't write prompts. Buddy it.

AI Buddy
Discuss these briefs
AI Buddy
AI Buddy
● Creating visual systems
Mission 4 Briefs: Prompting Perfect Visuals
AI Buddy
Hey! 👋 You're reading about turning words into professional visuals. This is where strategy becomes marketing.
AI Buddy
🎨 The Two Core Frameworks: The 5-Element Framework turns loose ideas into professional images. VISION extends images into video narratives.
AI Buddy
📊 The Key Insight: Structure in = quality out. Define your five elements before you generate. Define your VISION before you shoot video. Same principle from missions 1-3, applied to visuals.
AI Buddy
Questions about visual frameworks or video? Pick one below. 👇