Transcripts
1. Introduction to AI Avatars: You have been curious
about AI avatars, but feel overwhelmed by tools, demos, and mixed results,
you are in the right place. This course is created
by Bros Academy based on real production experience
from Bros AI Studio. In our studio, we don't
just experiment with AI. We use it to create AA
avatars, animated characters, full AI driven cartoons,
music video clips, and commercial advertising
videos for real use cases. AA avatars don't fall
because of bad tools. They fall because of
unclear workflow. This course is not a collection of random features or demos. It's a practical
end to end process. We actually use it
in our own projects from choosing a character
to creating a talking, moving AA avatar video. You will see how we design
consistent characters, write shorts, scripts
that work for AA, generate natural
voices, apply lip sync, no hype, no one click magic, just what works and why. This course will not
promise perfect results, viral videos or instant income. What it will give you
is a clear structure, realistic starting point, and the confidence to experiment
without guessing. By the end, you won't just have a finished AI avatar video. You will understand how
and why it was made. If you want to grounded
practical introduction to AI avatars used
in real projects, this course is for you.
Let's get started.
2. Module 1: AI Avatars - Types, Use Cases & Choosing Your Direction: Module one, AA avatars types, use cases and choosing
your direction. Before we start working
with tools and visuals, let's take a step back and talk about A avatars in general. In this module, we'll look
at what A avatars are, where they used
and how to choose the right direction
without overthinking it. When we talk about AI avatars, we mean digital characters created with the
help of AI tools. These avatars can
represent real people or they can be completely
fictional characters. The main purpose is communication
to explain something, tell a story, or
deliver a message. AA avatars are already used
in many different areas. You will often see them in social media content,
educational videos, marketing and advertising, and even presentation or
storytelling projects. They are flexible tools, and their role depends on
how you want to use them. There are several main
types of AA avatars. Some are realistic and based on real photos, others
are stylized, cartoon like, or semi realistic and some are
fully fictional characters. To be honest, there is
no single correct type. Each approach has
its own strengths. Realistic avatars are usually
created from real photos. They can feel very
personal and engaging, especially when
representing a real person. At the same time, they come
with a higher expectations for realism and require more
control and consistency. Fictional or stylized avatars are not based on real people. They offer more
creative freedom and are often easier to
maintain visually. They are also more forgiving
when it comes to motion, lip sync, and small
imperfections. Before choosing an avatar, it's important to ask yourself
a few simple questions. Why do you need this avatar?
Where it will be used? Do you want it to represent
you or a character, and how realistic does
it really need to be? One important thing
to remember is that you are not locked
into one choice. You can always
change your avatar later or create more than one. What matters most is not
making a perfect decision, but gaining experience
by actually trying. In this course, we will show you both approaches in practice. We will create a realistic
avatar based on real photos, and we will also generate a fictional avatar from scratch. This way, you understand
how each approach works and which one
fits your goals best. All right, that's the
end of this module. Let's quickly summarize
what we have learned. In this module, we
have learned what A avatars are where
they are used, and the main types
you will encounter. You have also seen how
to think about using an avatar direction without pressure or fear of
making the wrong choice. In the next module, we
will start building the visual foundation for your avatar and move
into hands on practice. In the next module, we will move from concepts to practice. You will see how AI
avatars are actually created using different
tools and approaches. We will work with real photos, generate fictional
characters, and focus on building a
consistent visual foundation. You can reuse later for
video and animation.
3. Module 2: Visual Foundation - Creating Your Avatar in Practice: Module two, Visual foundation, creating your
Avator in practice. Welcome to Module
two. In this module, we finally move from
theory to practice. I will work you
through how we create a avatars step by step using
different tools and setups. We are not aiming
for perfection here. The goal is to understand the process and
learn how to create avatars that are consistent
and usable in real projects. Will see a few common approaches from realistic avatar based on real photos to more stylized
cartoon like characters. We'll use different
tools along the way, but don't get to attach
to any specific one. The tools are just examples, the workflows that
really matters. As you watch, try to notice
some small decision, lighting, angles, prom details. Those little thing often make a bigger difference
than you would expect. Will also spend time on
angles and consistency, creating multiple views of the same character and preparing images for video or animation. And don't stress about
remembering everything. Focus on understanding why
things are done a certain way. You can always come
back later when you start building
your own avatar. All right, let's start with a very powerful tool that I
use a lot. Hicks felt AI. Hicksfld is a generative
AI platform that brings multiple images and video models together in one interface. Instead of being limited
to a single model, it lets you experiment
with different models, cinematic tools and
generation modes depending on what you
are trying to create. You can generate
images and videos, experiment with cinematic shots, control camera angles,
create scene variation, and explore different
visual styles. That's why it's a great tool
for creative production, marketing, and
visual storytelling. Start creating our avatar. The first thing we
need is a good prompt. For that, I usually go
straight to ChatGPT. You can use other
models as well like Gemini ASCI to help
generate the prompt. In this example,
I want to create a super realistic avatar of myself using my own
photos as a reference. I want the avatar
to be recognizable, so I'll keep some
clear visual details like wearing a blue hoodie. I also want a studio
microphone in front of me and some green plants
maybe on a background. I usually use photos like this, and you have to remember that your reference image should
be just like you in a frame, preferably a portrait
or a close up, and also a good quality, and most importantly, your face should be looking
straight at the camera. Once you edit your
reference image, let the AI do its magic, and let's wait for the prompt. All right, let's check
what ChatGPT gave us. Let's take a quick
look at the prompt. If everything looks good, just copy it and let's
get back to Hicksvild AA. The first way to create
your avatar is by training it using your own
photos with Hicksvild sole. This actually is my wife
favorite option right now. To get started, click Image in the top left corner and select
the model Hicks felt Sol. My wife and I already have
our Sol avatars trained, so I won't go through the full process from the scratch here. But what you need is to
click Generate New and upload around 15 maybe
20 photos of yourself. Take in mind that it's
the best to include a mix of close up shots
and full body photos. This helps the model
learn your face and body in different
positions and angles. For my avatar training, I use the same photos
I showed you earlier. Try to use similar
images of yourself, good lighting, clear face, and no other people
on the frame. The training process
take a bit of time. From what I remember, it usually takes about
ten or 20 minutes, so just be patient here. Once your avatar is ready, select it and past the prompt, what we got from ChatGPT. Since this course,
we are creating YouTube shorts,
TikTok style videos. Let's choose the
916 aspect ratio. For resolution, I usually go with the highest
available option. Right now, it's two K. Just a quick note
about the credits. Generating four images
costs two credits, and honestly, that's not much, especially compared
to other tools. And remember when
I said Hicksfeld is an AI model aggregator. That's one of the
biggest advantages. You can switch between different
models while paying for one subscription instead of
juggling five separate tools, which would be way
more expensive. Next up is my personal favorite at the moment, nana Banana Pro. Here we will use
the same prompt and the same reference image and
generate the avatar again. I will generate two
images using nana banana. At the moment, one image in two K resolution
costs two credits, so keep that in mind when
you are testing things. After that, let's
switch to CDRm 4.5. Same setup again, same prompt
and same reference image. I think by now you are starting to see what
we are doing here. The idea is to show you several different image
generation models using exactly same input. This gives you more
option to choose from and makes it easier
to compare results. Might like the look of one
model more than another, and that's totally fine. The goal is to
understand how to test, compare and choose the model
that works best for you. Next, I want to
show you link AI. You can actually use link
models inside Hicksville too. But when I'm specifically
working with link, I usually go straight
to the link website and generate images there. I will explain why I
prefer that a bit later. It will make sense once
you see the workflow. For now, let's generate
our avatar using 01 model or whatever the latest version is
available on your site. On the left side of the screen, click 01 buttom now switch
to image generation. Pase the prompt and upload
your reference photo. If you have a clink
subscription, you can generate
images for free. I have a subscription,
so I'm going to select the free generation option
by clicking this button. I will set four outputs, but you can choose up to nine if you want more variations. And for resolution, let's
go with two K. See, with subscription, it
shows zero credits. If you don't have
a subscription, generating one image usually
cost about one credit. Alright, I everything looks
good, let's hit generate. While the images are generating, I want to explain why I
often use link AI directly. Instead of running it
through Hicks field, even through Is that having one subscription
can be cheaper. The reason is pretty simple. In Bros Academy, we are active
creators and we generate a lot of images and videos both for ourselves
and for clients. That means we need
a lot of credits. Link AI has a really nice system that lets you earn
credits for free. Let me show you our
profile in link. We post our work
here regularly and sometimes participate
in different contests. Just posting your creatives doesn't earn you
credits by itself, but you do earn credits
when someone recreates your images or videos
using your on top of that, you get a small commission
from each create. People can also like your work and follow you if they
enjoy your style. So if you want to,
you can actually build the audience inside
the cling, as well. For example, yesterday, I
earned 160 free credits. Not my best result, but
that's totally fine. Some days are
better than others. Over the past few months, I have earned more than
32,000 free credits this way. If I were to buy those credits, it would cost roughly 400 bucks. On top of that, I earn
over about 50,000 free credits from other
activities offered in the past. Link also has a
referral program. At the moment I'm
recording this. If you use my referral link, we both get 500 free credits
for your first generations. It's a win win for both of us. I will link the link
in the description. All right, let's take a look at the results that link generated. I actually really like
how this one turned out. Before downloading the image, I recommend upscaling it to get the best
possible quality. It's a small step, but it
makes a noticeable difference. Now let's get back
to Hicksfeld AI and check what the different
models produced there. Personally, my favorite result
here is from C dream 4.5. But in your case,
the best result might come from different model, and that's totally fine. The goal here isn't to
pick up a correct model. My goal is simply to show you the options so you can choose
what works best for you. Go ahead and download the
image you like the most. And that's it for now. We just created our
first realistic avatar based on a photo reference. We tested several models and pick the one
we like the most. In the next module, we will
take this avatar further. We will generate more images
of the same character, but with different angles. So our video feels more
dynamic and natural. And if you'd like more a cartoon style
avatar, don't worry. I will also show you how to turn your realistic avatar into a pixel stay character
in just a few clicks. See you in the next module.
4. Module 2.1: Creating Multiple Angles and Styles for Your AI Avatar: Module 2.1, creating
multiple angles and styles for your AA avatar. In this module, we will take
your avatar step further. We will focus on creating multiple angles of
the same character, so your video feel more
natural and dynamic. You will see how to turn
your realistic avatar into a more stylized
pixel style version, if that's the direction
you want to explore. Alright, let's get
back to Hicksfeld and choose the nana
Banana Pro model. I have already picked the images I like the most
and downloaded it. Now I want to generate different camera
angles for this image while keeping the
same character and the same environment
as in the reference. Let's select our image
and update the prompt. Make sure that 916 Spec ratio is selected. Then
click Generate. I will generate four variations, but feel free to experiment with fewer or more outputs and
see what works best for you. All right, as you can see, we got a few different
options here. They're not perfect,
but that's right. In some cases, it helps
to be more specific in the prompt and clearly describe the camera
angle you want. But for now, I'm
mainly showing you the workflow and the
options that are available. Next, I want to show
you another tool inside Hicksville AI that based
on Nana Banana Pro model. It's called Shots. Let's go
to the top of the screen, click Apps and find Shots app. This app generates nine
different camera angles from a single uploaded image.
We use it quite often. So let me quickly show you a few examples of the work
that were generated with it. It's a great tool for telling your story in a
more cinematic way. A lot of people use avatars with just one image or
one camera angle. If you want to stand out,
it helps to do something a bit more interesting or at least understand
how it's done. All right, let's
upload our image. Give it a moment, then double
check the aspect ratio. In our case, 916
works perfectly. The generation
cost four credits. Let's click Generate
and wait about one, 2 minutes for the results. All right, we have
got nine shots here. I like few of them, so I'm going to pick up four
and upscale those. Each image upscale
costs two credits. You can also download
the images without upscaling if you don't need it or upscale them up to four X, which cost more
credits, of course. All right, let's
start the upscaling. It usually takes about
two or 3 minutes, so let's be patient and wait. Okay, now we can see
all the images we got. Let's download them and save everything into a
separate folder. And later maybe in
the next modules, I will generate a few
more angles where the avatar is looking straight at the camera,
lacking these shots. Now remember I said that I
will show you how to turn your avatar into a
cartoon style character just in case you want that look. You can generate
a cartoon avatar from scratch using
proms from ChatGPT, of course, but today we
will keep it simple. We will use nana Banana Pro in Hicksville with
a short prompt. I'm going to use the
same reference image and ask for a pixel
style version while keeping the
character features and environment as close as
possible to the original. Let's generate it,
and then we will jump into clean Gale to
compare the results. That way, you will see
how different tools handle the exact same task, and you can choose
what you like best. All right, in clink, select 01 model and switch
to image generation. Upload the reference image, paste the same prompt we used in Hicks field and
set four outputs. And now let's hit generate. Now let's get back
to Hicks field AA and take a look what we got. So what do you think, guys? Personally, I think
this looks really good. At this stage, you can easily experiment by changing
things like clothing, hairstyle or small
details just by tweaking the prompt and using
the image as a reference. It's very flexible setup. Now let's check what
we got from Klink AA. Hm. In my opinion, the results here looks better with the banana
banana Pro model. The link version feels a bit too simple and cartoony
for my taste. That said, you might
feel different, and that's totally fine. There is no single
best result here. My goal is simply to show you different options so
you can test them yourself and decide what works best for your style
and your projects. All right, that's
it for this module. Let's quickly summarize
what we have learned here. You learn how to expand single avatar images into
multiple views and styles. We explore how to generate different camera angles while
keeping the same character and environment and
why this matters for creating more dynamic and
natural looking videos. You also saw how
different AI models can produce very
different results, even when using the same
prompt and reference image, and why there is no
single right choice. Finally, we'll look at
how realistic avatar can be transformed into a more cartoon style
version and how to compare results
across different tools. The key takeaway here is a
workflow, testing options, comparing results, and choosing what fits your project
and your taste. In the next module, we'll
focus on story and structure. We'll use ChatGPT
to come up with a simple script and turn our idea in a short
video scenario. You will see how to go from
a rough concept to a clear, usable script what works
well for a short form video. See in the next module.
5. Module 3: Script & Story - Writing Short video scenario: Module free, script and story writing short
video scenario. Hey, everyone, and
welcome to Module free. In this module, we are going to start playing with ideas for our video using the avatar
we have already created. We will talk about AA, but don't worry, not in a
boring or super technical way. The goal is to come up with the ideas that are interesting, even for people who
aren't really into AA. We will use ChatGPT to
help us brainstorm. We will tell you that
we want to create short YouTube short
style video and ask for ten fun engaging
ideas around AA. Let's see what it comes up with and pick something we like. Alright, let's take a look
at what ChatGPT came up with I read through
all the ideas, and just keep in
mind, you don't have to stick to AI as a topic. You can take any
subject you like and use the same approach
to create videos. The main thing is the workflow. Once you finish this course, you will have a clear way to go from an idea to a
finished short video. After that, everything else
depends on your imagination. Uh huh. One idea
really stood up to me. AI won't replace you, someone using AI W. It
sounds a bit provocative, which is exactly what we need. Let's tweak it slightly and
add in 2026 to the idea. Now I will ask ChatGPT to
write a 20 32nd script, which is strong hook in
the first 3 seconds, so people don't just
scroll past the video. Let's see what
ChatGPT gave us next. All right, let's
read what we got. ChatGPT actually
generate a script broken down second by second, which is super helpful, especially for short videos. If this version already feels good to you,
that's totally fine. You can stop right here
and move forward with it. But here's a small trick
when working with ChatGPT. You usually get better results
if you give it a roll. In our case, I want ChatGPT
to imagine that it's a YouTuber with ten years of experience and a
skilled public speaker. Then I will ask to rewrite the script using
that perspective. Now let's see what
kind of results we get and compare it with
the previous version. Alright, let's read the result. Personally, I like
this version more. I really like how the
script starts with. In 2026, AI won't replace you. At that moment, the viewer can relax a bit because a lot of people are genuinely afraid that AI will replace
them in the future. And then just a
few seconds later, the avatar says,
someone using AI will. That's where the feeling flips. The viewer might think, wait, what? What do you mean? And that curiosity makes
them want to keep watching. The video also ends with a provocative question,
which is great. It can motivate people to leave a comment or react to the video, and that naturally helps with the engagement
and the algorithm. Let's copy the script
and paste it into a separate Google Docs file
to keep everything organized. We will come back to this
document in the next module. And that's it for this model. Take a short break,
grab a coffee, or do a few push
ups, reset a bit, then get ready for
the next module where we'll turn the
script into a voiceover. I will show you how to do that
in 11 labs in a way where most people won't even realize
it's AI generated voice. See in the next module.
6. Module 4: Turning Scripts into Speech - Turning Scripts into Speech: Module four, voice generation, turning scripts into speech. Hey, everyone, and
welcome to Module four. In the previous module,
we focus on writing a script that works
well for AI avatars. Now it's time to give
the script a voice. In this module, we will look
at how to turn your text into natural sounding speech
using AI voice generation. This is a very important
step because voice plays a huge role in how believable and comfortable
your avatar feels. We will be using 11 labs
for this part of workflow. 11 Labs is AI voice
platform that allows you to generate
realistic speech from text, works with different
voice styles and control how the voice sounds
and delivers your script. It supports things like
text to speech generation, voice libraries with different
tones and personalities, voice design and voice tools for longer content like
audio books or videos. You don't need to
use every feature. We will focus on
what's actually useful for AA avatars and
video voiceovers. You can sign up for 11 labs for free using your
Google account. Every month you get 10,000 free credits for
voice generation. For most beginners, this is more than enough
to get started. In practice, that
amount of credits is usually enough to create
around five or six videos, similar to one we are
building in this course, so you don't need to worry about paying for anything right away. And follow along,
test the workflow and see how everything works
using the free plan. In our past projects, we have used 11 labs in
many different contexts. We have used it to
record voiceovers for online courses to create dialects and narration
for AI cartoons, to voice YouTube videos
and short form content, and to produce clean, consistent audio for different
types of videos. We are not going to cover all of these use case in details here, di mentioned to give you
a context and to show how flexible this tool
can be in real projects. This module will focus only
on what you need right now using 11 laps to turn
your script into clear, natural sounding speech that works well with lip
sync and animation. You will also find a
link to 11 laps in the course resource document with useful links
attached to this course. As always, don't worry about
memorizing every setting. Focus to understanding
the process, how to choose a voice
or how small changes in text or delivery
affect the final result. By the end of this module, you'll be able to confidently
turn your script into spoken audio that's ready to
be used with your AI avatar. Before we start
generating the voiceover, let's take a quick look at
the voice library in 11 Labs. On the left side of the
screen, click on Voices. Here you can explore
a large variety of voices that's
already available. 11 labs also give you the
option to clone your own voice. You can upload an audio
sample with your voice and generate voiceover
without recording every time. You will notice that voices are organized in different ways. You can browse by language, by styles or use case, for example, narration,
social media, or advertising. You can also filter
voices by gender, age, and other characteristics. For our case, we are looking
for something closer to a social media or
advertisement style voice. I have already chosen
a voice for my avatar, but don't feel like you
need to use the same one. Take a moment to
explore the filters, listen to a few options, and choose the voice
you like the most. At the end of the day, this
part is very subjective. It's mostly a matter of taste. Now let's move on and start
generating our voiceover. On the left side of the screen, click on text to speech. As I said, I have already chosen a voice for this project. It's Alex a young
American male voice. Here you can also choose
the voice generation model. All the available
models are good and each one is designed for
slightly different purposes. For my avatar, I'm going
to use the V free model. At the time I'm recording this, it's the latest model available. If you are going
through this course later and you'll see
newer models added, a good general rule is to
try the latest one first. In most cases, newer models
offer better quality, more natural delivery or
improved lip sync behavior. Now let's get back
to Google Docs where we saved our video script. From here, we can simply
copy the text and paste it into 11 laps to
generate the voiceover. In my case, this script is only less than 900 characters long, and 11 laps allows you to generate up to five K
characters at once. Technically, we could generate the entire voiceover in
the single audio file. However, there is an important
thing to keep in mind. Later we'll be using this audio
for lip sync and we don't want our avatar to be talking for the
entire video length. We'll be applying lip sync
only for specific parts of the video using
different camera angles that we prepared earlier. Some sections we'll also cover the avatar with stock footage
or additional visuals. Another reason for
this approach is cost. Generating long lip sync videos can consume quite
a lot of credits. If you are making
just one video, that might be totally fine. But if you plan to
create many videos, splitting your voiceover
into smaller parts can help you save a
significant amount of credits. So for now, let's generate the first part of the voiceover
and listen to the result. In 2026, AI won't replace you. That line gets repeated a lot. But here's the part people skip. In 2026, AI won't replace you. That line gets repeated a lot. But here's the part people skip. With the free model, 11 laps usually gives you two different
variations to choose from. Don't worry about the
sound quality right now. I'm recording my screen, so the audio you
hear is compressed. Once you download the file, you will hear how good the
final quality actually is. Listen to both options, choose the one you like the
most, and then download it. Now let's take the next
part of our script, paste it into 11 laps
and hit generate. The generation process
is very fast as you see. Someone using AI will, same job, same title. Very different results. Someone using AI W. Same job, same title, very
different results. Once you're choosing the
options you like the most, download the audio file and move on to the next
part of the script. You will simply repeat
the same process until all parts of your
script are generated. I'm not going to record every
single repetition here. The goal of this module is
to show you the workflow and help you to understand how to approach voice generation, not to waste your time watching the same steps over
and over again. I will finish generating the remaining parts
in the background, and then we will move forward
for the next all right now all the parts of our script have been turned
into voiceovers. I have downloaded
all the files into a separate folder just to keep everything organized and
easy to work with it later. One important thing
to notice here, I downloaded all the files in wave format. The
reason is simple. Wave gives you the best
possible audio quality, which is especially
important when you use this audio for
lipsing and animation. Starting with high
quality audio helps avoid problems later and gives
better final results. All right, at this point, we have turned your
script into a voice. You have seen how
to choose a voice, how to work with
generation models, and how to prepare
clean audio files that are ready for
the next step. You also learn why it's
often makes sense to split a script into smaller parts and how this approach
can save time, credits, and give you
more flexibility later. Most importantly, you now have high quality voice files that work well for
animation and lip sync. In the next module, we will take these voice files and
move on to lip sync. You will see how to
apply lip sync in practice using
different AI tools and how the same
audio can produce different results
depending on the workflow. We'll compare to tools side by side and talk honestly
about what works well and what doesn't
and what to pay attention to when choosing
a lip sync solution. When you are ready, let's
move on to the next module.
7. Module 5: Comparing AI Tools - Comparing AI Tools: Module five, lip
sync in practice. Hey, everyone, and
welcome to Module five. In this module, we will
take the voiceovers from the previous module and make
lip sync videos in practice. We will apply the same audio to the same character
using two AI tools, cling Avatar and Hagen, so we can clearly compare how each one handles
lip sync and movement. The goal here isn't to
find a perfect tool, but to understand the
difference and choose the lip sync result that
works best for our video. By the end of this module, we will select the final version and use it in the last
stage of the editing. Let's get back to Google Docs, where we saved our script. In the previous module, we generated the voiceover in 11 laps by working with the
script in smaller parts. For each audio piece, I noted which avatar shot or stock footage
will go with it. So we basically ended up with a simple
written storyboard. I planned the order of the
avatar shots to keep things moving and avoid staying
on one angle for too long. I also generate a
few extra angles and pick the ones
I like the most. For this video, I will
use three main angles, a front view, a
slight side angle, and a slightly top down shot, and we will switch
between them to keep the video feeling
dynamic and engaging. Now let's move over
to link AI website. In the top left corner,
click AI Tools. As I mentioned earlier, clink offer a wide range
of tools and models. But for this lesson, we're interested in only
Avatar two point oh, or simply the
latest Avatar model available if you are
watching this course later. The team at link Luis
updates quite often, so using the newest version
is usually a safe choice. You also see that link offers
a set of pre made Avatars. They can be useful for quick
test or short term tasks. But for this project,
we are taking a more professional approach and using our own custom Avatar. Click Aloadimage on the left and upload your avatar image. Once the image is loaded, upload your first voiceover
by clicking Upload Audio. Before generating the video, you can choose the output
resolution HD or full HD. I usually go with the
highest option available. It costs more
credits, of course, but it gives you the
best possible quality, which is especially important for close up talking avatars. One full HD lip sync
generation costs 48 credits, which is not cheap, but
lip sync in general slightly more expensive than
regular video generation. Below, you will see an option
to add a prompt if you want to describe the avatars
behavior in more details. In many cases, clink
automatically suggests a prompt, and from my experience, it's actually work quite well. To keep things simple, we will use the
suggested prompt and see how the results all right, let's move on and check which Avator angle comes next
in our storyboard. Now we go back to clink, upload the next
part of our audio, click Upload Image, select for Avatar
angle, and upload it. Then we repeat the same process for the remaining audio parts. Alright, everything is set. All the pieces of our puzzle
are now in generation. All that's left to do is weigh the result and take a look at
how everything turned out. Alright, our videos
are generated. So let's take a
look what we got. In 2026, AI won't replace you. That line gets repeated a lot, but here's the part people skip. So what do you think?
Compared to previous version, link has clearly improved the realism of the
Avatar's emotion. It's already feels much better, and I think it will only
keep improving from here. Personally, I like the result. The dialects feels alive, no te or plastic. Let's move on and check
the next generation. Someone using AI W.
Same job. Same title. Very different results. Someone using AI W. Same job, same title. Very
different results. Someone using This one
also looks really good. I don't see any noticeable
defects or artifacts here, so I think we can
safely keep it. Let's continue and
see what we got next. A developer without AI writes
everything from scratch. A developer with
AI ships faster, fixes bugs earlier, and
focuses on real problems. Hm. This one I like a bit less. During the head turn, it feels like the head becomes
slightly smaller. It's not a critical issue. So for now, I will keep it, but it's something to be
aware of. Let's move on. So, in 2026, the question
isn't will AI take my job. So in 2026, the question
isn't will AI take my job. This generation looks great. I will definitely keep
this one and move forward. The question is,
the question is, All right, and here
we have a quick shot that works well as a transition. That fits our video perfectly. Will you be the one using it or the one
competing against it? Which side are you on? All right. Now let's say all
the clips into one folder. As you can see,
applying lip sync to our avatar is pretty
straightforward process. Next, we will do the same
thing in another application, Hagen, which is currently one
of the leading in lip sync. We will compare the
results and then choose the best shots
for our final video. Hagen is an AI
platform focused on creating talking avatars
and lip sync videos. It's widely used for educational content,
marketing, videos, and social media,
and it's known for stable lip sync results
and easy to use workflow. The last time we used Hagen
was about four months ago. Back then, we use it to create short videos for
YouTube and TikTok, as well as short form film. Since then, Hagen has
released a new model, and that's exactly what we
are going to test today. Hagen has limits on
how many avatars you can create depending
on your subscription, and additional Avatar
require extra payment. I have already reached
the limit on my account, so my wife registered
a separate account. That way, we can
properly show you the full Avatar creation process inside Hagen and walk
through it step by step. Hagen offers several
subscription plans, including a free. With the free plan, you can
make a few generation and get a feel how the service works before committing
to anything. For this lesson, we are
using the 25 euro plan, mainly to properly test the new model and
show you the process. Also because three free videos wouldn't be enough
for today's example. We also want to get the
best possible quality. That set your setup
might be different. In some cases, a
clink subscription alone might already be enough. It's really depends on
your needs and workflow. All right, now let's move
on creating our avatar. On the left side of the
screen, click Avatars. Here, you will see that Hagen gives you two main
options to choose from. You can either clone a
real person, for example, yourself, or create a virtual
character from image. In our case, we'll go
with the second option. Create a virtual
character from image, since that fits
our workflow best. Next, we upload
our avatar image. As you can see, agents show examples of which
images work best, but our avatars are perfectly
suitable for lipsing, so there is nothing
to worry about here. Click Upload and move
on to the next step. Here we enter the basic
information for our Avatar. There is nothing special here, so you don't need to spend
much time on this part. Our Avatar is now created. To add voice, click
on the Avatar, and then click ZEN Video. You will see many
different options here. You can use voices
from Hagen's Library, as far as I remember, they recently partnered
with 11 Labs, which we used for our voiceover. But since our audio
is already ready, we'll upload our own file. Click Upload audio in
the top left corner. Upload the first audio file
and check how it sounds. In 2026, AI won't replace you. That line gets repeated a lot, but here's the part people skip. If everything sounds
good, click Out Audio. Once the audio is added, go to the top right corner,
click Generate Video, make sure all the
settings are set to maximum quality and that
there is no watermark, and then click Submit. All right, while
video is generating, we can move on and create
the next lip sync. Here I select the
six audio file, which I mark in a Google Docs as the one that should be
used with this avatar. Just like before,
we click Generate, make sure all the settings are set to highest
available quality, and then click Submit. While we are waiting for
the next generation, let's take a look at the
result from our first one. In 2026, AI won't replace you. That line gets repeated a lot, but here's the part people skip. To me, it looks very realistic. I'm pretty sure
that most people, if they see this
avatar in their feet, wouldn't even realize
that it's AI. Now, let's compare it with the same video created in Clink. In 2026, AI won't replace you. That line a lot. A replace. That line gets repeated a lot. But here's the part people skip. In the Klink version, the face
feels a bit more plastic, and the emotions
look slightly more expressive compared
to the hygien result. What do you think? Now, let's take a look at
second generation. So, in 2026, the question
isn't will AI take my job. Nice. I really like
this one again. And let's also compare it with the version
created in link. So, in 2026, the question
isn't will AI take my job. So in 2026, the question
isn't AI take my job? In the clink version,
the Avatar thiefs are not visible and the overall quality feels a bit less detail
compared to Hagen. Hagen doesn't hide the thief, and because of that, the result
feels more natural to me. Overall, I think I prefer
the Hagen version here. Alright, we have created
two lip sync videos. Now we need to create next one using a different
avatar image. For that, we go back to Avatar section to
create a new Avatar, click on New Look, upload
the next avatar image, and then click Create Look. Using the same process,
let's also create our third and final
Avatar by uploading the next image and
clicking Create Look. Now we take our second Avatar and move on to
creating lip sync. Let's check the Google
Docs to see which Audio files should be
used for this Avatar. Okay, here we have two audio tracks that
need to be applied. Before applying them, let's
quickly double check. AI doesn't replace A developer without AI writes
everything from scratch. A developer with
AI ships faster, fixes bugs earlier, and
focuses on real problems. Yes, that's exactly
what we need. We upload the audio and add it. Next, we follow the
same familiar process. Click Generate, add the
description if needed, and make sure all the settings are set to the maximum quality. This time, the generation
cost free credits, since the audio file is a bit longer than the previous
one. That's fine. In my case, I have
enough credits. Then we click Generate. Alright, almost
everything is generated. Let's take a look
at what we got. A developer without AI writes
everything from scratch. A developer with
AI ships faster, fixes bugs earlier, and
focuses on real problems. Me, this looks really good. There is no head deformation,
like we saw in Klink. Let's compare the two versions. A developer without AI writes
everything from scratch. A developer with
AI ships faster, fixes bugs earlier, and
focuses on real problems. Honestly, both options could be used with a bit of
post production work, but once again, Hagen is my
favorite here. Let's move on. The question is,
the question is, here, everything looks fine. This clip is too short to really compare,
so let's move on. Someone using AI W, same job, same title. Very different results. This one also turned out great. No weird glitches or awkward
gestures from the avatar. Let's compare this version with the one generated in clink. Someone using AI Will, same job, same title,
very different results. Someone using AI Will, same job, same title. Very different results. The clink version is a
bit more expressive. In some cases, that
can work well, but because of this
expressiveness, it becomes slightly more
noticeable that it's AI. Now let's take a look
at final generation. Will you be the one using it or the one
competing against it? Which side are you on? Will you be the one using it or the one competing against it? Which side are you on? This one turned out to be
a great closing shot for this video with a question that works as a call to action, encouraging viewers
to leave a comment. In my opinion, Hagen
handles this really well. And even fru, I'm a
big fan of Klink, which I personally use for
about 80% of my tasks. When it comes to
realistic avatars, Hagen currently feels
stronger to me. That said, if you are creating a cartoon style avatar or if you only have a
link subscription, or you need to apply lip sync to a shot in animated project, link does a great job. I have used it many
times for those cases, and I can definitely recommend
for that kind of work. Now let's download all the
files into a separate folder. You have seen the full
process and the results, and from here, you can decide what works best for
your own situation. My goal was to show you
the available options. Alright, let's wrap
up this module. We already have finished
talking avatar. There are just a
few pieces left, creating background music
for our video using AA, and then bringing everything
together in post production. We are almost at
the finish line. If you have made it this far, there is no reason to step now, S in the next module.
8. Module 6: Creating Background Sound - Creating Background Sound: Module six, AI
Music in practice. Hi and welcome to
the next module. In this module, we will
focus on generating background music for
our video using AA. I will show you
how quickly create music that fits the
mood and pacing of our avatar video without spending hours searching
through stock music libraries. We will use two AI tools
that I personally work with, and we will also
use ChatGPT to help us write a clear prompt for
the kind of music we need. The goal here isn't to
create a perfect soundtrack, but to generate clean, usable background
music that supports the video and works
well in final edit. All right, to understand
what kind of music works best for this type of
video, let's ask JAGPT. When you're creating this kind of content for the first time, it's totally normal
not to know what music is actually popular or
works well for this format. So instead of guessing, we
will use JAGPT to help us to figure out and give us a few good directions
to start from. Okay, let's take a moment and go through the options
JAGPT came up with. Out of all the options, I like second one subtle
cinematic underscore the most. Ja GPT mentioned that
this style works really well for public
speaking style delivery, and I agree it supports the voice without
distracting from it. So let's do the next step. I will ask JAGPT
to write a prompt for generating music
in this exact style. Okay, here's our prompt. We will copy it and use it in two different AI tools to
generate background music. The first app we are
going to use is Sona. Sona is AI tool that mainly used for generating
music from text Prompt. You can create background music, full song, instrumentals, or simple atmospheric tracks, all just by describing
the mood and style. It's especially popular for
background music for videos, social media content,
demos, and experiments, weak music ideas without needing
music production skills. One thing I really like about Sona is that it's
really easy to use. You don't need to understand music theory or mess
with complex settings. You just describe what you want and it gives you a result. For our case, we will
use Sona to generate subtle cinematic
background music that supports the voice and doesn't
distract from the message. We will use same prompt we prepared earlier,
generate the music, and then later compare it with another AI tool to see which result fits
our video better. Let's quickly talk about
SNA subscriptions, including a free one and paid option with more
credits and features. Our goal in this course, generating short background
music for video, the free plan is totally enough. You can already create music, test prompts, and get a feel
for how everything works. If later you decide to generate a lot of music and need
commercial rights, you can always upgrade. But to follow along
with this course, you don't need to pay anything. I will link the link to
Sun and producer AA, in the course resources and Pin file so you can
easily find it later. Second app, as you already
got it, it's producer AA. It's not a popular as SNA, but I have been using it
for quite a long time. It has all the features I personally need for my workflow. When it comes to subscription, producer AA is slightly cheaper than Suna the
difference isn't huge. There is also a free plan, which is more than
enough if you're generating music just for
personal use or learning. I'm currently on
Startup plan for eight bucks because I use the music for
commercial projects. But that's a separate topic. For this course, the free
version is totally fine. The main point here is to
compare the result and see which tool fits your
style and workflow better. You can see, I have generated quite a long list
of tracks here. There are actually
a lot of them. The last time I
generate music in Producer AI was a few weeks ago, but everything is still
safe and easy to access. Over time, this
becomes really handy. You build your own small library of tracks that you can reuse, compare, or take
inspiration from later. All right, let's
start generating. First, in producer AI,
click New Session. In the chat window that appears, past the prompt we
got earlier from ChatGPT and click
Submit message. While music is generating there, let's switch to Sona. In Sona, go to the left
corner and click Create. We only need background
music without vocals, so make sure to
select instrumental. Now paste the same
prompt and click Create. This way, you're
generating videos in two different tools at the same time using
the exact same prompt, which made the
comparison much clearer. Okay, let's listen to what
producer Rey gave us. Hmm. It's not bad, but it feels a bit too go, maybe even a little boring. Let's try to make it more
interesting and add more bits. While we waiting for
the new generation, let's listen to what
Sona created for us. A I actually really like the last part of the track. I think it fits our
video pretty well. So let's go ahead
and download it. As you can see,
with the free plan, you can download the track
only in MP free format. But for our video that's
more than enough. We don't need anything
more complex here. All right, let's wrap
up this module up. In this module, we create
a script for our video, figure out what kind of
background music work best for our format and generate music
using different AI tools. We compare the results, picked what we like, and now we have all the main pieces ready. There is just one puzzle left, putting
everything together. In the next module,
we will move to post production and assemble
the final video in Capcat. That's where everything
comes together. See in the next module.
9. Module 7: Final Assembly in Practice - Editing the Video in CapCut: Module seven, editing
the video in CapCut. Welcome to final module. In this module, I
will walk through my own video project
and show you how everything comes
together in CapCut. We will go step by step through the key features I
used from placing the visuals and voiceover to adding music and small
finishing touches. I'm not aiming to show
a perfect edit here. The goal is to share a
clear practical workflow so you can understand the logic and then experiment on your own. All right, let's
move into CapCut. First, I will show you the
final result I ended up with. After that, we will
go through everything step by step and break
it all down together. In 2026, AI won't replace you. Someone using AI will same job, same title, very
different results. A designer without AI spends
all day on one concept. A designer with AI explores ten directions before lunch
and refines the best one. A developer without AI writes
everything from scratch. A developer with
AI ships faster, fixes bugs earlier, and
focuses on real problems. AI doesn't replace professions. It replaces hesitation. It replaces resistance. It replaces people who wait. So in 2026, the question
isn't will AI take my job? The question is,
will you be the one using it or the one
competing against it? Which side are you on? All right, this is how my
final version turned out. I don't know if you
noticed, but while editing, I felt that the
original script was a bit too long and slightly
boring at the start. So I trimmed the opening and cut a small
part of the script, and honestly, it feels
much better now. Not everyone will watch 1
minute video till then, and our goal is to deliver the main idea clearly,
not to drag it out. So here's what I did next. I imported a folder with all the files I needed
for this project. I also download a few
stock video clips. We could have generated
those with AI as well, but in this case, downloading stock footage was simply faster. I'm using those clips
to fill the part of the script where the token
avatar is not visible. I split the screen
into two parts. On top, I place a stock clip
and below it, the avatar. The idea here is to avoid that first reaction
of This looks boring. Skip. Instead, the viewer has something to
look at right away, which helps keep
their attention. I also added a bit of motion to the first shot with
the token avatar, a subtle zooming effect. To do this in CapCut, go to the very first frame of the clip and add a keyframe. Then move to the
end of the clip, slightly increase the scale to the level you want and add
another keyframe there. This creates a smooth
zoom that adds a bit of life and
energy to the shot. Next, I switch to another
shot of the avatar, but from a different angle. Right after that, I
add a shortcut to a full screen stock
clip and then bring the layout back to
the split screen again. All of these small
transitions and changes help keep the
viewers attention, and that's really the
most important part. If you don't catch
someone's attention in the first or 2 seconds, they probably won't
stick around to see what the video is about
or what comes next. So the goal here
isn't to be fancy. It's to keep the
video visually alive and give the viewers a
reason to keep watching. After that, I switch
to a stock lip that visually supports the
part about designers. This helps reinforce
the message and makes the idea clearer without
over explaining. This is also where
I had the first subtle hint to subscribe, a small sticker placed
in a visible spot. The key here is to keep it
light and more intrusive. You don't want to push too hard because that can easily
turn people off. Think of it more as
a gentle reminder, not a call to action
shouted at the viewer. After that, I didn't
overcomplicate it. I keep the layout simple, a split screen with
a stock lip on top, captions and the talking avatar
shown from another angle. Then I decided to
give the viewer a short break from
seeing my avatar's face. For this part, I felt the words themselves
were strong enough, so I played with a very
minimal visual approach, large text appearing
on a black background. I selected the text,
went to animation, and chose a Zoom in animation with the longest
possible duration. I repeated the same setup
for all three words. This creates a clean
pause in the visuals, lets the message land and helps reset the viewer attention
before moving on. Next, we move into the
final part of the video. Here, the avatar
appeals full screen with different camera angles
switching through the scene. I also alternate
those shots with subtle zoom effect on more static frames using
a closer camera view. This helps to keep the
ending visually interesting and prevents it from
feeling flat or repetitive. Another important element
here is a background music. The track we have chosen
has a fairly dramatic tone, so I lowered the
volume quite a bit. The goal for the music to
stay in the background, supporting the mood without
distracting from the voice. Honestly, this part
isn't even mandatory. You could always
add music later, directly in TikTok or YouTube. I mainly want to show you the music generation process
as part of the workflow. Whether you use it in
your own project or not, it's completely up to you. The next very important
detail is captions. To create them in CapCut, go to text at the
top of the screen. Choose auto captions, select English as the language
and click Generate. CapCut has a really large
library of caption styles, different fonts, animations and layouts for all kinds of looks. If you're editing in CapCut, I highly recommend
spending some time exploring them and choosing what feels right for your style. Another important element is
transitions between clips. Just like with captions, CapCut offer a wide variety
of transitions. You can find them by
clicking transitions at the top of the screen.
Don't overuse them. A few simple transitions
usually work best. And sometimes it's
also helpful to gently remind the viewer
to like or subscribe. For that, we can use stickers. You can find stickers right next to the transition section. There are tons of them, and new one are
added all the time, arrows, highlights, outlines,
call outs, and more. They can be really useful
when you want to point at some specific on the screen or guide the viewer attention. Take some time to
explore the section. It's more powerful than
it might seem at first. And the last final
step is audio mixing. AI voiceovers are
already pretty good, but they aren't perfect yet. Sometimes you will hear
longer pauses between phases, so it's a good idea to trim the audio bit to make
it sounds more natural. At the point where
two audio clips meet, I usually add a
short fade outut at the end of the first clip and smooth fading at the
beginning of the next one. In some cases, I even slightly
overlap audio eclipse. This helps avoid awkward pauses
and keeps the flow going. Since we are working
with short form videos, pace and rhyme really matter, and that's basically
all the techniques I used to create this video. So yeah, congratulations. You made it to the end of the
course. And that's a wrap. In this course, you learn
how to go from an idea to a finished short video using
AA, creating an avatar, generating visuals,
writing a script, adding voice and music, and finally putting everything together in post production. Thanks for choosing our course. We truly try to share
our real experience, not theory, but a practical workflow
you can actually use. This course gave you
a new useful skills, we would really appreciate if you let the positive review. It helps us grow and continue creating practical honest
content like this. And if you like
to keep learning, feel free to check out
our other guides and continue developing your
skills in this direction. Thanks again for being
here, your Bros Academy.