Transcripts
1. Introduction: Hello everyone. Welcome to the voice cloning
course made by me, Ahad Dragon, also known
as Ahad al Belushi. Anyway, I'm going to teach you this amazing RVC
software that is out there and it is the most
realistic voice cloning software that is out right now. It's better even than 11 labs and other things that you
might have heard of RVC. It stands for retrieval based voice conversion
and I'm going to teach you how you're
going to utilize that and use it to the best
of its abilities. So what is the core structure that I'm going to teach you? Well, well, those are the
core structure, first of all, I'm going to teach
you how to install RVC locally on your PC. We're going to go step by
step about how you can successfully install it and it works for Mac and
Windows as well. And secondly, I'm
going to teach you and explain the whole UI or
user interface of RVC. And you're going to
need to understand them before you do
any kinds of things, otherwise you won't be able
to even clone your voice. Then we're going to extract
the necessary amount of clean voice data without
any background noise. And then at the very end, we're going to train
the voice model. And I'm going to explain every section so that
you avoid the errors, the common errors
that everyone makes, including me when I first
started this whole journey out, after we have cloned
the voice model. At the end of the software, I'm going to show
you the applications of using those own voice model, including the most
popular AI cover songs. I bet you all want to
make some AI cover songs, especially with your own
voice singing in Spanish or Japanese or any other kinds
of language you can think of. This wouldn't be possible if I don't teach you how to
make it sound perfect. Because even if you
make your cover, if it sounds weird,
then what's the point? It needs to sound
exactly like your voice. Because there are
so many details like you need to learn
when you make your I cover songs or any
other things like converting it from female
to male or vice versa. At the very end of
this whole course, I'm going to teach you
how to use that voice, the cloned voice
model, in real time. That is the most
intriguing part for me. Yeah, there will
be a class project in which you have to
clone your voice. And after you have
cloned your voice, you need to use it perfectly and properly to make
an I cover song. And yeah, so that's it and let's get right into the course.
2. Installing RVC: Guys, welcome to the first part of this voice cloning class. The first thing we're
going to do is to install the RVC software. Let's
get right into it. You got to go to Google
first and type get RVC. Yeah. Then you scroll
down until you see your Cheval based
voice conversion, which stands for RVC. Then you got to go
down until you see the green thing which says Latest over here and
updated. Click on it. Then you have two versions, either for Vida GP users
or AMD Oriental GPA users. Nvidia graphics card. Download the one above. Otherwise, if you
have AMD Oriental download the one on the bottom. Since I have Vida,
I'm going to download this one and then I'm going
to wait for it to install. I've already installed it. I'm going to skip this part. But basically what
you're going to do is after it's installed, you got to extract it. And then you'll have a
folder that says RVC beta 0717 or whatever version that you've downloaded
or something like that. You double click on it. You got to scroll down until
you find the thing which says Go Web Bat and you write, click on it, Show
more options if you're using Windows 11
and send to Desktop. Why we got to do this is
because we don't want to go through the folder every
time we want to start RVC. You write, click on it
if you want and you can change it to RVC. I've
already done that. As you can see here, I've
already got my RVC installed. So you got to do that and then you have successfully
installed RVC. Continue on the next
part of the video.
3. How To Clone Your Voice: All right then. So now that
you have installed RVC, it's about time that
you clone your voice. Well, before we start using RVC, you need to actually have
a data of your voice. Either you record your voice or if you have some other old, like voice notes that you had. You can just mix
them up together in an editing software and
then you can clone it. But it needs to be clean and in high quality without
any background noise. If you don't have
that, go ahead and grab the best microphone
you have at home. Whether that is your phone
or like this mic over here. Yeah, go ahead and record it. You can record your
voice in maybe like a voice recorder on your
phone or anywhere else. As you can see right
now, I have already recorded my voice previously, so I already have prepared
for it to be cloned. So here you can go
ahead and listen to how it sounds me speaking. My name is Ah
Dragon. Hello guys. So this is a good example of how the voice data should sound like without any background
noise and in high quality. All right, so let's go right
ahead and start the RVC. So you double click on it and
then you'll see a command, a CMD, open up and you
just wait for it here. You got to have some patience. It might take a minute
or two for it to run. So as you can see right now, it will open up in
your default browser. In my case, it opened up in Google Chrome. And here we are. So when RVC starts up, your mind will be blown by how many settings and
options you have. But don't worry, I'll
explain it all shortly. It's all easy. You just need to
understand them. They're all bunch of
texts and so many things, but you don't necessarily need
to adjust everything here. You have model inference here. You will be able to use your model after you've trained it. Here we will use this
thing for removing the vocals from songs and also removing the instruments
from songs so you can create an AI
cover of your voice. And here in Train Tab you will be able to
clone your voice. This is where we will start. And here it's like an
experimental thing where you can take two voice models,
mix them together, and you can get a
unique voice model, kind of like in like fantasies or like in animes
like Dragon Ball. If you watch Dragon Ball
or like Goku and Vigita, they fuse together and then they have one unique voice
or something like that. Yeah, and here in export, I'm not sure what this is, but I think like ON X is
some kind of software. So here you can export
your voice model to ONNX. And here you have some
frequently asked questions, but in my experience, they're not really that useful. If you have any questions, I will have better answers
to them than in here. Here, I think it's like
just some general knowledge in case you ran into any issues. But not every error
is listed over here. Anyway, so this is where we
will start in train tab. So let's go ahead
into the train tab. As you can see, there
will be so many things. We're going to start
from the top all the way to the bottom
where we actually, we're going to start
training the voice. First of all, step one, we need to name our voice model. Here you got to make
sure you don't use special characters other than underscore or maybe
like a minus. If you use any special
characters other than those, because this is using Python
as programming language, it will run into errors, so you got to name it
something like, in my case, I'm going to name
it something like, ahead Vali First we're going to skip those two and
we need to choose version two because we want the best quality
possible, right? We want to use RVC V two
which is the latest, which has better features. Then as you can see
here, we have now three options you
can choose this for, like I'm guessing like a
lower sized voice model. But since we want the best, we need to choose
the best options. So 48 K, and here it says
Pitch guidance for the model, which is required for singing. Optional for speech. It says optional, but I
kind of disagree with that. Because without it, you won't have emotions in
your voice model. And your voice model will
not have multiple pitches. It will only have one pitch, and it will sound monotone. It kind of goes like
this, Dom Faso, if you set it to false, that's
how it's gonna sound like. So you got to set it to true. So it can say domas, so it can have
different pitches. Then it says the
number of CP processes used for pitch extraction
data processing. If you want it to be accurate, set it to like the maximum. And then step two, here
we're going to implement the voice data that
you've prepared from before it needs the
path of the training, will you need to go to the
training folder for example, I have mine in files
and then voices. And then now I have
my voice right here. What we got to do, we go back. We got to right click
on the folder where our voice data is located
and copy as path. Then we go back to the browser. And highlight this all
and control V paste it. Then the idea of the singer speaker which
is your voice model. Just set any number,
just leave it at zero. I prefer to leave it as default so we don't run into any issues. Then we click on Process Data, and then we wait. This box will be
highlighted in orange, which means that it is running, so we just got to wait
for it to finish. Once it finishes, it will stop. Yeah. This is how it will
look like when it finishes. It will say pre process as
well and it will say Success. Step two, here it will
show you your GPU. If you have more than one puse, it will show you 01 to, and here you can
select your GPU. Right now, if you
only have one GPU, just leave it at zero
and you should also select the best
GPU that you have. In my case, this is the best GPU and the only GPU that
I have which is zero. And that is RTX 30 80. This actually works with any and every laptop or
PC is out there. But the better your laptop, the faster the results. It won't take as long to
train the voice model. If you have a better PC, then you got to select
the algorithm for pick extraction for the best
quality. Select Harvest. There's PM and there's O, but they're relatively worse. Select Harvest, then click
on Feature Extraction. Then now you just
got to wait for it. You can scroll if you want. You can take a look
at the stuff that's happening and you just
got to wait for it. You can also see the changes in real time over here
in the command. And also one more
thing, do not close. I mean, you can minimize the
command prompt or the CMD, but do not click on X. Otherwise, this whole
program won't work. Because it's actually
running on this, the browser is just like a user interface for
us to access it. So now it's done,
It's all featured, done and stopped gloaming. We go down on step three. And here we're going to set the settings for training
our voice model. First, you got to select the epoch before we get
into the safe frequency, we got to get into the epoch, you understand what it is. Epochs. What are epochs? Epochs are basically
like how many times your AI is going to train your voice if you
set it to once. You're going to
train like one time, the 20 times, 200 times, and the harder the better. But as a rule of thumb, this is how it should be. I'm going to open up
my note to show you. If your voice model
is under 10 minutes, then the epoch should be
between 100, 200 epochs. If your voice model, I mean if your voice audio is equal to or greater
than 10 minutes, then you can go 200-300 Anything above 300 doesn't really make
much of a difference. If you ever ran into issues while setting up these epochs, try lowering them and then
try clicking on Train again. Because sometimes maybe
like if your laptop is like multitasking or maybe if
your GPU is not that strong, I'm guessing like it
could have some errors. So you should retrain your voice model
using lower epochs. So for the sake of time video, I'm going to set
the epochs to 20. Save frequency is like how many epochs till your software saves
the training model? I usually leave it at five
so that I don't lose. But if you go hire
like 2,200 or 300, you can set it like 25 or
50, whatever you feel like. It depends on how fast your PC trains the
voice model and it depends on how many
epochs you set it to leave this batch size
per GPU at the default, even save only the latest checkpoint
file to save disk space, just leave it a default default, which as you know,
if you click on yes, it says large datasets will
consume a lot of GPU memory, may not provide much
speed improvements. Leave it as you know here. Click on Yes, just to be safe because when it saves you'll
be able to use that model. Then you have those pre
trained based model, G path and D path. Those things are like
when you retrain your voice model and since
you're just starting out, you don't need to think
too much about those. But if you want to retrain your models after
you've trained them, you can go to the
RVCs folder and find the path and the D
path of the voice model. And then just past
the paths over here so that you can
improve that voice model. And then enter the GPU that
you're using, which is zero. At the very end, you got to click on Train Model. Now you just got to wait. You can look at the command, prompt the CMD and just wait for it to finish up bunch of stuff. And then you can
see that it will say something like epoch one epoch until it reaches the end of the amount
of epochs that you set. Then it will say successful
at the very end. As you can see over here, it says that training is done
and the program is closed, and then it's a saving,
final checkpoint success. If it said that,
that means you have successfully cloned
your voice model. It's ready for usage right now in any applications
that you can think of. In the next video, I'm going
to show you how to use them in the applications of
the clone voice model. A.
4. Using Voice Model in Conversion: Right now it's about time to
use the voice model in RVC. Last time we've trained the
voice model in the Train tab, Now it's available for usage
in the model inference. If you go to Model
Inference and refresh the voice list and index
path and open this up, you're not going to
see a bunch of models because I've already
cloned many voices. Of course you need
permission to clone voices. Don't just clone voices. All right, if you
go through this, you're going to find
your voice model. In my case, I have the cloned
voice from last video, which is Adel Belushi. So now you select it and then
you can use it over here. So there will be a
bunch of settings. But right now we're going to use this only for speech to speech. So how we're going to do
that is that we're going to bring any audio from Youtube or something like that and then download it and then
just place that audio. Then we're going to convert the voice of the
person to our voice. So let's do that. So
I've gone ahead and downloaded an MP three
audio of this video. Thoughts on Humanity,
fame, and Love. Shao Han, the famous
Indian actor, he said some really nice things
over here and I wanted to see how it would sound like if it were me who
said that stuff. So I've downloaded the audio
and now I'm going to use it in RBC to convert his
voice into my voice. So let's go right
ahead and do that. First up, you got
to go ahead and find the path of the audio that you want to use to convert the person's
voice into audio, into the voice of
the voice model. So you right click on the file instead of the folder
this time and copy his path and paste it over
here and then you're ready. As long as the voice is male to male and
female to female. If it's from male to female, you got to change this option right here as you can see here. It even recommends you, it says plus 12 key for male to female conversion
and negative 12 key for female to male conversion. Right now it's male to male. So that means we just
leave it as zero. But if it was like male to
like Hana or someone else, we got to decrease
that by negative 12. If it were female to
female, zero as well. But if it were female to male, like Hana for example, my voice, I would set
it to negative 12. For example, if you
were a woman and you were converting Michael
Jackson's voice, your voice, you got to set it to positive 12 for it
to work properly. So right now since
it's the same gender, we just leave it as zero. And then we got to go down
and set this as creep because it has the best
quality for voice conversion. And here there are a
bunch of other settings. You can go through them
and see what they do. Some of them are
like for breathing. I'm going to just give you a
few explanations about that. This one is like for filtering
the breathiness and stuff, and this one is like for
preventing artifacts and stuff. But you need to balance them. Each setting, they all do some kind of stuff
one way or another, like this one over here for mimicking the volume of
the original vocals. Like the volume
high or low volume. And this one is for
resampling the audio, This one is for accent strength. You got the idea.
Recommended settings is just to slightly
decrease the accent. And that's all because if it's too high there
could lead to artifacts and we're all ready
to go and click on Convert. Now as you can see over
here, it's loading, so we just have to wait
for it to finish loading. Then we will get our results soon while our audio
is being converted. Over here down here it
says batch conversion. This is like form converting
multiple audio files. I've never used it, so I'm not sure how it's going to work, but I'm guessing it does
what it says that it does. You don't need to really
go through all this. You can just focus on
the one on the top. So there we go. Our
audio is ready, let's hear it. So yeah. So yeah, when the audio
is being converted, it converts literally
everything, like the sound of clapping, the sound of music and stuff. That's why you need
to bring an audio that only has speech
or someone's voice, and if you have any other noise, it will convert that as well. But it's fine in this case because it's only the beginning. Let's skip ahead for a bit. Seems a bit careless of me. Now, I do remember the night my father died and I remember the driver of a neighbor who was driving us to the hospital. He mumbled something
about that people don't tip so well and
walk away into the. So yeah, you can see that the
voice has been converted. But unfortunately, since the original audio
has some echo, the sound of the audience like the crowd and
all that noise, they all factored into the quality of this
final conversion. So you got to find
something that sounds good in order
for you to convert it. And we're going to get right
into it in the cover song. But before that, this
is speech to speech. Right? I'm going to
show you real quick how you can use this
for text to speech. Unfortunately, though,
text to speech requires you to pay a
subscription to 11 laps, as you can see here, it says
you get started and then you kind of sign up and
then asks you to pay. And then when you pay I'll
show you what you can do. All right, so we're in 11 labs. Since I've already paid for the subscription previously,
I can access it. So how do you use it
for text to speech, You click on the
plus sign over here, then you click on
instant voice cloning. Then you name it, whatever
you want to name it. Like for example, my
name I had, Belushi. And then you got to click over here and select your voice, not the voice model. You got to select your recording that you've recorded
earlier. All right. I'm going to select the file of my voice recording
and then click Confirm that I have
the rights and consent and add voice And
just wait for it. And now I can now you
can use your voice. You can type anything in
any language and it works. So let's say, I'm going to
say something like hello, there I am Dragon. Hello, there I am dragon. I mean, sometimes it
sounds a bit weird. You got to play around
with the settings. Keep moving forward, everyone. The more data that you get, the better the results. Let's see how I
sound in Spanish. Oh nice. That's me speaking
Spanish apparently. So this is how you
use text to speech. So that's it for this part, and in the next part, we're
going to try out cover song.
5. AI Cover Song W/RVC: Welcome everyone. It's
about time that we use our voice model
for AI cover song. So let's get right into it. So first of all, you need to go to
Youtube and choose any kind of song that you'd like to convert into your voice. In my case, I would like to have an AI cover song of Diamonds. So first you got to download
the audio of the song. You can use any kinds of
websites to download the audio. I'm going to use my own, which is the fork
video downloader. I recommend it. It's good. Once you've
downloaded your song, you need to rename it so that
it doesn't have any errors. You need to remove the spaces
and yeah, click on Enter. Let's get right into the
voice conversion work. We got to go to
vocals accompaniment, which is the instrument
separation tab. And then over here you should remove the path
because it has issues. Do not use the path.
Instead, bring the audio and drag it over here. And then in the model
that's not the voice model, you need to select HP
three, all vocals. The rest of the stuff do
different kinds of things. Over here it explains it, but some of them are
like for removing echo, getting only the main vocal. But for the purpose
of this tutorial, you should use the vocals. And then here,
there's some issue. It says here the
output for vocals, and here it's the output
for accompaniment. In fact, it's actually
the opposite. So here it should be
in the output path for instruments and here it should
be output path for vocals. I guess they had some
issues naming it and they mixed them
up while naming them. So what I would
like to do is have a folder for vocals
and for accompaniment, which is the instruments, just like this right over here. Instruments plus Vocals. And then when you
get into it, there's instrument folder and
there's vocals folder. Right click on the
vocals copy path and then pace it in
the accompaniment, because as I mentioned, it's
mixed up for some reason. And then we go back. We right click on
the instrument path and paste it in the vocals. And then we can export
it in any format. And I would like to
export it in MP three. Then we click on Export. Then we can look at the command, prompt the CMD and wait
for it to do its magic. So as you can see here,
it's doing its work. It's separating the vocals from the instrument so that
we can use the vocals. All right, so when it's
done it might give you a red text which says no
such file or directory. It gives you the false thought that maybe you ran
into an error, but no actually it work. And proof is that over
here it says success. So you can go right into
the folders and see those instruments and
vocals go into vocals. Actually, for some reason
it will say instrument, but it is not the instrument. I think it meant that the
instrument was removed. So it's on the vocals right now. We're going to use the vocals to convert the voice of
Rihanna into my voice. Let's get right into model
inference over here. Since we're doing a
conversion of female to male Rihanna to mean set this to negative 12 and
then we got to copy the path of the vocal but which for some reason
says instrument copy, paste it over here, then
I think we're all done. We click on Convert, we let RVC do its magic. All right, so it is all done, but this is only the vocals, so let's listen to it a bit, I a way right away. All right, so you
heard the vocals. Now we need to actually mix the vocals with
the instrument. So it sounds like full I cover, so you can do that with any editing software that you have. Premiere Pro or something else. I'm going to show you
an example of that. First you got to click
on the Three Dots over here and download the
audio to your PC. Then you can rename it to
something like your name, which is, in my
case, my name is. Ahead ahead. Diamond
Vocals. There you go. And let's go right ahead and use it editing software
to make a nice cover. When you mix your vocals
with the instrument, Point the instrument, It's not called the
instrument over here. When you go to instrument, for some reason it's
called the vocals. I don't know why RVC mix it up. It's not the vocals guys, it's the instrument
actually it's mixed up. We played it over here and bag. Now you can go ahead and listen to the cover in your voice. Yeah, when. Wow, Yeah. Since this is AI, it
might not sound perfect, but well, at least we know how I sound like
when I sing this. Yeah, don't we? Yeah, so that will be
your class project. You should show me
what song that you've used and show me the
vinyl results of your AI cover after
removing the vocals and instruments and after converting the vocals
into your voice, and then after mixing
them, both your vocals, your custom vocal
which is made by AI and the original
instrument together. And show me the final results. All right, you might need
some editing skills. I'm not going to teach you that you should learn to yourself. But show me the final results
and make me impressed. All right guys, and I'll
tell you in the next lesson.
6. Voice Conversion Without GPU: All right guys. In this
part I'm going to show you how to use your
voice model for, I cover song without
graphics card. This also works on your
phone, by the way, as long as you have
PTH voice model file prepared already or maybe you download it
from like Discord, like pre prepared
by somebody else. I'm going to show
you how to do that. First of all, you need
to open up your browser, whether that is a Google
Chrome or something else, and then search for apps kits. I just like that. All right, as you
can see over here. And press center, and then
you'll have those results. Click on My Voices. Basically, this website is going to allow you
to make I cover song just by using the engine
that is in the servers, but also it has some limitation and I'll get through that first. You need to go
ahead and sign in. When you sign in, it
will ask for your mail. If you have Gmail already, just continue with Google and wait for it to load.
So it will load. Then you will see over here, the limitations is that
if you're doing this without any subscription and
you're doing it for free, you'll have some time limit. For example, right now
I have 15 minutes left. That means I can only make songs up to 15 minutes,
not longer than that. And I think it regenerates. Yeah. As you can see over here, it will refresh on first. I think it's like
yeah, 14th of January. So I think it's like monthly. It will refresh
monthly every time. And also the characters
for text to speech, but I recommend using text
to speech on 11 laughs. It's not really that
good over here. I mean, as far text to
speech is concerned, I've tested it last time, maybe like a week or two
ago, and it's not that good. So we're only going
to focus on I cover song to make your voice work. Over here you need to
click on Uploaded in my voice section and then
click Upload your first voice, or maybe it will be second. Third. After you've done this, then you're going to click on Voice Model over
here and you're going to search for your PTH
file wherever you've put it. I mean, if you don't
know where it's located, you need to go to
your RVC models file that we've created
in the previous videos. And then you'll see
your PTH file from the Weights folder in
the RVC beta software. When you go there, click on Weights and you'll
have your PTH files. Maybe you'll only have
one because since you're starting out you have
your first voice model. So you'll have the
PTH somewhere in like maybe at the top
double click on that. And then you can name it. I'm going to name it my own name because it's my
voice. And then go down. You can also display
image if you'd like, but I'm going to skip that. I'm going to show you anyway. You click on that,
go to Pictures, and then you can just put your own image or whoever it
is and then upload Model. And then as you can see on the bottom right over
there, it's uploading. It's uploading the
voice model that we've prepared in urVCVtwo
or maybe you've downloaded it from this court so that it can be
used on the website. Once you've uploaded it, you can use this on your phone. You don't need anything, like it doesn't even require
a strong GP or anything. It's actually, it
works with that GPU. It works. You don't need
a very strong laptop. You only need
Internet connection, a browser, and that's it. That means it can even
probably work on your Nokia. As far as I'm concerned, You can work with any phone you can think of. Any laptops. And I mean, as long as you've uploaded the voice
model on the website, after you've uploaded your
voice model on the website, you can see you have
different options over there. We have for voice input, we have audio file, you can either drop the vocals or you can enter the link
of the song on Youtube. And different options over here like record your own voice or somebody else's and then convert it into
the model over here. So it's similar to
RVC voice conversion that I've showed you in
the previous videos. And yeah, we have
a fan settings, we're going to get through that. So right now we're
focusing on I cover song, so I'm going to search up on Youtube.com I'm going to
search up for country roads, one of the most popular
songs known to planet Earth, so I'm going to go click on it. Sorry about that.
That was a bit loud for me anyway, so I'm going to, I'm going to copy the
link of the song of Country Roads and I'm going to click on Enter Youtube link
and paste it over here. And then there we
go. We have it. So we go to Advanced settings. And here it's very important, just like how I showed you
in the previous video. Like when we convert the voices, we need to be careful
around the pitch. If it's male to male,
female to female, right now it's male to male, it's 00 is fine. But sometimes you need
to adjust even if it's the same gender and
if it's different gender, for example, male to female, you need to increase
it by positive 12. And if it's female
to male opposite, that means negative 12. And if it's female to female, just like male to
male, it's zero. But sometimes we need to adjust according to the pitch
of the person's voice. Because we don't
have only one pitch for male and one
pitch for female. All the gender, every person in this planet have
different pitches. But generally speaking,
male to female, their conversion like when
you convert between them, it's usually around negative
12 to 12 between that. Anyway, right now it's male, John Denver's voice to
my voice, male to male. That means zero over here. I mean, I can also adjust the pitch if I feel like
it doesn't sound good, but I'm just going to
leave it at zero for now. Conversion strength, I would
recommend just decrease it. Maybe put it like 75 or maybe 70% Because high
values as it states here, it may lead to overcorrecting
in artifacts as well. And model volume, just
leave it as it is. That's the volume
of our voice model. Then here I recommend just
turning off all these. Compressor is basically for when the songs getting too loud. High pass, low pass. It's just some
audio like effects. If it's too noisy, there will be a gait so
it doesn't go too low. Just like the compressor, basically the post
processing effects, you can put up chorus
to make your voice sound like basically
it's like a sing, I don't know how to explain it, but it's like autotune, reverb, delay what they are reverb like echo even
delay compressor. Basically what we
mentioned over here, if it's getting too loud
that will compress it. Make it lower and not too high, but we're just going
to leave it as it is. We're going to click on Convert. Then we can wait. This is also nice. It's over simplified
in my opinion. Way simplified than in RVC
because when you put the link, basically it's going to
do everything for you. It's going to split
what's it called, the vocals from the instruments, and then it's going to
convert your voice. And then it's going to mix
both of them together. So you don't need
to do all that, you just need to
give it the link. And just just some
settings and then boom. Now all you have
to do is just wait for a few seconds and
then it will work. Remember, this works with
laptops and with phones, anything that you
have, it works. Right now, we used Youtubes link to just put the AI cover song. It will do everything by itself, but if you have your own
vocals or your own speech, you can also put them here. Because this is not just
for AI cover songs, this is also if you have a separate vocal that
you've already prepared. Maybe like a custom one, I don't know, somewhere
like out of Youtube. And then you can just put it
over here and boom it works. Or if you have some
kind of speech just like with RBC conversion that I showed you in
the previous videos, you put it here and
then same concept, You got to advanced
settings and then you can adjust the pitch and stuff. It's just similar stuff
but different inputs. As you've seen previously, either you put Youtube link
or you put your own audio. That audio could be speech, could be vocals,
could be anything, or you could record
audio which is also here in the options
like there are some other. Then there was one last thing, It's like a beta thing, but it doesn't matter right now. Right now, it took around
like 2 minutes to make this. So let's hear it, shall we? And also, you can download it, by the way, and you can
also share it and stuff. So let's hear it. So once you click on plate, you need to wait almost
heaven, West Virginia. Yeah, it sounds
pretty good, right? Wow, The cover song, like it has advanced. It sounds so much better now. Anyway, back to the topic. So when you go to audio file, either you put your own vocals or whatever like voice
you have that you want to convert or or you bring
a link from Youtube or you cord over here or free
demo audio. I'm not sure. It's ah, it says you can
try some model for free, but it doesn't matter anyway. So that's it for this video. This is the website, It's
called apps do kits into Ti. But do not go and type
this on your browser. Like apps dot kits into I, because for some reason
it gives you an error. What I would suggest is
you type it over here, app Kitai and then you go
into this link over here, because that's how
it works for me. Yeah, that's it. And I'll
see you in the Linkt class.
7. FINAL: Real Time Conversion Using Okada: Hello everyone. Welcome to the last part of
this video lesson. It's going to be about using
the voice in real time, Using a software called Okada. So let's go ahead
and install it. First thing you need to
do is you need to go to Google and search
up Okada getup. After that, you'll
get those results. Enter this link, click on it, then go scroll till you
see a Table of contents. Over here, you click on the one which is disclaimer
this does not work on. Do not try even if it says Mac. Do not install on Mac because it doesn't work, is
too weak for it. Windows usually works for it. If you're a Windows user, go ahead and download this one. If you're a user,
you should just buy a Windows PC when you get here.
If you're a Windows user. If you get here,
when you get here, you scroll down, then you'll
find different versions. As you can see on the bottom left of the screen
in the corner, when you hover on the links, you can see the
different versions. I look at the latest version
and download it, in my case, at the current date, which is 121-22-0203 it seems that
this is the latest version. Go ahead and download that. Click on it and then
clad right over here and then wait
for it to download. I've already
downloaded one before. I'm going to cancel it. I'm going to show you how
what you're going to do. Basically, you will
have this zip file, you need to click on it. And then if you're
using a Windows 78, a different version, you
will have those options. Maybe you will not have wind, maybe you'll have this on that. Or if you have
this extract here. Once you do that, you'll have this folder right over
here. Double click on it. Double click again. Yeah, you'll have these types of folders
and files over here. We need to look for something
which says, let me see. I think it's called
Start Underscore HTTP. Yeah, there it is. This is the execution file. What we're going to do
is we're going to right click on it and then
show more options. And send to Desktop. Click then you'll have it
on Desktop right over here. You can go ahead and
rename it to if you want. I'll call it too because
I already have installed. Then you double click on it. Once you click on it, it will take a while. For the first run it will show like download bars and
stuff installations. After that, once it finishes, you can it will pop up
with this white window. Yeah. Also when
you run it, again, you don't have to go through
the installation thing, you just double click on it. That's it. Once you have
this window, click on Start. Yeah, you'll have
first four models, or maybe five pre installed
models that you can use. They're all like female voices. If you go down, you'll have
different kinds of settings. Maybe this will be confusing, but I'll explain it right now. First up, you got to choose
your model, For example, like I click on a
female model right now, then there it's ready to go and you can change
the gain in and out. But also one important thing is you got to change the tune. Just like in RVC, when it's female to male, you got to tune it, also male to female, and maybe sometimes male to
male or female to female, depending on the
pitch of the voice. Right now, since my
voice is deeper, I need to increase it. I'm going to increase
it by positive 12, just like we did in
the Okta software. But in Okta software, we're changing the model's
voice to our voice. Here I'm changing my voice
to this person's voice. It's opposite. It's
like reversed. Since it's male to female. Right now, I have to make it even higher pitch positive 12. You can also adjust the
ins and outs if you want, but just leave it as it is and then you go
down here also. It's better to use this with
a microphone if you have. An external Mic just
like this one over here. It will work so much better than a Mic that's on the laptop. But you can go ahead
and try and see. Sometimes it works,
but you might run into some echo
issues to combat that. You have this option
over here called echo suppression one,
suppression two. They work but not the best. If you have an external Mic
will work much, much better. And then here you should select R M P O N X and
then scroll down. Leave those as they
are 256.40 96. Those are like the best ones. P if it's on CPU, change it to GP zero. Then here you must select your input and that's your microphone that
you're going to use. In my case it's Mic
Lgao wave three. Output is where you're
going to hear the voice. Before I teach you how
to use it on this court, first we're going
to need to hear it. I'm going to select my
headphones so I can hear it. And then go down here
to advanced settings. Change the tank to 300
and RVC quality to high. You must always change this to high once you restart
to get the best results. After that, you can go right
ahead and click on Start, so you can hear
yourself talking. You got to give it a minute
and then just speak. Speak. Hello there. Hello there. My name is Ah Dragon. I'm here to teach you
how to fight enemies. Yeah, as you've heard it, it works really nice. And you can also use
your V RVC model that you've trained
from previous lessons. How are you going to do
that? First, we're going to locate your RVC model. How to locate it. Go to Desktop. Remember that shortcut
that we made for RVC? Right click on it, click
on Open File Location, and then scroll up till you find the folder called Weights. Maybe it will be located differently for you if
you didn't find it. It could be somewhere in logs, like you search for it or
either you look for it yourself or you search over here for weights by typing weights Enter. And then it will show you the location of the folder.
You go right there. And then once you find
it right click on it, show more options. If you're on Windows 11
and then sent to Desktop, that way you will have
easy access to your models because they're all located
here in the desktop. Right now, I have different
models over here. You might have one
from previous lesson. Then when you go back to Okada, you can go to Edit. Scroll down to an empty,
like blank space. In my case, it's 12 right now, because I have others
that are occupied. Click on Upload. Click Select. File would go to the folder
location of your model. Go to Desktop and then scroll. Tell you find the weights
folder that you've created. You can also rename it if you'd like to double click on it. Then select the PTH file that you have for your model
that you created already. For example, my
voice I had V two. And then I click Upload. One more thing, Ok. Works only with like it requires
Internet connection. So make sure you have a
great Internet connection. Yeah, once you've selected
that and click on Upload, just close, and then you will find your molar like
somewhere over here. It will be like the last
one that you've created. So I have it right here. Click on it. Since
it's my voice, it will sound like my voice, so I don't need to
tune it or anything. So I'm going to click
on Start. Hello there. Hello. My name is Zahid Dragon. My name is Zahi Dragon. I am here to teach you to
fight, fight enemies, enemies. So yeah, that's my voice. Now I'm going to teach
you how you can use those voices in real time in Discord or other
apps as well. I'm going to show you on Discord and then I'm
going to give you a general idea of how you can then use it to other softwaes. I'm not going to show you
all the other software, I'm just going to
show you a discord and just get an idea of how you can use it
in other softwares. Let's go to Google and then
you search for VBD cable. Once you search for that, you'll have different links over here. You click on the first one, then you download
the Windows one, because either way, this is going to only work on Windows. It's going to download, give it like 5 seconds, I guess. And then you click on it. Then you need to extract this. You should right click on it. Or yeah, right click
and show in folder, then you right click again. Then just like with
the previous file, previous zip file,
we're going to extract. Once you extract it,
you'll have this pop in, Scroll down to find the
VB cable set up 64. You right click on it and run
as administrator. Click on. Then after that, here it should say installed driver
or something. Since I already
have it installed, it says remove driver, so I'm not going to do that. I already have it.
And then after that, you're ready to use your
real time voice conversion. First of all, you
need to restart the Kara software. So
we're going to do that. We're going to restart it. Yeah. Just wait for it. As you will see here. As you will notice
here, it's faster. I didn't show you the
installation like how it's going to be
once you install it. But I don't want to take much time from you,
so I just showed you. But basically the first
time is going to be longer. The second time is going
to be as fast as this. Yeah, once we have
the VB audio cable installed, we scroll down. We'll leave the input
as our mic as it is. And then the output, we
change it to cable input. Basically, VB audio
cable is like a cable that it's like a
virtual cable that we cannot see that connects
between the software, I mean the voice
changer software that you're going
to use to change your voice and the software that you're going to use it for, like Discord right now. First we connected the
software in the output to the input of what's
it called, the cable. Basically, it's going to input the voice into the
virtual cable, then we're going to output it. I mean we're going to connect the output cable in the disc, in the input of the discord here in the output we selected
the cable input, right? Then we go to discord. We go to Settings, we
go to voice and video. And then in the input device, I'm going to change
it to cable output. Now if I use my
voices over here, if I click on Start,
it's going to work. Then after that, if you want to use it for software that do not have the input
and output settings, you can go ahead and change
the default by going to settings of Windows,
by searching settings. And then here you
search for input. And then it'll show you
sound input. Click on it. Then here you just change the input to cable
output. This one. Once you do that, I mean
I'm not going to do that right now because
otherwise you won't hear me. But once you do that
you're ready to go to frank people
or just have fun, have a laughing moment
and just like have fun with your friends
and that kind of stuff. Yeah, so all in all, that's the end of the project. I hope you've enjoyed it. Make sure you finish
your project that I've assigned you in
the previous lessons. And also I hope you'll leave a rating for me and
for this class. And maybe, I don't know if there is
a comment in Schultre, but tell me what you think
or message me or something. Tell me if you've enjoyed it and if you would like like
different lessons. I'm all here to help you. And if you have any other
questions, that's it. I mean, ask me in the comments
or wherever they are, DM's, if you'd like me to
do any kind of service, I kind of service for you. I am available on fiber. You can see me as
I'm called SamarB at Maya 8156 and those are
my gigs I can teach, you can face to face or like one to one live and
I can copyright for you. I have art. I can create RVCV two models
for you if you'd like, hit me up on fiber. I'll see you next time. Bye bye.